From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pve-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id C714A1FF165 for <inbox@lore.proxmox.com>; Thu, 22 May 2025 08:56:28 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id C96812EB78; Thu, 22 May 2025 08:56:29 +0200 (CEST) Date: Thu, 22 May 2025 09:55:49 +0300 To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>, Proxmox VE development discussion <pve-devel@lists.proxmox.com> References: <mailman.538.1747833190.394.pve-devel@lists.proxmox.com> <1283184248.17536.1747895442851@webmail.proxmox.com> In-Reply-To: <1283184248.17536.1747895442851@webmail.proxmox.com> MIME-Version: 1.0 Message-ID: <mailman.548.1747896989.394.pve-devel@lists.proxmox.com> List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Post: <mailto:pve-devel@lists.proxmox.com> From: Denis Kanchev via pve-devel <pve-devel@lists.proxmox.com> Precedence: list Cc: Denis Kanchev <denis.kanchev@storpool.com> X-Mailman-Version: 2.1.29 X-BeenThere: pve-devel@lists.proxmox.com List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> Subject: Re: [pve-devel] PVE child process behavior question Content-Type: multipart/mixed; boundary="===============3495916773310996226==" Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com> --===============3495916773310996226== Content-Type: message/rfc822 Content-Disposition: inline Return-Path: <denis.kanchev@storpool.com> X-Original-To: pve-devel@lists.proxmox.com Delivered-To: pve-devel@lists.proxmox.com Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9B9B4D04DB for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 08:56:28 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 7311D2EADB for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 08:55:58 +0200 (CEST) Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 08:55:57 +0200 (CEST) Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-601dd3dfc1fso8873263a12.0 for <pve-devel@lists.proxmox.com>; Wed, 21 May 2025 23:55:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=storpool.com; s=google; t=1747896951; x=1748501751; darn=lists.proxmox.com; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=GC2VBCyGc6xnkavqIxYMsFXtDqkBbtcEX8IEDSBB6C8=; b=QPRuQMz4s6pLj40pHwgMUN8Y+7RN3Z/fMMIeZwIcbg0NP7mPq27pZ0pyQPxC1IFfQG rL/u7OclV8g3sztekAmtgUsweb8kOy+nifm/QRPezSZF7auxUweHBVi333CA/vgKfa9l foNfKahGCOeZ5BQjSBbyq/UEP8DBx7RNwOp1gNgHA8JEVWOGZepEAousJoayGdl9TynU hELTOTKMtP33ldtl6LrvkuThimZWSbnZkGjJpsJ8+MMfkAgmAgQ4umtjp8O9UVPgeJpI qcQT+rXL4XY3GnTYtHC1DhisJ9HUhRm/jVeAvkOGEp22SzH14Oo4FMI9GWl2npeL8fOz RIrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747896951; x=1748501751; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GC2VBCyGc6xnkavqIxYMsFXtDqkBbtcEX8IEDSBB6C8=; b=SkfuxVfLUo8Et0kwRUz60wssafASqoEbh+O+hQp+2yDHrNR2JFCC9wzzkrEEJeuzQt 635Mv8Bfw9Qnp+Nbx5Mqx887iJUH6rkIkJYko/+wKnuENYT7azdVS0Y6Vi7OXDMEg+H0 Oc7hfEwYM12hNPYZst8JdirW0/w91ZRkNy81rwy6QsYmiGrnbvMdGzt4yJC1lUQ6FxhO EkyioGMbtASBT/apFnXObX2SUJOcq1PpD2U8TAMMOFoTLEUr9b4HEqeh0TzYKtjoGXI9 RMllKrgEgAIhTfQB58t2NgqHvCAygJp51nM9MOU8C+nz4e+sBxfK4oyoPg3LzQK5gWQ1 Vdwg== X-Forwarded-Encrypted: i=1; AJvYcCXX/kcAUakvSHBaPJS1ifciW0ot9x1Nha7DGSkW/aJdFijwk8kEfL4o8N32FKxlPQHXzKE3t3drRoA=@lists.proxmox.com X-Gm-Message-State: AOJu0YxjgIIMVRNIisHV+lEUAZIB3f7U/H888imJPZ8RB4HG23hlNslt C6nRfq1Ksm+UdIOPpGe2MWK3frZjz5twEYdo65SKJFFpgShG835OtAbILO63hVK9/N8= X-Gm-Gg: ASbGncupwJhmmKNZOWsWJDozbX5uk1l+dc6gx0bjwNQ9jZQtSbYahUd/pqKW3H29h5e 6zAE1fXFaL9GmNqLJA5dQSsTAoB+NoK7PCfeNqmgpyl0XPXcHqupIlf/fxBtTL+L0Oz11OdBpUG MUxXtjrVa670jx+Ot262ovMufIu51kcJ1ZiMS+R7u0G3/XbTG7hiaQlduOAniSfBxdt7NI5+s4t dzijbM1ccjh1ZadVxqCfnkGs9lzkxKLcKz0W0QG7lRb/s1EHl0eCvl093xEQfp7OHE2kxjefqbs IPJzIP0wrw5yqeWaTmnPzSN7DQb64mHmahsf3aYueLUJWXWCq53TOqRuKuTNKb3DAyzA30nx1X9 txK4i6IHHXV+YnhNv3eQ= X-Google-Smtp-Source: AGHT+IF6HRPix5uv8LgpezxGb6cEo43PIItVVw8nKsdXG6KPYvllo3+4OHWeJx7IgTpWmkC4GDKoUQ== X-Received: by 2002:a05:6402:2706:b0:5ff:ef95:333a with SMTP id 4fb4d7f45d1cf-6011409b2d2mr21131951a12.13.1747896950976; Wed, 21 May 2025 23:55:50 -0700 (PDT) Received: from [192.168.0.165] (79-100-232-190.ip.btc-net.bg. [79.100.232.190]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6005ae3918esm9991563a12.68.2025.05.21.23.55.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 21 May 2025 23:55:50 -0700 (PDT) Message-ID: <857cbd6c-6866-417d-a71f-f5b5297bf09c@storpool.com> Date: Thu, 22 May 2025 09:55:49 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [pve-devel] PVE child process behavior question To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>, Proxmox VE development discussion <pve-devel@lists.proxmox.com> References: <mailman.538.1747833190.394.pve-devel@lists.proxmox.com> <1283184248.17536.1747895442851@webmail.proxmox.com> Content-Language: en-US From: Denis Kanchev <denis.kanchev@storpool.com> In-Reply-To: <1283184248.17536.1747895442851@webmail.proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.001 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain DMARC_PASS -0.1 DMARC pass policy RCVD_IN_DNSWL_NONE -0.0001 Sender listed at https://www.dnswl.org/, no trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [storpool.com,restenvironment.pm] The parent of the storage migration process gets killed. It seems that this is the desired behavior and as far i understand it correctly - the child worker is detached from the parent and it has nothing to do with it after spawning. Thanks for the information, it was very helpful. On 22.05.25 г. 9:30 ч., Fabian Grünbichler wrote: >> Denis Kanchev via pve-devel <pve-devel@lists.proxmox.com> hat am 21.05.2025 15:13 CEST geschrieben: >> Hello, >> >> We had an issue with a customer migrating a VM between nodes using our >> shared storage solution. >> >> On the target host the OOM killer killed the main migration process, but >> the child process (which actually performs the migration) kept on >> working, which we did not expect, and that caused some issues. > could you be more specific which process got killed? > > when you do a migration, a task worker is forked and its UPID is returned > to the caller for further querying. > > as part of the migration, other processes get spawned: > - ssh tunnel to the target node > - storage migration processes (on both nodes) > - VM state management CLI calls (on the target node) > > which of those is the "main migration process"? which is the child process? > >> This leads us to the broader question - after a request is submitted, >> the parent can be terminated, and not return a response to the client, >> while the work is being done, and the request can be wrongly retried or >> considered unfinished. > the parent should return almost immediately, as all it is doing at that > point is returning the UPID to the client (the process then continues to > do other work though, but that is no longer related to this task). > > the only exception is for "sync" task workers, like in a CLI context, > where the "parent" has no other work to do, so it waits for the child/task > to finish and prints its output while doing so, and some "bulk action" > style API calls that fork multiple task workers and poll them themselves. > >> Should the child processes terminate together with the parent to guard >> against this, or is this expected behavior? > the parent (API worker process) and child (task worker process) have no > direct relation after the task worker has been spawned. > >> Here is an example patch to do this: >> >> >> diff --git a/src/PVE/RESTEnvironment.pm b/src/PVE/RESTEnvironment.pm >> >> index bfde7e6..744fffc 100644 >> >> --- a/src/PVE/RESTEnvironment.pm >> >> +++ b/src/PVE/RESTEnvironment.pm >> >> @@ -13,8 +13,9 @@ use Fcntl qw(:flock); >> >> use IO::File; >> >> use IO::Handle; >> >> use IO::Select; >> >> -use POSIX qw(:sys_wait_h EINTR); >> >> +use POSIX qw(:sys_wait_h EINTR SIGKILL); >> >> use AnyEvent; >> >> +use Linux::Prctl qw(set_pdeathsig); >> >> >> use PVE::Exception qw(raise raise_perm_exc); >> >> use PVE::INotify; >> >> @@ -549,6 +550,9 @@ sub fork_worker { >> >> POSIX::setsid(); >> >> } >> >> >> + # The signal that the calling process will get when its parent dies >> >> + set_pdeathsig(SIGKILL); > that has weird implications with regards to threads, so I don't think that > is a good idea.. > >> + >> >> POSIX::close ($psync[0]); >> >> POSIX::close ($ctrlfd[0]) if $sync; >> >> POSIX::close ($csync[1]); --===============3495916773310996226== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel --===============3495916773310996226==--