From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id C714A1FF165
	for <inbox@lore.proxmox.com>; Thu, 22 May 2025 08:56:28 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id C96812EB78;
	Thu, 22 May 2025 08:56:29 +0200 (CEST)
Date: Thu, 22 May 2025 09:55:49 +0300
To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>,
 Proxmox VE development discussion <pve-devel@lists.proxmox.com>
References: <mailman.538.1747833190.394.pve-devel@lists.proxmox.com>
 <1283184248.17536.1747895442851@webmail.proxmox.com>
In-Reply-To: <1283184248.17536.1747895442851@webmail.proxmox.com>
MIME-Version: 1.0
Message-ID: <mailman.548.1747896989.394.pve-devel@lists.proxmox.com>
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Post: <mailto:pve-devel@lists.proxmox.com>
From: Denis Kanchev via pve-devel <pve-devel@lists.proxmox.com>
Precedence: list
Cc: Denis Kanchev <denis.kanchev@storpool.com>
X-Mailman-Version: 2.1.29
X-BeenThere: pve-devel@lists.proxmox.com
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
Subject: Re: [pve-devel] PVE child process behavior question
Content-Type: multipart/mixed; boundary="===============3495916773310996226=="
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

--===============3495916773310996226==
Content-Type: message/rfc822
Content-Disposition: inline

Return-Path: <denis.kanchev@storpool.com>
X-Original-To: pve-devel@lists.proxmox.com
Delivered-To: pve-devel@lists.proxmox.com
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by lists.proxmox.com (Postfix) with ESMTPS id 9B9B4D04DB
	for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 08:56:28 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 7311D2EADB
	for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 08:55:58 +0200 (CEST)
Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536])
	(using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by firstgate.proxmox.com (Proxmox) with ESMTPS
	for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 08:55:57 +0200 (CEST)
Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-601dd3dfc1fso8873263a12.0
        for <pve-devel@lists.proxmox.com>; Wed, 21 May 2025 23:55:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=storpool.com; s=google; t=1747896951; x=1748501751; darn=lists.proxmox.com;
        h=content-transfer-encoding:in-reply-to:from:content-language
         :references:to:subject:user-agent:mime-version:date:message-id:from
         :to:cc:subject:date:message-id:reply-to;
        bh=GC2VBCyGc6xnkavqIxYMsFXtDqkBbtcEX8IEDSBB6C8=;
        b=QPRuQMz4s6pLj40pHwgMUN8Y+7RN3Z/fMMIeZwIcbg0NP7mPq27pZ0pyQPxC1IFfQG
         rL/u7OclV8g3sztekAmtgUsweb8kOy+nifm/QRPezSZF7auxUweHBVi333CA/vgKfa9l
         foNfKahGCOeZ5BQjSBbyq/UEP8DBx7RNwOp1gNgHA8JEVWOGZepEAousJoayGdl9TynU
         hELTOTKMtP33ldtl6LrvkuThimZWSbnZkGjJpsJ8+MMfkAgmAgQ4umtjp8O9UVPgeJpI
         qcQT+rXL4XY3GnTYtHC1DhisJ9HUhRm/jVeAvkOGEp22SzH14Oo4FMI9GWl2npeL8fOz
         RIrA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1747896951; x=1748501751;
        h=content-transfer-encoding:in-reply-to:from:content-language
         :references:to:subject:user-agent:mime-version:date:message-id
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=GC2VBCyGc6xnkavqIxYMsFXtDqkBbtcEX8IEDSBB6C8=;
        b=SkfuxVfLUo8Et0kwRUz60wssafASqoEbh+O+hQp+2yDHrNR2JFCC9wzzkrEEJeuzQt
         635Mv8Bfw9Qnp+Nbx5Mqx887iJUH6rkIkJYko/+wKnuENYT7azdVS0Y6Vi7OXDMEg+H0
         Oc7hfEwYM12hNPYZst8JdirW0/w91ZRkNy81rwy6QsYmiGrnbvMdGzt4yJC1lUQ6FxhO
         EkyioGMbtASBT/apFnXObX2SUJOcq1PpD2U8TAMMOFoTLEUr9b4HEqeh0TzYKtjoGXI9
         RMllKrgEgAIhTfQB58t2NgqHvCAygJp51nM9MOU8C+nz4e+sBxfK4oyoPg3LzQK5gWQ1
         Vdwg==
X-Forwarded-Encrypted: i=1; AJvYcCXX/kcAUakvSHBaPJS1ifciW0ot9x1Nha7DGSkW/aJdFijwk8kEfL4o8N32FKxlPQHXzKE3t3drRoA=@lists.proxmox.com
X-Gm-Message-State: AOJu0YxjgIIMVRNIisHV+lEUAZIB3f7U/H888imJPZ8RB4HG23hlNslt
	C6nRfq1Ksm+UdIOPpGe2MWK3frZjz5twEYdo65SKJFFpgShG835OtAbILO63hVK9/N8=
X-Gm-Gg: ASbGncupwJhmmKNZOWsWJDozbX5uk1l+dc6gx0bjwNQ9jZQtSbYahUd/pqKW3H29h5e
	6zAE1fXFaL9GmNqLJA5dQSsTAoB+NoK7PCfeNqmgpyl0XPXcHqupIlf/fxBtTL+L0Oz11OdBpUG
	MUxXtjrVa670jx+Ot262ovMufIu51kcJ1ZiMS+R7u0G3/XbTG7hiaQlduOAniSfBxdt7NI5+s4t
	dzijbM1ccjh1ZadVxqCfnkGs9lzkxKLcKz0W0QG7lRb/s1EHl0eCvl093xEQfp7OHE2kxjefqbs
	IPJzIP0wrw5yqeWaTmnPzSN7DQb64mHmahsf3aYueLUJWXWCq53TOqRuKuTNKb3DAyzA30nx1X9
	txK4i6IHHXV+YnhNv3eQ=
X-Google-Smtp-Source: AGHT+IF6HRPix5uv8LgpezxGb6cEo43PIItVVw8nKsdXG6KPYvllo3+4OHWeJx7IgTpWmkC4GDKoUQ==
X-Received: by 2002:a05:6402:2706:b0:5ff:ef95:333a with SMTP id 4fb4d7f45d1cf-6011409b2d2mr21131951a12.13.1747896950976;
        Wed, 21 May 2025 23:55:50 -0700 (PDT)
Received: from [192.168.0.165] (79-100-232-190.ip.btc-net.bg. [79.100.232.190])
        by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6005ae3918esm9991563a12.68.2025.05.21.23.55.50
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Wed, 21 May 2025 23:55:50 -0700 (PDT)
Message-ID: <857cbd6c-6866-417d-a71f-f5b5297bf09c@storpool.com>
Date: Thu, 22 May 2025 09:55:49 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [pve-devel] PVE child process behavior question
To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>,
 Proxmox VE development discussion <pve-devel@lists.proxmox.com>
References: <mailman.538.1747833190.394.pve-devel@lists.proxmox.com>
 <1283184248.17536.1747895442851@webmail.proxmox.com>
Content-Language: en-US
From: Denis Kanchev <denis.kanchev@storpool.com>
In-Reply-To: <1283184248.17536.1747895442851@webmail.proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
	AWL                     0.001 Adjusted score from AWL reputation of From: address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DKIM_SIGNED               0.1 Message has a DKIM or DK signature, not necessarily valid
	DKIM_VALID               -0.1 Message has at least one valid DKIM or DK signature
	DKIM_VALID_AU            -0.1 Message has a valid DKIM or DK signature from author's domain
	DKIM_VALID_EF            -0.1 Message has a valid DKIM or DK signature from envelope-from domain
	DMARC_PASS               -0.1 DMARC pass policy
	RCVD_IN_DNSWL_NONE     -0.0001 Sender listed at https://www.dnswl.org/, no trust
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_PASS               -0.001 SPF: sender matches SPF record
	URIBL_BLOCKED           0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked.  See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [storpool.com,restenvironment.pm]

The parent of the storage migration process gets killed.

It seems that this is the desired behavior and as far i understand it 
correctly - the child worker is detached from the parent and it has 
nothing to do with it after spawning.

Thanks for the information, it was very helpful.

On 22.05.25 г. 9:30 ч., Fabian Grünbichler wrote:
>> Denis Kanchev via pve-devel <pve-devel@lists.proxmox.com> hat am 21.05.2025 15:13 CEST geschrieben:
>> Hello,
>>
>> We had an issue with a customer migrating a VM between nodes using our
>> shared storage solution.
>>
>> On the target host the OOM killer killed the main migration process, but
>> the child process (which actually performs the migration) kept on
>> working, which we did not expect, and that caused some issues.
> could you be more specific which process got killed?
>
> when you do a migration, a task worker is forked and its UPID is returned
> to the caller for further querying.
>
> as part of the migration, other processes get spawned:
> - ssh tunnel to the target node
> - storage migration processes (on both nodes)
> - VM state management CLI calls (on the target node)
>
> which of those is the "main migration process"? which is the child process?
>
>> This leads us to the broader question - after a request is submitted,
>> the parent can be terminated, and not return a response to the client,
>> while the work is being done, and the request can be wrongly retried or
>> considered unfinished.
> the parent should return almost immediately, as all it is doing at that
> point is returning the UPID to the client (the process then continues to
> do other work though, but that is no longer related to this task).
>
> the only exception is for "sync" task workers, like in a CLI context,
> where the "parent" has no other work to do, so it waits for the child/task
> to finish and prints its output while doing so, and some "bulk action"
> style API calls that fork multiple task workers and poll them themselves.
>   
>> Should the child processes terminate together with the parent to guard
>> against this, or is this expected behavior?
> the parent (API worker process) and child (task worker process) have no
> direct relation after the task worker has been spawned.
>
>> Here is an example patch to do this:
>>
>>
>> diff --git a/src/PVE/RESTEnvironment.pm b/src/PVE/RESTEnvironment.pm
>>
>> index bfde7e6..744fffc 100644
>>
>> --- a/src/PVE/RESTEnvironment.pm
>>
>> +++ b/src/PVE/RESTEnvironment.pm
>>
>> @@ -13,8 +13,9 @@ use Fcntl qw(:flock);
>>
>>    use IO::File;
>>
>>    use IO::Handle;
>>
>>    use IO::Select;
>>
>> -use POSIX qw(:sys_wait_h EINTR);
>>
>> +use POSIX qw(:sys_wait_h EINTR SIGKILL);
>>
>>    use AnyEvent;
>>
>> +use Linux::Prctl qw(set_pdeathsig);
>>
>>
>>    use PVE::Exception qw(raise raise_perm_exc);
>>
>>    use PVE::INotify;
>>
>> @@ -549,6 +550,9 @@ sub fork_worker {
>>
>> POSIX::setsid();
>>
>>       }
>>
>>
>> +   # The signal that the calling process will get when its parent dies
>>
>> +   set_pdeathsig(SIGKILL);
> that has weird implications with regards to threads, so I don't think that
> is a good idea..
>
>> +
>>
>> POSIX::close ($psync[0]);
>>
>> POSIX::close ($ctrlfd[0]) if $sync;
>>
>> POSIX::close ($csync[1]);


--===============3495916773310996226==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

--===============3495916773310996226==--