* [pve-devel] Consistency in volume deletion in process of concurrent VM deletion
@ 2025-10-21 15:33 Andrei Perepiolkin via pve-devel
2025-10-22 9:49 ` Fabian Grünbichler
0 siblings, 1 reply; 3+ messages in thread
From: Andrei Perepiolkin via pve-devel @ 2025-10-21 15:33 UTC (permalink / raw)
To: Proxmox VE development discussion; +Cc: Andrei Perepiolkin
[-- Attachment #1: Type: message/rfc822, Size: 8933 bytes --]
From: Andrei Perepiolkin <andrei.perepiolkin@open-e.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: [pve-devel] Consistency in volume deletion in process of concurrent VM deletion
Date: Tue, 21 Oct 2025 11:33:27 -0400
Message-ID: <7cf85c82-28d9-4883-9826-39e60bfa3450@open-e.com>
Hi Proxmox Community,
There might be a potential consistency problem with Proxmox vm deletion.
If Proxmox receives multiple concurrent VM deletion requests, where each
VM has multiple disks located on shared storage.
The deletion process may fail or hang when attempting to acquire the
storage
lock(https://github.com/proxmox/pve-storage/blob/master/src/PVE/Storage.pm#L1196C1-L1209C7).
...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
trying to acquire cfs lock 'storage-jdss-Pool-2' ...
cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
...
Eventually, the VM configuration files in /etc/pve are removed, but some
VM disks may remain.
Additionally, the Web UI shows all deletions as successful, even though
some disks were not deleted.
In my opinion, a VM should either be deleted completely—including all
dependent resources—or the deletion should fail, leaving the VM
configuration file with an updated state.
Im reproducing this by:
for i in `seq 401 420` ; do qm clone 104 $i --name "win-$i" --full
--storage jdss-Pool-2 ; done;
for i in `seq 401 410` ; do qm destroy $i
--destroy-unreferenced-disks 1 --purge 1 & done ;
Have to notice that ssh session that I use to conduct 'qm destroy'
command get terminated by Proxmox.
Ive duplicated as a bug at:
https://bugzilla.proxmox.com/show_bug.cgi?id=6957
Is this a bug and will it be addressed in near future?
Best regards,
Andrei Perepiolkin
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pve-devel] Consistency in volume deletion in process of concurrent VM deletion
2025-10-21 15:33 [pve-devel] Consistency in volume deletion in process of concurrent VM deletion Andrei Perepiolkin via pve-devel
@ 2025-10-22 9:49 ` Fabian Grünbichler
2025-10-22 14:38 ` Andrei Perepiolkin via pve-devel
0 siblings, 1 reply; 3+ messages in thread
From: Fabian Grünbichler @ 2025-10-22 9:49 UTC (permalink / raw)
To: Proxmox VE development discussion
On October 21, 2025 5:33 pm, Andrei Perepiolkin via pve-devel wrote:
> Hi Proxmox Community,
>
> There might be a potential consistency problem with Proxmox vm deletion.
>
> If Proxmox receives multiple concurrent VM deletion requests, where each
> VM has multiple disks located on shared storage.
>
> The deletion process may fail or hang when attempting to acquire the
> storage
> lock(https://github.com/proxmox/pve-storage/blob/master/src/PVE/Storage.pm#L1196C1-L1209C7).
>
> ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
> ...
>
> Eventually, the VM configuration files in /etc/pve are removed, but some
> VM disks may remain.
>
> Additionally, the Web UI shows all deletions as successful, even though
> some disks were not deleted.
>
> In my opinion, a VM should either be deleted completely—including all
> dependent resources—or the deletion should fail, leaving the VM
> configuration file with an updated state.
the underlying issue is that the scope of the lock taken for certain
storage operations is very big for shared storages. we could probably
reduce it to a more meaningful level for most such storages:
https://bugzilla.proxmox.com/show_bug.cgi?id=1962
but the the error handling might also be lacking in this case, would
have to double-check.
>
> Im reproducing this by:
>
> for i in `seq 401 420` ; do qm clone 104 $i --name "win-$i" --full
> --storage jdss-Pool-2 ; done;
>
> for i in `seq 401 410` ; do qm destroy $i
> --destroy-unreferenced-disks 1 --purge 1 & done ;
>
>
> Have to notice that ssh session that I use to conduct 'qm destroy'
> command get terminated by Proxmox.
that seems unexpected, are you sure this is caused by PVE?
> Ive duplicated as a bug at:
> https://bugzilla.proxmox.com/show_bug.cgi?id=6957
it would be better to either send a mail or file a bug, to not risk
splitting the discussion..
> Is this a bug and will it be addressed in near future?
nobody picked up the work regarding the lock granularity, but it would
be a nice improvement IMHO!
Fabian
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pve-devel] Consistency in volume deletion in process of concurrent VM deletion
2025-10-22 9:49 ` Fabian Grünbichler
@ 2025-10-22 14:38 ` Andrei Perepiolkin via pve-devel
0 siblings, 0 replies; 3+ messages in thread
From: Andrei Perepiolkin via pve-devel @ 2025-10-22 14:38 UTC (permalink / raw)
To: Fabian Grünbichler, Proxmox VE development discussion
Cc: Andrei Perepiolkin
[-- Attachment #1: Type: message/rfc822, Size: 9512 bytes --]
From: Andrei Perepiolkin <andrei.perepiolkin@open-e.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>, "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] Consistency in volume deletion in process of concurrent VM deletion
Date: Wed, 22 Oct 2025 10:38:45 -0400
Message-ID: <e14b6374-9460-4655-8bd5-55bd90245919@open-e.com>
Hi Fabian,
I can try to prototype some proof-of-concept solution for 'lock
granularity'.
Once it is done, the issue of ssh session termination should become clear.
Im new to mail-based contribution and Proxmox code itself.
So I will probably have questions on various topics.
Should I send this questions via email, as messages in bugzila or via
other tool?
Best regards,
Andrei Perepiolkin
On 10/22/25 05:49, Fabian Grünbichler wrote:
> On October 21, 2025 5:33 pm, Andrei Perepiolkin via pve-devel wrote:
>> Hi Proxmox Community,
>>
>> There might be a potential consistency problem with Proxmox vm deletion.
>>
>> If Proxmox receives multiple concurrent VM deletion requests, where each
>> VM has multiple disks located on shared storage.
>>
>> The deletion process may fail or hang when attempting to acquire the
>> storage
>> lock(https://github.com/proxmox/pve-storage/blob/master/src/PVE/Storage.pm#L1196C1-L1209C7).
>>
>> ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> trying to acquire cfs lock 'storage-jdss-Pool-2' ...
>> cfs-lock 'storage-jdss-Pool-2' error: got lock request timeout
>> ...
>>
>> Eventually, the VM configuration files in /etc/pve are removed, but some
>> VM disks may remain.
>>
>> Additionally, the Web UI shows all deletions as successful, even though
>> some disks were not deleted.
>>
>> In my opinion, a VM should either be deleted completely—including all
>> dependent resources—or the deletion should fail, leaving the VM
>> configuration file with an updated state.
> the underlying issue is that the scope of the lock taken for certain
> storage operations is very big for shared storages. we could probably
> reduce it to a more meaningful level for most such storages:
>
> https://bugzilla.proxmox.com/show_bug.cgi?id=1962
>
> but the the error handling might also be lacking in this case, would
> have to double-check.
>
>> Im reproducing this by:
>>
>> for i in `seq 401 420` ; do qm clone 104 $i --name "win-$i" --full
>> --storage jdss-Pool-2 ; done;
>>
>> for i in `seq 401 410` ; do qm destroy $i
>> --destroy-unreferenced-disks 1 --purge 1 & done ;
>>
>>
>> Have to notice that ssh session that I use to conduct 'qm destroy'
>> command get terminated by Proxmox.
> that seems unexpected, are you sure this is caused by PVE?
>
>> Ive duplicated as a bug at:
>> https://bugzilla.proxmox.com/show_bug.cgi?id=6957
> it would be better to either send a mail or file a bug, to not risk
> splitting the discussion..
>
>> Is this a bug and will it be addressed in near future?
> nobody picked up the work regarding the lock granularity, but it would
> be a nice improvement IMHO!
>
> Fabian
>
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-10-22 14:38 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-21 15:33 [pve-devel] Consistency in volume deletion in process of concurrent VM deletion Andrei Perepiolkin via pve-devel
2025-10-22 9:49 ` Fabian Grünbichler
2025-10-22 14:38 ` Andrei Perepiolkin via pve-devel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox