public inbox for pve-user@lists.proxmox.com
 help / color / mirror / Atom feed
* [PVE-User] storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command
@ 2021-09-15  8:15 Marco Gaiarin
       [not found] ` <mailman.90.1631705429.440.pve-user@lists.proxmox.com>
  2021-09-15 13:21 ` Fabian Grünbichler
  0 siblings, 2 replies; 7+ messages in thread
From: Marco Gaiarin @ 2021-09-15  8:15 UTC (permalink / raw)
  To: pve-user


We are trying to move some VMs disks from a cluster (PVE 5, storage LVM
thin), to a storage of type NFS (a PVE 6 server, debian buster standard
NFS server), usin QCOW as destination file/image format.

We have correctly move some smaller disks (200GB), but if we try to
move a 'big' disk, we got:

	Sep 14 22:48:18 pveod1 pvedaemon[31552]: <root@pam> starting task UPID:pveod1:00007BE2:A90E5224:61410A92:qmmove:100:root@pam:
	Sep 14 22:49:18 pveod1 pvedaemon[31552]: <root@pam> end task UPID:pveod1:00007BE2:A90E5224:61410A92:qmmove:100:root@pam: storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command

Only this log row appear, no kernel/nfs errors in nfs server or source
machine.


I've tried to google for this error, or for 'nfs lock timeout' but
nothing relevant (to me) appear.

Someone have some feedback? Thanks.

-- 
dott. Marco Gaiarin				        GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''          http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

		Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
      http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
	(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PVE-User] storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command
       [not found] ` <mailman.90.1631705429.440.pve-user@lists.proxmox.com>
@ 2021-09-15 13:12   ` Marco Gaiarin
  0 siblings, 0 replies; 7+ messages in thread
From: Marco Gaiarin @ 2021-09-15 13:12 UTC (permalink / raw)
  To: pve-user

Mandi! Stefan M. Radman via pve-user
  In chel di` si favelave...

> The error message says "error with cfs lock”.
> If you google that you should get a lot of relevant information.

I've found some hit about CIFS (and i'm using NFS) and about permission
trouble (and as just stated, i've moved on the NFS storage successfully
a 200GB disk, it is the 2TB one that don't move...).

So, seems nothing relevant to me... if you have som direct hit, thanks.

-- 
dott. Marco Gaiarin				        GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''          http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

		Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
      http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
	(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PVE-User] storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command
  2021-09-15  8:15 [PVE-User] storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command Marco Gaiarin
       [not found] ` <mailman.90.1631705429.440.pve-user@lists.proxmox.com>
@ 2021-09-15 13:21 ` Fabian Grünbichler
  2021-09-15 14:47   ` Marco Gaiarin
  1 sibling, 1 reply; 7+ messages in thread
From: Fabian Grünbichler @ 2021-09-15 13:21 UTC (permalink / raw)
  To: Proxmox VE user list

On September 15, 2021 10:15 am, Marco Gaiarin wrote:
> 
> We are trying to move some VMs disks from a cluster (PVE 5, storage LVM
> thin), to a storage of type NFS (a PVE 6 server, debian buster standard
> NFS server), usin QCOW as destination file/image format.
> 
> We have correctly move some smaller disks (200GB), but if we try to
> move a 'big' disk, we got:
> 
> 	Sep 14 22:48:18 pveod1 pvedaemon[31552]: <root@pam> starting task UPID:pveod1:00007BE2:A90E5224:61410A92:qmmove:100:root@pam:
> 	Sep 14 22:49:18 pveod1 pvedaemon[31552]: <root@pam> end task UPID:pveod1:00007BE2:A90E5224:61410A92:qmmove:100:root@pam: storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command
> 
> Only this log row appear, no kernel/nfs errors in nfs server or source
> machine.
> 
> 
> I've tried to google for this error, or for 'nfs lock timeout' but
> nothing relevant (to me) appear.
> 
> Someone have some feedback? Thanks.

this is an issue with certain shared-storage operations in PVE - they have to 
happen under a pmxcfs-lock, which has a hard timeout. if the operation 
takes too long, the lock will run into the timeout, and the operation 
fail.

there has been some recent development to improve the situation:

 https://lists.proxmox.com/pipermail/pve-devel/2021-September/049879.html

but it hasn't been finalized yet.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PVE-User] pve-user Digest, Vol 162, Issue 12
       [not found] <mailman.5.1631700001.9824.pve-user@lists.proxmox.com>
@ 2021-09-15 14:03 ` Christoph Weber
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Weber @ 2021-09-15 14:03 UTC (permalink / raw)
  To: 'pve-user@lists.proxmox.com'

> Message: 1
> Date: Wed, 15 Sep 2021 10:15:08 +0200
> From: Marco Gaiarin <gaio@sv.lnf.it>
> To: pve-user@pve.proxmox.com
> Subject: [PVE-User] storage migration failed: error with cfs lock
> 	'storage-nfs-scratch': unable to create image: got lock timeout -
> 	aborting command
> Message-ID: <20210915081508.GC3261@sv.lnf.it>


> We have correctly move some smaller disks (200GB), but if we try to move a
> 'big' disk, we got:
> 
> 	Sep 14 22:48:18 pveod1 pvedaemon[31552]: <root@pam> starting
> task UPID:pveod1:00007BE2:A90E5224:61410A92:qmmove:100:root@pam:
> 	Sep 14 22:49:18 pveod1 pvedaemon[31552]: <root@pam> end task
> UPID:pveod1:00007BE2:A90E5224:61410A92:qmmove:100:root@pam:
> storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to
> create image: got lock timeout - aborting command

I think this thread might be relevant

  https://forum.proxmox.com/threads/error-with-cfs-lock-unable-to-create-image-got-lock-timeout-aborting-command.65786/

Quote:
>>> we have a hard timeout of 60s for any operation obtaining a cluster lock, which includes volume allocation on shared storages.
...
>>> your storage is simply too slow when allocating bigger images it seems. you need to manually allocate them, for example using qemu-img create or convert.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PVE-User] storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command
  2021-09-15 13:21 ` Fabian Grünbichler
@ 2021-09-15 14:47   ` Marco Gaiarin
  2021-09-16  7:26     ` Fabian Grünbichler
  0 siblings, 1 reply; 7+ messages in thread
From: Marco Gaiarin @ 2021-09-15 14:47 UTC (permalink / raw)
  To: Proxmox VE user list

Mandi! Fabian Grünbichler
  In chel di` si favelave...

> this is an issue with certain shared-storage operations in PVE - they have to 
> happen under a pmxcfs-lock, which has a hard timeout. if the operation 
> takes too long, the lock will run into the timeout, and the operation 
> fail.

OK. Good to know. but...


Mandi! Christoph Weber
  In chel di` si favelave...

> I think this thread might be relevant
>   https://forum.proxmox.com/threads/error-with-cfs-lock-unable-to-create-image-got-lock-timeout-aborting-command.65786/

...seems i have exactly the same trouble, doing some more tests seems
that timeout does not happen if i use RAW, but only for QCOW; but in
the temporary NFS storage i've not space for the RAW disk...

In this link someone say:

	you can manually create the image (with qemu-img create and then rescan to reference it as unused volume in the configuration

but i need to move the disk, not create it...

-- 
dott. Marco Gaiarin				        GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''          http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

		Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
      http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
	(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PVE-User] storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command
  2021-09-15 14:47   ` Marco Gaiarin
@ 2021-09-16  7:26     ` Fabian Grünbichler
  2021-09-16  7:46       ` Marco Gaiarin
  0 siblings, 1 reply; 7+ messages in thread
From: Fabian Grünbichler @ 2021-09-16  7:26 UTC (permalink / raw)
  To: Proxmox VE user list

On September 15, 2021 4:47 pm, Marco Gaiarin wrote:
> Mandi! Fabian Grünbichler
>   In chel di` si favelave...
> 
>> this is an issue with certain shared-storage operations in PVE - they have to 
>> happen under a pmxcfs-lock, which has a hard timeout. if the operation 
>> takes too long, the lock will run into the timeout, and the operation 
>> fail.
> 
> OK. Good to know. but...
> 
> 
> Mandi! Christoph Weber
>   In chel di` si favelave...
> 
>> I think this thread might be relevant
>>   https://forum.proxmox.com/threads/error-with-cfs-lock-unable-to-create-image-got-lock-timeout-aborting-command.65786/
> 
> ...seems i have exactly the same trouble, doing some more tests seems
> that timeout does not happen if i use RAW, but only for QCOW; but in
> the temporary NFS storage i've not space for the RAW disk...

the problem (as described in the patch I linked earlier) is that for 
qcow2, we currently always allocate the metadata for the qcow2 file. if 
the image file is big enough, and the storage slow enough, this can take 
too long. for raw there is no metadata (well there is, but it does not 
scale with the size of the file ;))

the patches allow selecting no pre-allocation for storages where this is 
an issue - it basically trades off a bit of a performance hit when the 
image file is filled with data against more/less work when initially 
creating the image file.

> In this link someone say:
> 
> 	you can manually create the image (with qemu-img create and then rescan to reference it as unused volume in the configuration
> 
> but i need to move the disk, not create it...

a manual offline move would also be possible, it boils down to:
- create new volume (qemu-img create)
- qemu-img convert old volume to new volume
- change references in guest config to point to new volume
- delete old volume or add it back as unused to the guest config (the 
  latter happens automatically if you do a rescan)

a manual online move should only be done if you really understand the 
machinery involved, but it is also an option in theory.

last, you could temporarily switch out the hardcoded
'preallocation=metadata' in /usr/share/perl5/PVE/Storage/Plugin.pm with 
'preallocation=off', then reload pveproxy and pvedaemon. running 'apt 
install --reinstall libpve-storage-perl' reverts to the original code 
(either after you're done, or if something goes wrong).

obviously all of this should be carefully tested with non-production 
images/guests/systems first, as you are leaving supported/tested 
territory!




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PVE-User] storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command
  2021-09-16  7:26     ` Fabian Grünbichler
@ 2021-09-16  7:46       ` Marco Gaiarin
  0 siblings, 0 replies; 7+ messages in thread
From: Marco Gaiarin @ 2021-09-16  7:46 UTC (permalink / raw)
  To: pve-user

Mandi! Fabian Grünbichler
  In chel di` si favelave...

> the problem (as described in the patch I linked earlier) is that for 
> qcow2, we currently always allocate the metadata for the qcow2 file. if 
> the image file is big enough, and the storage slow enough, this can take 
> too long. for raw there is no metadata (well there is, but it does not 
> scale with the size of the file ;))

Perfectly clear. Thanks.


> obviously all of this should be carefully tested with non-production 
> images/guests/systems first, as you are leaving supported/tested 
> territory!

I've solved this in a more simpler way: i've discovered that a 'thin'
filesystem (in my case, ZFS) can allocate a 2TB RAW image in a 800GB
ZFS volume, obviously providing that there's no more then 800GB of data
in the volume.
So, i've simply moved the image as RAW.


Thanks!

-- 
dott. Marco Gaiarin				        GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''          http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

		Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
      http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
	(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-16  7:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <mailman.5.1631700001.9824.pve-user@lists.proxmox.com>
2021-09-15 14:03 ` [PVE-User] pve-user Digest, Vol 162, Issue 12 Christoph Weber
2021-09-15  8:15 [PVE-User] storage migration failed: error with cfs lock 'storage-nfs-scratch': unable to create image: got lock timeout - aborting command Marco Gaiarin
     [not found] ` <mailman.90.1631705429.440.pve-user@lists.proxmox.com>
2021-09-15 13:12   ` Marco Gaiarin
2021-09-15 13:21 ` Fabian Grünbichler
2021-09-15 14:47   ` Marco Gaiarin
2021-09-16  7:26     ` Fabian Grünbichler
2021-09-16  7:46       ` Marco Gaiarin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal