From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.gruenbichler@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id C38806A300
 for <pve-user@lists.proxmox.com>; Thu, 16 Sep 2021 09:26:53 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id AFE7921426
 for <pve-user@lists.proxmox.com>; Thu, 16 Sep 2021 09:26:23 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id D5C6E2141A
 for <pve-user@lists.proxmox.com>; Thu, 16 Sep 2021 09:26:22 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id A7920448E0
 for <pve-user@lists.proxmox.com>; Thu, 16 Sep 2021 09:26:22 +0200 (CEST)
Date: Thu, 16 Sep 2021 09:26:15 +0200
From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= <f.gruenbichler@proxmox.com>
To: Proxmox VE user list <pve-user@lists.proxmox.com>
References: <20210915144716.GR3261@sv.lnf.it>
In-Reply-To: <20210915144716.GR3261@sv.lnf.it>
MIME-Version: 1.0
User-Agent: astroid/0.15.0 (https://github.com/astroidmail/astroid)
Message-Id: <1631776380.0shrfkj31s.astroid@nora.none>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.375 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [PVE-User] storage migration failed: error with cfs lock
 'storage-nfs-scratch': unable to create image: got lock timeout - aborting
 command
X-BeenThere: pve-user@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE user list <pve-user.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-user>, 
 <mailto:pve-user-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-user/>
List-Post: <mailto:pve-user@lists.proxmox.com>
List-Help: <mailto:pve-user-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user>, 
 <mailto:pve-user-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2021 07:26:53 -0000

On September 15, 2021 4:47 pm, Marco Gaiarin wrote:
> Mandi! Fabian Gr=C3=BCnbichler
>   In chel di` si favelave...
>=20
>> this is an issue with certain shared-storage operations in PVE - they ha=
ve to=20
>> happen under a pmxcfs-lock, which has a hard timeout. if the operation=20
>> takes too long, the lock will run into the timeout, and the operation=20
>> fail.
>=20
> OK. Good to know. but...
>=20
>=20
> Mandi! Christoph Weber
>   In chel di` si favelave...
>=20
>> I think this thread might be relevant
>>   https://forum.proxmox.com/threads/error-with-cfs-lock-unable-to-create=
-image-got-lock-timeout-aborting-command.65786/
>=20
> ...seems i have exactly the same trouble, doing some more tests seems
> that timeout does not happen if i use RAW, but only for QCOW; but in
> the temporary NFS storage i've not space for the RAW disk...

the problem (as described in the patch I linked earlier) is that for=20
qcow2, we currently always allocate the metadata for the qcow2 file. if=20
the image file is big enough, and the storage slow enough, this can take=20
too long. for raw there is no metadata (well there is, but it does not=20
scale with the size of the file ;))

the patches allow selecting no pre-allocation for storages where this is=20
an issue - it basically trades off a bit of a performance hit when the=20
image file is filled with data against more/less work when initially=20
creating the image file.

> In this link someone say:
>=20
> 	you can manually create the image (with qemu-img create and then rescan =
to reference it as unused volume in the configuration
>=20
> but i need to move the disk, not create it...

a manual offline move would also be possible, it boils down to:
- create new volume (qemu-img create)
- qemu-img convert old volume to new volume
- change references in guest config to point to new volume
- delete old volume or add it back as unused to the guest config (the=20
  latter happens automatically if you do a rescan)

a manual online move should only be done if you really understand the=20
machinery involved, but it is also an option in theory.

last, you could temporarily switch out the hardcoded
'preallocation=3Dmetadata' in /usr/share/perl5/PVE/Storage/Plugin.pm with=20
'preallocation=3Doff', then reload pveproxy and pvedaemon. running 'apt=20
install --reinstall libpve-storage-perl' reverts to the original code=20
(either after you're done, or if something goes wrong).

obviously all of this should be carefully tested with non-production=20
images/guests/systems first, as you are leaving supported/tested=20
territory!