From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id ED79368EE7 for ; Tue, 22 Mar 2022 10:38:51 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E084F1C0A7 for ; Tue, 22 Mar 2022 10:38:51 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 3FCA91C09E for ; Tue, 22 Mar 2022 10:38:51 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 101C94205D for ; Tue, 22 Mar 2022 10:38:45 +0100 (CET) Date: Tue, 22 Mar 2022 10:38:42 +0100 From: Wolfgang Bumiller To: Stefan Sterz Cc: pbs-devel@lists.proxmox.com Message-ID: <20220322093842.62ehezxisls6xsy2@olga.proxmox.com> References: <20220318140655.170656-1-s.sterz@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220318140655.170656-1-s.sterz@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.351 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pbs-devel] [RFC PATCH 0/5] fix #3935: refactor datastore locking to use tmpfs X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Mar 2022 09:38:52 -0000 On Fri, Mar 18, 2022 at 03:06:50PM +0100, Stefan Sterz wrote: > This RFC overhauls the locking mechanism for snapshots and backup > groups in pbs-datastore and wherever groups and snapshots are > accessed. Sending this as an RFC in case I missed places where > locking occurs on groups, snapshots or manifests. > > The first patch in this series moves most of the locking into new > functions that are part of the `datastore` trait. These functions > create a mirror of the directory structure under > `/run/proxmox-backup/locks` that is used for locking. The second > patch adds one more locking function that enables shared locks on > groups. This functionality has not been needed so far, but might be > useful in the future. I am unsure whether it should be included. > > The third patch moves the `/run/proxmox-backup/locks` path into a > constant in order to be able to switch the lock location more easily > latter. Sending this as its own patch as it also affects manifest > locking in a minor way, unlike the previous patches. > > Lastly, patch four and five fix issues that are locking related and > currently present in the codebase. For patch four, see my comment > there. Patch five resolves a race condition when two clients try to > change the owner of a data store via the API. Patch five depends on > patch one, but patch four could be applied independently. So I think we'll move ahead with this series with minor changes and possibly extend it a little further. We had an off-list discussion and basically came to the following conclusion: Since this series moves the locks into a tmpfs, adding `stat()` calls on the locked fd and the path afterwards should be cheap enough. By comparing the inodes we can make sure the file is still the same one and we didn't race against a deletion. This way we don't need to leave the files lying around. >From what I can tell you currently use directories for locks. But since the locks aren't actually the backup group directory anymore I think they should be files instead. Also, since we'd like to actually clean up the locks, the lock paths shouldn't be nested directories as they are now, but flat files (still based on the path but escaped, otherwise cleaning up would become even more complex ;-) ). Other ideas which floated around: - A lock daemon (or simply a fuse implementation where 'unlink' cancels pending flocks). (This could be extended in various ways for async locking, timeouts, etc.) - A "hashmap lock file": An in-memory file filled with robust-futex where the group name simply gets hashed to get to a positioin in the file. Hash clashes aren't an issue since all they'd do is cause multiple groups to be guarded by the same lock, which wouldn't technically be wrong. But the tmpfs+stat approach seems good enough to - at least for now - not really bother with these other options. I also think patch 3 should be merged into 1 (the constant should be there right away), and patch 2 (dead_code) can just be dropped for now. As for the manifest lock (patch 4), we could probably switch to doing a double-stat() there as well instead.