From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pbs-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id B18A21FF15E
	for <inbox@lore.proxmox.com>; Tue, 25 Mar 2025 11:12:42 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 3ACC5611E;
	Tue, 25 Mar 2025 11:12:38 +0100 (CET)
Message-ID: <cccbc471-8f26-4da0-87a9-61c1c4c100e7@proxmox.com>
Date: Tue, 25 Mar 2025 11:12:33 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
To: Shannon Sterz <s.sterz@proxmox.com>,
 Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com>
References: <20250324125157.165976-1-s.sterz@proxmox.com>
 <20250324125157.165976-2-s.sterz@proxmox.com>
 <41f554be-cd53-4ea3-9692-11ce238f6b15@proxmox.com>
 <D8P9ASEBBW7F.29VZZSGU1ZEIC@proxmox.com>
Content-Language: en-US, de-DE
From: Christian Ebner <c.ebner@proxmox.com>
In-Reply-To: <D8P9ASEBBW7F.29VZZSGU1ZEIC@proxmox.com>
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.029 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pbs-devel] [PATCH proxmox-backup v8 1/4] datastore/api/backup:
 prepare for fix of #3935 by adding lock helpers
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox Backup Server development discussion
 <pbs-devel@lists.proxmox.com>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: pbs-devel-bounces@lists.proxmox.com
Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com>

On 3/25/25 10:57, Shannon Sterz wrote:
> On Tue Mar 25, 2025 at 10:35 AM CET, Christian Ebner wrote:
>> some small nits inline
>>
>> On 3/24/25 13:51, Shannon Sterz wrote:
>>> to avoid duplicate code, add helpers for locking groups and snapshots
>>> to the BackupGroup and BackupDir traits respectively and refactor
>>> existing code to use them.
>>>
>>> this also adapts error handling by adding relevant context to each
>>> locking helper call site. otherwise, we might loose valuable
>>> information useful for debugging. note, however, that users that
>>> relied on specific error messages will break.
>>>
>>> Signed-off-by: Shannon Sterz <s.sterz@proxmox.com>
>>> ---
> 
> -->8 snip 8<--
> 
>>> +
>>> +    /// Lock a group exclusively
>>> +    pub fn lock(&self) -> Result<DirLockGuard, Error> {
>>> +        lock_dir_noblock(
>>> +            &self.full_group_path(),
>>
>> this allocates the group path...
>>
>>> +            "backup group",
>>> +            "possible running backup",
>>> +        )
>>> +    }
>>>    }
>>>
>>>    impl AsRef<pbs_api_types::BackupNamespace> for BackupGroup {
>>> @@ -442,15 +454,33 @@ impl BackupDir {
>>>                .map_err(|err| format_err!("unable to acquire manifest lock {:?} - {}", &path, err))
>>>        }
>>>
>>> +    /// Lock this snapshot exclusively
>>> +    pub fn lock(&self) -> Result<DirLockGuard, Error> {
>>> +        lock_dir_noblock(
>>> +            &self.full_path(),
>>
>> ... this allocates the snapshot path ...
>>
>>> +            "snapshot",
>>> +            "backup is running or snapshot is in use",
>>> +        )
>>> +    }
>>> +
>>> +    /// Acquire a shared lock on this snapshot
>>> +    pub fn lock_shared(&self) -> Result<DirLockGuard, Error> {
>>> +        lock_dir_noblock_shared(
>>> +            &self.full_path(),
>>> +            "snapshot",
>>> +            "backup is running or snapshot is in use, could not acquire shared lock",
>>> +        )
>>> +    }
>>> +
>>>        /// Destroy the whole snapshot, bails if it's protected
>>>        ///
>>>        /// Setting `force` to true skips locking and thus ignores if the backup is currently in use.
>>>        pub fn destroy(&self, force: bool) -> Result<(), Error> {
>>> -        let full_path = self.full_path();
>>> -
>>>            let (_guard, _manifest_guard);
>>>            if !force {
>>> -            _guard = lock_dir_noblock(&full_path, "snapshot", "possibly running or in use")?;
>>> +            _guard = self
>>> +                .lock()
>>> +                .with_context(|| format!("while destroying snapshot '{self:?}'"))?;
>>>                _manifest_guard = self.lock_manifest()?;
>>>            }
>>>
>>> @@ -458,6 +488,7 @@ impl BackupDir {
>>>                bail!("cannot remove protected snapshot"); // use special error type?
>>>            }
>>>
>>> +        let full_path = self.full_path();
>>
>> ... and here it reallocates the snapshot path again.
>>
>>>            log::info!("removing backup snapshot {:?}", full_path);
>>>            std::fs::remove_dir_all(&full_path).map_err(|err| {
>>>                format_err!("removing backup snapshot {:?} failed - {}", full_path, err,)
>>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>>> index a6a91ca7..1cbd0038 100644
>>> --- a/pbs-datastore/src/datastore.rs
>>> +++ b/pbs-datastore/src/datastore.rs
>>> @@ -5,7 +5,7 @@ use std::os::unix::io::AsRawFd;
>>>    use std::path::{Path, PathBuf};
>>>    use std::sync::{Arc, LazyLock, Mutex};
>>>
>>> -use anyhow::{bail, format_err, Error};
>>> +use anyhow::{bail, format_err, Context, Error};
>>>    use nix::unistd::{unlinkat, UnlinkatFlags};
>>>    use tracing::{info, warn};
>>>
>>> @@ -13,8 +13,8 @@ use proxmox_human_byte::HumanByte;
>>>    use proxmox_schema::ApiType;
>>>
>>>    use proxmox_sys::error::SysError;
>>> +use proxmox_sys::fs::DirLockGuard;
>>>    use proxmox_sys::fs::{file_read_optional_string, replace_file, CreateOptions};
>>> -use proxmox_sys::fs::{lock_dir_noblock, DirLockGuard};
>>>    use proxmox_sys::linux::procfs::MountInfo;
>>>    use proxmox_sys::process_locker::ProcessLockSharedGuard;
>>>    use proxmox_worker_task::WorkerTaskContext;
>>> @@ -774,41 +774,35 @@ impl DataStore {
>>>        ///
>>>        /// This also acquires an exclusive lock on the directory and returns the lock guard.
>>>        pub fn create_locked_backup_group(
>>> -        &self,
>>> +        self: &Arc<Self>,
>>>            ns: &BackupNamespace,
>>>            backup_group: &pbs_api_types::BackupGroup,
>>>            auth_id: &Authid,
>>>        ) -> Result<(Authid, DirLockGuard), Error> {
>>> -        // create intermediate path first:
>>> -        let mut full_path = self.base_path();
>>> -        for ns in ns.components() {
>>> -            full_path.push("ns");
>>> -            full_path.push(ns);
>>> -        }
>>> -        full_path.push(backup_group.ty.as_str());
>>> -        std::fs::create_dir_all(&full_path)?;
>>> +        let backup_group = self.backup_group(ns.clone(), backup_group.clone());
>>>
>>> -        full_path.push(&backup_group.id);
>>> +        // create intermediate path first
>>> +        let full_path = backup_group.full_group_path();
>>
>> ... same here, this reallocates the group path.
>>
>> I see that this will change with the new locking mechanism, in the
>> following patches, but might it make sense to also generate the
>> respective paths together with the locking in order to avoid double
>> allocation?
>>
>> However not that relevant since it will get phased out once fully
>> switched to the new locking mechanism anyways.
> 
> if i understand you correctly then we have basically two choices. a)
> make the locking handlers take a parameter that will eventually be
> ignored, which is the actual path of the group or b) let the locking
> helper return the path.
> 
> both would mean that there is now more churn, as we are messing with the
> signatures of the locking helpers now and when we remove the old locking
> mechanism completely.
> 
> also i don't think having the path be allocated twice is that much of a
> performance hit. especially considering that both options mean that
> during the transition period we would always create the locking path and
> the full path. even when already using the new locking mechanism where
> we don't technically need the latter at all call sites anymore.
> 
> there is also option c) which could add a field to these structs like
> `full_path: Option<PathBuf>` (or similar) that caches these allocations,
> but that seems a bit like over-engineering to me tbh. at least i
> wouldn't want to add that unless it has actual performance impact.
> 
> -->8 snip 8<--

Agreed, just wanted to bring your attention to this in case it was not 
considered. Nothing in the path creation callstack seems expensive 
enough to justify any of your suggested approaches, also given that the 
locking path will change anyways.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel