public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>
Cc: Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>,
	 Stefan Sterz <s.sterz@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup] fix #3336: api: remove backup group if the last snapshot is removed
Date: Mon, 14 Mar 2022 10:36:17 +0100	[thread overview]
Message-ID: <20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com> (raw)
In-Reply-To: <717c8999-d3f8-a01b-a8f5-da0f5960d23f@proxmox.com>

On Fri, Mar 11, 2022 at 01:20:22PM +0100, Thomas Lamprecht wrote:
> On 09.03.22 14:50, Stefan Sterz wrote:
> > Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
> > ---
> >  pbs-datastore/src/datastore.rs | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> > 
> > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> > index d416c8d8..623b7688 100644
> > --- a/pbs-datastore/src/datastore.rs
> > +++ b/pbs-datastore/src/datastore.rs
> > @@ -346,6 +346,28 @@ impl DataStore {
> >                  )
> >              })?;
> >  
> > +        // check if this was the last snapshot and if so remove the group
> > +        if backup_dir
> > +            .group()
> > +            .list_backups(&self.base_path())?
> > +            .is_empty()
> > +        {
> 
> a log::info could be appropriate in the "success" (i.e., delete dir) case.
> 
> I'd factor the this block below out into a non-pub (or pub(crate)) remove_empty_group_dir fn.
> 
> > +            let group_path = self.group_path(backup_dir.group());
> > +            let _guard = proxmox_sys::fs::lock_dir_noblock(
> > +                &group_path,
> > +                "backup group",
> > +                "possible running backup",
> > +            )?;
> > +
> > +            std::fs::remove_dir_all(&group_path).map_err(|err| {
> 
> this is still unsafe as there's a TOCTOU race, the lock does not protects you from the
> following sequence with two threads/async-excutions t1 and t1
> 
> t1.1 snapshot deleted
> t1.2 empty dir check holds up, entering "delete group dir" code branch
> t2.1                                        create new snapshot in group -> lock group dir
> t2.2                                        finish new snapshot in group -> unlock group dir
> t1.3 lock group dir
> t1.4 delete all files, including the new snapshot made in-between.
> 
> Rather, just use the safer "remove_dir" variant, that way the TOCTOU race doesn't matters,
> the check merely becomes a short cut; if we'd explicitly check for
>   `err.kind() != ErrorKind::DirectoryNotEmpty
> and silent it we could even do away with the check, should result in the same amount of
> syscalls in the best-case (one rmdir vs. one readir) and can be better on success
> (readdir + rmdir vs. rmdir only), not that perfromance matters much in this case.
> 
> fyi, "remove_backup_group", the place where I think you copied this part, can use the
> remove_dir_all safely because there's no check to made there, so no TOCTOU.

Correct me if I'm wrong, but I think we need to rethink our locking
there in general. We can't lock the directory itself if we also want to
be allowed to delete it (same reasoning as with regular files):

-> A locks backup group
    -> B begins locking: opens dir handle
-> A deletes group, group is now gone
        -> C recreates the backup group, _locked_
-> A drops directory handle (& with it the lock)
    -> B acquries lock on deleted directory handle which works just fine

now B and C both think they're holding an exlusive lock

We *could* use a lock helper that also stats before and after the lock
(on the handle first, then on the *path* for the second one) to see if
the inode changed, to catch this...
Or we just live with empty directories or (hidden) lock files lingering.
(which would only be safe to clean up during a maintenance mode
operation).
Or we introduce a create/delete lock one level up, held only for the
duration of mkdir()/rmdir() calls.

(But in any case, all the current inline `lock_dir_noblock` calls should
instead go over a safe helper dealing with this properly...)




  reply	other threads:[~2022-03-14  9:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-09 13:50 Stefan Sterz
2022-03-11 12:20 ` Thomas Lamprecht
2022-03-14  9:36   ` Wolfgang Bumiller [this message]
2022-03-14 10:19     ` Thomas Lamprecht
2022-03-14 11:13       ` Stefan Sterz
2022-03-14 11:36         ` Thomas Lamprecht
2022-03-14 14:18           ` Stefan Sterz
2022-03-14 14:53             ` Thomas Lamprecht
2022-03-14 15:19               ` Stefan Sterz
2022-03-14 17:12                 ` Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com \
    --to=w.bumiller@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=s.sterz@proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal