[pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot

public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed

* [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
@ 2025-03-06 12:08 Maximiliano Sandoval
  2025-03-07 10:37 ` Shannon Sterz
  0 siblings, 1 reply; 10+ messages in thread
From: Maximiliano Sandoval @ 2025-03-06 12:08 UTC (permalink / raw)
  To: pbs-devel

When the last snapshot from a group is deleted we clear the entire
group, this in turn cleans the owner for the group.

Without this change, the user is unable to change the owner of the group
after the last snapshot has been deleted. This would prevent a new
backups to the same group from a different owner.

Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
---
 src/api2/admin/datastore.rs | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index dbb7ae47..305673f1 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -423,10 +423,20 @@ pub async fn delete_snapshot(
             &backup_dir.group,
         )?;
 
-        let snapshot = datastore.backup_dir(ns, backup_dir)?;
+        let snapshot = datastore.backup_dir(ns.clone(), backup_dir)?;
 
         snapshot.destroy(false)?;
 
+        let group = BackupGroup::from(snapshot);
+        if group.list_backups().is_ok_and(|backups| backups.is_empty()) {
+            if let Err(err) = datastore.remove_backup_group(&ns, group.as_ref()) {
+                log::error!(
+                    "error while cleaning group {path:?} - {err}",
+                    path = group.full_group_path()
+                );
+            }
+        }
+
         Ok(Value::Null)
     })
     .await?
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-06 12:08 [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot Maximiliano Sandoval
@ 2025-03-07 10:37 ` Shannon Sterz
  2025-03-07 15:33   ` Wolfgang Bumiller
  0 siblings, 1 reply; 10+ messages in thread
From: Shannon Sterz @ 2025-03-07 10:37 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Maximiliano Sandoval

On Thu Mar 6, 2025 at 1:08 PM CET, Maximiliano Sandoval wrote:
> When the last snapshot from a group is deleted we clear the entire
> group, this in turn cleans the owner for the group.
>
> Without this change, the user is unable to change the owner of the group
> after the last snapshot has been deleted. This would prevent a new
> backups to the same group from a different owner.
>
> Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
> ---
>  src/api2/admin/datastore.rs | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
> index dbb7ae47..305673f1 100644
> --- a/src/api2/admin/datastore.rs
> +++ b/src/api2/admin/datastore.rs
> @@ -423,10 +423,20 @@ pub async fn delete_snapshot(
>              &backup_dir.group,
>          )?;
>
> -        let snapshot = datastore.backup_dir(ns, backup_dir)?;
> +        let snapshot = datastore.backup_dir(ns.clone(), backup_dir)?;
>
>          snapshot.destroy(false)?;
>
> +        let group = BackupGroup::from(snapshot);
> +        if group.list_backups().is_ok_and(|backups| backups.is_empty()) {
> +            if let Err(err) = datastore.remove_backup_group(&ns, group.as_ref()) {
> +                log::error!(
> +                    "error while cleaning group {path:?} - {err}",
> +                    path = group.full_group_path()
> +                );
> +            }
> +        }
> +

this bug... looks so harmless, but comes back to haunt me every time. as
explained by wobu here [1] it is not really possible to cleanly fix this
bug without reworking our locking mechanism. i did send some patches for
that (checks notes) almost 3 years ago ^^', but they are still not
applied and definitively need a rework at this point [2]. i can pick
this up again, sorry got focused on other things in the meantime.

[1]: https://lore.proxmox.com/pbs-devel/20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com/
[2]: https://lore.proxmox.com/pbs-devel/20220824124829.392189-1-s.sterz@proxmox.com/

>          Ok(Value::Null)
>      })
>      .await?



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-07 10:37 ` Shannon Sterz
@ 2025-03-07 15:33   ` Wolfgang Bumiller
  2025-03-07 15:53     ` Shannon Sterz
  0 siblings, 1 reply; 10+ messages in thread
From: Wolfgang Bumiller @ 2025-03-07 15:33 UTC (permalink / raw)
  To: Shannon Sterz; +Cc: Proxmox Backup Server development discussion

On Fri, Mar 07, 2025 at 11:37:32AM +0100, Shannon Sterz wrote:
> On Thu Mar 6, 2025 at 1:08 PM CET, Maximiliano Sandoval wrote:
> > When the last snapshot from a group is deleted we clear the entire
> > group, this in turn cleans the owner for the group.
> >
> > Without this change, the user is unable to change the owner of the group
> > after the last snapshot has been deleted. This would prevent a new
> > backups to the same group from a different owner.
> >
> > Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
> > ---
> >  src/api2/admin/datastore.rs | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
> > index dbb7ae47..305673f1 100644
> > --- a/src/api2/admin/datastore.rs
> > +++ b/src/api2/admin/datastore.rs
> > @@ -423,10 +423,20 @@ pub async fn delete_snapshot(
> >              &backup_dir.group,
> >          )?;
> >
> > -        let snapshot = datastore.backup_dir(ns, backup_dir)?;
> > +        let snapshot = datastore.backup_dir(ns.clone(), backup_dir)?;
> >
> >          snapshot.destroy(false)?;
> >
> > +        let group = BackupGroup::from(snapshot);
> > +        if group.list_backups().is_ok_and(|backups| backups.is_empty()) {
> > +            if let Err(err) = datastore.remove_backup_group(&ns, group.as_ref()) {
> > +                log::error!(
> > +                    "error while cleaning group {path:?} - {err}",
> > +                    path = group.full_group_path()
> > +                );
> > +            }
> > +        }
> > +
> 
> this bug... looks so harmless, but comes back to haunt me every time. as
> explained by wobu here [1] it is not really possible to cleanly fix this
> bug without reworking our locking mechanism. i did send some patches for
> that (checks notes) almost 3 years ago ^^', but they are still not
> applied and definitively need a rework at this point [2]. i can pick
> this up again, sorry got focused on other things in the meantime.
> 
> [1]: https://lore.proxmox.com/pbs-devel/20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com/
> [2]: https://lore.proxmox.com/pbs-devel/20220824124829.392189-1-s.sterz@proxmox.com/

IIRC we need to figure out a good upgrade strategy so running processes
don't use different locking.

One idea was to have the postinst script create a file in run (eg
`/run/proxmox-backup/old-locking`) which, when present, instructs the
daemons to keep using the old strategy.

The new one would then automatically be used after either a reboot, or
manually removing the file between stop & start of the daemons.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-07 15:33   ` Wolfgang Bumiller
@ 2025-03-07 15:53     ` Shannon Sterz
  2025-03-10 10:53       ` Wolfgang Bumiller
  0 siblings, 1 reply; 10+ messages in thread
From: Shannon Sterz @ 2025-03-07 15:53 UTC (permalink / raw)
  To: Wolfgang Bumiller; +Cc: Proxmox Backup Server development discussion

On Fri Mar 7, 2025 at 4:33 PM CET, Wolfgang Bumiller wrote:
> On Fri, Mar 07, 2025 at 11:37:32AM +0100, Shannon Sterz wrote:
>> On Thu Mar 6, 2025 at 1:08 PM CET, Maximiliano Sandoval wrote:
>> > When the last snapshot from a group is deleted we clear the entire
>> > group, this in turn cleans the owner for the group.
>> >
>> > Without this change, the user is unable to change the owner of the group
>> > after the last snapshot has been deleted. This would prevent a new
>> > backups to the same group from a different owner.
>> >
>> > Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
>> > ---
>> >  src/api2/admin/datastore.rs | 12 +++++++++++-
>> >  1 file changed, 11 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
>> > index dbb7ae47..305673f1 100644
>> > --- a/src/api2/admin/datastore.rs
>> > +++ b/src/api2/admin/datastore.rs
>> > @@ -423,10 +423,20 @@ pub async fn delete_snapshot(
>> >              &backup_dir.group,
>> >          )?;
>> >
>> > -        let snapshot = datastore.backup_dir(ns, backup_dir)?;
>> > +        let snapshot = datastore.backup_dir(ns.clone(), backup_dir)?;
>> >
>> >          snapshot.destroy(false)?;
>> >
>> > +        let group = BackupGroup::from(snapshot);
>> > +        if group.list_backups().is_ok_and(|backups| backups.is_empty()) {
>> > +            if let Err(err) = datastore.remove_backup_group(&ns, group.as_ref()) {
>> > +                log::error!(
>> > +                    "error while cleaning group {path:?} - {err}",
>> > +                    path = group.full_group_path()
>> > +                );
>> > +            }
>> > +        }
>> > +
>>
>> this bug... looks so harmless, but comes back to haunt me every time. as
>> explained by wobu here [1] it is not really possible to cleanly fix this
>> bug without reworking our locking mechanism. i did send some patches for
>> that (checks notes) almost 3 years ago ^^', but they are still not
>> applied and definitively need a rework at this point [2]. i can pick
>> this up again, sorry got focused on other things in the meantime.
>>
>> [1]: https://lore.proxmox.com/pbs-devel/20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com/
>> [2]: https://lore.proxmox.com/pbs-devel/20220824124829.392189-1-s.sterz@proxmox.com/
>
> IIRC we need to figure out a good upgrade strategy so running processes
> don't use different locking.
>
> One idea was to have the postinst script create a file in run (eg
> `/run/proxmox-backup/old-locking`) which, when present, instructs the
> daemons to keep using the old strategy.
>
> The new one would then automatically be used after either a reboot, or
> manually removing the file between stop & start of the daemons.

yeah i remember that being a blocker, but since pbs 4 is coming up
soon-ish, couldn't we just apply it then? upgrading between 3 -> 4
requiring a reboot seems reasonable to me, though maybe i'm missing
something (e.g. could it be problematic to have the services running,
even shortly, before the reboot?).

if that would be an option, it'd be much simpler than carrying around
that switching logic forever (or at least one major version?). also,
what would happen if a user accidentally creates that file after the new
locking is already in-place? do we consider this "bad luck" or do we
want some kind of protection in-place?


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-07 15:53     ` Shannon Sterz
@ 2025-03-10 10:53       ` Wolfgang Bumiller
  2025-03-10 15:19         ` Fabian Grünbichler
  0 siblings, 1 reply; 10+ messages in thread
From: Wolfgang Bumiller @ 2025-03-10 10:53 UTC (permalink / raw)
  To: Shannon Sterz
  Cc: Thomas Lamprecht, Proxmox Backup Server development discussion

On Fri, Mar 07, 2025 at 04:53:14PM +0100, Shannon Sterz wrote:
> On Fri Mar 7, 2025 at 4:33 PM CET, Wolfgang Bumiller wrote:
> > On Fri, Mar 07, 2025 at 11:37:32AM +0100, Shannon Sterz wrote:
> >> On Thu Mar 6, 2025 at 1:08 PM CET, Maximiliano Sandoval wrote:
> >> > When the last snapshot from a group is deleted we clear the entire
> >> > group, this in turn cleans the owner for the group.
> >> >
> >> > Without this change, the user is unable to change the owner of the group
> >> > after the last snapshot has been deleted. This would prevent a new
> >> > backups to the same group from a different owner.
> >> >
> >> > Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
> >> > ---
> >> >  src/api2/admin/datastore.rs | 12 +++++++++++-
> >> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
> >> > index dbb7ae47..305673f1 100644
> >> > --- a/src/api2/admin/datastore.rs
> >> > +++ b/src/api2/admin/datastore.rs
> >> > @@ -423,10 +423,20 @@ pub async fn delete_snapshot(
> >> >              &backup_dir.group,
> >> >          )?;
> >> >
> >> > -        let snapshot = datastore.backup_dir(ns, backup_dir)?;
> >> > +        let snapshot = datastore.backup_dir(ns.clone(), backup_dir)?;
> >> >
> >> >          snapshot.destroy(false)?;
> >> >
> >> > +        let group = BackupGroup::from(snapshot);
> >> > +        if group.list_backups().is_ok_and(|backups| backups.is_empty()) {
> >> > +            if let Err(err) = datastore.remove_backup_group(&ns, group.as_ref()) {
> >> > +                log::error!(
> >> > +                    "error while cleaning group {path:?} - {err}",
> >> > +                    path = group.full_group_path()
> >> > +                );
> >> > +            }
> >> > +        }
> >> > +
> >>
> >> this bug... looks so harmless, but comes back to haunt me every time. as
> >> explained by wobu here [1] it is not really possible to cleanly fix this
> >> bug without reworking our locking mechanism. i did send some patches for
> >> that (checks notes) almost 3 years ago ^^', but they are still not
> >> applied and definitively need a rework at this point [2]. i can pick
> >> this up again, sorry got focused on other things in the meantime.
> >>
> >> [1]: https://lore.proxmox.com/pbs-devel/20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com/
> >> [2]: https://lore.proxmox.com/pbs-devel/20220824124829.392189-1-s.sterz@proxmox.com/
> >
> > IIRC we need to figure out a good upgrade strategy so running processes
> > don't use different locking.
> >
> > One idea was to have the postinst script create a file in run (eg
> > `/run/proxmox-backup/old-locking`) which, when present, instructs the
> > daemons to keep using the old strategy.
> >
> > The new one would then automatically be used after either a reboot, or
> > manually removing the file between stop & start of the daemons.
> 
> yeah i remember that being a blocker, but since pbs 4 is coming up
> soon-ish, couldn't we just apply it then? upgrading between 3 -> 4
> requiring a reboot seems reasonable to me, though maybe i'm missing
> something (e.g. could it be problematic to have the services running,
> even shortly, before the reboot?).
> 
> if that would be an option, it'd be much simpler than carrying around
> that switching logic forever (or at least one major version?). also,

I suppose that could work. Depending on how we want to deal with
old-version support there? @Thomas?

> what would happen if a user accidentally creates that file after the new
> locking is already in-place? do we consider this "bad luck" or do we
> want some kind of protection in-place?

For upgrade->downgrade->upgrade switcheroo the postinst script should be
able to prevent this, since it gets old and new versions.

For anything else I'm putting "accidentally creating that file" in the
same category as "accidentally modifying files in their PBS storage" and
"accidentally putting their hard drives in the microwave".


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-10 10:53       ` Wolfgang Bumiller
@ 2025-03-10 15:19         ` Fabian Grünbichler
  2025-03-10 15:22           ` Shannon Sterz
  0 siblings, 1 reply; 10+ messages in thread
From: Fabian Grünbichler @ 2025-03-10 15:19 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Wolfgang Bumiller,
	Shannon Sterz
  Cc: Thomas Lamprecht


> Wolfgang Bumiller <w.bumiller@proxmox.com> hat am 10.03.2025 11:53 CET geschrieben:
> 
>  
> On Fri, Mar 07, 2025 at 04:53:14PM +0100, Shannon Sterz wrote:
> > On Fri Mar 7, 2025 at 4:33 PM CET, Wolfgang Bumiller wrote:
> > > On Fri, Mar 07, 2025 at 11:37:32AM +0100, Shannon Sterz wrote:
> > >> On Thu Mar 6, 2025 at 1:08 PM CET, Maximiliano Sandoval wrote:
> > >> > When the last snapshot from a group is deleted we clear the entire
> > >> > group, this in turn cleans the owner for the group.
> > >> >
> > >> > Without this change, the user is unable to change the owner of the group
> > >> > after the last snapshot has been deleted. This would prevent a new
> > >> > backups to the same group from a different owner.
> > >> >
> > >> > Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
> > >> > ---
> > >> >  src/api2/admin/datastore.rs | 12 +++++++++++-
> > >> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > >> >
> > >> > diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
> > >> > index dbb7ae47..305673f1 100644
> > >> > --- a/src/api2/admin/datastore.rs
> > >> > +++ b/src/api2/admin/datastore.rs
> > >> > @@ -423,10 +423,20 @@ pub async fn delete_snapshot(
> > >> >              &backup_dir.group,
> > >> >          )?;
> > >> >
> > >> > -        let snapshot = datastore.backup_dir(ns, backup_dir)?;
> > >> > +        let snapshot = datastore.backup_dir(ns.clone(), backup_dir)?;
> > >> >
> > >> >          snapshot.destroy(false)?;
> > >> >
> > >> > +        let group = BackupGroup::from(snapshot);
> > >> > +        if group.list_backups().is_ok_and(|backups| backups.is_empty()) {
> > >> > +            if let Err(err) = datastore.remove_backup_group(&ns, group.as_ref()) {
> > >> > +                log::error!(
> > >> > +                    "error while cleaning group {path:?} - {err}",
> > >> > +                    path = group.full_group_path()
> > >> > +                );
> > >> > +            }
> > >> > +        }
> > >> > +
> > >>
> > >> this bug... looks so harmless, but comes back to haunt me every time. as
> > >> explained by wobu here [1] it is not really possible to cleanly fix this
> > >> bug without reworking our locking mechanism. i did send some patches for
> > >> that (checks notes) almost 3 years ago ^^', but they are still not
> > >> applied and definitively need a rework at this point [2]. i can pick
> > >> this up again, sorry got focused on other things in the meantime.
> > >>
> > >> [1]: https://lore.proxmox.com/pbs-devel/20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com/
> > >> [2]: https://lore.proxmox.com/pbs-devel/20220824124829.392189-1-s.sterz@proxmox.com/
> > >
> > > IIRC we need to figure out a good upgrade strategy so running processes
> > > don't use different locking.
> > >
> > > One idea was to have the postinst script create a file in run (eg
> > > `/run/proxmox-backup/old-locking`) which, when present, instructs the
> > > daemons to keep using the old strategy.
> > >
> > > The new one would then automatically be used after either a reboot, or
> > > manually removing the file between stop & start of the daemons.
> > 
> > yeah i remember that being a blocker, but since pbs 4 is coming up
> > soon-ish, couldn't we just apply it then? upgrading between 3 -> 4
> > requiring a reboot seems reasonable to me, though maybe i'm missing
> > something (e.g. could it be problematic to have the services running,
> > even shortly, before the reboot?).
> > 
> > if that would be an option, it'd be much simpler than carrying around
> > that switching logic forever (or at least one major version?). also,
> 
> I suppose that could work. Depending on how we want to deal with
> old-version support there? @Thomas?

I don't think that works, unless we want to require putting all datastores
into maintenance mode prior to the upgrade and until the system has been
rebooted?

otherwise, the upgrade from 3.x to 4.x is just like any other, with all the
same problems if the old proxy still has a backup or other lock-holding task
running and the new one uses different locking..

> > what would happen if a user accidentally creates that file after the new
> > locking is already in-place? do we consider this "bad luck" or do we
> > want some kind of protection in-place?
> 
> For upgrade->downgrade->upgrade switcheroo the postinst script should be
> able to prevent this, since it gets old and new versions.
> 
> For anything else I'm putting "accidentally creating that file" in the
> same category as "accidentally modifying files in their PBS storage" and
> "accidentally putting their hard drives in the microwave".

agreed on that part :)


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-10 15:19         ` Fabian Grünbichler
@ 2025-03-10 15:22           ` Shannon Sterz
  2025-03-11  9:21             ` Thomas Lamprecht
  0 siblings, 1 reply; 10+ messages in thread
From: Shannon Sterz @ 2025-03-10 15:22 UTC (permalink / raw)
  To: Fabian Grünbichler,
	Proxmox Backup Server development discussion, Wolfgang Bumiller
  Cc: Thomas Lamprecht

On Mon Mar 10, 2025 at 4:19 PM CET, Fabian Grünbichler wrote:
>
>> Wolfgang Bumiller <w.bumiller@proxmox.com> hat am 10.03.2025 11:53 CET geschrieben:
>>
>>
>> On Fri, Mar 07, 2025 at 04:53:14PM +0100, Shannon Sterz wrote:
>> > On Fri Mar 7, 2025 at 4:33 PM CET, Wolfgang Bumiller wrote:
>> > > On Fri, Mar 07, 2025 at 11:37:32AM +0100, Shannon Sterz wrote:
>> > >> On Thu Mar 6, 2025 at 1:08 PM CET, Maximiliano Sandoval wrote:
>> > >> > When the last snapshot from a group is deleted we clear the entire
>> > >> > group, this in turn cleans the owner for the group.
>> > >> >
>> > >> > Without this change, the user is unable to change the owner of the group
>> > >> > after the last snapshot has been deleted. This would prevent a new
>> > >> > backups to the same group from a different owner.
>> > >> >
>> > >> > Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
>> > >> > ---
>> > >> >  src/api2/admin/datastore.rs | 12 +++++++++++-
>> > >> >  1 file changed, 11 insertions(+), 1 deletion(-)
>> > >> >
>> > >> > diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
>> > >> > index dbb7ae47..305673f1 100644
>> > >> > --- a/src/api2/admin/datastore.rs
>> > >> > +++ b/src/api2/admin/datastore.rs
>> > >> > @@ -423,10 +423,20 @@ pub async fn delete_snapshot(
>> > >> >              &backup_dir.group,
>> > >> >          )?;
>> > >> >
>> > >> > -        let snapshot = datastore.backup_dir(ns, backup_dir)?;
>> > >> > +        let snapshot = datastore.backup_dir(ns.clone(), backup_dir)?;
>> > >> >
>> > >> >          snapshot.destroy(false)?;
>> > >> >
>> > >> > +        let group = BackupGroup::from(snapshot);
>> > >> > +        if group.list_backups().is_ok_and(|backups| backups.is_empty()) {
>> > >> > +            if let Err(err) = datastore.remove_backup_group(&ns, group.as_ref()) {
>> > >> > +                log::error!(
>> > >> > +                    "error while cleaning group {path:?} - {err}",
>> > >> > +                    path = group.full_group_path()
>> > >> > +                );
>> > >> > +            }
>> > >> > +        }
>> > >> > +
>> > >>
>> > >> this bug... looks so harmless, but comes back to haunt me every time. as
>> > >> explained by wobu here [1] it is not really possible to cleanly fix this
>> > >> bug without reworking our locking mechanism. i did send some patches for
>> > >> that (checks notes) almost 3 years ago ^^', but they are still not
>> > >> applied and definitively need a rework at this point [2]. i can pick
>> > >> this up again, sorry got focused on other things in the meantime.
>> > >>
>> > >> [1]: https://lore.proxmox.com/pbs-devel/20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com/
>> > >> [2]: https://lore.proxmox.com/pbs-devel/20220824124829.392189-1-s.sterz@proxmox.com/
>> > >
>> > > IIRC we need to figure out a good upgrade strategy so running processes
>> > > don't use different locking.
>> > >
>> > > One idea was to have the postinst script create a file in run (eg
>> > > `/run/proxmox-backup/old-locking`) which, when present, instructs the
>> > > daemons to keep using the old strategy.
>> > >
>> > > The new one would then automatically be used after either a reboot, or
>> > > manually removing the file between stop & start of the daemons.
>> >
>> > yeah i remember that being a blocker, but since pbs 4 is coming up
>> > soon-ish, couldn't we just apply it then? upgrading between 3 -> 4
>> > requiring a reboot seems reasonable to me, though maybe i'm missing
>> > something (e.g. could it be problematic to have the services running,
>> > even shortly, before the reboot?).
>> >
>> > if that would be an option, it'd be much simpler than carrying around
>> > that switching logic forever (or at least one major version?). also,
>>
>> I suppose that could work. Depending on how we want to deal with
>> old-version support there? @Thomas?
>
> I don't think that works, unless we want to require putting all datastores
> into maintenance mode prior to the upgrade and until the system has been
> rebooted?
>
> otherwise, the upgrade from 3.x to 4.x is just like any other, with all the
> same problems if the old proxy still has a backup or other lock-holding task
> running and the new one uses different locking..

i feel like it would be fine to do that though? we already optionally
recommended that when upgrading from 2 -> 3 [1]. so requiring that and
documenting it in the upgrade notes sounds fine to me.

[1]: https://pbs.proxmox.com/wiki/index.php/Upgrade_from_2_to_3#Optional:_Enable_Maintenance_Mode

>
>> > what would happen if a user accidentally creates that file after the new
>> > locking is already in-place? do we consider this "bad luck" or do we
>> > want some kind of protection in-place?
>>
>> For upgrade->downgrade->upgrade switcheroo the postinst script should be
>> able to prevent this, since it gets old and new versions.
>>
>> For anything else I'm putting "accidentally creating that file" in the
>> same category as "accidentally modifying files in their PBS storage" and
>> "accidentally putting their hard drives in the microwave".
>
> agreed on that part :)



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-10 15:22           ` Shannon Sterz
@ 2025-03-11  9:21             ` Thomas Lamprecht
  2025-03-11  9:52               ` Shannon Sterz
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Lamprecht @ 2025-03-11  9:21 UTC (permalink / raw)
  To: Shannon Sterz, Fabian Grünbichler,
	Proxmox Backup Server development discussion, Wolfgang Bumiller

On 10/03/2025 16:22, Shannon Sterz wrote:
> On Mon Mar 10, 2025 at 4:19 PM CET, Fabian Grünbichler wrote:
>>> Wolfgang Bumiller <w.bumiller@proxmox.com> hat am 10.03.2025 11:53 CET geschrieben:
>>> On Fri, Mar 07, 2025 at 04:53:14PM +0100, Shannon Sterz wrote:
>>>> On Fri Mar 7, 2025 at 4:33 PM CET, Wolfgang Bumiller wrote:
>>>>> On Fri, Mar 07, 2025 at 11:37:32AM +0100, Shannon Sterz wrote:
>>>>> IIRC we need to figure out a good upgrade strategy so running processes
>>>>> don't use different locking.
>>>>>
>>>>> One idea was to have the postinst script create a file in run (eg
>>>>> `/run/proxmox-backup/old-locking`) which, when present, instructs the
>>>>> daemons to keep using the old strategy.
>>>>>
>>>>> The new one would then automatically be used after either a reboot, or
>>>>> manually removing the file between stop & start of the daemons.
>>>>
>>>> yeah i remember that being a blocker, but since pbs 4 is coming up
>>>> soon-ish, couldn't we just apply it then? upgrading between 3 -> 4
>>>> requiring a reboot seems reasonable to me, though maybe i'm missing
>>>> something (e.g. could it be problematic to have the services running,
>>>> even shortly, before the reboot?).
>>>>
>>>> if that would be an option, it'd be much simpler than carrying around
>>>> that switching logic forever (or at least one major version?). also,

One major version would be enough for all practical cases as we require
a reboot for the new kernel after an upgrade, which ensures that no old
PBS processes are around anymore as side effect. So people staying on EOL
version for a while and then do two major upgrades in quick succession
without any reboot are a bit on their own anyway, adding a hint to the
upgrade docs for that case would be cheap though and cover those admins
that actually try to do a good job and are just fixing such an outdated
setup without having been the one that caused it to exist in the first
place.

>> I don't think that works, unless we want to require putting all datastores
>> into maintenance mode prior to the upgrade and until the system has been
>> rebooted?
>>
>> otherwise, the upgrade from 3.x to 4.x is just like any other, with all the
>> same problems if the old proxy still has a backup or other lock-holding task
>> running and the new one uses different locking..

That's what the flag is for, touch it on upgrade before the new daemon
starts, in the new daemon set an internal global OnceLock (or the like)
guarded flag and use that to determine if old or new locking needs to be
used. On the next reboot the flag won't be there anymore and thus new
locking mode is used.


> i feel like it would be fine to do that though? we already optionally
> recommended that when upgrading from 2 -> 3 [1]. so requiring that and
> documenting it in the upgrade notes sounds fine to me.
> 
> [1]: https://pbs.proxmox.com/wiki/index.php/Upgrade_from_2_to_3#Optional:_Enable_Maintenance_Mode

can be an option, but not requiring user doing this is always nicer for
them and our support.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-11  9:21             ` Thomas Lamprecht
@ 2025-03-11  9:52               ` Shannon Sterz
  2025-03-11 10:40                 ` Thomas Lamprecht
  0 siblings, 1 reply; 10+ messages in thread
From: Shannon Sterz @ 2025-03-11  9:52 UTC (permalink / raw)
  To: Thomas Lamprecht, Fabian Grünbichler,
	Proxmox Backup Server development discussion, Wolfgang Bumiller

On Tue Mar 11, 2025 at 10:21 AM CET, Thomas Lamprecht wrote:
> On 10/03/2025 16:22, Shannon Sterz wrote:
>> On Mon Mar 10, 2025 at 4:19 PM CET, Fabian Grünbichler wrote:
>>>> Wolfgang Bumiller <w.bumiller@proxmox.com> hat am 10.03.2025 11:53 CET geschrieben:
>>>> On Fri, Mar 07, 2025 at 04:53:14PM +0100, Shannon Sterz wrote:
>>>>> On Fri Mar 7, 2025 at 4:33 PM CET, Wolfgang Bumiller wrote:
>>>>>> On Fri, Mar 07, 2025 at 11:37:32AM +0100, Shannon Sterz wrote:
>>>>>> IIRC we need to figure out a good upgrade strategy so running processes
>>>>>> don't use different locking.
>>>>>>
>>>>>> One idea was to have the postinst script create a file in run (eg
>>>>>> `/run/proxmox-backup/old-locking`) which, when present, instructs the
>>>>>> daemons to keep using the old strategy.
>>>>>>
>>>>>> The new one would then automatically be used after either a reboot, or
>>>>>> manually removing the file between stop & start of the daemons.
>>>>>
>>>>> yeah i remember that being a blocker, but since pbs 4 is coming up
>>>>> soon-ish, couldn't we just apply it then? upgrading between 3 -> 4
>>>>> requiring a reboot seems reasonable to me, though maybe i'm missing
>>>>> something (e.g. could it be problematic to have the services running,
>>>>> even shortly, before the reboot?).
>>>>>
>>>>> if that would be an option, it'd be much simpler than carrying around
>>>>> that switching logic forever (or at least one major version?). also,
>
> One major version would be enough for all practical cases as we require
> a reboot for the new kernel after an upgrade, which ensures that no old
> PBS processes are around anymore as side effect. So people staying on EOL
> version for a while and then do two major upgrades in quick succession
> without any reboot are a bit on their own anyway, adding a hint to the
> upgrade docs for that case would be cheap though and cover those admins
> that actually try to do a good job and are just fixing such an outdated
> setup without having been the one that caused it to exist in the first
> place.
>
>>> I don't think that works, unless we want to require putting all datastores
>>> into maintenance mode prior to the upgrade and until the system has been
>>> rebooted?
>>>
>>> otherwise, the upgrade from 3.x to 4.x is just like any other, with all the
>>> same problems if the old proxy still has a backup or other lock-holding task
>>> running and the new one uses different locking..
>
> That's what the flag is for, touch it on upgrade before the new daemon
> starts, in the new daemon set an internal global OnceLock (or the like)
> guarded flag and use that to determine if old or new locking needs to be
> used. On the next reboot the flag won't be there anymore and thus new
> locking mode is used.

hm since this was sparked by the group removal bug (the one that leaves
the owner file in place), that would mean we can only fix that once we
are sure that the new locking mechanism is used? or do we build in that
contingency there too?

can send a v7 that adds the locking strategy switch later today
probably.

>> i feel like it would be fine to do that though? we already optionally
>> recommended that when upgrading from 2 -> 3 [1]. so requiring that and
>> documenting it in the upgrade notes sounds fine to me.
>>
>> [1]: https://pbs.proxmox.com/wiki/index.php/Upgrade_from_2_to_3#Optional:_Enable_Maintenance_Mode
>
> can be an option, but not requiring user doing this is always nicer for
> them and our support.



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot
  2025-03-11  9:52               ` Shannon Sterz
@ 2025-03-11 10:40                 ` Thomas Lamprecht
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Lamprecht @ 2025-03-11 10:40 UTC (permalink / raw)
  To: Shannon Sterz, Fabian Grünbichler,
	Proxmox Backup Server development discussion, Wolfgang Bumiller

On 11/03/2025 10:52, Shannon Sterz wrote:
>> That's what the flag is for, touch it on upgrade before the new daemon
>> starts, in the new daemon set an internal global OnceLock (or the like)
>> guarded flag and use that to determine if old or new locking needs to be
>> used. On the next reboot the flag won't be there anymore and thus new
>> locking mode is used.
> hm since this was sparked by the group removal bug (the one that leaves
> the owner file in place), that would mean we can only fix that once we
> are sure that the new locking mechanism is used? or do we build in that
> contingency there too?
At least the global flag could be checked there too and one could select
behavior based on that, if sensible.

While the overhead on our (dev) side is slightly higher, it's not that
high with rust. And FWIW dropping it again should be relatively easy,
one can just remove the flag and then clean up all compile errors that
are caused by that removal, rinse and repeat style.

I do not want to set that solution in stone, but IMO our maintenance
cost amplification will be relatively low while the one for users won't
be, just by the fact that we got a few orders of magnitudes more users
than devs, so if above is really feasible (it has been a while since
we thought this through and tbh. I did not recheck if that was the
exact same solution we came up last time) then it should be worth the
small hassle we have do deal with for one major release.

> can send a v7 that adds the locking strategy switch later today
> probably.

Could be worth to wait on Fabian having a solid objection here, or
discuss that directly – given you both are desk neighbours ^^ – to
avoid potential extra work.

_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-03-11 10:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-06 12:08 [pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot Maximiliano Sandoval
2025-03-07 10:37 ` Shannon Sterz
2025-03-07 15:33   ` Wolfgang Bumiller
2025-03-07 15:53     ` Shannon Sterz
2025-03-10 10:53       ` Wolfgang Bumiller
2025-03-10 15:19         ` Fabian Grünbichler
2025-03-10 15:22           ` Shannon Sterz
2025-03-11  9:21             ` Thomas Lamprecht
2025-03-11  9:52               ` Shannon Sterz
2025-03-11 10:40                 ` Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal