all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Hannes Laimer <h.laimer@proxmox.com>, pbs-devel@lists.proxmox.com
Cc: Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: [pbs-devel] [PATCH FOLLOW-UP proxmox-backup 2/4] task tracking: actually reset entry if desynced
Date: Thu, 20 Nov 2025 11:22:53 +0100	[thread overview]
Message-ID: <1763633501.91d4npp4ky.astroid@yuna.none> (raw)
In-Reply-To: <b74d6f3e-e241-4aed-88dd-d7485c567208@proxmox.com>

On November 20, 2025 10:37 am, Hannes Laimer wrote:
> hmm, I'm not sure pushing a new 0/0 entry in that case adds much...
> logging this though makes a lot if sense
> 
> actually, I think my patch is not correct. If we have `0/0` and call
> update with -1 we'd end up with a -1 count in the tracking file.
> decrementing is also a problem with a 0 counter, not just with 
> non-existing entries.

that's true. maybe we should first answer the question how we want to
handle such a mismatch, and then think about implementation details ;)

AFAICT:

- we add an operation during datastore lookup (two calls)
- we add an operation when cloning a datastore instance (one call)
- we remove an operation when dropping a datastore instance (one call)

there's some more which are only used by examples and should maybe be
dropped..

if a process crashes without executing the drop handler, a left-over
entry could exist. but such an entry will be cleaned up by the next
update_active_operations call since the PID is no longer valid.

so the only remaining issues would be:
- explicitly leaking instead of dropping a datastore (should never be
  done)
- manually editing the active operations file
- unlinking the lock file while it is used

effectively, if we would ever end up with an active operation count < 0
for a given PID, we know something is wrong. but we can not recover for
this particular PID, so maybe we should add a poison flag (or use a
negative count as such), and require that process to exit before
considering the datastore to be "sane" again?

there are only a few places where the operation counts matter:
- removal from the cache map to close FDs when the last task exits, in
  case certain maintenance mode is set
- waiting for active tasks to be done before activating certain
  maintenance modes

neither of this can be done (safely) if we can no longer tell whether
there are active tasks..

> 
> On 11/20/25 10:03, Fabian Grünbichler wrote:
>> and warn about it. this *should* never happen unless the tracking file got
>> somehow messed with manually..
>> 
>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>> ---
>> This one fixes the replied-to patch to also correctly store an entry with no
>> tasks for the current PID, instead of just returning that there are none..
>> 
>> I am actually not sure how we should handle such a desync, we now pretend it's
>> the last task even though we don't know for sure.. maybe we should just error
>> out and let the Drop handler (not) handle it?
>> 
>>   pbs-datastore/src/task_tracking.rs | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>> 
>> diff --git a/pbs-datastore/src/task_tracking.rs b/pbs-datastore/src/task_tracking.rs
>> index 10afebbe2..755d88fdf 100644
>> --- a/pbs-datastore/src/task_tracking.rs
>> +++ b/pbs-datastore/src/task_tracking.rs
>> @@ -94,7 +94,7 @@ pub fn get_active_operations_locked(
>>   pub fn update_active_operations(
>>       name: &str,
>>       operation: Operation,
>> -    count: i64,
>> +    mut count: i64,
>>   ) -> Result<ActiveOperationStats, Error> {
>>       let path = PathBuf::from(format!("{}/{}", crate::ACTIVE_OPERATIONS_DIR, name));
>>   
>> @@ -131,7 +131,15 @@ pub fn update_active_operations(
>>           None => Vec::new(),
>>       };
>>   
>> -    if !found_entry && count > 0 {
>> +    if !found_entry {
>> +        if count < 0 {
>> +            // if we don't have any operations at the moment, decrementing is not possible..
>> +            log::warn!(
>> +                "Active operations tracking mismatch - no current entry for {pid} but asked
>> +to decrement by {count}!"
>> +            );
>> +            count = 0;
>> +        };
>>           match operation {
>>               Operation::Read => updated_active_operations.read = count,
>>               Operation::Write => updated_active_operations.write = count,
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

      reply	other threads:[~2025-11-20 10:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-20  6:02 [pbs-devel] [PATCH proxmox-backup v2] task tracking: improve pruning and fix accounting for missing entries Hannes Laimer
2025-11-20  9:01 ` [pbs-devel] [PATCH FOLLOW-UP proxmox-backup 2/4] task tracking: actually reset entry if desynced Fabian Grünbichler
2025-11-20  9:01   ` [pbs-devel] [PATCH FOLLOW-UP proxmox-backup 3/4] task tracking: refactor code Fabian Grünbichler
2025-11-20  9:01   ` [pbs-devel] [RFC FOLLOW-UP proxmox-backup 4/4] task tracking: simplify public interface Fabian Grünbichler
2025-11-20  9:37   ` [pbs-devel] [PATCH FOLLOW-UP proxmox-backup 2/4] task tracking: actually reset entry if desynced Hannes Laimer
2025-11-20 10:22     ` Fabian Grünbichler [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1763633501.91d4npp4ky.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=h.laimer@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal