public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Hannes Laimer <h.laimer@proxmox.com>,
	Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup] task tracking: improve pruning of reused-PID stale entries
Date: Mon, 17 Nov 2025 13:07:26 +0100	[thread overview]
Message-ID: <1763378788.sn0xt8s5og.astroid@yuna.none> (raw)
In-Reply-To: <8d1f6060-d546-4007-bc64-f95f51db9110@proxmox.com>

On November 17, 2025 10:47 am, Hannes Laimer wrote:
> On 11/17/25 10:30, Fabian Grünbichler wrote:
>> On November 17, 2025 9:58 am, Hannes Laimer wrote:
>>> Keep entries only when check_process_running(pid) reports a
>>> starttime equal to the stored one. This improves pruning of stale
>>> entries for all PIDs (not just the current one) and aligns update
>>> with the read path. Counting behavior and semantics are unchanged.
>> 
>> I don't think this is correct? the starttime comparison is only valid
>> for the current process, an old process will almost certainly have a
>> different starting time and we still want to keep its entry if it is
>> still running..
>> 
> 
> But not the same PID. This drops an entry only if the starttime for a 
> pid in `/proc/<pid>/stat` and the tracking file don't match.
> we can't have a reused pid with the old process not dead
> 
> hope I'm not missing something :P
> 

ah right, the starttime comparison there is between the running process'
pidstat and the task entry, not between the pidstat we did for our
current PID, sorry..

then we could simply re-order things like this to make this more
readable:

    let mut updated_tasks: Vec<TaskOperations> = match file_read_optional_string(&path)? {
        Some(data) => serde_json::from_str::<Vec<TaskOperations>>(&data)?
            .into_iter()
            .filter_map(
                |mut task| match procfs::check_process_running(task.pid as pid_t) {
                    // Drop entries for recycled PIDs
                    Some(stat) if stat.starttime != task.starttime => None,
                    // Update entry for current PID
                    Some(_stat) if pid == task.pid => {
                        found_entry = true;
                        match operation {
                            Operation::Read => task.active_operations.read += count,
                            Operation::Write => task.active_operations.write += count,
                            Operation::Lookup => (), // no IO must happen there
                        };
                        updated_active_operations = task.active_operations;
                        Some(task)
                    }
                    // Keep other entries
                    Some(_stat) => Some(task),
                    // Drop entries for PIDs which are not running..
                    None => None,
                },
            )
            .collect(),
        None => Vec::new(),
    };

or, since we already have a helper implementing these semantics:

    let mut updated_tasks: Vec<TaskOperations> = match file_read_optional_string(&path)? {
        Some(data) => serde_json::from_str::<Vec<TaskOperations>>(&data)?
            .into_iter()
            .filter_map(|mut task| {
                match procfs::check_process_running_pstart(task.pid as pid_t, task.starttime) {
                    // Update entry for current PID
                    Some(_stat) if pid == task.pid => {
                        found_entry = true;
                        match operation {
                            Operation::Read => task.active_operations.read += count,
                            Operation::Write => task.active_operations.write += count,
                            Operation::Lookup => (), // no IO must happen there
                        };
                        updated_active_operations = task.active_operations;
                        Some(task)
                    }
                    // Keep other entries
                    Some(_stat) => Some(task),
                    // Drop entries for PIDs which are not running or have been recycled
                    None => None,
                }
            })
            .collect(),
        None => Vec::new(),
    };

we also only ever call update_active_operations with a count of 1 or -1,
which might be worth fixing as well ;) right now, if for some reason we
call it with -1 but there is no entry, we'd end up with 1 active
operation instead of 0 or an error..

> 
>> if we want to improve this, we would need to query the process starttime
>> for all entries, and then compare, but that would make this more
>> expensive..
>> 
>>>
>>> Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
>>> ---
>>> noticed while looking through recent proposed changes to the tracking
>>> logic. this isn't a problem, but should keep the tracking file cleaner
>>> and remove a match arm in the code
>>>
>>>   pbs-datastore/src/task_tracking.rs | 3 +--
>>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/pbs-datastore/src/task_tracking.rs b/pbs-datastore/src/task_tracking.rs
>>> index 44a4522d..4fcbbaa4 100644
>>> --- a/pbs-datastore/src/task_tracking.rs
>>> +++ b/pbs-datastore/src/task_tracking.rs
>>> @@ -114,8 +114,7 @@ pub fn update_active_operations(
>>>               .iter_mut()
>>>               .filter_map(
>>>                   |task| match procfs::check_process_running(task.pid as pid_t) {
>>> -                    Some(stat) if pid == task.pid && stat.starttime != task.starttime => None,
>>> -                    Some(_) => {
>>> +                    Some(stat) if stat.starttime == task.starttime => {
>>>                           if pid == task.pid {
>>>                               found_entry = true;
>>>                               match operation {
>>> -- 
>>> 2.47.3
>>>
>>>
>>>
>>> _______________________________________________
>>> pbs-devel mailing list
>>> pbs-devel@lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>>
>>>
>>>
>> 
>> 
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>> 
>> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2025-11-17 12:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-17  8:58 Hannes Laimer
2025-11-17  9:30 ` Fabian Grünbichler
2025-11-17  9:47   ` Hannes Laimer
2025-11-17 12:07     ` Fabian Grünbichler [this message]
2025-11-20  6:05 ` [pbs-devel] superseded: " Hannes Laimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1763378788.sn0xt8s5og.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=h.laimer@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal