From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Hannes Laimer <h.laimer@proxmox.com>,
Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup] task tracking: improve pruning of reused-PID stale entries
Date: Mon, 17 Nov 2025 13:07:26 +0100 [thread overview]
Message-ID: <1763378788.sn0xt8s5og.astroid@yuna.none> (raw)
In-Reply-To: <8d1f6060-d546-4007-bc64-f95f51db9110@proxmox.com>
On November 17, 2025 10:47 am, Hannes Laimer wrote:
> On 11/17/25 10:30, Fabian Grünbichler wrote:
>> On November 17, 2025 9:58 am, Hannes Laimer wrote:
>>> Keep entries only when check_process_running(pid) reports a
>>> starttime equal to the stored one. This improves pruning of stale
>>> entries for all PIDs (not just the current one) and aligns update
>>> with the read path. Counting behavior and semantics are unchanged.
>>
>> I don't think this is correct? the starttime comparison is only valid
>> for the current process, an old process will almost certainly have a
>> different starting time and we still want to keep its entry if it is
>> still running..
>>
>
> But not the same PID. This drops an entry only if the starttime for a
> pid in `/proc/<pid>/stat` and the tracking file don't match.
> we can't have a reused pid with the old process not dead
>
> hope I'm not missing something :P
>
ah right, the starttime comparison there is between the running process'
pidstat and the task entry, not between the pidstat we did for our
current PID, sorry..
then we could simply re-order things like this to make this more
readable:
let mut updated_tasks: Vec<TaskOperations> = match file_read_optional_string(&path)? {
Some(data) => serde_json::from_str::<Vec<TaskOperations>>(&data)?
.into_iter()
.filter_map(
|mut task| match procfs::check_process_running(task.pid as pid_t) {
// Drop entries for recycled PIDs
Some(stat) if stat.starttime != task.starttime => None,
// Update entry for current PID
Some(_stat) if pid == task.pid => {
found_entry = true;
match operation {
Operation::Read => task.active_operations.read += count,
Operation::Write => task.active_operations.write += count,
Operation::Lookup => (), // no IO must happen there
};
updated_active_operations = task.active_operations;
Some(task)
}
// Keep other entries
Some(_stat) => Some(task),
// Drop entries for PIDs which are not running..
None => None,
},
)
.collect(),
None => Vec::new(),
};
or, since we already have a helper implementing these semantics:
let mut updated_tasks: Vec<TaskOperations> = match file_read_optional_string(&path)? {
Some(data) => serde_json::from_str::<Vec<TaskOperations>>(&data)?
.into_iter()
.filter_map(|mut task| {
match procfs::check_process_running_pstart(task.pid as pid_t, task.starttime) {
// Update entry for current PID
Some(_stat) if pid == task.pid => {
found_entry = true;
match operation {
Operation::Read => task.active_operations.read += count,
Operation::Write => task.active_operations.write += count,
Operation::Lookup => (), // no IO must happen there
};
updated_active_operations = task.active_operations;
Some(task)
}
// Keep other entries
Some(_stat) => Some(task),
// Drop entries for PIDs which are not running or have been recycled
None => None,
}
})
.collect(),
None => Vec::new(),
};
we also only ever call update_active_operations with a count of 1 or -1,
which might be worth fixing as well ;) right now, if for some reason we
call it with -1 but there is no entry, we'd end up with 1 active
operation instead of 0 or an error..
>
>> if we want to improve this, we would need to query the process starttime
>> for all entries, and then compare, but that would make this more
>> expensive..
>>
>>>
>>> Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
>>> ---
>>> noticed while looking through recent proposed changes to the tracking
>>> logic. this isn't a problem, but should keep the tracking file cleaner
>>> and remove a match arm in the code
>>>
>>> pbs-datastore/src/task_tracking.rs | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/pbs-datastore/src/task_tracking.rs b/pbs-datastore/src/task_tracking.rs
>>> index 44a4522d..4fcbbaa4 100644
>>> --- a/pbs-datastore/src/task_tracking.rs
>>> +++ b/pbs-datastore/src/task_tracking.rs
>>> @@ -114,8 +114,7 @@ pub fn update_active_operations(
>>> .iter_mut()
>>> .filter_map(
>>> |task| match procfs::check_process_running(task.pid as pid_t) {
>>> - Some(stat) if pid == task.pid && stat.starttime != task.starttime => None,
>>> - Some(_) => {
>>> + Some(stat) if stat.starttime == task.starttime => {
>>> if pid == task.pid {
>>> found_entry = true;
>>> match operation {
>>> --
>>> 2.47.3
>>>
>>>
>>>
>>> _______________________________________________
>>> pbs-devel mailing list
>>> pbs-devel@lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>>
>>>
>>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
prev parent reply other threads:[~2025-11-17 12:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-17 8:58 Hannes Laimer
2025-11-17 9:30 ` Fabian Grünbichler
2025-11-17 9:47 ` Hannes Laimer
2025-11-17 12:07 ` Fabian Grünbichler [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1763378788.sn0xt8s5og.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=h.laimer@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.