From: "Michael Köppl" <m.koeppl@proxmox.com>
To: "Michael Köppl" <m.koeppl@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: superseded: [PATCH proxmox-backup v4 0/3] fix #7400: improve handling of corrupted job statefiles
Date: Mon, 13 Apr 2026 15:21:41 +0200 [thread overview]
Message-ID: <DHS24AT81BUH.RPRX6WKPVRT9@proxmox.com> (raw)
In-Reply-To: <20260403132628.210128-1-m.koeppl@proxmox.com>
Superseded by:
https://lore.proxmox.com/pbs-devel/20260413132000.49889-1-m.koeppl@proxmox.com
On Fri Apr 3, 2026 at 3:26 PM CEST, Michael Köppl wrote:
> This patch series fixes a problem [0] where an empty or corrupted job
> state file (due to I/O error, abrupt shutdown, ...) would cause API
> endpoints for listing jobs to return an error, breaking the web UI for
> users because they could not view any of their configured jobs of that
> type. It would also cause proxmox-backup-proxy to indefinitely skip the
> jobs until a user manually triggered it to rewrite the statefile.
>
> 1/3 is a preparatory patch that centralizes job statefile loading
> in compute_schedule_status instead of having every handler function
> open the statefile, handle potential errors and then passing the
> JobState to compute_schedule_status.
>
> 2/3 introduces a new JobState `Unknown`, representing cases in which the
> job state could not be determined. In addition, the patch also updates
> the scheduling functions such that errors during reading the statefiles
> will result in the Unknown state.
>
> 3/3 then utilizes this Unknown state and adapts the scheduling functions
> such that the Unknown state will then lead to the statefile being
> overwritten with a new Created state and the job running again at its
> next scheduled run.
>
> changes since v3:
> - adapted commit message of 1/3 to mention the change in behavior
> regarding the handling of UPID parsing errors with garbage collection
> state files
> - in 2/3, adapt JobState::load to return early with JobState::Unknown
> - defined a constant for the scheduling offset used when calculating
> the last run time. The constant is introduced in 2/3 and also used in
> 3/3
> Thanks for the feedback on v3, @Christian!
>
> changes since v2:
> - introduced the Unknown state in 2/3, adapted 3/3 accordingly (thanks,
> @Fabian and @Christian)
> - make sure the "could not open statefile" error is also printed in
> garbage_collection_status if status_in_memory.upid is None (thanks,
> @Christian)
> - inline jobtype and err variables in error logging
>
> changes since v1:
> - added preparatory patch 1/3, centralizing the statefile loading before
> adapting the handling of the error case in that centralized place
> (compute_schedule_status). Thanks, Christian for the suggestion!
> - adapted the error message if job statefile loading fails to make clear
> that the default status will be returned as a fallback
>
> [0] https://bugzilla.proxmox.com/show_bug.cgi?id=7400
>
> proxmox-backup:
>
> Michael Köppl (3):
> api: move statefile loading into compute_schedule_status
> fix #7400: api: gracefully handle corrupted job statefiles
> fix #7400: proxy: self-heal corrupted job statefiles
>
> src/api2/admin/datastore.rs | 15 +++-----
> src/api2/admin/prune.rs | 9 ++---
> src/api2/admin/sync.rs | 9 ++---
> src/api2/admin/verify.rs | 9 ++---
> src/api2/tape/backup.rs | 9 ++---
> src/bin/proxmox-backup-proxy.rs | 6 ++-
> src/server/jobstate.rs | 65 +++++++++++++++++++++++++++++----
> 7 files changed, 80 insertions(+), 42 deletions(-)
>
>
> Summary over all repositories:
> 7 files changed, 80 insertions(+), 42 deletions(-)
prev parent reply other threads:[~2026-04-13 13:20 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-03 13:26 Michael Köppl
2026-04-03 13:26 ` [PATCH proxmox-backup v4 1/3] api: move statefile loading into compute_schedule_status Michael Köppl
2026-04-03 13:26 ` [PATCH proxmox-backup v4 2/3] fix #7400: api: gracefully handle corrupted job statefiles Michael Köppl
2026-04-09 10:09 ` Shannon Sterz
2026-04-13 12:56 ` Michael Köppl
2026-04-13 13:02 ` Shannon Sterz
2026-04-03 13:26 ` [PATCH proxmox-backup v4 3/3] fix #7400: proxy: self-heal " Michael Köppl
2026-04-13 13:21 ` Michael Köppl [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DHS24AT81BUH.RPRX6WKPVRT9@proxmox.com \
--to=m.koeppl@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox