From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 6DF7B1FF13B for ; Wed, 25 Mar 2026 17:06:05 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 28A7B30030; Wed, 25 Mar 2026 17:06:25 +0100 (CET) From: =?UTF-8?q?Michael=20K=C3=B6ppl?= To: pbs-devel@lists.proxmox.com Subject: [PATCH proxmox-backup v3 0/3] fix #7400: improve handling of corrupted job statefiles Date: Wed, 25 Mar 2026 17:06:14 +0100 Message-ID: <20260325160617.342295-1-m.koeppl@proxmox.com> X-Mailer: git-send-email 2.47.3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1774454731900 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.093 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: EEWPZX25FDZIK7J4NYCTEOGQ5DU2DGXJ X-Message-ID-Hash: EEWPZX25FDZIK7J4NYCTEOGQ5DU2DGXJ X-MailFrom: m.koeppl@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox Backup Server development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: This patch series fixes a problem where an empty or corrupted job state file (due to I/O error, abrupt shutdown, ...) would cause API endpoints for listing jobs to return an error, breaking the web UI for users because they could not view any of their configured jobs of that type. It would also cause proxmox-backup-proxy to indefinitely skip the jobs until a user manually triggered it to rewrite the statefile. 1/3 is a preparatory patch that centralizes job statefile loading in compute_schedule_status instead of having every handler function open the statefile, handle potential errors and then passing the JobState to compute_schedule_status. 2/3 introduces a new JobState `Unknown`, representing cases in which the job state could not be determined. In addition, the patch also updates the scheduling functions such that errors during reading the statefiles will result in the Unknown state. 3/3 then utilizes this Unknown state and adapts the scheduling functions such that the Unknown state will then lead to the statefile being overwritten with a new Created state and the job running again at its next scheduled run. changes since v2: - introduced the Unknown state in 2/3, adapted 3/3 accordingly (thanks, @Fabian and @Christian) - make sure the "could not open statefile" error is also printed in garbage_collection_status if status_in_memory.upid is None (thanks, @Christian) - inline jobtype and err variables in error logging changes since v1: - added preparatory patch 1/3, centralizing the statefile loading before adapting the handling of the error case in that centralized place (compute_schedule_status). Thanks, Christian for the suggestion! - adapted the error message if job statefile loading fails to make clear that the default status will be returned as a fallback proxmox-backup: Michael Köppl (3): api: move statefile loading into compute_schedule_status fix #7400: api: gracefully handle corrupted job statefiles fix #7400: proxy: self-heal corrupted job statefiles src/api2/admin/datastore.rs | 15 ++++------ src/api2/admin/prune.rs | 9 ++---- src/api2/admin/sync.rs | 9 ++---- src/api2/admin/verify.rs | 9 ++---- src/api2/tape/backup.rs | 9 ++---- src/bin/proxmox-backup-proxy.rs | 4 ++- src/server/jobstate.rs | 52 +++++++++++++++++++++++++++++---- 7 files changed, 67 insertions(+), 40 deletions(-) Summary over all repositories: 7 files changed, 67 insertions(+), 40 deletions(-) -- Generated by murpp 0.11.0