all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Michael Köppl" <m.koeppl@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [PATCH proxmox-backup v3 2/3] fix #7400: api: gracefully handle corrupted job statefiles
Date: Wed, 25 Mar 2026 17:06:16 +0100	[thread overview]
Message-ID: <20260325160617.342295-3-m.koeppl@proxmox.com> (raw)
In-Reply-To: <20260325160617.342295-1-m.koeppl@proxmox.com>

Introduce Unknown JobState to more explicitly represent cases where the
state could not be determined, e.g. if the statefile was corrupted or
missing. Update JobState::load to handle parsing errors (both for
statefiles themselves as well as UPIDs) and return an Unknown state if
such an error occurred. Update compute_schedule_status to also handle
the new Unknown status, returning a default JobScheduleStatus so API
endpoints don't return an error to the user, stopping them from viewing
their jobs.

Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
 src/server/jobstate.rs | 48 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/src/server/jobstate.rs b/src/server/jobstate.rs
index ceac8dde8..4163656e8 100644
--- a/src/server/jobstate.rs
+++ b/src/server/jobstate.rs
@@ -66,6 +66,7 @@ pub enum JobState {
         state: TaskState,
         updated: Option<i64>,
     },
+    Unknown,
 }
 
 /// Represents a Job and holds the correct lock
@@ -155,6 +156,7 @@ pub fn update_job_last_run_time(jobtype: &str, jobname: &str) -> Result<(), Erro
             state,
             updated: Some(time),
         },
+        JobState::Unknown => bail!("cannot update last run time for unknown job state"),
     };
     job.write_state()
 }
@@ -179,6 +181,7 @@ pub fn last_run_time(jobtype: &str, jobname: &str) -> Result<i64, Error> {
                 .map_err(|err| format_err!("could not parse upid from state: {err}"))?;
             Ok(upid.starttime)
         }
+        JobState::Unknown => bail!("statefile could not be parsed or was empty"),
     }
 }
 
@@ -191,11 +194,20 @@ impl JobState {
     /// This does not update the state in the file.
     pub fn load(jobtype: &str, jobname: &str) -> Result<Self, Error> {
         if let Some(state) = file_read_optional_string(get_path(jobtype, jobname))? {
-            match serde_json::from_str(&state)? {
+            let job_state = serde_json::from_str(&state).unwrap_or_else(|err| {
+                log::error!("could not parse statefile for {jobname}: {err}");
+                JobState::Unknown
+            });
+
+            match job_state {
                 JobState::Started { upid } => {
-                    let parsed: UPID = upid
-                        .parse()
-                        .map_err(|err| format_err!("error parsing upid: {err}"))?;
+                    let parsed: UPID = match upid.parse() {
+                        Ok(parsed) => parsed,
+                        Err(err) => {
+                            log::error!("error parsing upid for {jobname}: {err}");
+                            return Ok(JobState::Unknown);
+                        }
+                    };
 
                     if !worker_is_active_local(&parsed) {
                         let state = upid_read_status(&parsed).unwrap_or(TaskState::Unknown {
@@ -211,6 +223,21 @@ impl JobState {
                         Ok(JobState::Started { upid })
                     }
                 }
+                JobState::Finished {
+                    upid,
+                    state,
+                    updated,
+                } => {
+                    if let Err(err) = upid.parse::<UPID>() {
+                        log::error!("error parsing upid for {jobname}: {err}");
+                        return Ok(JobState::Unknown);
+                    }
+                    Ok(JobState::Finished {
+                        upid,
+                        state,
+                        updated,
+                    })
+                }
                 other => Ok(other),
             }
         } else {
@@ -263,6 +290,7 @@ impl Job {
             JobState::Created { .. } => bail!("cannot finish when not started"),
             JobState::Started { upid } => upid,
             JobState::Finished { upid, .. } => upid,
+            JobState::Unknown => bail!("cannot finish job with unknown status"),
         }
         .to_string();
 
@@ -305,8 +333,15 @@ pub fn compute_schedule_status(
     jobname: &str,
     schedule: Option<&str>,
 ) -> Result<JobScheduleStatus, Error> {
-    let job_state = JobState::load(jobtype, jobname)
-        .map_err(|err| format_err!("could not open statefile for {jobname}: {err}"))?;
+    let job_state = match JobState::load(jobtype, jobname) {
+        Ok(job_state) => job_state,
+        Err(err) => {
+            log::error!(
+                "could not open statefile for {jobname}: {err} - falling back to default job schedule status",
+            );
+            return Ok(JobScheduleStatus::default());
+        }
+    };
 
     let (upid, endtime, state, last) = match job_state {
         JobState::Created { time } => (None, None, None, time),
@@ -327,6 +362,7 @@ pub fn compute_schedule_status(
                 last,
             )
         }
+        JobState::Unknown => (None, None, None, proxmox_time::epoch_i64() - 30),
     };
 
     let mut status = JobScheduleStatus {
-- 
2.47.3





  parent reply	other threads:[~2026-03-25 16:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-25 16:06 [PATCH proxmox-backup v3 0/3] fix #7400: improve handling of " Michael Köppl
2026-03-25 16:06 ` [PATCH proxmox-backup v3 1/3] api: move statefile loading into compute_schedule_status Michael Köppl
2026-03-25 16:06 ` Michael Köppl [this message]
2026-03-25 16:06 ` [PATCH proxmox-backup v3 3/3] fix #7400: proxy: self-heal corrupted job statefiles Michael Köppl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260325160617.342295-3-m.koeppl@proxmox.com \
    --to=m.koeppl@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal