public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup] fix #4895: jobs: ignore task log not found error
@ 2023-09-20 14:11 Gabriel Goller
  2023-09-27 15:41 ` [pbs-devel] applied: " Thomas Lamprecht
  0 siblings, 1 reply; 3+ messages in thread
From: Gabriel Goller @ 2023-09-20 14:11 UTC (permalink / raw)
  To: pbs-devel

Use job starttime as endtime when it is stuck in `JobState::Starting`
and no task log exists.
A user experienced a power loss, which left a gc job in the `Started`
state, but the task log did not exist. This breaks the schedule and
no following gc runs. Now the error is simply ignored and a new gc job is
started on the next occurence.

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
---
 src/server/jobstate.rs | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/server/jobstate.rs b/src/server/jobstate.rs
index a13d768b..be9dac42 100644
--- a/src/server/jobstate.rs
+++ b/src/server/jobstate.rs
@@ -198,8 +198,9 @@ impl JobState {
                         .map_err(|err| format_err!("error parsing upid: {err}"))?;
 
                     if !worker_is_active_local(&parsed) {
-                        let state = upid_read_status(&parsed)
-                            .map_err(|err| format_err!("error reading upid log status: {err}"))?;
+                        let state = upid_read_status(&parsed).unwrap_or(TaskState::Unknown {
+                            endtime: parsed.starttime,
+                        });
 
                         Ok(JobState::Finished {
                             upid,
-- 
2.39.2





^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pbs-devel] applied: [PATCH proxmox-backup] fix #4895: jobs: ignore task log not found error
  2023-09-20 14:11 [pbs-devel] [PATCH proxmox-backup] fix #4895: jobs: ignore task log not found error Gabriel Goller
@ 2023-09-27 15:41 ` Thomas Lamprecht
  2023-09-28  7:32   ` Gabriel Goller
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Lamprecht @ 2023-09-27 15:41 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Gabriel Goller

Am 20/09/2023 um 16:11 schrieb Gabriel Goller:
> Use job starttime as endtime when it is stuck in `JobState::Starting`
> and no task log exists.
> A user experienced a power loss, which left a gc job in the `Started`
> state, but the task log did not exist. This breaks the schedule and
> no following gc runs. Now the error is simply ignored and a new gc job is
> started on the next occurence.
> 
> Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
> ---
>  src/server/jobstate.rs | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
>

applied, thanks!

IMO it might be worth doing this more centrally, e.g., catch a ENOENT in
the upid_read_status's `File::open(path)` call and return either
`TaskState::Unknown { endtime: upid.startime }`, which is also the
default of upid_read_status on other (parsing) errors, or add a new
`TaskState::NotFound` state to differ between a unknown result and this
situation, and make it more likely that call-sites handle this
explicitly.

What do you think?




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pbs-devel] applied: [PATCH proxmox-backup] fix #4895: jobs: ignore task log not found error
  2023-09-27 15:41 ` [pbs-devel] applied: " Thomas Lamprecht
@ 2023-09-28  7:32   ` Gabriel Goller
  0 siblings, 0 replies; 3+ messages in thread
From: Gabriel Goller @ 2023-09-28  7:32 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox Backup Server development discussion

On 9/27/23 17:41, Thomas Lamprecht wrote:

> [..]
> applied, thanks!
>
> IMO it might be worth doing this more centrally, e.g., catch a ENOENT in
> the upid_read_status's `File::open(path)` call and return either
> `TaskState::Unknown { endtime: upid.startime }`, which is also the
> default of upid_read_status on other (parsing) errors, or add a new
> `TaskState::NotFound` state to differ between a unknown result and this
> situation, and make it more likely that call-sites handle this
> explicitly.
>
> What do you think?

Could make sense because we do quite often:
```
upid_read_status(&info.upid).unwrap_or(TaskState::Unknown { endtime: now });
```
The problem is that we use different endtimes (mostly either 0 or 'now') 
on every
call. So we would have to convert this to a match statement matching 
`TaskState::Unknown`
and changing the endtime AFAICT.





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-09-28  7:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-20 14:11 [pbs-devel] [PATCH proxmox-backup] fix #4895: jobs: ignore task log not found error Gabriel Goller
2023-09-27 15:41 ` [pbs-devel] applied: " Thomas Lamprecht
2023-09-28  7:32   ` Gabriel Goller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal