public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup] verify: handle manifest update errors as non-fatal
@ 2025-01-28 11:47 Christian Ebner
  2025-01-28 12:43 ` Gabriel Goller
  2025-02-10 10:47 ` [pbs-devel] applied: " Thomas Lamprecht
  0 siblings, 2 replies; 4+ messages in thread
From: Christian Ebner @ 2025-01-28 11:47 UTC (permalink / raw)
  To: pbs-devel

Since commit 8ea00f6e ("allow to abort verify jobs") errors
propagated up to the verify jobs worker call side are interpreted as
job aborts.

The manifest update did not honor this, leading to the verify job
being aborted with the misleading log entry:
`verification failed - job aborted`

Instead, handle the manifest update error non-fatal just like any
other verification related error, log it including the error message
and continue verification with the next item.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 src/backup/verify.rs | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index 840a37859..02478b165 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -3,7 +3,7 @@ use std::sync::atomic::{AtomicUsize, Ordering};
 use std::sync::{Arc, Mutex};
 use std::time::Instant;
 
-use anyhow::{bail, format_err, Error};
+use anyhow::{bail, Error};
 use nix::dir::Dir;
 use tracing::{error, info, warn};
 
@@ -399,12 +399,20 @@ pub fn verify_backup_dir_with_lock(
         state: verify_result,
         upid,
     };
-    let verify_state = serde_json::to_value(verify_state)?;
-    backup_dir
-        .update_manifest(|manifest| {
+
+    if let Err(err) = {
+        let verify_state = serde_json::to_value(verify_state)?;
+        backup_dir.update_manifest(|manifest| {
             manifest.unprotected["verify_state"] = verify_state;
         })
-        .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
+    } {
+        info!(
+            "verify {}:{} - manifest update error: {err}",
+            verify_worker.datastore.name(),
+            backup_dir.dir(),
+        );
+        return Ok(false);
+    }
 
     Ok(error_count == 0)
 }
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup] verify: handle manifest update errors as non-fatal
  2025-01-28 11:47 [pbs-devel] [PATCH proxmox-backup] verify: handle manifest update errors as non-fatal Christian Ebner
@ 2025-01-28 12:43 ` Gabriel Goller
  2025-01-29  8:15   ` Christian Ebner
  2025-02-10 10:47 ` [pbs-devel] applied: " Thomas Lamprecht
  1 sibling, 1 reply; 4+ messages in thread
From: Gabriel Goller @ 2025-01-28 12:43 UTC (permalink / raw)
  To: Christian Ebner; +Cc: pbs-devel

On 28.01.2025 12:47, Christian Ebner wrote:
>Since commit 8ea00f6e ("allow to abort verify jobs") errors
>propagated up to the verify jobs worker call side are interpreted as
>job aborts.
>
>The manifest update did not honor this, leading to the verify job
>being aborted with the misleading log entry:
>`verification failed - job aborted`
>
>Instead, handle the manifest update error non-fatal just like any
>other verification related error, log it including the error message
>and continue verification with the next item.
>
>Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>---
> src/backup/verify.rs | 18 +++++++++++++-----
> 1 file changed, 13 insertions(+), 5 deletions(-)
>
>diff --git a/src/backup/verify.rs b/src/backup/verify.rs
>index 840a37859..02478b165 100644
>--- a/src/backup/verify.rs
>+++ b/src/backup/verify.rs
>@@ -3,7 +3,7 @@ use std::sync::atomic::{AtomicUsize, Ordering};
> use std::sync::{Arc, Mutex};
> use std::time::Instant;
>
>-use anyhow::{bail, format_err, Error};
>+use anyhow::{bail, Error};
> use nix::dir::Dir;
> use tracing::{error, info, warn};
>
>@@ -399,12 +399,20 @@ pub fn verify_backup_dir_with_lock(
>         state: verify_result,
>         upid,
>     };
>-    let verify_state = serde_json::to_value(verify_state)?;
>-    backup_dir
>-        .update_manifest(|manifest| {
>+
>+    if let Err(err) = {
>+        let verify_state = serde_json::to_value(verify_state)?;
>+        backup_dir.update_manifest(|manifest| {
>             manifest.unprotected["verify_state"] = verify_state;
>         })
>-        .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
>+    } {
>+        info!(
>+            "verify {}:{} - manifest update error: {err}",
>+            verify_worker.datastore.name(),
>+            backup_dir.dir(),
>+        );

Is there any reason for not using tracing::error? This would be nice to
find in the syslog as well. Also using "{err:#}" would print the whole
error chain/context.

>+        return Ok(false);
>+    }
>
>     Ok(error_count == 0)
> }


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup] verify: handle manifest update errors as non-fatal
  2025-01-28 12:43 ` Gabriel Goller
@ 2025-01-29  8:15   ` Christian Ebner
  0 siblings, 0 replies; 4+ messages in thread
From: Christian Ebner @ 2025-01-29  8:15 UTC (permalink / raw)
  To: Gabriel Goller; +Cc: pbs-devel

On 1/28/25 13:43, Gabriel Goller wrote:
> On 28.01.2025 12:47, Christian Ebner wrote:
>> Since commit 8ea00f6e ("allow to abort verify jobs") errors
>> propagated up to the verify jobs worker call side are interpreted as
>> job aborts.
>>
>> The manifest update did not honor this, leading to the verify job
>> being aborted with the misleading log entry:
>> `verification failed - job aborted`
>>
>> Instead, handle the manifest update error non-fatal just like any
>> other verification related error, log it including the error message
>> and continue verification with the next item.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> src/backup/verify.rs | 18 +++++++++++++-----
>> 1 file changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/backup/verify.rs b/src/backup/verify.rs
>> index 840a37859..02478b165 100644
>> --- a/src/backup/verify.rs
>> +++ b/src/backup/verify.rs
>> @@ -3,7 +3,7 @@ use std::sync::atomic::{AtomicUsize, Ordering};
>> use std::sync::{Arc, Mutex};
>> use std::time::Instant;
>>
>> -use anyhow::{bail, format_err, Error};
>> +use anyhow::{bail, Error};
>> use nix::dir::Dir;
>> use tracing::{error, info, warn};
>>
>> @@ -399,12 +399,20 @@ pub fn verify_backup_dir_with_lock(
>>         state: verify_result,
>>         upid,
>>     };
>> -    let verify_state = serde_json::to_value(verify_state)?;
>> -    backup_dir
>> -        .update_manifest(|manifest| {
>> +
>> +    if let Err(err) = {
>> +        let verify_state = serde_json::to_value(verify_state)?;
>> +        backup_dir.update_manifest(|manifest| {
>>             manifest.unprotected["verify_state"] = verify_state;
>>         })
>> -        .map_err(|err| format_err!("unable to update manifest blob - 
>> {}", err))?;
>> +    } {
>> +        info!(
>> +            "verify {}:{} - manifest update error: {err}",
>> +            verify_worker.datastore.name(),
>> +            backup_dir.dir(),
>> +        );
> 
> Is there any reason for not using tracing::error? This would be nice to
> find in the syslog as well. Also using "{err:#}" would print the whole
> error chain/context.

The reason I used `info` over `error` here is that all other errors for 
verification are logged the same way, so I would argue that logging only 
this one case as error is not correct. If we however decide for using 
the `error` over the `info`, I think this should be adapted for all 
verification errors being logged, to be consistent.

Regarding error formatting, as the errors by update_manifest do not add 
error context, this has no effect currently as far as I see, but you are 
right in that this would be more future proof.

Should I send a v2 for that?


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pbs-devel] applied: [PATCH proxmox-backup] verify: handle manifest update errors as non-fatal
  2025-01-28 11:47 [pbs-devel] [PATCH proxmox-backup] verify: handle manifest update errors as non-fatal Christian Ebner
  2025-01-28 12:43 ` Gabriel Goller
@ 2025-02-10 10:47 ` Thomas Lamprecht
  1 sibling, 0 replies; 4+ messages in thread
From: Thomas Lamprecht @ 2025-02-10 10:47 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Christian Ebner

Am 28.01.25 um 12:47 schrieb Christian Ebner:
> Since commit 8ea00f6e ("allow to abort verify jobs") errors
> propagated up to the verify jobs worker call side are interpreted as
> job aborts.
> 
> The manifest update did not honor this, leading to the verify job
> being aborted with the misleading log entry:
> `verification failed - job aborted`
> 
> Instead, handle the manifest update error non-fatal just like any
> other verification related error, log it including the error message
> and continue verification with the next item.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  src/backup/verify.rs | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
>

applied, thanks!

I kept your patch as is, keeping this consistently using info-level for now
is fine, should be changed for all log-sites, if wanted. The error context
would not hurt to use in general, but as it doesn't changes anything now
this also might be better suited in a dedicated patch switching more than
just this call-site over.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-02-10 10:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-28 11:47 [pbs-devel] [PATCH proxmox-backup] verify: handle manifest update errors as non-fatal Christian Ebner
2025-01-28 12:43 ` Gabriel Goller
2025-01-29  8:15   ` Christian Ebner
2025-02-10 10:47 ` [pbs-devel] applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal