public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Gabriel Goller <g.goller@proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 1/3] fix #3786: api: add resync-corrupt option to sync jobs
Date: Thu, 21 Nov 2024 11:04:07 +0100	[thread overview]
Message-ID: <dhxpybs2vlloe4cfzs3n2dqvamyqklsjcydhuofrk5wb3kglkr@3nyavus7dspf> (raw)
In-Reply-To: <173210826421.198988.14774192201672116937@yuna.proxmox.com>

On 20.11.2024 14:11, Fabian Grünbichler wrote:
>a few small nits inline, looks good to me otherwise, but given the size of this
>and the size of the push series, I'd rather this be rebased on top of the other
>one ;)

Sure, shouldn't be a lot of work. Should I send a rebased version on top
of the current push series as a v4?

>Quoting Gabriel Goller (2024-11-05 11:40:13)
>> diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
>> index 414ec878d01a..e6174322dad6 100644
>> --- a/pbs-datastore/src/backup_info.rs
>> +++ b/pbs-datastore/src/backup_info.rs
>> @@ -8,7 +8,8 @@ use anyhow::{bail, format_err, Error};
>>  use proxmox_sys::fs::{lock_dir_noblock, replace_file, CreateOptions};
>>
>>  use pbs_api_types::{
>> -    Authid, BackupNamespace, BackupType, GroupFilter, BACKUP_DATE_REGEX, BACKUP_FILE_REGEX,
>> +    Authid, BackupNamespace, BackupType, GroupFilter, VerifyState, BACKUP_DATE_REGEX,
>> +    BACKUP_FILE_REGEX,
>>  };
>>  use pbs_config::{open_backup_lockfile, BackupLockGuard};
>>
>> @@ -583,6 +584,15 @@ impl BackupDir {
>>
>>          Ok(())
>>      }
>> +
>> +    /// Load the verify state from the manifest.
>> +    pub fn verify_state(&self) -> Option<VerifyState> {
>
>should this be a Result<Option<..>> to allow differentiation between no
>verification state, and failure to parse or load the manifest? that would allow
>us to resync totally corrupted snapshots as well (although that might be
>considered out of scope based on the parameter description ;))

Yep it was already like this in the first version, no idea why I changed
it. Like this we can return the load_manifest error with the Result and
swallow the inner error with a `ok()` as it doesn't matter anymore.

     pub fn verify_state(&self) -> Result<Option<VerifyState>, anyhow::Error> {
         let manifest = self.load_manifest()?;
         Ok(manifest.0.verify_state().ok().flatten().map(|svs| svs.state))
     }


I think we also want to resync on errors when reading the manifest, I'll
include that in the next version! Something like this maybe:

     match local_dir.verify_state() {
         Ok(Some(state)) => {
             if state == VerifyState::Failed {
                 return Some((dir, true));
             }
         },
         Ok(None) => {
             // This means there either was an error parsing the manifest, or the 
             // verify_state item was not found. This could be a new backup.
         }
         Err(_) => {
             // There was an error loading the manifest, probably better if we
             // resync.
             return Some((dir, true));
         }
     }

>> +        if let Ok(manifest) = self.load_manifest() {
>> +            manifest.0.verify_state()
>> +        } else {
>> +            None
>> +        }
>> +    }
>>  }
>>
>>  impl AsRef<pbs_api_types::BackupNamespace> for BackupDir {
>> diff --git a/pbs-datastore/src/manifest.rs b/pbs-datastore/src/manifest.rs
>> index c3df014272a0..623c1499c0bb 100644
>> --- a/pbs-datastore/src/manifest.rs
>> +++ b/pbs-datastore/src/manifest.rs
>> @@ -5,7 +5,7 @@ use anyhow::{bail, format_err, Error};
>>  use serde::{Deserialize, Serialize};
>>  use serde_json::{json, Value};
>>
>> -use pbs_api_types::{BackupType, CryptMode, Fingerprint};
>> +use pbs_api_types::{BackupType, CryptMode, Fingerprint, SnapshotVerifyState, VerifyState};
>>  use pbs_tools::crypt_config::CryptConfig;
>>
>>  pub const MANIFEST_BLOB_NAME: &str = "index.json.blob";
>> @@ -242,6 +242,17 @@ impl BackupManifest {
>>          let manifest: BackupManifest = serde_json::from_value(json)?;
>>          Ok(manifest)
>>      }
>> +
>> +    /// Get the verify state of the snapshot
>> +    ///
>> +    /// Note: New snapshots, which have not been verified yet, do not have a status and this
>> +    /// function will return `None`.
>> +    pub fn verify_state(&self) -> Option<VerifyState> {
>
>should this be a Result<Option<..>> to allow differentiation between no
>verification state, and failure to parse?

Hmm so I could return a Result<Option<..>> by checking the error of the
serde_json::from_value call. I could check if the "verify_state" value
wasn't found in the manifest by calling `is_eof` [0] and if not, return
a Ok(None), otherwise return an Error. This will make it more
complicated for all the callers though – also 99% of the callers will
treat Err the same as Ok(None) anyways. LTMK what you think!

     /// Get the verify state of the snapshot
     ///
     /// Note: New snapshots, which have not been verified yet, do not have a status and this
     /// function will return `Ok(None)`.
     pub fn verify_state(&self) -> Result<Option<SnapshotVerifyState>, anyhow::Error> {
         let verify = self.unprotected["verify_state"].clone();
         match serde_json::from_value::<SnapshotVerifyState>(verify) {
             Err(err) => {
                 // `verify_state` item has not been found
                 if err.is_eof() {
                     Ok(None)
                 }else {
                     Err(err.into())
                 }
             },
             Ok(svs) => {
                 Ok(Some(svs))
             }
         }
     }


Else I could just return a Result<SnapshotVerifyState>.

[0]: https://docs.rs/serde_json/latest/serde_json/struct.Error.html#method.is_eof

>also, it would be great if existing code retrieving this could be adapted to
>use these new helpers, which would require having the Result there as well..

Yep, overlooked those, my bad.

>> @@ -381,7 +388,7 @@ async fn pull_snapshot<'a>(
>>          let mut path = snapshot.full_path();
>>          path.push(&item.filename);
>>
>> -        if path.exists() {
>> +        if !corrupt && path.exists() {
>>              match ArchiveType::from_path(&item.filename)? {
>>                  ArchiveType::DynamicIndex => {
>>                      let index = DynamicIndexReader::open(&path)?;
>> @@ -443,6 +450,7 @@ async fn pull_snapshot_from<'a>(
>>      reader: Arc<dyn SyncSourceReader + 'a>,
>>      snapshot: &'a pbs_datastore::BackupDir,
>>      downloaded_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
>> +    corrupt: bool,
>>  ) -> Result<SyncStats, Error> {
>>      let (_path, is_new, _snap_lock) = snapshot
>>          .datastore()
>> @@ -451,7 +459,7 @@ async fn pull_snapshot_from<'a>(
>>      let sync_stats = if is_new {
>
>is_new and corrupt are never both true..
>
>>          info!("sync snapshot {}", snapshot.dir());
>>
>> -        match pull_snapshot(reader, snapshot, downloaded_chunks).await {
>> +        match pull_snapshot(reader, snapshot, downloaded_chunks, corrupt).await {
>
>so this should be always false ;)

Agree, wrote a comment and passed directly `false`.

>>              Err(err) => {
>>                  if let Err(cleanup_err) = snapshot.datastore().remove_backup_dir(
>>                      snapshot.backup_ns(),
>> @@ -468,8 +476,15 @@ async fn pull_snapshot_from<'a>(
>>              }
>>          }
>>      } else {
>> -        info!("re-sync snapshot {}", snapshot.dir());
>> -        pull_snapshot(reader, snapshot, downloaded_chunks).await?
>> +        if corrupt {
>> +            info!(
>> +                "re-sync snapshot {} due to bad verification result",
>
>nit: why not call it "corrupt", since that is what the parameter is called?

ack

>> +                snapshot.dir()
>> +            );
>> +        } else {
>> +            info!("re-sync snapshot {}", snapshot.dir());
>> +        }
>> +        pull_snapshot(reader, snapshot, downloaded_chunks, corrupt).await?
>>      };
>>
>>      Ok(sync_stats)


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2024-11-21 10:04 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-05 10:40 [pbs-devel] [PATCH proxmox-backup v3 0/3] fix #3786: resync corrupt chunks in sync-job Gabriel Goller
2024-11-05 10:40 ` [pbs-devel] [PATCH proxmox-backup v3 1/3] fix #3786: api: add resync-corrupt option to sync jobs Gabriel Goller
2024-11-20 13:11   ` Fabian Grünbichler
2024-11-21 10:04     ` Gabriel Goller [this message]
2024-11-21 10:09       ` Fabian Grünbichler
2024-11-21 10:30         ` Gabriel Goller
2024-11-05 10:40 ` [pbs-devel] [PATCH proxmox-backup v3 2/3] fix #3786: ui/cli: add resync-corrupt option on sync-jobs Gabriel Goller
2024-11-20 13:12   ` Fabian Grünbichler
2024-11-21 10:18     ` Gabriel Goller
2024-11-05 10:40 ` [pbs-devel] [PATCH proxmox-backup v3 3/3] fix #3786: docs: add resync-corrupt option to sync-job Gabriel Goller
2024-11-21 13:34 ` [pbs-devel] [PATCH proxmox-backup v3 0/3] fix #3786: resync corrupt chunks in sync-job Gabriel Goller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dhxpybs2vlloe4cfzs3n2dqvamyqklsjcydhuofrk5wb3kglkr@3nyavus7dspf \
    --to=g.goller@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal