public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling
@ 2021-07-22 13:40 Dominik Csapak
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 1/7] tape: media_catalog: improve chunk_archive interface Dominik Csapak
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Dominik Csapak @ 2021-07-22 13:40 UTC (permalink / raw)
  To: pbs-devel

this series combines my previous catalog related patch-series[0][1][2]

changes the catalog interface to be more concise, optimizes catalog
commit calls during restore, and implements a fast catalog for the
gui which only contains the snapshot lists

changes from v1:
* only write snapshot list in new 'finish' method of the catalog
* add 'finish' also to pool writer
* replace pending offset counter with reducing the chunk_archive
  interface of the catalog

0: https://lists.proxmox.com/pipermail/pbs-devel/2021-July/003711.html
1: https://lists.proxmox.com/pipermail/pbs-devel/2021-July/003715.html
2: https://lists.proxmox.com/pipermail/pbs-devel/2021-July/003714.html

Dominik Csapak (7):
  tape: media_catalog: improve chunk_archive interface
  tape: media_catalog: add fast_catalog beside normal catalog
  tape: pool_writer: finish the catalog when its done
  tape: media_catalog: add local type aliases to make code more clear
  api2: tape/backup: commit pool_writer even on error
  api2: tape/restore: finish temporary catalog at the end
  api2: tape: media: use MediaCatalog::snapshot_list for content listing

 src/api2/tape/backup.rs             | 117 +++++++++----------
 src/api2/tape/media.rs              |  44 ++++----
 src/api2/tape/restore.rs            |  11 +-
 src/tape/media_catalog.rs           | 167 +++++++++++++++++++++++++---
 src/tape/pool_writer/catalog_set.rs |   9 +-
 src/tape/pool_writer/mod.rs         |  15 ++-
 6 files changed, 250 insertions(+), 113 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 1/7] tape: media_catalog: improve chunk_archive interface
  2021-07-22 13:40 [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling Dominik Csapak
@ 2021-07-22 13:41 ` Dominik Csapak
  2021-07-26  8:21   ` [pbs-devel] applied: " Dietmar Maurer
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 2/7] tape: media_catalog: add fast_catalog beside normal catalog Dominik Csapak
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Dominik Csapak @ 2021-07-22 13:41 UTC (permalink / raw)
  To: pbs-devel

instead of having a public start/end_chunk_archive and register_chunks,
simply expose a 'register_chunk_archive' method since we always have
a list of chunks anywhere we want to add them

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/api2/tape/restore.rs            |  7 ++-----
 src/tape/media_catalog.rs           | 22 +++++++++++++++++++---
 src/tape/pool_writer/catalog_set.rs |  6 +-----
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/src/api2/tape/restore.rs b/src/api2/tape/restore.rs
index f959a856..1b0bc03f 100644
--- a/src/api2/tape/restore.rs
+++ b/src/api2/tape/restore.rs
@@ -1121,16 +1121,13 @@ fn restore_archive<'a>(
                 };
 
                 if let Some(chunks) = chunks {
-                    catalog.start_chunk_archive(
+                    catalog.register_chunk_archive(
                         Uuid::from(header.uuid),
                         current_file_number,
                         &source_datastore,
+                        &chunks[..],
                     )?;
-                    for digest in chunks.iter() {
-                        catalog.register_chunk(&digest)?;
-                    }
                     task_log!(worker, "register {} chunks", chunks.len());
-                    catalog.end_chunk_archive()?;
                     catalog.commit_if_large()?;
                 }
                 return Ok(());
diff --git a/src/tape/media_catalog.rs b/src/tape/media_catalog.rs
index 65b52a42..086c8a7d 100644
--- a/src/tape/media_catalog.rs
+++ b/src/tape/media_catalog.rs
@@ -503,10 +503,26 @@ impl MediaCatalog {
         Ok(())
     }
 
+    /// Register a chunk archive
+    pub fn register_chunk_archive(
+        &mut self,
+        uuid: Uuid, // Uuid form MediaContentHeader
+        file_number: u64,
+        store: &str,
+        chunk_list: &[[u8; 32]],
+    ) -> Result<(), Error> {
+        self.start_chunk_archive(uuid, file_number, store)?;
+        for digest in chunk_list {
+            self.register_chunk(digest)?;
+        }
+        self.end_chunk_archive()?;
+        Ok(())
+    }
+
     /// Register a chunk
     ///
     /// Only valid after start_chunk_archive.
-    pub fn register_chunk(
+    fn register_chunk(
         &mut self,
         digest: &[u8;32],
     ) -> Result<(), Error> {
@@ -557,7 +573,7 @@ impl MediaCatalog {
     }
 
     /// Start a chunk archive section
-    pub fn start_chunk_archive(
+    fn start_chunk_archive(
         &mut self,
         uuid: Uuid, // Uuid form MediaContentHeader
         file_number: u64,
@@ -606,7 +622,7 @@ impl MediaCatalog {
     }
 
     /// End a chunk archive section
-    pub fn end_chunk_archive(&mut self) -> Result<(), Error> {
+    fn end_chunk_archive(&mut self) -> Result<(), Error> {
 
         match self.current_archive.take() {
             None => bail!("end_chunk_archive failed: not started"),
diff --git a/src/tape/pool_writer/catalog_set.rs b/src/tape/pool_writer/catalog_set.rs
index fbca3e97..d5903413 100644
--- a/src/tape/pool_writer/catalog_set.rs
+++ b/src/tape/pool_writer/catalog_set.rs
@@ -97,11 +97,7 @@ impl CatalogSet {
     ) -> Result<(), Error> {
         match self.catalog {
             Some(ref mut catalog) => {
-                catalog.start_chunk_archive(uuid, file_number, store)?;
-                for digest in chunk_list {
-                    catalog.register_chunk(digest)?;
-                }
-                catalog.end_chunk_archive()?;
+                catalog.register_chunk_archive(uuid, file_number, store, chunk_list)?;
             }
             None => bail!("no catalog loaded - internal error"),
         }
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 2/7] tape: media_catalog: add fast_catalog beside normal catalog
  2021-07-22 13:40 [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling Dominik Csapak
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 1/7] tape: media_catalog: improve chunk_archive interface Dominik Csapak
@ 2021-07-22 13:41 ` Dominik Csapak
  2021-07-27  6:53   ` Dietmar Maurer
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 3/7] tape: pool_writer: finish the catalog when its done Dominik Csapak
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Dominik Csapak @ 2021-07-22 13:41 UTC (permalink / raw)
  To: pbs-devel

for some parts of the ui, we only need the snapshot list from the catalog,
and reading the whole catalog (can be multiple hundred MiB) is not
really necessary.

Instead, on every commit of the catalog, write the complete content list
into a seperate .index file, that can be read to get only the snapshot
list.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/tape/media_catalog.rs | 114 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 112 insertions(+), 2 deletions(-)

diff --git a/src/tape/media_catalog.rs b/src/tape/media_catalog.rs
index 086c8a7d..f7fcdd0a 100644
--- a/src/tape/media_catalog.rs
+++ b/src/tape/media_catalog.rs
@@ -1,8 +1,8 @@
 use std::convert::TryFrom;
 use std::fs::File;
-use std::io::{Write, Read, BufReader, Seek, SeekFrom};
+use std::io::{Write, Read, BufRead, BufReader, Seek, SeekFrom};
 use std::os::unix::io::AsRawFd;
-use std::path::Path;
+use std::path::{Path, PathBuf};
 use std::collections::{HashSet, HashMap};
 
 use anyhow::{bail, format_err, Error};
@@ -53,6 +53,7 @@ impl DatastoreContent {
 ///
 /// We use a simple binary format to store data on disk.
 pub struct MediaCatalog  {
+    base_path: PathBuf,
 
     uuid: Uuid, // BackupMedia uuid
 
@@ -108,7 +109,11 @@ impl MediaCatalog {
 
         let mut path = base_path.to_owned();
         path.push(uuid.to_string());
+        let mut fast_catalog = path.clone();
         path.set_extension("log");
+        fast_catalog.set_extension("index");
+
+        let _ = std::fs::remove_file(fast_catalog); // ignore errors
 
         match std::fs::remove_file(path) {
             Ok(()) => Ok(()),
@@ -217,6 +222,7 @@ impl MediaCatalog {
                 .map_err(|err| format_err!("fchown failed - {}", err))?;
 
             let mut me = Self {
+                base_path: base_path.to_path_buf(),
                 uuid: uuid.clone(),
                 file: None,
                 log_to_stdout: false,
@@ -294,6 +300,7 @@ impl MediaCatalog {
             let file = Self::create_temporary_database_file(base_path, uuid)?;
 
             let mut me = Self {
+                base_path: base_path.to_path_buf(),
                 uuid: uuid.clone(),
                 file: Some(file),
                 log_to_stdout: false,
@@ -360,6 +367,109 @@ impl MediaCatalog {
         &self.content
     }
 
+    fn load_fast_catalog(
+        file: &mut File,
+    ) -> Result<Vec<(String, String)>, Error> {
+        let mut list = Vec::new();
+        let file = BufReader::new(file);
+        for line in file.lines() {
+            let mut line = line?;
+
+            let idx = line
+                .find(':')
+                .ok_or_else(|| format_err!("invalid line format (no store found)"))?;
+
+            let snapshot = line.split_off(idx + 1);
+            line.truncate(idx);
+            list.push((line, snapshot));
+        }
+
+        Ok(list)
+    }
+
+    /// Returns a list of (store, snapshot) for a given MediaId
+    pub fn snapshot_list(
+        base_path: &Path,
+        media_id: &MediaId,
+    ) -> Result<Vec<(String, String)>, Error> {
+        let uuid = &media_id.label.uuid;
+
+        let mut path = base_path.to_owned();
+        path.push(uuid.to_string());
+        path.set_extension("index");
+
+
+        let list = proxmox::try_block!({
+
+            Self::create_basedir(base_path)?;
+
+            let mut file = match std::fs::OpenOptions::new().read(true).open(&path) {
+                Ok(file) => file,
+                Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
+                    // open normal catalog and write fast index
+                    let catalog = Self::open(base_path, media_id, false, false)?;
+                    catalog.write_snapshot_list()?;
+                    let mut list = Vec::new();
+                    for (store, content) in catalog.content() {
+                        for snapshot in content.snapshot_index.keys() {
+                            list.push((store.to_string(), snapshot.to_string()));
+                        }
+                    }
+                    return Ok(list);
+                },
+                Err(err) => bail!(err),
+            };
+
+            Self::load_fast_catalog(&mut file)
+        }).map_err(|err: Error| {
+            format_err!("unable to open fast media catalog {:?} - {}", path, err)
+        })?;
+
+        Ok(list)
+    }
+
+    // writes the full snapshot list into <uuid>.index
+    fn write_snapshot_list(&self) -> Result<(), Error> {
+        let mut data = String::new();
+
+        for (store, content) in self.content() {
+            for snapshot in content.snapshot_index.keys() {
+                data.push_str(store);
+                data.push_str(":");
+                data.push_str(snapshot);
+                data.push_str("\n");
+            }
+        }
+
+        let mut path = self.base_path.clone();
+        path.push(self.uuid.to_string());
+        path.set_extension("index");
+
+        let backup_user = crate::backup::backup_user()?;
+        let options = if cfg!(test) {
+            // cannot chown the file in the test environment
+            CreateOptions::new()
+        } else {
+            CreateOptions::new().owner(backup_user.uid).group(backup_user.gid)
+        };
+
+        proxmox::tools::fs::replace_file(
+            path,
+            data.as_bytes(),
+            options,
+        )
+    }
+
+    /// Finishes the catalog by writing all pending data and a fast
+    /// index with only the snapshot list
+    pub fn finish(&mut self) -> Result<(), Error> {
+        self.commit()?;
+        self.write_snapshot_list().map_err(|err| {
+            format_err!("could not write fast catalog: {}", err)
+        })?;
+        Ok(())
+    }
+
     /// Commit pending changes
     ///
     /// This is necessary to store changes persistently.
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 3/7] tape: pool_writer: finish the catalog when its done
  2021-07-22 13:40 [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling Dominik Csapak
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 1/7] tape: media_catalog: improve chunk_archive interface Dominik Csapak
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 2/7] tape: media_catalog: add fast_catalog beside normal catalog Dominik Csapak
@ 2021-07-22 13:41 ` Dominik Csapak
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 4/7] tape: media_catalog: add local type aliases to make code more clear Dominik Csapak
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Dominik Csapak @ 2021-07-22 13:41 UTC (permalink / raw)
  To: pbs-devel

by adding a 'finish' method to the pool writer itself which must
be called before its dropped

and when the catalog gets moved into the read_only section of
the catalog set

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/api2/tape/backup.rs             |  2 +-
 src/tape/pool_writer/catalog_set.rs |  3 ++-
 src/tape/pool_writer/mod.rs         | 15 +++++++++++++--
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/src/api2/tape/backup.rs b/src/api2/tape/backup.rs
index 8119482f..45ee4bad 100644
--- a/src/api2/tape/backup.rs
+++ b/src/api2/tape/backup.rs
@@ -520,7 +520,7 @@ fn backup_worker(
         }
     }
 
-    pool_writer.commit()?;
+    pool_writer.finish()?;
 
     if need_catalog {
         task_log!(worker, "append media catalog");
diff --git a/src/tape/pool_writer/catalog_set.rs b/src/tape/pool_writer/catalog_set.rs
index d5903413..da39b659 100644
--- a/src/tape/pool_writer/catalog_set.rs
+++ b/src/tape/pool_writer/catalog_set.rs
@@ -58,7 +58,8 @@ impl CatalogSet {
     pub fn append_catalog(&mut self, new_catalog: MediaCatalog) -> Result<(), Error> {
 
         // append current catalog to read-only set
-        if let Some(catalog) = self.catalog.take() {
+        if let Some(mut catalog) = self.catalog.take() {
+            catalog.finish()?;
             self.media_set_catalog.append_catalog(catalog)?;
         }
 
diff --git a/src/tape/pool_writer/mod.rs b/src/tape/pool_writer/mod.rs
index 6f887c60..dc908ce5 100644
--- a/src/tape/pool_writer/mod.rs
+++ b/src/tape/pool_writer/mod.rs
@@ -186,8 +186,7 @@ impl PoolWriter {
 
     /// commit changes to tape and catalog
     ///
-    /// This is done automatically during a backupsession, but needs to
-    /// be called explicitly before dropping the PoolWriter
+    /// This is done automatically during a backupsession
     pub fn commit(&mut self) -> Result<(), Error> {
          if let Some(PoolWriterState {ref mut drive, .. }) = self.status {
             drive.sync()?; // sync all data to the tape
@@ -196,6 +195,18 @@ impl PoolWriter {
         Ok(())
     }
 
+    /// commit changes to tape and finished catalog
+    ///
+    /// this needs be called before the PoolWriter is dropped
+    pub fn finish(&mut self) -> Result<(), Error> {
+        self.commit()?;
+        let mut catalog_set = self.catalog_set.lock().unwrap();
+        if let Some(ref mut catalog) = catalog_set.catalog {
+            catalog.finish()?;
+        }
+        Ok(())
+    }
+
     /// Load a writable media into the drive
     pub fn load_writable_media(&mut self, worker: &WorkerTask) -> Result<Uuid, Error> {
         let last_media_uuid = match self.status {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 4/7] tape: media_catalog: add local type aliases to make code more clear
  2021-07-22 13:40 [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling Dominik Csapak
                   ` (2 preceding siblings ...)
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 3/7] tape: pool_writer: finish the catalog when its done Dominik Csapak
@ 2021-07-22 13:41 ` Dominik Csapak
  2021-07-27  7:10   ` Dietmar Maurer
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 5/7] api2: tape/backup: commit pool_writer even on error Dominik Csapak
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Dominik Csapak @ 2021-07-22 13:41 UTC (permalink / raw)
  To: pbs-devel

by adding some type aliases like 'type Store = String',
the more complex types/return values are easier to read.

For example
HashMap<String, u64>

turns into:
HashMap<Snapshot, FileNr>

since those types are not public, the generated cargo docs, do not
contain them

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/tape/media_catalog.rs | 35 ++++++++++++++++++++---------------
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/src/tape/media_catalog.rs b/src/tape/media_catalog.rs
index f7fcdd0a..296d1d2f 100644
--- a/src/tape/media_catalog.rs
+++ b/src/tape/media_catalog.rs
@@ -31,9 +31,14 @@ use crate::{
     },
 };
 
+type Store = String;
+type Snapshot = String;
+type FileNr = u64;
+type Chunk = [u8; 32];
+
 pub struct DatastoreContent {
-    pub snapshot_index: HashMap<String, u64>, // snapshot => file_nr
-    pub chunk_index: HashMap<[u8;32], u64>, // chunk => file_nr
+    pub snapshot_index: HashMap<Snapshot, FileNr>,
+    pub chunk_index: HashMap<Chunk, FileNr>,
 }
 
 impl DatastoreContent {
@@ -61,11 +66,11 @@ pub struct MediaCatalog  {
 
     log_to_stdout: bool,
 
-    current_archive: Option<(Uuid, u64, String)>, // (uuid, file_nr, store)
+    current_archive: Option<(Uuid, FileNr, Store)>,
 
-    last_entry: Option<(Uuid, u64)>,
+    last_entry: Option<(Uuid, FileNr)>,
 
-    content: HashMap<String, DatastoreContent>,
+    content: HashMap<Store, DatastoreContent>,
 
     pending: Vec<u8>,
 }
@@ -363,13 +368,13 @@ impl MediaCatalog {
     }
 
     /// Accessor to content list
-    pub fn content(&self) -> &HashMap<String, DatastoreContent> {
+    pub fn content(&self) -> &HashMap<Store, DatastoreContent> {
         &self.content
     }
 
     fn load_fast_catalog(
         file: &mut File,
-    ) -> Result<Vec<(String, String)>, Error> {
+    ) -> Result<Vec<(Store, Snapshot)>, Error> {
         let mut list = Vec::new();
         let file = BufReader::new(file);
         for line in file.lines() {
@@ -391,7 +396,7 @@ impl MediaCatalog {
     pub fn snapshot_list(
         base_path: &Path,
         media_id: &MediaId,
-    ) -> Result<Vec<(String, String)>, Error> {
+    ) -> Result<Vec<(Store, Snapshot)>, Error> {
         let uuid = &media_id.label.uuid;
 
         let mut path = base_path.to_owned();
@@ -531,7 +536,7 @@ impl MediaCatalog {
     }
 
     /// Returns the snapshot archive file number
-    pub fn lookup_snapshot(&self, store: &str, snapshot: &str) -> Option<u64> {
+    pub fn lookup_snapshot(&self, store: &str, snapshot: &str) -> Option<FileNr> {
         match self.content.get(store) {
             None => None,
             Some(content) => content.snapshot_index.get(snapshot).copied(),
@@ -539,7 +544,7 @@ impl MediaCatalog {
     }
 
     /// Test if the catalog already contain a chunk
-    pub fn contains_chunk(&self, store: &str, digest: &[u8;32]) -> bool {
+    pub fn contains_chunk(&self, store: &str, digest: &Chunk) -> bool {
         match self.content.get(store) {
             None => false,
             Some(content) => content.chunk_index.contains_key(digest),
@@ -547,7 +552,7 @@ impl MediaCatalog {
     }
 
     /// Returns the chunk archive file number
-    pub fn lookup_chunk(&self, store: &str, digest: &[u8;32]) -> Option<u64> {
+    pub fn lookup_chunk(&self, store: &str, digest: &Chunk) -> Option<FileNr> {
         match self.content.get(store) {
             None => None,
             Some(content) => content.chunk_index.get(digest).copied(),
@@ -634,7 +639,7 @@ impl MediaCatalog {
     /// Only valid after start_chunk_archive.
     fn register_chunk(
         &mut self,
-        digest: &[u8;32],
+        digest: &Chunk,
     ) -> Result<(), Error> {
 
         let (file_number, store) = match self.current_archive {
@@ -1052,7 +1057,7 @@ impl MediaSetCatalog {
     }
 
     /// Returns the media uuid and snapshot archive file number
-    pub fn lookup_snapshot(&self, store: &str, snapshot: &str) -> Option<(&Uuid, u64)> {
+    pub fn lookup_snapshot(&self, store: &str, snapshot: &str) -> Option<(&Uuid, FileNr)> {
         for (uuid, catalog) in self.catalog_list.iter() {
             if let Some(nr) = catalog.lookup_snapshot(store, snapshot) {
                 return Some((uuid, nr));
@@ -1062,7 +1067,7 @@ impl MediaSetCatalog {
     }
 
     /// Test if the catalog already contain a chunk
-    pub fn contains_chunk(&self, store: &str, digest: &[u8;32]) -> bool {
+    pub fn contains_chunk(&self, store: &str, digest: &Chunk) -> bool {
         for catalog in self.catalog_list.values() {
             if catalog.contains_chunk(store, digest) {
                 return true;
@@ -1072,7 +1077,7 @@ impl MediaSetCatalog {
     }
 
     /// Returns the media uuid and chunk archive file number
-    pub fn lookup_chunk(&self, store: &str, digest: &[u8;32]) -> Option<(&Uuid, u64)> {
+    pub fn lookup_chunk(&self, store: &str, digest: &Chunk) -> Option<(&Uuid, FileNr)> {
         for (uuid, catalog) in self.catalog_list.iter() {
             if let Some(nr) = catalog.lookup_chunk(store, digest) {
                 return Some((uuid, nr));
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 5/7] api2: tape/backup: commit pool_writer even on error
  2021-07-22 13:40 [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling Dominik Csapak
                   ` (3 preceding siblings ...)
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 4/7] tape: media_catalog: add local type aliases to make code more clear Dominik Csapak
@ 2021-07-22 13:41 ` Dominik Csapak
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 6/7] api2: tape/restore: finish temporary catalog at the end Dominik Csapak
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 7/7] api2: tape: media: use MediaCatalog::snapshot_list for content listing Dominik Csapak
  6 siblings, 0 replies; 11+ messages in thread
From: Dominik Csapak @ 2021-07-22 13:41 UTC (permalink / raw)
  To: pbs-devel

this way we store all finished snapshots/chunk archives we in the
catalog, and not onlye those until the last commit

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/api2/tape/backup.rs | 115 +++++++++++++++++++++-------------------
 1 file changed, 60 insertions(+), 55 deletions(-)

diff --git a/src/api2/tape/backup.rs b/src/api2/tape/backup.rs
index 45ee4bad..c78b697b 100644
--- a/src/api2/tape/backup.rs
+++ b/src/api2/tape/backup.rs
@@ -456,72 +456,77 @@ fn backup_worker(
 
     let mut need_catalog = false; // avoid writing catalog for empty jobs
 
-    for (group_number, group) in group_list.into_iter().enumerate() {
-        progress.done_groups = group_number as u64;
-        progress.done_snapshots = 0;
-        progress.group_snapshots = 0;
-
-        let snapshot_list = group.list_backups(&datastore.base_path())?;
-
-        // filter out unfinished backups
-        let mut snapshot_list = snapshot_list
-            .into_iter()
-            .filter(|item| item.is_finished())
-            .collect();
-
-        BackupInfo::sort_list(&mut snapshot_list, true); // oldest first
-
-        if latest_only {
-            progress.group_snapshots = 1;
-            if let Some(info) = snapshot_list.pop() {
-                if pool_writer.contains_snapshot(datastore_name, &info.backup_dir.to_string()) {
-                    task_log!(worker, "skip snapshot {}", info.backup_dir);
-                    continue;
-                }
+    let res: Result<(), Error> = proxmox::try_block!({
+        for (group_number, group) in group_list.into_iter().enumerate() {
+            progress.done_groups = group_number as u64;
+            progress.done_snapshots = 0;
+            progress.group_snapshots = 0;
+
+            let snapshot_list = group.list_backups(&datastore.base_path())?;
+
+            // filter out unfinished backups
+            let mut snapshot_list = snapshot_list
+                .into_iter()
+                .filter(|item| item.is_finished())
+                .collect();
+
+            BackupInfo::sort_list(&mut snapshot_list, true); // oldest first
+
+            if latest_only {
+                progress.group_snapshots = 1;
+                if let Some(info) = snapshot_list.pop() {
+                    if pool_writer.contains_snapshot(datastore_name, &info.backup_dir.to_string()) {
+                        task_log!(worker, "skip snapshot {}", info.backup_dir);
+                        continue;
+                    }
 
-                need_catalog = true;
+                    need_catalog = true;
 
-                let snapshot_name = info.backup_dir.to_string();
-                if !backup_snapshot(worker, &mut pool_writer, datastore.clone(), info.backup_dir)? {
-                    errors = true;
-                } else {
-                    summary.snapshot_list.push(snapshot_name);
-                }
-                progress.done_snapshots = 1;
-                task_log!(
-                    worker,
-                    "percentage done: {}",
-                    progress
-                );
-            }
-        } else {
-            progress.group_snapshots = snapshot_list.len() as u64;
-            for (snapshot_number, info) in snapshot_list.into_iter().enumerate() {
-                if pool_writer.contains_snapshot(datastore_name, &info.backup_dir.to_string()) {
-                    task_log!(worker, "skip snapshot {}", info.backup_dir);
-                    continue;
+                    let snapshot_name = info.backup_dir.to_string();
+                    if !backup_snapshot(worker, &mut pool_writer, datastore.clone(), info.backup_dir)? {
+                        errors = true;
+                    } else {
+                        summary.snapshot_list.push(snapshot_name);
+                    }
+                    progress.done_snapshots = 1;
+                    task_log!(
+                        worker,
+                        "percentage done: {}",
+                        progress
+                    );
                 }
+            } else {
+                progress.group_snapshots = snapshot_list.len() as u64;
+                for (snapshot_number, info) in snapshot_list.into_iter().enumerate() {
+                    if pool_writer.contains_snapshot(datastore_name, &info.backup_dir.to_string()) {
+                        task_log!(worker, "skip snapshot {}", info.backup_dir);
+                        continue;
+                    }
 
-                need_catalog = true;
+                    need_catalog = true;
 
-                let snapshot_name = info.backup_dir.to_string();
-                if !backup_snapshot(worker, &mut pool_writer, datastore.clone(), info.backup_dir)? {
-                    errors = true;
-                } else {
-                    summary.snapshot_list.push(snapshot_name);
+                    let snapshot_name = info.backup_dir.to_string();
+                    if !backup_snapshot(worker, &mut pool_writer, datastore.clone(), info.backup_dir)? {
+                        errors = true;
+                    } else {
+                        summary.snapshot_list.push(snapshot_name);
+                    }
+                    progress.done_snapshots = snapshot_number as u64 + 1;
+                    task_log!(
+                        worker,
+                        "percentage done: {}",
+                        progress
+                    );
                 }
-                progress.done_snapshots = snapshot_number as u64 + 1;
-                task_log!(
-                    worker,
-                    "percentage done: {}",
-                    progress
-                );
             }
         }
-    }
+        Ok(())
+    });
 
     pool_writer.finish()?;
 
+    let _ = res?; // bubble errors up
+
     if need_catalog {
         task_log!(worker, "append media catalog");
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 6/7] api2: tape/restore: finish temporary catalog at the end
  2021-07-22 13:40 [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling Dominik Csapak
                   ` (4 preceding siblings ...)
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 5/7] api2: tape/backup: commit pool_writer even on error Dominik Csapak
@ 2021-07-22 13:41 ` Dominik Csapak
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 7/7] api2: tape: media: use MediaCatalog::snapshot_list for content listing Dominik Csapak
  6 siblings, 0 replies; 11+ messages in thread
From: Dominik Csapak @ 2021-07-22 13:41 UTC (permalink / raw)
  To: pbs-devel

in 'restore_archive', we reach that 'catalog.commit()' for
* every skipped snapshot (we already call 'commit_if_large' then before)
* every skipped chunk archive (no change in catalog since we do not read
  the chunk archive in that case)
* after reading a catalog (no change in catalog)

in all other cases, we call 'commit_if_large' and return early,
meaning that the 'commit' there was executed too often and
unnecessary, so change it to a finish and after the loop over the files,
before the call to 'finish_temporary_database' (which only renames
the file into place)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/api2/tape/restore.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/api2/tape/restore.rs b/src/api2/tape/restore.rs
index 1b0bc03f..806304aa 100644
--- a/src/api2/tape/restore.rs
+++ b/src/api2/tape/restore.rs
@@ -998,6 +998,8 @@ pub fn restore_media(
         restore_archive(worker.clone(), reader, current_file_number, target, &mut catalog, checked_chunks_map, verbose)?;
     }
 
+    catalog.finish()?;
+
     MediaCatalog::finish_temporary_database(status_path, &media_id.label.uuid, true)?;
 
     Ok(())
@@ -1150,8 +1152,6 @@ fn restore_archive<'a>(
          _ =>  bail!("unknown content magic {:?}", header.content_magic),
     }
 
-    catalog.commit()?;
-
     Ok(())
 }
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 7/7] api2: tape: media: use MediaCatalog::snapshot_list for content listing
  2021-07-22 13:40 [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling Dominik Csapak
                   ` (5 preceding siblings ...)
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 6/7] api2: tape/restore: finish temporary catalog at the end Dominik Csapak
@ 2021-07-22 13:41 ` Dominik Csapak
  6 siblings, 0 replies; 11+ messages in thread
From: Dominik Csapak @ 2021-07-22 13:41 UTC (permalink / raw)
  To: pbs-devel

this should make the api call much faster, since it is not reading
the whole catalog anymore

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/api2/tape/media.rs | 44 +++++++++++++++++++-----------------------
 1 file changed, 20 insertions(+), 24 deletions(-)

diff --git a/src/api2/tape/media.rs b/src/api2/tape/media.rs
index 8351b2be..cc2d896d 100644
--- a/src/api2/tape/media.rs
+++ b/src/api2/tape/media.rs
@@ -502,32 +502,28 @@ pub fn list_content(
             .generate_media_set_name(&set.uuid, template)
             .unwrap_or_else(|_| set.uuid.to_string());
 
-        let catalog = MediaCatalog::open(status_path, &media_id, false, false)?;
+        for (store, snapshot) in MediaCatalog::snapshot_list(status_path, &media_id)? {
+            let backup_dir: BackupDir = snapshot.parse()?;
 
-        for (store, content) in catalog.content() {
-            for snapshot in content.snapshot_index.keys() {
-                let backup_dir: BackupDir = snapshot.parse()?;
-
-                if let Some(ref backup_type) = filter.backup_type {
-                    if backup_dir.group().backup_type() != backup_type { continue; }
-                }
-                if let Some(ref backup_id) = filter.backup_id {
-                    if backup_dir.group().backup_id() != backup_id { continue; }
-                }
-
-                list.push(MediaContentEntry {
-                    uuid: media_id.label.uuid.clone(),
-                    label_text: media_id.label.label_text.to_string(),
-                    pool: set.pool.clone(),
-                    media_set_name: media_set_name.clone(),
-                    media_set_uuid: set.uuid.clone(),
-                    media_set_ctime: set.ctime,
-                    seq_nr: set.seq_nr,
-                    snapshot: snapshot.to_owned(),
-                    store: store.to_owned(),
-                    backup_time: backup_dir.backup_time(),
-                });
+            if let Some(ref backup_type) = filter.backup_type {
+                if backup_dir.group().backup_type() != backup_type { continue; }
+            }
+            if let Some(ref backup_id) = filter.backup_id {
+                if backup_dir.group().backup_id() != backup_id { continue; }
             }
+
+            list.push(MediaContentEntry {
+                uuid: media_id.label.uuid.clone(),
+                label_text: media_id.label.label_text.to_string(),
+                pool: set.pool.clone(),
+                media_set_name: media_set_name.clone(),
+                media_set_uuid: set.uuid.clone(),
+                media_set_ctime: set.ctime,
+                seq_nr: set.seq_nr,
+                snapshot: snapshot.to_owned(),
+                store: store.to_owned(),
+                backup_time: backup_dir.backup_time(),
+            });
         }
     }
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] applied: [PATCH proxmox-backup v2 1/7] tape: media_catalog: improve chunk_archive interface
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 1/7] tape: media_catalog: improve chunk_archive interface Dominik Csapak
@ 2021-07-26  8:21   ` Dietmar Maurer
  0 siblings, 0 replies; 11+ messages in thread
From: Dietmar Maurer @ 2021-07-26  8:21 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Dominik Csapak

applied

On 7/22/21 3:41 PM, Dominik Csapak wrote:
> instead of having a public start/end_chunk_archive and register_chunks,
> simply expose a 'register_chunk_archive' method since we always have
> a list of chunks anywhere we want to add them
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>   src/api2/tape/restore.rs            |  7 ++-----
>   src/tape/media_catalog.rs           | 22 +++++++++++++++++++---
>   src/tape/pool_writer/catalog_set.rs |  6 +-----
>   3 files changed, 22 insertions(+), 13 deletions(-)
>
> diff --git a/src/api2/tape/restore.rs b/src/api2/tape/restore.rs
> index f959a856..1b0bc03f 100644
> --- a/src/api2/tape/restore.rs
> +++ b/src/api2/tape/restore.rs
> @@ -1121,16 +1121,13 @@ fn restore_archive<'a>(
>                   };
>   
>                   if let Some(chunks) = chunks {
> -                    catalog.start_chunk_archive(
> +                    catalog.register_chunk_archive(
>                           Uuid::from(header.uuid),
>                           current_file_number,
>                           &source_datastore,
> +                        &chunks[..],
>                       )?;
> -                    for digest in chunks.iter() {
> -                        catalog.register_chunk(&digest)?;
> -                    }
>                       task_log!(worker, "register {} chunks", chunks.len());
> -                    catalog.end_chunk_archive()?;
>                       catalog.commit_if_large()?;
>                   }
>                   return Ok(());
> diff --git a/src/tape/media_catalog.rs b/src/tape/media_catalog.rs
> index 65b52a42..086c8a7d 100644
> --- a/src/tape/media_catalog.rs
> +++ b/src/tape/media_catalog.rs
> @@ -503,10 +503,26 @@ impl MediaCatalog {
>           Ok(())
>       }
>   
> +    /// Register a chunk archive
> +    pub fn register_chunk_archive(
> +        &mut self,
> +        uuid: Uuid, // Uuid form MediaContentHeader
> +        file_number: u64,
> +        store: &str,
> +        chunk_list: &[[u8; 32]],
> +    ) -> Result<(), Error> {
> +        self.start_chunk_archive(uuid, file_number, store)?;
> +        for digest in chunk_list {
> +            self.register_chunk(digest)?;
> +        }
> +        self.end_chunk_archive()?;
> +        Ok(())
> +    }
> +
>       /// Register a chunk
>       ///
>       /// Only valid after start_chunk_archive.
> -    pub fn register_chunk(
> +    fn register_chunk(
>           &mut self,
>           digest: &[u8;32],
>       ) -> Result<(), Error> {
> @@ -557,7 +573,7 @@ impl MediaCatalog {
>       }
>   
>       /// Start a chunk archive section
> -    pub fn start_chunk_archive(
> +    fn start_chunk_archive(
>           &mut self,
>           uuid: Uuid, // Uuid form MediaContentHeader
>           file_number: u64,
> @@ -606,7 +622,7 @@ impl MediaCatalog {
>       }
>   
>       /// End a chunk archive section
> -    pub fn end_chunk_archive(&mut self) -> Result<(), Error> {
> +    fn end_chunk_archive(&mut self) -> Result<(), Error> {
>   
>           match self.current_archive.take() {
>               None => bail!("end_chunk_archive failed: not started"),
> diff --git a/src/tape/pool_writer/catalog_set.rs b/src/tape/pool_writer/catalog_set.rs
> index fbca3e97..d5903413 100644
> --- a/src/tape/pool_writer/catalog_set.rs
> +++ b/src/tape/pool_writer/catalog_set.rs
> @@ -97,11 +97,7 @@ impl CatalogSet {
>       ) -> Result<(), Error> {
>           match self.catalog {
>               Some(ref mut catalog) => {
> -                catalog.start_chunk_archive(uuid, file_number, store)?;
> -                for digest in chunk_list {
> -                    catalog.register_chunk(digest)?;
> -                }
> -                catalog.end_chunk_archive()?;
> +                catalog.register_chunk_archive(uuid, file_number, store, chunk_list)?;
>               }
>               None => bail!("no catalog loaded - internal error"),
>           }




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v2 2/7] tape: media_catalog: add fast_catalog beside normal catalog
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 2/7] tape: media_catalog: add fast_catalog beside normal catalog Dominik Csapak
@ 2021-07-27  6:53   ` Dietmar Maurer
  0 siblings, 0 replies; 11+ messages in thread
From: Dietmar Maurer @ 2021-07-27  6:53 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Dominik Csapak

comments inline

On 7/22/21 3:41 PM, Dominik Csapak wrote:
> for some parts of the ui, we only need the snapshot list from the catalog,
> and reading the whole catalog (can be multiple hundred MiB) is not
> really necessary.
>
> Instead, on every commit of the catalog, write the complete content list
> into a seperate .index file, that can be read to get only the snapshot
> list.

This description seems wrong to me? Also, what about 
destroy_unrelated_catalog()?

I guess that function should also remove the .index file?






^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v2 4/7] tape: media_catalog: add local type aliases to make code more clear
  2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 4/7] tape: media_catalog: add local type aliases to make code more clear Dominik Csapak
@ 2021-07-27  7:10   ` Dietmar Maurer
  0 siblings, 0 replies; 11+ messages in thread
From: Dietmar Maurer @ 2021-07-27  7:10 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Dominik Csapak

comments below

On 7/22/21 3:41 PM, Dominik Csapak wrote:
> by adding some type aliases like 'type Store = String',
> the more complex types/return values are easier to read.
>
> For example
> HashMap<String, u64>
>
> turns into:
> HashMap<Snapshot, FileNr>
>
> since those types are not public, the generated cargo docs, do not
> contain them
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>   src/tape/media_catalog.rs | 35 ++++++++++++++++++++---------------
>   1 file changed, 20 insertions(+), 15 deletions(-)
>
> diff --git a/src/tape/media_catalog.rs b/src/tape/media_catalog.rs
> index f7fcdd0a..296d1d2f 100644
> --- a/src/tape/media_catalog.rs
> +++ b/src/tape/media_catalog.rs
> @@ -31,9 +31,14 @@ use crate::{
>       },
>   };
>   
> +type Store = String;
> +type Snapshot = String;
> +type FileNr = u64;

I am not a big fan , but OK.

> +type Chunk = [u8; 32];
> +

Introducing a special type here no sense - we use &[u8;32] for chunks 
digests everywhere, so we want to replace that
everywhere (not only inside this file)






^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-07-27  7:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-22 13:40 [pbs-devel] [PATCH proxmox-backup v2 0/7] improve catalog handling Dominik Csapak
2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 1/7] tape: media_catalog: improve chunk_archive interface Dominik Csapak
2021-07-26  8:21   ` [pbs-devel] applied: " Dietmar Maurer
2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 2/7] tape: media_catalog: add fast_catalog beside normal catalog Dominik Csapak
2021-07-27  6:53   ` Dietmar Maurer
2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 3/7] tape: pool_writer: finish the catalog when its done Dominik Csapak
2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 4/7] tape: media_catalog: add local type aliases to make code more clear Dominik Csapak
2021-07-27  7:10   ` Dietmar Maurer
2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 5/7] api2: tape/backup: commit pool_writer even on error Dominik Csapak
2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 6/7] api2: tape/restore: finish temporary catalog at the end Dominik Csapak
2021-07-22 13:41 ` [pbs-devel] [PATCH proxmox-backup v2 7/7] api2: tape: media: use MediaCatalog::snapshot_list for content listing Dominik Csapak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal