* [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup
@ 2024-06-05 10:53 Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 01/58] client: pxar: switch to stack based encoder state Christian Ebner
` (57 more replies)
0 siblings, 58 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
This series of patches implements an metadata based file change
detection mechanism for improved pxar file level backup creation speed
for unchanged files.
The chosen approach is to split pxar archives on creation via the
proxmox-backup-client into two separate data and upload streams,
one exclusive for regular file payloads, the other one for the rest
of the pxar archive, which is mostly metadata.
On consecutive runs, the metadata archive of the previous backup run,
which is limited in size and therefore rapidly accessed is used to
lookup and compare the metadata for entries to encode.
This assumes that the connection speed to the Proxmox Backup Server is
sufficiently fast, allowing the download and chaching of the chunks for
that index.
Changes to regular files are detected by comparing all of the files
metadata object, including mtime, acls, ecc. If no changes are detected,
the previous payload index is used to lookup chunks to possibly re-use
in the payload stream of the new archive.
In order to reduce possible chunk fragmentation, the decision whether to
reuse or reencode a file payload is deferred until enough information
is gathered by adding entries to a look-ahead cache. If the padding
introduced by reusing chunks falls below a threshold, the entries are
referenced, the chunks are reused and injected into the pxar payload
upload stream, otherwise they are discated and the files encoded
regularly.
Note:
Patches up to patch 36 will only compile without patches
be5d68aa8a3d5848e0fbbf651834514a35ed6dd8 and
0983094c87345284ee56ae9eeab47be8375cd730 on the pxar side, patch 36 and
following however require them.
The following lists the most notable changes included in this series since
the version 8:
- Fix an issue with entries being reencoded even when they could be
reused.
- Reordered and regrouped patches based on parts of the code they
modify.
- Smaller refactoring as outlined in the individual patches
The following lists the most notable changes included in this series since
the version 7:
- Fixed incorrectly squashed patches during rebase
The following lists the most notable changes included in this series since
the version 6:
- Allow to use `.pxar` extension in cli commands for convenience
- Refactor the input/output interface for the pxar encoder, decoder and
accessor to use a `PxarVariant` enum, in order to guarantee the
payload relate input/output is always attached for split archives.
- Refactor the lookahead caching logic in the pxars `Archiver` to
improve overall code readability.
- Add helper method for file name matching and use it where possible,
for it to be handled in a single place.
- Extend documentation to include additional information about which
metadata is compared to the previous snapshot
- Fix an issue with the `pxar list` which failed in case of metadata
only pxar archives.
- Fix an issue in the payload chunker test where the context was not
updated accordingly.
- Various clippy fixes, smaller refactoring and reordering of patches
The following lists the most notable changes included in this series since
the version 5:
- Fix an issue where the payload chunker was not correctly reset after
suggested or forced boundaries.
- Added regression tests for payload chunker and chunk stream.
The following lists the most notable changes included in this series since
the version 4:
- Increase open file handle limit to hard limit and adapt lookahead
cache size dynamically (thanks a lot to Thomas for pointing this out
and providing the necessary background information). This helps with
the reuse of multiple entries being contained within the same chunk,
otherwise exceeding padding threshold and being therefore reencoded
instead.
- Fix payload chunker scan to only scan up until chunk pos in case a
suggested boundary is chosen.
- Fix issue with decoder state being not set to correct `InDirectory`
after reading prelude and getting root directory entry.
- Fix issue with kept back chunk injection when the chunk follows a
range discontinuity.
- Add regression test for pxar create with metadata archive and payload
index reference.
The following lists the most notable changes included in this series since
the version 3:
- Rework the whole reused chunk injection and accounting logic and use
lockless async `mpsc::channel`s instead of `Arc<Mutex<VecDeque<..>>>`.
- Reworked lookahead caching logic to use payload ranges and check for
possible range continuation instead of looking up the reusable dynamic
entries immediately in case of a reusable entry chain. This also
avoids edge cases not covered in the previous version of the patch series.
This current version therefore tends to reencode small files more
aggressively, since they might introduce additional unwanted paddings.
- Correctly cover also hardlinks for the reuse logic, avoiding to
reencode these entries.
- Add additional dedicatet chunker implementation for payload data
stream, allowing the archiver to suggest boundaries to the chunker to
reduce padding for reused chunks.
- Add additional `change-detection-mode=data`, in order to allow
creating split archives with fully reencoded payload data.
- Add additional payload input readers for pxar accessor type
implementations where needed.
- Add additional consistency check in pxar encoder when dropping state
or encoder instance.
- CliParams was renamed to the more opaque Prelude, since the pxar
archive does not care about its contents and this might be extended to
store other information about the archive as well.
- Add missing proxmox-file-restore for split archives and fix restore of
tar/zip archives via WebUI. This is handled by the same decoder logic,
and needed an updated payload input content range to read the data
from the correct location in the payload data archive.
- Additional refactoring to use the pxar reader helpers where possible.
The following lists the most notable changes included in this series since
the version 2:
- many bugfixes regarding incorrect archive encoding by wrong offset
generation, adding additional sanity checks and rather fail on
encoding than produce an incorrectly encoded archive
- different approach for deciding whether to reuse or reencode the
entries. Previously, the entries have been encoded when a cached
payload size threshold was reached. Now, the padding introduced by
reusable chunks is tracked, and only if the padding does not exceed
the set threshold, the entries are reused. This reduces the possible
padding, at the cost of reencoding more entries. Also avoids to
re-use chunks which have now large padding holes because of
moved/removed files contained within.
- added headers for metadata archive and payload file
- added documentation
An invocation of a backup run with this patches now is:
```bash
proxmox-backup-client backup <label>.pxar:<source-path> --change-detection-mode=metadata
```
During the first run, no reference index is available, the pxar archive
will however be split into the two parts.
Following backups will however utilize the pxar archive accessor and
index files of the previous run to perform file change detection.
As benchmarks, the linux source code as well as the coco dataset for
computer vision and pattern recognition can be used.
The benchmarks can be performed by running:
```bash
proxmox-backup-test-suite detection-mode-bench prepare --target /<path-to-bench-source-target>
proxmox-backup-test-suite detection-mode-bench run linux.pxar:/<path-to-bench-source-target>/linux
proxmox-backup-test-suite detection-mode-bench run coco.pxar:/<path-to-bench-source-target>/coco
```
Above command invocations assume the default repository and credentials
to be set as environment variables, they might however be passed as
additional optional parameters instead.
Christian Ebner (58):
client: pxar: switch to stack based encoder state
client: pxar: combine writers into struct
client: pxar: optionally split metadata and payload streams
client: helper: add helpers for creating reader instances
client: helper: add method for split archive name mapping
client: tools: helper to check pxar filename extensions
client: restore: read payload from dedicated index
client: tools: cover extension for split pxar archives
client: mount: make split pxar archives mountable
api: datastore: attach split archive payload chunk reader
catalog: shell: make split pxar archives accessible
www: cover metadata extension for pxar archives
file restore: cover extension for split pxar archives
file restore: factor out getting pxar reader
file restore: cover split metadata and payload archives
file restore: show more error context when extraction fails
pxar: bin: add optional payload input for archive restore
pxar: bin: cover listing for split archives
pxar: bin: add more context to extraction error
client: pxar: include payload offset in entry listing
client: pxar: helper for lookup of reusable dynamic entries
upload stream: implement reused chunk injector
client: chunk stream: add struct to hold injection state
chunker: add method to reset chunker state
client: streams: add channels for dynamic entry injection
specs: add backup detection mode specification
client: implement prepare reference method
client: pxar: add method for metadata comparison
pxar: caching: add look-ahead cache
client: pxar: refactor catalog encoding for directories
fix #3174: client: pxar: enable caching and meta comparison
client: backup writer: add injected chunk count to stats
pxar: create: keep track of reused chunks and files
pxar: create: show chunk injection stats info output
client: backup writer: make backup info output more concise
client: pxar: add helper to handle optional preludes
client: pxar: opt encode cli exclude patterns as Prelude
client: pxar: allow to restore prelude to optional path
pxar: bin: show padding in debug output on archive list
pxar: bin: ignore version and prelude entries in listing
pxar: bin: test `pxar list` with payload-input
pxar: bin: support creation of split pxar archives via cli
pxar: add optional payload input to mount archive
datastore: chunker: add Chunker trait
datastore: chunker: implement chunker for payload stream
chunker: tests: add regression tests for payload chunker
chunk stream: tests: add regression tests for payload chunker
client: chunk stream: switch payload stream chunker
client: pxar: add archive creation with reference test
client: tools: add helper to raise nofile rlimit
client: pxar: set cache limit based on nofile rlimit
api: datastore: add endpoint to lookup entries via pxar archive
api: datastore: add optional archive-name to file-restore
www: content: lookup via metadata archive instead of catalog
docs: file formats: describe split pxar archive file layout
docs: add section describing change detection mode
test-suite: add detection mode change benchmark
test-suite: Makefile: add debian package and related files
Cargo.toml | 1 +
Makefile | 18 +-
debian/control | 7 +
debian/proxmox-backup-client.bash-completion | 1 +
debian/proxmox-backup-test-suite.bc | 8 +
debian/proxmox-backup-test-suite.install | 3 +
docs/Makefile | 2 +
docs/backup-client.rst | 47 +
docs/command-line-tools.rst | 5 +
docs/command-syntax.rst | 4 +
docs/conf.py | 1 +
docs/file-formats.rst | 46 +
docs/meta-format-overview.dot | 50 +
.../proxmox-backup-test-suite/description.rst | 2 +
docs/proxmox-backup-test-suite/man1.rst | 17 +
docs/technical-overview.rst | 3 +
examples/test_chunk_size.rs | 9 +-
examples/test_chunk_speed.rs | 7 +-
examples/test_chunk_speed2.rs | 2 +-
pbs-client/src/backup_specification.rs | 26 +
pbs-client/src/backup_writer.rs | 125 ++-
pbs-client/src/chunk_stream.rs | 238 ++++-
pbs-client/src/inject_reused_chunks.rs | 127 +++
pbs-client/src/lib.rs | 3 +-
pbs-client/src/pxar/create.rs | 908 +++++++++++++++++-
pbs-client/src/pxar/extract.rs | 28 +-
pbs-client/src/pxar/look_ahead_cache.rs | 162 ++++
pbs-client/src/pxar/mod.rs | 5 +-
pbs-client/src/pxar/tools.rs | 123 ++-
pbs-client/src/pxar_backup_stream.rs | 71 +-
pbs-client/src/tools/mod.rs | 124 ++-
pbs-datastore/src/chunker.rs | 267 ++++-
pbs-datastore/src/dynamic_index.rs | 9 +-
pbs-datastore/src/lib.rs | 2 +-
pbs-pxar-fuse/src/lib.rs | 14 +-
proxmox-backup-client/src/catalog.rs | 29 +-
proxmox-backup-client/src/helper.rs | 72 ++
proxmox-backup-client/src/main.rs | 293 +++++-
proxmox-backup-client/src/mount.rs | 33 +-
proxmox-backup-test-suite/Cargo.toml | 18 +
.../src/detection_mode_bench.rs | 294 ++++++
proxmox-backup-test-suite/src/main.rs | 17 +
proxmox-file-restore/src/main.rs | 74 +-
.../src/proxmox_restore_daemon/api.rs | 20 +-
pxar-bin/Cargo.toml | 1 +
pxar-bin/src/main.rs | 158 ++-
pxar-bin/tests/pxar.rs | 135 +++
src/api2/admin/datastore.rs | 178 +++-
src/api2/tape/restore.rs | 22 +-
src/bin/proxmox_backup_debug/diff.rs | 2 +-
src/tape/file_formats/snapshot_archive.rs | 8 +-
tests/catar.rs | 7 +-
tests/pxar/backup-client-pxar-data.mpxar | Bin 0 -> 15070 bytes
tests/pxar/backup-client-pxar-data.ppxar.didx | Bin 0 -> 8096 bytes
tests/pxar/backup-client-pxar-expected.mpxar | Bin 0 -> 15086 bytes
tests/pxar/backup-client-pxar-expected.ppxar | Bin 0 -> 104859264 bytes
www/datastore/Content.js | 37 +-
zsh-completions/_proxmox-backup-test-suite | 13 +
58 files changed, 3526 insertions(+), 350 deletions(-)
create mode 100644 debian/proxmox-backup-test-suite.bc
create mode 100644 debian/proxmox-backup-test-suite.install
create mode 100644 docs/meta-format-overview.dot
create mode 100644 docs/proxmox-backup-test-suite/description.rst
create mode 100644 docs/proxmox-backup-test-suite/man1.rst
create mode 100644 pbs-client/src/inject_reused_chunks.rs
create mode 100644 pbs-client/src/pxar/look_ahead_cache.rs
create mode 100644 proxmox-backup-client/src/helper.rs
create mode 100644 proxmox-backup-test-suite/Cargo.toml
create mode 100644 proxmox-backup-test-suite/src/detection_mode_bench.rs
create mode 100644 proxmox-backup-test-suite/src/main.rs
create mode 100644 tests/pxar/backup-client-pxar-data.mpxar
create mode 100644 tests/pxar/backup-client-pxar-data.ppxar.didx
create mode 100644 tests/pxar/backup-client-pxar-expected.mpxar
create mode 100644 tests/pxar/backup-client-pxar-expected.ppxar
create mode 100644 zsh-completions/_proxmox-backup-test-suite
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 01/58] client: pxar: switch to stack based encoder state
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 02/58] client: pxar: combine writers into struct Christian Ebner
` (56 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
... and adapt to the new reader/writer variant for encoder or
decoder/accessor to attach a dedicated payload input/output for split
pxar archives.
In preparation for look-ahead caching, where a passing around of
per-directory level encoder instances with internal references is
not feasible.
Previously, for each directory level a new encoder instance has been
generated, restricting possible implementation errors. These encoder
instances have been internally linked by references to keep track of
the state changes in a parent child relationship.
This is however not feasible when the encoder has to be passed by
mutable reference, as required by the look-ahead cache
implementation. The encoder has therefore been adapted to use a
single instance implementation with an internal stack keeping track
of the state.
Depends on the bumped pxar library version, including the patches to
attach the corresponding variant for the pxar reader/writer
instantiation.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- add missing hunk for pxar's dump_archive method to use PxarVariant as
parameter for decoder
pbs-client/src/pxar/create.rs | 8 +++++---
pbs-pxar-fuse/src/lib.rs | 2 +-
proxmox-backup-client/src/catalog.rs | 3 ++-
proxmox-backup-client/src/main.rs | 2 +-
proxmox-backup-client/src/mount.rs | 3 ++-
proxmox-file-restore/src/main.rs | 4 ++--
pxar-bin/src/main.rs | 4 ++--
src/api2/admin/datastore.rs | 2 +-
src/api2/tape/restore.rs | 5 +++--
src/bin/proxmox_backup_debug/diff.rs | 2 +-
src/tape/file_formats/snapshot_archive.rs | 7 +++++--
11 files changed, 25 insertions(+), 17 deletions(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 60efb0ce5..1b1bac2d4 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -170,7 +170,7 @@ where
set.insert(stat.st_dev);
}
- let mut encoder = Encoder::new(&mut writer, &metadata).await?;
+ let mut encoder = Encoder::new(pxar::PxarVariant::Unified(&mut writer), &metadata).await?;
let mut patterns = options.patterns;
@@ -203,6 +203,8 @@ where
.archive_dir_contents(&mut encoder, source_dir, true)
.await?;
encoder.finish().await?;
+ encoder.close().await?;
+
Ok(())
}
@@ -663,7 +665,7 @@ impl Archiver {
) -> Result<(), Error> {
let dir_name = OsStr::from_bytes(dir_name.to_bytes());
- let mut encoder = encoder.create_directory(dir_name, metadata).await?;
+ encoder.create_directory(dir_name, metadata).await?;
let old_fs_magic = self.fs_magic;
let old_fs_feature_flags = self.fs_feature_flags;
@@ -686,7 +688,7 @@ impl Archiver {
log::info!("skipping mount point: {:?}", self.path);
Ok(())
} else {
- self.archive_dir_contents(&mut encoder, dir, false).await
+ self.archive_dir_contents(encoder, dir, false).await
};
self.fs_magic = old_fs_magic;
diff --git a/pbs-pxar-fuse/src/lib.rs b/pbs-pxar-fuse/src/lib.rs
index bf196b6c4..377635b2a 100644
--- a/pbs-pxar-fuse/src/lib.rs
+++ b/pbs-pxar-fuse/src/lib.rs
@@ -66,7 +66,7 @@ impl Session {
let file = std::fs::File::open(archive_path)?;
let file_size = file.metadata()?.len();
let reader: Reader = Arc::new(accessor::sync::FileReader::new(file));
- let accessor = Accessor::new(reader, file_size).await?;
+ let accessor = Accessor::new(pxar::PxarVariant::Unified(reader), file_size).await?;
Self::mount(accessor, options, verbose, mountpoint)
}
diff --git a/proxmox-backup-client/src/catalog.rs b/proxmox-backup-client/src/catalog.rs
index 72b22e67f..e72b6a1e0 100644
--- a/proxmox-backup-client/src/catalog.rs
+++ b/proxmox-backup-client/src/catalog.rs
@@ -220,7 +220,8 @@ async fn catalog_shell(param: Value) -> Result<(), Error> {
let reader = BufferedDynamicReader::new(index, chunk_reader);
let archive_size = reader.archive_size();
let reader: pbs_pxar_fuse::Reader = Arc::new(BufferedDynamicReadAt::new(reader));
- let decoder = pbs_pxar_fuse::Accessor::new(reader, archive_size).await?;
+ let decoder =
+ pbs_pxar_fuse::Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
client.download(CATALOG_NAME, &mut tmpfile).await?;
let index = DynamicIndexReader::new(tmpfile)
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index 4453c7756..ad2bc5a66 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -1458,7 +1458,7 @@ async fn restore(
if let Some(target) = target {
pbs_client::pxar::extract_archive(
- pxar::decoder::Decoder::from_std(reader)?,
+ pxar::decoder::Decoder::from_std(pxar::PxarVariant::Unified(reader))?,
Path::new(target),
feature_flags,
|path| {
diff --git a/proxmox-backup-client/src/mount.rs b/proxmox-backup-client/src/mount.rs
index 4a2f83357..4d352b6e4 100644
--- a/proxmox-backup-client/src/mount.rs
+++ b/proxmox-backup-client/src/mount.rs
@@ -296,7 +296,8 @@ async fn mount_do(param: Value, pipe: Option<OwnedFd>) -> Result<Value, Error> {
let reader = BufferedDynamicReader::new(index, chunk_reader);
let archive_size = reader.archive_size();
let reader: pbs_pxar_fuse::Reader = Arc::new(BufferedDynamicReadAt::new(reader));
- let decoder = pbs_pxar_fuse::Accessor::new(reader, archive_size).await?;
+ let decoder =
+ pbs_pxar_fuse::Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
let session =
pbs_pxar_fuse::Session::mount(decoder, options, false, Path::new(target.unwrap()))
diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs
index 50875a636..6a6379f27 100644
--- a/proxmox-file-restore/src/main.rs
+++ b/proxmox-file-restore/src/main.rs
@@ -457,7 +457,7 @@ async fn extract(
let archive_size = reader.archive_size();
let reader = LocalDynamicReadAt::new(reader);
- let decoder = Accessor::new(reader, archive_size).await?;
+ let decoder = Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
extract_to_target(decoder, &path, target, format, zstd).await?;
}
ExtractPath::VM(file, path) => {
@@ -483,7 +483,7 @@ async fn extract(
false,
)
.await?;
- let decoder = Decoder::from_tokio(reader).await?;
+ let decoder = Decoder::from_tokio(pxar::PxarVariant::Unified(reader)).await?;
extract_sub_dir_seq(&target, decoder).await?;
// we extracted a .pxarexclude-cli file auto-generated by the VM when encoding the
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index 2bbe90e34..68f3dcb5c 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -26,7 +26,7 @@ fn extract_archive_from_reader<R: std::io::Read>(
options: PxarExtractOptions,
) -> Result<(), Error> {
pbs_client::pxar::extract_archive(
- pxar::decoder::Decoder::from_std(reader)?,
+ pxar::decoder::Decoder::from_std(pxar::PxarVariant::Unified(reader))?,
Path::new(target),
feature_flags,
|path| {
@@ -436,7 +436,7 @@ async fn mount_archive(archive: String, mountpoint: String, verbose: bool) -> Re
)]
/// List the contents of an archive.
fn dump_archive(archive: String) -> Result<(), Error> {
- for entry in pxar::decoder::Decoder::open(archive)? {
+ for entry in pxar::decoder::Decoder::open(pxar::PxarVariant::Unified(archive))? {
let entry = entry?;
if log::log_enabled!(log::Level::Debug) {
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index ca72a2f2b..af1c12cc0 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1813,7 +1813,7 @@ pub fn pxar_file_download(
let (reader, archive_size) =
get_local_pxar_reader(datastore.clone(), &manifest, &backup_dir, pxar_name)?;
- let decoder = Accessor::new(reader, archive_size).await?;
+ let decoder = Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
let root = decoder.open_root().await?;
let path = OsStr::from_bytes(file_path).to_os_string();
let file = root
diff --git a/src/api2/tape/restore.rs b/src/api2/tape/restore.rs
index 84557bce1..9184ff934 100644
--- a/src/api2/tape/restore.rs
+++ b/src/api2/tape/restore.rs
@@ -1069,7 +1069,8 @@ fn restore_snapshots_to_tmpdir(
"File {file_num}: snapshot archive {source_datastore}:{snapshot}",
);
- let mut decoder = pxar::decoder::sync::Decoder::from_std(reader)?;
+ let mut decoder =
+ pxar::decoder::sync::Decoder::from_std(pxar::PxarVariant::Unified(reader))?;
let target_datastore = match store_map.target_store(&source_datastore) {
Some(datastore) => datastore,
@@ -1685,7 +1686,7 @@ fn restore_snapshot_archive<'a>(
reader: Box<dyn 'a + TapeRead>,
snapshot_path: &Path,
) -> Result<bool, Error> {
- let mut decoder = pxar::decoder::sync::Decoder::from_std(reader)?;
+ let mut decoder = pxar::decoder::sync::Decoder::from_std(pxar::PxarVariant::Unified(reader))?;
match try_restore_snapshot_archive(worker, &mut decoder, snapshot_path) {
Ok(_) => Ok(true),
Err(err) => {
diff --git a/src/bin/proxmox_backup_debug/diff.rs b/src/bin/proxmox_backup_debug/diff.rs
index 5b68941a4..e6767c17c 100644
--- a/src/bin/proxmox_backup_debug/diff.rs
+++ b/src/bin/proxmox_backup_debug/diff.rs
@@ -277,7 +277,7 @@ async fn open_dynamic_index(
let reader = BufferedDynamicReader::new(index, chunk_reader);
let archive_size = reader.archive_size();
let reader: Arc<dyn ReadAt + Send + Sync> = Arc::new(LocalDynamicReadAt::new(reader));
- let accessor = Accessor::new(reader, archive_size).await?;
+ let accessor = Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
Ok((lookup_index, accessor))
}
diff --git a/src/tape/file_formats/snapshot_archive.rs b/src/tape/file_formats/snapshot_archive.rs
index 252384b50..82f466980 100644
--- a/src/tape/file_formats/snapshot_archive.rs
+++ b/src/tape/file_formats/snapshot_archive.rs
@@ -58,8 +58,10 @@ pub fn tape_write_snapshot_archive<'a>(
));
}
- let mut encoder =
- pxar::encoder::sync::Encoder::new(PxarTapeWriter::new(writer), &root_metadata)?;
+ let mut encoder = pxar::encoder::sync::Encoder::new(
+ pxar::PxarVariant::Unified(PxarTapeWriter::new(writer)),
+ &root_metadata,
+ )?;
for filename in file_list.iter() {
let mut file = snapshot_reader.open_file(filename).map_err(|err| {
@@ -89,6 +91,7 @@ pub fn tape_write_snapshot_archive<'a>(
}
}
encoder.finish()?;
+ encoder.close()?;
Ok(())
});
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 02/58] client: pxar: combine writers into struct
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 01/58] client: pxar: switch to stack based encoder state Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 03/58] client: pxar: optionally split metadata and payload streams Christian Ebner
` (55 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Introduce a `PxarWriters` struct to bundle all writer instances
required for the pxar archive creation into a single object to limit
the number of function call parameters.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar/create.rs | 23 +++++++++++++++----
pbs-client/src/pxar/mod.rs | 2 +-
pbs-client/src/pxar_backup_stream.rs | 8 ++++---
.../src/proxmox_restore_daemon/api.rs | 8 ++++---
pxar-bin/src/main.rs | 8 +++----
tests/catar.rs | 5 ++--
6 files changed, 35 insertions(+), 19 deletions(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 1b1bac2d4..cc75f0262 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -18,7 +18,7 @@ use nix::sys::stat::{FileStat, Mode};
use pathpatterns::{MatchEntry, MatchFlag, MatchList, MatchType, PatternFlag};
use proxmox_sys::error::SysError;
use pxar::encoder::{LinkOffset, SeqWrite};
-use pxar::Metadata;
+use pxar::{Metadata, PxarVariant};
use proxmox_io::vec;
use proxmox_lang::c_str;
@@ -135,12 +135,25 @@ struct Archiver {
type Encoder<'a, T> = pxar::encoder::aio::Encoder<'a, T>;
+pub struct PxarWriters<T> {
+ archive: PxarVariant<T, T>,
+ catalog: Option<Arc<Mutex<dyn BackupCatalogWriter + Send>>>,
+}
+
+impl<T> PxarWriters<T> {
+ pub fn new(
+ archive: PxarVariant<T, T>,
+ catalog: Option<Arc<Mutex<dyn BackupCatalogWriter + Send>>>,
+ ) -> Self {
+ Self { archive, catalog }
+ }
+}
+
pub async fn create_archive<T, F>(
source_dir: Dir,
- mut writer: T,
+ writers: PxarWriters<T>,
feature_flags: Flags,
callback: F,
- catalog: Option<Arc<Mutex<dyn BackupCatalogWriter + Send>>>,
options: PxarCreateOptions,
) -> Result<(), Error>
where
@@ -170,7 +183,7 @@ where
set.insert(stat.st_dev);
}
- let mut encoder = Encoder::new(pxar::PxarVariant::Unified(&mut writer), &metadata).await?;
+ let mut encoder = Encoder::new(writers.archive, &metadata).await?;
let mut patterns = options.patterns;
@@ -188,7 +201,7 @@ where
fs_magic,
callback: Box::new(callback),
patterns,
- catalog,
+ catalog: writers.catalog,
path: PathBuf::new(),
entry_counter: 0,
entry_limit: options.entries_max,
diff --git a/pbs-client/src/pxar/mod.rs b/pbs-client/src/pxar/mod.rs
index 14674b9b9..b7dcf8362 100644
--- a/pbs-client/src/pxar/mod.rs
+++ b/pbs-client/src/pxar/mod.rs
@@ -56,7 +56,7 @@ pub(crate) mod tools;
mod flags;
pub use flags::Flags;
-pub use create::{create_archive, PxarCreateOptions};
+pub use create::{create_archive, PxarCreateOptions, PxarWriters};
pub use extract::{
create_tar, create_zip, extract_archive, extract_sub_dir, extract_sub_dir_seq, ErrorHandler,
OverwriteFlags, PxarExtractContext, PxarExtractOptions,
diff --git a/pbs-client/src/pxar_backup_stream.rs b/pbs-client/src/pxar_backup_stream.rs
index 22a6ffdc2..8dc3fd088 100644
--- a/pbs-client/src/pxar_backup_stream.rs
+++ b/pbs-client/src/pxar_backup_stream.rs
@@ -17,6 +17,8 @@ use proxmox_io::StdChannelWriter;
use pbs_datastore::catalog::CatalogWriter;
+use crate::pxar::create::PxarWriters;
+
/// Stream implementation to encode and upload .pxar archives.
///
/// The hyper client needs an async Stream for file upload, so we
@@ -53,16 +55,16 @@ impl PxarBackupStream {
StdChannelWriter::new(tx),
));
- let writer = pxar::encoder::sync::StandardWriter::new(writer);
+ let writer =
+ pxar::PxarVariant::Unified(pxar::encoder::sync::StandardWriter::new(writer));
if let Err(err) = crate::pxar::create_archive(
dir,
- writer,
+ PxarWriters::new(writer, Some(catalog)),
crate::pxar::Flags::DEFAULT,
move |path| {
log::debug!("{:?}", path);
Ok(())
},
- Some(catalog),
options,
)
.await
diff --git a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
index cb7b53e11..95c9f4619 100644
--- a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
+++ b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
@@ -23,7 +23,9 @@ use proxmox_sortable_macro::sortable;
use proxmox_sys::fs::read_subdir;
use pbs_api_types::file_restore::{FileRestoreFormat, RestoreDaemonStatus};
-use pbs_client::pxar::{create_archive, Flags, PxarCreateOptions, ENCODER_MAX_ENTRIES};
+use pbs_client::pxar::{
+ create_archive, Flags, PxarCreateOptions, PxarWriters, ENCODER_MAX_ENTRIES,
+};
use pbs_datastore::catalog::{ArchiveEntry, DirEntryAttribute};
use pbs_tools::json::required_string_param;
@@ -360,8 +362,8 @@ fn extract(
skip_e2big_xattr: false,
};
- let pxar_writer = TokioWriter::new(writer);
- create_archive(dir, pxar_writer, Flags::DEFAULT, |_| Ok(()), None, options)
+ let pxar_writer = pxar::PxarVariant::Unified(TokioWriter::new(writer));
+ create_archive(dir, PxarWriters::new(pxar_writer, None), Flags::DEFAULT, |_| Ok(()), options)
.await
}
.await;
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index 68f3dcb5c..8108ec0fb 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -13,7 +13,8 @@ use tokio::signal::unix::{signal, SignalKind};
use pathpatterns::{MatchEntry, MatchType, PatternFlag};
use pbs_client::pxar::{
- format_single_line_entry, Flags, OverwriteFlags, PxarExtractOptions, ENCODER_MAX_ENTRIES,
+ format_single_line_entry, Flags, OverwriteFlags, PxarExtractOptions, PxarWriters,
+ ENCODER_MAX_ENTRIES,
};
use proxmox_router::cli::*;
@@ -373,16 +374,15 @@ async fn create_archive(
feature_flags.remove(Flags::WITH_SOCKETS);
}
- let writer = pxar::encoder::sync::StandardWriter::new(writer);
+ let writer = pxar::PxarVariant::Unified(pxar::encoder::sync::StandardWriter::new(writer));
pbs_client::pxar::create_archive(
dir,
- writer,
+ PxarWriters::new(writer, None),
feature_flags,
move |path| {
log::debug!("{:?}", path);
Ok(())
},
- None,
options,
)
.await?;
diff --git a/tests/catar.rs b/tests/catar.rs
index 36bb4f3bc..932df61a9 100644
--- a/tests/catar.rs
+++ b/tests/catar.rs
@@ -19,7 +19,7 @@ fn run_test(dir_name: &str) -> Result<(), Error> {
.write(true)
.truncate(true)
.open("test-proxmox.catar")?;
- let writer = pxar::encoder::sync::StandardWriter::new(writer);
+ let writer = pxar::PxarVariant::Unified(pxar::encoder::sync::StandardWriter::new(writer));
let dir = nix::dir::Dir::open(
dir_name,
@@ -35,10 +35,9 @@ fn run_test(dir_name: &str) -> Result<(), Error> {
let rt = tokio::runtime::Runtime::new().unwrap();
rt.block_on(create_archive(
dir,
- writer,
+ PxarWriters::new(writer, None),
Flags::DEFAULT,
|_| Ok(()),
- None,
options,
))?;
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 03/58] client: pxar: optionally split metadata and payload streams
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 01/58] client: pxar: switch to stack based encoder state Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 02/58] client: pxar: combine writers into struct Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 04/58] client: helper: add helpers for creating reader instances Christian Ebner
` (54 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
... and attach the split payload writer variant to the pxar archive
creation. By this, metadata and payload data will create different
dynamic indexes, allowing to lookup and reuse payload chunks without
the additional overhead of the pxar archive's metadata.
For now this functionality remains disabled and will be enabled in a
later patch once the logic for reusing the payload chunks is in
place.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar_backup_stream.rs | 51 ++++++++++++++-----
proxmox-backup-client/src/main.rs | 75 +++++++++++++++++++++++++---
2 files changed, 105 insertions(+), 21 deletions(-)
diff --git a/pbs-client/src/pxar_backup_stream.rs b/pbs-client/src/pxar_backup_stream.rs
index 8dc3fd088..3541eddb5 100644
--- a/pbs-client/src/pxar_backup_stream.rs
+++ b/pbs-client/src/pxar_backup_stream.rs
@@ -42,21 +42,37 @@ impl PxarBackupStream {
dir: Dir,
catalog: Arc<Mutex<CatalogWriter<W>>>,
options: crate::pxar::PxarCreateOptions,
- ) -> Result<Self, Error> {
- let (tx, rx) = std::sync::mpsc::sync_channel(10);
-
+ separate_payload_stream: bool,
+ ) -> Result<(Self, Option<Self>), Error> {
let buffer_size = 256 * 1024;
- let error = Arc::new(Mutex::new(None));
- let error2 = Arc::clone(&error);
- let handler = async move {
- let writer = TokioWriterAdapter::new(std::io::BufWriter::with_capacity(
+ let (tx, rx) = std::sync::mpsc::sync_channel(10);
+ let writer = TokioWriterAdapter::new(std::io::BufWriter::with_capacity(
+ buffer_size,
+ StdChannelWriter::new(tx),
+ ));
+ let writer = pxar::encoder::sync::StandardWriter::new(writer);
+
+ let (writer, payload_rx) = if separate_payload_stream {
+ let (tx, rx) = std::sync::mpsc::sync_channel(10);
+ let payload_writer = TokioWriterAdapter::new(std::io::BufWriter::with_capacity(
buffer_size,
StdChannelWriter::new(tx),
));
+ (
+ pxar::PxarVariant::Split(
+ writer,
+ pxar::encoder::sync::StandardWriter::new(payload_writer),
+ ),
+ Some(rx),
+ )
+ } else {
+ (pxar::PxarVariant::Unified(writer), None)
+ };
- let writer =
- pxar::PxarVariant::Unified(pxar::encoder::sync::StandardWriter::new(writer));
+ let error = Arc::new(Mutex::new(None));
+ let error2 = Arc::clone(&error);
+ let handler = async move {
if let Err(err) = crate::pxar::create_archive(
dir,
PxarWriters::new(writer, Some(catalog)),
@@ -78,21 +94,30 @@ impl PxarBackupStream {
let future = Abortable::new(handler, registration);
tokio::spawn(future);
- Ok(Self {
+ let backup_stream = Self {
+ rx: Some(rx),
+ handle: Some(handle.clone()),
+ error: Arc::clone(&error),
+ };
+
+ let backup_payload_stream = payload_rx.map(|rx| Self {
rx: Some(rx),
handle: Some(handle),
error,
- })
+ });
+
+ Ok((backup_stream, backup_payload_stream))
}
pub fn open<W: Write + Send + 'static>(
dirname: &Path,
catalog: Arc<Mutex<CatalogWriter<W>>>,
options: crate::pxar::PxarCreateOptions,
- ) -> Result<Self, Error> {
+ separate_payload_stream: bool,
+ ) -> Result<(Self, Option<Self>), Error> {
let dir = nix::dir::Dir::open(dirname, OFlag::O_DIRECTORY, Mode::empty())?;
- Self::new(dir, catalog, options)
+ Self::new(dir, catalog, options, separate_payload_stream)
}
}
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index ad2bc5a66..25556d672 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -187,18 +187,24 @@ async fn backup_directory<P: AsRef<Path>>(
client: &BackupWriter,
dir_path: P,
archive_name: &str,
+ payload_target: Option<&str>,
chunk_size: Option<usize>,
catalog: Arc<Mutex<CatalogWriter<TokioWriterAdapter<StdChannelWriter<Error>>>>>,
pxar_create_options: pbs_client::pxar::PxarCreateOptions,
upload_options: UploadOptions,
-) -> Result<BackupStats, Error> {
+) -> Result<(BackupStats, Option<BackupStats>), Error> {
if upload_options.fixed_size.is_some() {
bail!("cannot backup directory with fixed chunk size!");
}
- let pxar_stream = PxarBackupStream::open(dir_path.as_ref(), catalog, pxar_create_options)?;
- let mut chunk_stream = ChunkStream::new(pxar_stream, chunk_size);
+ let (pxar_stream, payload_stream) = PxarBackupStream::open(
+ dir_path.as_ref(),
+ catalog,
+ pxar_create_options,
+ payload_target.is_some(),
+ )?;
+ let mut chunk_stream = ChunkStream::new(pxar_stream, chunk_size);
let (tx, rx) = mpsc::channel(10); // allow to buffer 10 chunks
let stream = ReceiverStream::new(rx).map_err(Error::from);
@@ -210,11 +216,36 @@ async fn backup_directory<P: AsRef<Path>>(
}
});
- let stats = client
- .upload_stream(archive_name, stream, upload_options)
- .await?;
+ let stats = client.upload_stream(archive_name, stream, upload_options.clone());
- Ok(stats)
+ if let Some(payload_stream) = payload_stream {
+ let payload_target = payload_target
+ .ok_or_else(|| format_err!("got payload stream, but no target archive name"))?;
+
+ let mut payload_chunk_stream = ChunkStream::new(payload_stream, chunk_size);
+ let (payload_tx, payload_rx) = mpsc::channel(10); // allow to buffer 10 chunks
+ let stream = ReceiverStream::new(payload_rx).map_err(Error::from);
+
+ // spawn payload chunker inside a separate task so that it can run parallel
+ tokio::spawn(async move {
+ while let Some(v) = payload_chunk_stream.next().await {
+ let _ = payload_tx.send(v).await;
+ }
+ });
+
+ let payload_stats = client.upload_stream(&payload_target, stream, upload_options);
+
+ match futures::join!(stats, payload_stats) {
+ (Ok(stats), Ok(payload_stats)) => Ok((stats, Some(payload_stats))),
+ (Err(err), Ok(_)) => Err(format_err!("upload failed: {err}")),
+ (Ok(_), Err(err)) => Err(format_err!("upload failed: {err}")),
+ (Err(err), Err(payload_err)) => {
+ Err(format_err!("upload failed: {err} - {payload_err}"))
+ }
+ }
+ } else {
+ Ok((stats.await?, None))
+ }
}
async fn backup_image<P: AsRef<Path>>(
@@ -985,6 +1016,23 @@ async fn create_backup(
manifest.add_file(target, stats.size, stats.csum, crypto.mode)?;
}
(BackupSpecificationType::PXAR, false) => {
+ let metadata_mode = false; // Until enabled via param
+
+ let target_base = if let Some(base) = target_base.strip_suffix(".pxar") {
+ base.to_string()
+ } else {
+ bail!("unexpected suffix in target: {target_base}");
+ };
+
+ let (target, payload_target) = if metadata_mode {
+ (
+ format!("{target_base}.mpxar.{extension}"),
+ Some(format!("{target_base}.ppxar.{extension}")),
+ )
+ } else {
+ (target, None)
+ };
+
// start catalog upload on first use
if catalog.is_none() {
let catalog_upload_res =
@@ -1015,16 +1063,27 @@ async fn create_backup(
..UploadOptions::default()
};
- let stats = backup_directory(
+ let (stats, payload_stats) = backup_directory(
&client,
&filename,
&target,
+ payload_target.as_deref(),
chunk_size_opt,
catalog.clone(),
pxar_options,
upload_options,
)
.await?;
+
+ if let Some(payload_stats) = payload_stats {
+ manifest.add_file(
+ payload_target
+ .ok_or_else(|| format_err!("missing payload target archive"))?,
+ payload_stats.size,
+ payload_stats.csum,
+ crypto.mode,
+ )?;
+ }
manifest.add_file(target, stats.size, stats.csum, crypto.mode)?;
catalog.lock().unwrap().end_directory()?;
}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 04/58] client: helper: add helpers for creating reader instances
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (2 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 03/58] client: pxar: optionally split metadata and payload streams Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 05/58] client: helper: add method for split archive name mapping Christian Ebner
` (53 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Add module to place helper methods which need to be used in different
submodules of the client.
Add `get_pxar_fuse_reader`, `get_buffered_pxar_reader` and
`get_pxar_fuse_accessor` to create reader instances to access pxar
archives.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
proxmox-backup-client/src/helper.rs | 72 +++++++++++++++++++++++++++++
proxmox-backup-client/src/main.rs | 2 +
2 files changed, 74 insertions(+)
create mode 100644 proxmox-backup-client/src/helper.rs
diff --git a/proxmox-backup-client/src/helper.rs b/proxmox-backup-client/src/helper.rs
new file mode 100644
index 000000000..5b21b6720
--- /dev/null
+++ b/proxmox-backup-client/src/helper.rs
@@ -0,0 +1,72 @@
+use std::sync::Arc;
+
+use anyhow::Error;
+use pbs_client::{BackupReader, RemoteChunkReader};
+use pbs_datastore::BackupManifest;
+use pbs_tools::crypt_config::CryptConfig;
+
+use crate::{BufferedDynamicReadAt, BufferedDynamicReader, IndexFile};
+
+pub(crate) async fn get_pxar_fuse_accessor(
+ archive_name: &str,
+ payload_archive_name: Option<&str>,
+ client: Arc<BackupReader>,
+ manifest: &BackupManifest,
+ crypt_config: Option<Arc<CryptConfig>>,
+) -> Result<pbs_pxar_fuse::Accessor, Error> {
+ let (reader, archive_size) =
+ get_pxar_fuse_reader(archive_name, client.clone(), manifest, crypt_config.clone()).await?;
+
+ let reader = if let Some(payload_archive_name) = payload_archive_name {
+ let (payload_reader, payload_size) = get_pxar_fuse_reader(
+ payload_archive_name,
+ client.clone(),
+ manifest,
+ crypt_config.clone(),
+ )
+ .await?;
+
+ pxar::PxarVariant::Split(reader, (payload_reader, payload_size))
+ } else {
+ pxar::PxarVariant::Unified(reader)
+ };
+
+ let accessor = pbs_pxar_fuse::Accessor::new(reader, archive_size).await?;
+
+ Ok(accessor)
+}
+
+pub(crate) async fn get_pxar_fuse_reader(
+ archive_name: &str,
+ client: Arc<BackupReader>,
+ manifest: &BackupManifest,
+ crypt_config: Option<Arc<CryptConfig>>,
+) -> Result<(pbs_pxar_fuse::Reader, u64), Error> {
+ let reader = get_buffered_pxar_reader(archive_name, client, manifest, crypt_config).await?;
+ let archive_size = reader.archive_size();
+ let reader: pbs_pxar_fuse::Reader = Arc::new(BufferedDynamicReadAt::new(reader));
+
+ Ok((reader, archive_size))
+}
+
+pub(crate) async fn get_buffered_pxar_reader(
+ archive_name: &str,
+ client: Arc<BackupReader>,
+ manifest: &BackupManifest,
+ crypt_config: Option<Arc<CryptConfig>>,
+) -> Result<BufferedDynamicReader<RemoteChunkReader>, Error> {
+ let index = client
+ .download_dynamic_index(manifest, archive_name)
+ .await?;
+
+ let most_used = index.find_most_used_chunks(8);
+ let file_info = manifest.lookup_file_info(archive_name)?;
+ let chunk_reader = RemoteChunkReader::new(
+ client.clone(),
+ crypt_config.clone(),
+ file_info.chunk_crypt_mode(),
+ most_used,
+ );
+
+ Ok(BufferedDynamicReader::new(index, chunk_reader))
+}
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index 25556d672..db0fb6324 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -72,6 +72,8 @@ mod catalog;
pub use catalog::*;
mod snapshot;
pub use snapshot::*;
+mod helper;
+pub(crate) use helper::*;
pub mod key;
pub mod namespace;
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 05/58] client: helper: add method for split archive name mapping
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (3 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 04/58] client: helper: add helpers for creating reader instances Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 06/58] client: tools: helper to check pxar filename extensions Christian Ebner
` (52 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Helper method that takes an archive name as input and checks if the
given archive is present in the manifest, by also taking possible
split archive extensions into account.
Returns the pxar archive name if found or the split archive names if
the split archive variant is present in the manifest.
If neither is matched, an error is returned signaling that nothing
matched entries in the manifest.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- move method to pbs-client's tools, so it can be reused from within
crate
- lookup archive name in manifest, return with error if not present
pbs-client/src/tools/mod.rs | 50 +++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/pbs-client/src/tools/mod.rs b/pbs-client/src/tools/mod.rs
index 1b0123a39..fdce33914 100644
--- a/pbs-client/src/tools/mod.rs
+++ b/pbs-client/src/tools/mod.rs
@@ -16,6 +16,7 @@ use proxmox_schema::*;
use proxmox_sys::fs::file_get_json;
use pbs_api_types::{Authid, BackupNamespace, RateLimitConfig, UserWithTokens, BACKUP_REPO_URL};
+use pbs_datastore::BackupManifest;
use crate::{BackupRepository, HttpClient, HttpClientOptions};
@@ -526,3 +527,52 @@ pub fn place_xdg_file(
.and_then(|base| base.place_config_file(file_name).map_err(Error::from))
.with_context(|| format!("failed to place {} in xdg home", description))
}
+
+pub fn get_pxar_archive_names(
+ archive_name: &str,
+ manifest: &BackupManifest,
+) -> Result<(String, Option<String>), Error> {
+ let filename = archive_name.strip_suffix(".didx").unwrap_or(archive_name);
+
+ // Check if archive with given extension is present, otherwise fallback to split archive naming
+ if manifest
+ .files()
+ .iter()
+ .any(|fileinfo| fileinfo.filename == format!("{filename}.didx"))
+ {
+ // check if already given as one of split archive name variants
+ if let Some(base) = filename
+ .strip_suffix(".mpxar")
+ .or_else(|| filename.strip_suffix(".ppxar"))
+ {
+ if archive_name.ends_with(".didx") {
+ return Ok((
+ format!("{base}.mpxar.didx"),
+ Some(format!("{base}.ppxar.didx")),
+ ));
+ }
+ return Ok((format!("{base}.mpxar"), Some(format!("{base}.ppxar"))));
+ }
+ return Ok((archive_name.to_owned(), None));
+ }
+
+ if let Some(base) = filename.strip_suffix(".pxar") {
+ let filename = format!("{base}.mpxar");
+ // check if present with split archive name variant
+ if manifest
+ .files()
+ .iter()
+ .any(|fileinfo| fileinfo.filename == format!("{filename}.didx"))
+ {
+ if archive_name.ends_with(".didx") {
+ return Ok((
+ format!("{base}.mpxar.didx"),
+ Some(format!("{base}.ppxar.didx")),
+ ));
+ }
+ return Ok((format!("{base}.mpxar"), Some(format!("{base}.ppxar"))));
+ }
+ }
+
+ bail!("archive not found in manifest");
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 06/58] client: tools: helper to check pxar filename extensions
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (4 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 05/58] client: helper: add method for split archive name mapping Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 07/58] client: restore: read payload from dedicated index Christian Ebner
` (51 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
With the introduction of split pxar archives, the allowed extensions
are now `.pxar`, `.mpxar` and `.ppxar`. Add a helper function to
allow to check for all valid variants, including the optional
additional `.didx` in case of a server archive name.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/tools/mod.rs | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/pbs-client/src/tools/mod.rs b/pbs-client/src/tools/mod.rs
index fdce33914..67768fa5c 100644
--- a/pbs-client/src/tools/mod.rs
+++ b/pbs-client/src/tools/mod.rs
@@ -576,3 +576,16 @@ pub fn get_pxar_archive_names(
bail!("archive not found in manifest");
}
+
+/// Check if the given filename has a valid pxar filename extension variant
+///
+/// If `with_didx_extension` is `true`, check the additional `.didx` ending.
+pub fn has_pxar_filename_extension(name: &str, with_didx_extension: bool) -> bool {
+ if with_didx_extension {
+ name.ends_with(".pxar.didx")
+ || name.ends_with(".mpxar.didx")
+ || name.ends_with(".ppxar.didx")
+ } else {
+ name.ends_with(".pxar") || name.ends_with(".mpxar") || name.ends_with(".ppxar")
+ }
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 07/58] client: restore: read payload from dedicated index
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (5 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 06/58] client: tools: helper to check pxar filename extensions Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 08/58] client: tools: cover extension for split pxar archives Christian Ebner
` (50 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Whenever a split pxar archive is encountered, instantiate and attach
the required dedicated reader instance to the decoder instance on
restore.
Piping the output to stdout is not possible for these, as this would
require a decoder instance which can decode the input stream, while
maintaining the pxar stream format as output.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
proxmox-backup-client/src/main.rs | 45 ++++++++++++++++++++-----------
1 file changed, 29 insertions(+), 16 deletions(-)
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index db0fb6324..ba2c5fa59 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -35,7 +35,7 @@ use pbs_client::tools::{
complete_archive_name, complete_auth_id, complete_backup_group, complete_backup_snapshot,
complete_backup_source, complete_chunk_size, complete_group_or_snapshot,
complete_img_archive_name, complete_namespace, complete_pxar_archive_name, complete_repository,
- connect, connect_rate_limited, extract_repository_from_value,
+ connect, connect_rate_limited, extract_repository_from_value, has_pxar_filename_extension,
key_source::{
crypto_parameters, format_key_source, get_encryption_key_password, KEYFD_SCHEMA,
KEYFILE_SCHEMA, MASTER_PUBKEY_FD_SCHEMA, MASTER_PUBKEY_FILE_SCHEMA,
@@ -1216,7 +1216,7 @@ async fn dump_image<W: Write>(
fn parse_archive_type(name: &str) -> (String, ArchiveType) {
if name.ends_with(".didx") || name.ends_with(".fidx") || name.ends_with(".blob") {
(name.into(), archive_type(name).unwrap())
- } else if name.ends_with(".pxar") {
+ } else if has_pxar_filename_extension(name, false) {
(format!("{}.didx", name), ArchiveType::DynamicIndex)
} else if name.ends_with(".img") {
(format!("{}.fidx", name), ArchiveType::FixedIndex)
@@ -1400,6 +1400,9 @@ async fn restore(
let (manifest, backup_index_data) = client.download_manifest().await?;
+ let (archive_name, payload_archive_name) =
+ pbs_client::tools::get_pxar_archive_names(&archive_name, &manifest)?;
+
if archive_name == ENCRYPTED_KEY_BLOB_NAME && crypt_config.is_none() {
log::info!("Restoring encrypted key blob without original key - skipping manifest fingerprint check!")
} else {
@@ -1450,20 +1453,13 @@ async fn restore(
.map_err(|err| format_err!("unable to pipe data - {}", err))?;
}
} else if archive_type == ArchiveType::DynamicIndex {
- let index = client
- .download_dynamic_index(&manifest, &archive_name)
- .await?;
-
- let most_used = index.find_most_used_chunks(8);
-
- let chunk_reader = RemoteChunkReader::new(
+ let mut reader = get_buffered_pxar_reader(
+ &archive_name,
client.clone(),
- crypt_config,
- file_info.chunk_crypt_mode(),
- most_used,
- );
-
- let mut reader = BufferedDynamicReader::new(index, chunk_reader);
+ &manifest,
+ crypt_config.clone(),
+ )
+ .await?;
let on_error = if ignore_extract_device_errors {
let handler: PxarErrorHandler = Box::new(move |err: Error| {
@@ -1518,8 +1514,22 @@ async fn restore(
}
if let Some(target) = target {
+ let reader = if let Some(payload_archive_name) = payload_archive_name {
+ let payload_reader = get_buffered_pxar_reader(
+ &payload_archive_name,
+ client.clone(),
+ &manifest,
+ crypt_config.clone(),
+ )
+ .await?;
+ pxar::PxarVariant::Split(reader, payload_reader)
+ } else {
+ pxar::PxarVariant::Unified(reader)
+ };
+ let decoder = pxar::decoder::Decoder::from_std(reader)?;
+
pbs_client::pxar::extract_archive(
- pxar::decoder::Decoder::from_std(pxar::PxarVariant::Unified(reader))?,
+ decoder,
Path::new(target),
feature_flags,
|path| {
@@ -1529,6 +1539,9 @@ async fn restore(
)
.map_err(|err| format_err!("error extracting archive - {:#}", err))?;
} else {
+ if archive_name.ends_with(".mpxar.didx") || archive_name.ends_with(".ppxar.didx") {
+ bail!("unable to pipe split archive");
+ }
let mut writer = std::fs::OpenOptions::new()
.write(true)
.open("/dev/stdout")
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 08/58] client: tools: cover extension for split pxar archives
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (6 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 07/58] client: restore: read payload from dedicated index Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 09/58] client: mount: make split pxar archives mountable Christian Ebner
` (49 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Cover the additional `.mpxar` for metadata archive and `.ppxar` for
the payload data file in the cli parameter completion callback.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/tools/mod.rs | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/pbs-client/src/tools/mod.rs b/pbs-client/src/tools/mod.rs
index 67768fa5c..6680dc475 100644
--- a/pbs-client/src/tools/mod.rs
+++ b/pbs-client/src/tools/mod.rs
@@ -338,7 +338,7 @@ pub fn complete_pxar_archive_name(arg: &str, param: &HashMap<String, String>) ->
complete_server_file_name(arg, param)
.iter()
.filter_map(|name| {
- if name.ends_with(".pxar.didx") {
+ if has_pxar_filename_extension(name, true) {
Some(pbs_tools::format::strip_server_file_extension(name).to_owned())
} else {
None
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 09/58] client: mount: make split pxar archives mountable
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (7 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 08/58] client: tools: cover extension for split pxar archives Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 10/58] api: datastore: attach split archive payload chunk reader Christian Ebner
` (48 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Cover the cases where the pxar archive was uploaded as split payload
data and metadata streams. Instantiate the required reader and
decoder instances to access the metadata and payload data archives.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- move pbs_client use statement to correct position
proxmox-backup-client/src/mount.rs | 34 +++++++++++++-----------------
proxmox-file-restore/src/main.rs | 1 +
2 files changed, 16 insertions(+), 19 deletions(-)
diff --git a/proxmox-backup-client/src/mount.rs b/proxmox-backup-client/src/mount.rs
index 4d352b6e4..249f6e4b7 100644
--- a/proxmox-backup-client/src/mount.rs
+++ b/proxmox-backup-client/src/mount.rs
@@ -18,20 +18,20 @@ use proxmox_schema::*;
use proxmox_sortable_macro::sortable;
use pbs_api_types::BackupNamespace;
+use pbs_client::tools::has_pxar_filename_extension;
use pbs_client::tools::key_source::get_encryption_key_password;
use pbs_client::{BackupReader, RemoteChunkReader};
use pbs_datastore::cached_chunk_reader::CachedChunkReader;
-use pbs_datastore::dynamic_index::BufferedDynamicReader;
use pbs_datastore::index::IndexFile;
use pbs_key_config::load_and_decrypt_key;
use pbs_tools::crypt_config::CryptConfig;
use pbs_tools::json::required_string_param;
+use crate::helper;
use crate::{
complete_group_or_snapshot, complete_img_archive_name, complete_namespace,
complete_pxar_archive_name, complete_repository, connect, dir_or_last_from_group,
- extract_repository_from_value, optional_ns_param, record_repository, BufferedDynamicReadAt,
- REPO_URL_SCHEMA,
+ extract_repository_from_value, optional_ns_param, record_repository, REPO_URL_SCHEMA,
};
#[sortable]
@@ -219,7 +219,7 @@ async fn mount_do(param: Value, pipe: Option<OwnedFd>) -> Result<Value, Error> {
}
};
- let server_archive_name = if archive_name.ends_with(".pxar") {
+ let server_archive_name = if has_pxar_filename_extension(archive_name, false) {
if target.is_none() {
bail!("use the 'mount' command to mount pxar archives");
}
@@ -246,7 +246,10 @@ async fn mount_do(param: Value, pipe: Option<OwnedFd>) -> Result<Value, Error> {
let (manifest, _) = client.download_manifest().await?;
manifest.check_fingerprint(crypt_config.as_ref().map(Arc::as_ref))?;
- let file_info = manifest.lookup_file_info(&server_archive_name)?;
+ let (archive_name, payload_archive_name) =
+ pbs_client::tools::get_pxar_archive_names(&server_archive_name, &manifest)?;
+
+ let file_info = manifest.lookup_file_info(&archive_name)?;
let daemonize = || -> Result<(), Error> {
if let Some(pipe) = pipe {
@@ -283,21 +286,14 @@ async fn mount_do(param: Value, pipe: Option<OwnedFd>) -> Result<Value, Error> {
futures::future::select(interrupt_int.recv().boxed(), interrupt_term.recv().boxed());
if server_archive_name.ends_with(".didx") {
- let index = client
- .download_dynamic_index(&manifest, &server_archive_name)
- .await?;
- let most_used = index.find_most_used_chunks(8);
- let chunk_reader = RemoteChunkReader::new(
+ let decoder = helper::get_pxar_fuse_accessor(
+ &archive_name,
+ payload_archive_name.as_deref(),
client.clone(),
- crypt_config,
- file_info.chunk_crypt_mode(),
- most_used,
- );
- let reader = BufferedDynamicReader::new(index, chunk_reader);
- let archive_size = reader.archive_size();
- let reader: pbs_pxar_fuse::Reader = Arc::new(BufferedDynamicReadAt::new(reader));
- let decoder =
- pbs_pxar_fuse::Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
+ &manifest,
+ crypt_config.clone(),
+ )
+ .await?;
let session =
pbs_pxar_fuse::Session::mount(decoder, options, false, Path::new(target.unwrap()))
diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs
index 6a6379f27..9be52e8b1 100644
--- a/proxmox-file-restore/src/main.rs
+++ b/proxmox-file-restore/src/main.rs
@@ -24,6 +24,7 @@ use pbs_api_types::{file_restore::FileRestoreFormat, BackupDir, BackupNamespace,
use pbs_client::pxar::{create_tar, create_zip, extract_sub_dir, extract_sub_dir_seq};
use pbs_client::tools::{
complete_group_or_snapshot, complete_repository, connect, extract_repository_from_value,
+ has_pxar_filename_extension,
key_source::{
crypto_parameters_keep_fd, format_key_source, get_encryption_key_password, KEYFD_SCHEMA,
KEYFILE_SCHEMA,
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 10/58] api: datastore: attach split archive payload chunk reader
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (8 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 09/58] client: mount: make split pxar archives mountable Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 11/58] catalog: shell: make split pxar archives accessible Christian Ebner
` (47 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Attach the payload chunk reader for pxar archives which have been
uploaded using split streams for metadata and payload data.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- use get_pxar_archive_names helper
src/api2/admin/datastore.rs | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index af1c12cc0..34a9105dd 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1810,10 +1810,20 @@ pub fn pxar_file_download(
}
}
+ let (pxar_name, payload_archive_name) =
+ pbs_client::tools::get_pxar_archive_names(pxar_name, &manifest)?;
let (reader, archive_size) =
- get_local_pxar_reader(datastore.clone(), &manifest, &backup_dir, pxar_name)?;
+ get_local_pxar_reader(datastore.clone(), &manifest, &backup_dir, &pxar_name)?;
+
+ let reader = if let Some(payload_archive_name) = payload_archive_name {
+ let payload_input =
+ get_local_pxar_reader(datastore, &manifest, &backup_dir, &payload_archive_name)?;
+ pxar::PxarVariant::Split(reader, payload_input)
+ } else {
+ pxar::PxarVariant::Unified(reader)
+ };
+ let decoder = Accessor::new(reader, archive_size).await?;
- let decoder = Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
let root = decoder.open_root().await?;
let path = OsStr::from_bytes(file_path).to_os_string();
let file = root
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 11/58] catalog: shell: make split pxar archives accessible
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (9 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 10/58] api: datastore: attach split archive payload chunk reader Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 12/58] www: cover metadata extension for pxar archives Christian Ebner
` (46 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Cover the cases where the pxar archive was uploaded as split payload
data and metadata streams. Instantiate the required reader and
decoder instances to access the metadata and payload data archives,
using the corresponding helper methods.
Allows to restore split metadata and payload stream pxar archives via
the catalog shell.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
proxmox-backup-client/src/catalog.rs | 30 ++++++++++++----------------
1 file changed, 13 insertions(+), 17 deletions(-)
diff --git a/proxmox-backup-client/src/catalog.rs b/proxmox-backup-client/src/catalog.rs
index e72b6a1e0..b827c18a9 100644
--- a/proxmox-backup-client/src/catalog.rs
+++ b/proxmox-backup-client/src/catalog.rs
@@ -9,17 +9,19 @@ use proxmox_router::cli::*;
use proxmox_schema::api;
use pbs_api_types::BackupNamespace;
+use pbs_client::tools::has_pxar_filename_extension;
use pbs_client::tools::key_source::get_encryption_key_password;
use pbs_client::{BackupReader, RemoteChunkReader};
use pbs_tools::crypt_config::CryptConfig;
use pbs_tools::json::required_string_param;
+use crate::helper;
use crate::{
complete_backup_snapshot, complete_group_or_snapshot, complete_namespace,
complete_pxar_archive_name, complete_repository, connect, crypto_parameters, decrypt_key,
dir_or_last_from_group, extract_repository_from_value, format_key_source, optional_ns_param,
- record_repository, BackupDir, BufferedDynamicReadAt, BufferedDynamicReader, CatalogReader,
- DynamicIndexReader, IndexFile, Shell, CATALOG_NAME, KEYFD_SCHEMA, REPO_URL_SCHEMA,
+ record_repository, BackupDir, BufferedDynamicReader, CatalogReader, DynamicIndexReader,
+ IndexFile, Shell, CATALOG_NAME, KEYFD_SCHEMA, REPO_URL_SCHEMA,
};
#[api(
@@ -180,7 +182,7 @@ async fn catalog_shell(param: Value) -> Result<(), Error> {
}
};
- let server_archive_name = if archive_name.ends_with(".pxar") {
+ let server_archive_name = if has_pxar_filename_extension(archive_name, false) {
format!("{}.didx", archive_name)
} else {
bail!("Can only mount pxar archives.");
@@ -205,23 +207,17 @@ async fn catalog_shell(param: Value) -> Result<(), Error> {
let (manifest, _) = client.download_manifest().await?;
manifest.check_fingerprint(crypt_config.as_ref().map(Arc::as_ref))?;
- let index = client
- .download_dynamic_index(&manifest, &server_archive_name)
- .await?;
- let most_used = index.find_most_used_chunks(8);
+ let (archive_name, payload_archive_name) =
+ pbs_client::tools::get_pxar_archive_names(&server_archive_name, &manifest)?;
- let file_info = manifest.lookup_file_info(&server_archive_name)?;
- let chunk_reader = RemoteChunkReader::new(
+ let decoder = helper::get_pxar_fuse_accessor(
+ &archive_name,
+ payload_archive_name.as_deref(),
client.clone(),
+ &manifest,
crypt_config.clone(),
- file_info.chunk_crypt_mode(),
- most_used,
- );
- let reader = BufferedDynamicReader::new(index, chunk_reader);
- let archive_size = reader.archive_size();
- let reader: pbs_pxar_fuse::Reader = Arc::new(BufferedDynamicReadAt::new(reader));
- let decoder =
- pbs_pxar_fuse::Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
+ )
+ .await?;
client.download(CATALOG_NAME, &mut tmpfile).await?;
let index = DynamicIndexReader::new(tmpfile)
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 12/58] www: cover metadata extension for pxar archives
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (10 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 11/58] catalog: shell: make split pxar archives accessible Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 13/58] file restore: cover extension for split " Christian Ebner
` (45 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Allows to access the pxar metadata archives for navigation and
download via the Proxmox Backup Server web ui.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
www/datastore/Content.js | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/www/datastore/Content.js b/www/datastore/Content.js
index c2403ff9c..6dd1ab319 100644
--- a/www/datastore/Content.js
+++ b/www/datastore/Content.js
@@ -1050,7 +1050,7 @@ Ext.define('PBS.DataStoreContent', {
tooltip: gettext('Browse'),
getClass: (v, m, { data }) => {
if (
- (data.ty === 'file' && data.filename.endsWith('pxar.didx')) ||
+ (data.ty === 'file' && (data.filename.endsWith('.pxar.didx') || data.filename.endsWith('.mpxar.didx'))) ||
(data.ty === 'ns' && !data.root)
) {
return 'fa fa-folder-open-o';
@@ -1058,7 +1058,9 @@ Ext.define('PBS.DataStoreContent', {
return 'pmx-hidden';
},
isActionDisabled: (v, r, c, i, { data }) =>
- !(data.ty === 'file' && data.filename.endsWith('pxar.didx') && data['crypt-mode'] < 3) && data.ty !== 'ns',
+ !(data.ty === 'file' &&
+ (data.filename.endsWith('.pxar.didx') || data.filename.endsWith('.mpxar.didx')) &&
+ data['crypt-mode'] < 3) && data.ty !== 'ns',
},
],
},
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 13/58] file restore: cover extension for split pxar archives
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (11 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 12/58] www: cover metadata extension for pxar archives Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 14/58] file restore: factor out getting pxar reader Christian Ebner
` (44 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Cover the additional `.mpxar` for metadata archive and `.ppxar` for
the payload data for pxar archives written as split archive.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- prefix patch subject with `file restore` instead of `restore`
proxmox-file-restore/src/main.rs | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs
index 9be52e8b1..61dece97d 100644
--- a/proxmox-file-restore/src/main.rs
+++ b/proxmox-file-restore/src/main.rs
@@ -76,7 +76,7 @@ fn parse_path(path: String, base64: bool) -> Result<ExtractPath, Error> {
(file, path)
};
- if file.ends_with(".pxar.didx") {
+ if has_pxar_filename_extension(&file, true) {
Ok(ExtractPath::Pxar(file, path))
} else if file.ends_with(".img.fidx") {
Ok(ExtractPath::VM(file, path))
@@ -124,11 +124,13 @@ async fn list_files(
ExtractPath::ListArchives => {
let mut entries = vec![];
for file in manifest.files() {
- if !file.filename.ends_with(".pxar.didx") && !file.filename.ends_with(".img.fidx") {
+ if !has_pxar_filename_extension(&file.filename, true)
+ && !file.filename.ends_with(".img.fidx")
+ {
continue;
}
let path = format!("/{}", file.filename);
- let attr = if file.filename.ends_with(".pxar.didx") {
+ let attr = if has_pxar_filename_extension(&file.filename, true) {
// a pxar file is a file archive, so it's root is also a directory root
Some(&DirEntryAttribute::Directory { start: 0 })
} else {
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 14/58] file restore: factor out getting pxar reader
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (12 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 13/58] file restore: cover extension for split " Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 15/58] file restore: cover split metadata and payload archives Christian Ebner
` (43 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Factor out the logic to get the pxar reader into a dedicated function
so it can be reused to get the payload data archive reader instance.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
proxmox-file-restore/src/main.rs | 44 ++++++++++++++++++++------------
1 file changed, 28 insertions(+), 16 deletions(-)
diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs
index 61dece97d..c9a545677 100644
--- a/proxmox-file-restore/src/main.rs
+++ b/proxmox-file-restore/src/main.rs
@@ -35,7 +35,7 @@ use pbs_client::{BackupReader, BackupRepository, RemoteChunkReader};
use pbs_datastore::catalog::{ArchiveEntry, CatalogReader, DirEntryAttribute};
use pbs_datastore::dynamic_index::{BufferedDynamicReader, LocalDynamicReadAt};
use pbs_datastore::index::IndexFile;
-use pbs_datastore::CATALOG_NAME;
+use pbs_datastore::{BackupManifest, CATALOG_NAME};
use pbs_key_config::decrypt_key;
use pbs_tools::crypt_config::CryptConfig;
@@ -328,6 +328,31 @@ async fn list(
Ok(())
}
+async fn get_remote_pxar_reader(
+ archive_name: &str,
+ client: Arc<BackupReader>,
+ manifest: &BackupManifest,
+ crypt_config: Option<Arc<CryptConfig>>,
+) -> Result<(LocalDynamicReadAt<RemoteChunkReader>, u64), Error> {
+ let index = client
+ .download_dynamic_index(&manifest, &archive_name)
+ .await?;
+ let most_used = index.find_most_used_chunks(8);
+
+ let file_info = manifest.lookup_file_info(&archive_name)?;
+ let chunk_reader = RemoteChunkReader::new(
+ client.clone(),
+ crypt_config,
+ file_info.chunk_crypt_mode(),
+ most_used,
+ );
+
+ let reader = BufferedDynamicReader::new(index, chunk_reader);
+ let archive_size = reader.archive_size();
+
+ Ok((LocalDynamicReadAt::new(reader), archive_size))
+}
+
#[api(
input: {
properties: {
@@ -445,21 +470,8 @@ async fn extract(
match path {
ExtractPath::Pxar(archive_name, path) => {
- let file_info = manifest.lookup_file_info(&archive_name)?;
- let index = client
- .download_dynamic_index(&manifest, &archive_name)
- .await?;
- let most_used = index.find_most_used_chunks(8);
- let chunk_reader = RemoteChunkReader::new(
- client.clone(),
- crypt_config,
- file_info.chunk_crypt_mode(),
- most_used,
- );
- let reader = BufferedDynamicReader::new(index, chunk_reader);
-
- let archive_size = reader.archive_size();
- let reader = LocalDynamicReadAt::new(reader);
+ let (reader, archive_size) =
+ get_remote_pxar_reader(&archive_name, client, &manifest, crypt_config).await?;
let decoder = Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
extract_to_target(decoder, &path, target, format, zstd).await?;
}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 15/58] file restore: cover split metadata and payload archives
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (13 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 14/58] file restore: factor out getting pxar reader Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 16/58] file restore: show more error context when extraction fails Christian Ebner
` (42 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Attach the payload data archive as input stream to the decoder
and accessor instances for split archives.
Allows to restore contents from split archives via the
`proxmox-file-restore extract` command, by passing the metadata
archive name.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- use get_pxar_archive_names helper function
proxmox-file-restore/src/main.rs | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs
index c9a545677..e6a21afa2 100644
--- a/proxmox-file-restore/src/main.rs
+++ b/proxmox-file-restore/src/main.rs
@@ -470,9 +470,26 @@ async fn extract(
match path {
ExtractPath::Pxar(archive_name, path) => {
- let (reader, archive_size) =
- get_remote_pxar_reader(&archive_name, client, &manifest, crypt_config).await?;
- let decoder = Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
+ let (archive_name, payload_archive_name) =
+ pbs_client::tools::get_pxar_archive_names(&archive_name, &manifest)?;
+ let (reader, archive_size) = get_remote_pxar_reader(
+ &archive_name,
+ client.clone(),
+ &manifest,
+ crypt_config.clone(),
+ )
+ .await?;
+
+ let reader = if let Some(payload_archive_name) = payload_archive_name {
+ let (payload_reader, payload_size) =
+ get_remote_pxar_reader(&payload_archive_name, client, &manifest, crypt_config)
+ .await?;
+ pxar::PxarVariant::Split(reader, (payload_reader, payload_size))
+ } else {
+ pxar::PxarVariant::Unified(reader)
+ };
+ let decoder = Accessor::new(reader, archive_size).await?;
+
extract_to_target(decoder, &path, target, format, zstd).await?;
}
ExtractPath::VM(file, path) => {
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 16/58] file restore: show more error context when extraction fails
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (14 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 15/58] file restore: cover split metadata and payload archives Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 17/58] pxar: bin: add optional payload input for archive restore Christian Ebner
` (41 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Otherwise the context swallows the actual, underlying error message.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
proxmox-file-restore/src/main.rs | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs
index e6a21afa2..38cc1ce85 100644
--- a/proxmox-file-restore/src/main.rs
+++ b/proxmox-file-restore/src/main.rs
@@ -490,7 +490,9 @@ async fn extract(
};
let decoder = Accessor::new(reader, archive_size).await?;
- extract_to_target(decoder, &path, target, format, zstd).await?;
+ extract_to_target(decoder, &path, target, format, zstd)
+ .await
+ .map_err(|err| format_err!("error extracting archive - {err:#}"))?;
}
ExtractPath::VM(file, path) => {
let details = SnapRestoreDetails {
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 17/58] pxar: bin: add optional payload input for archive restore
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (15 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 16/58] file restore: show more error context when extraction fails Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 18/58] pxar: bin: cover listing for split archives Christian Ebner
` (40 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Allows to pass the optional payload input to restore for cases where the
regular file payloads are stored in the split archive.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- prefix patch subject with `pxar: bin` instead of `pxar` only
- add missing file completion function
pxar-bin/src/main.rs | 32 ++++++++++++++++++++++++++++----
1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index 8108ec0fb..fe5c91c97 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -25,9 +25,15 @@ fn extract_archive_from_reader<R: std::io::Read>(
target: &str,
feature_flags: Flags,
options: PxarExtractOptions,
+ payload_reader: Option<&mut R>,
) -> Result<(), Error> {
+ let reader = if let Some(payload_reader) = payload_reader {
+ pxar::PxarVariant::Split(reader, payload_reader)
+ } else {
+ pxar::PxarVariant::Unified(reader)
+ };
pbs_client::pxar::extract_archive(
- pxar::decoder::Decoder::from_std(pxar::PxarVariant::Unified(reader))?,
+ pxar::decoder::Decoder::from_std(reader)?,
Path::new(target),
feature_flags,
|path| {
@@ -120,6 +126,10 @@ fn extract_archive_from_reader<R: std::io::Read>(
optional: true,
default: false,
},
+ "payload-input": {
+ description: "'ppxar' payload input data file to restore split archive.",
+ optional: true,
+ },
},
},
)]
@@ -142,6 +152,7 @@ fn extract_archive(
no_fifos: bool,
no_sockets: bool,
strict: bool,
+ payload_input: Option<String>,
) -> Result<(), Error> {
let mut feature_flags = Flags::DEFAULT;
if no_xattrs {
@@ -220,12 +231,24 @@ fn extract_archive(
if archive == "-" {
let stdin = std::io::stdin();
let mut reader = stdin.lock();
- extract_archive_from_reader(&mut reader, target, feature_flags, options)?;
+ extract_archive_from_reader(&mut reader, target, feature_flags, options, None)?;
} else {
log::debug!("PXAR extract: {}", archive);
let file = std::fs::File::open(archive)?;
let mut reader = std::io::BufReader::new(file);
- extract_archive_from_reader(&mut reader, target, feature_flags, options)?;
+ let mut payload_reader = if let Some(payload_input) = payload_input {
+ let file = std::fs::File::open(payload_input)?;
+ Some(std::io::BufReader::new(file))
+ } else {
+ None
+ };
+ extract_archive_from_reader(
+ &mut reader,
+ target,
+ feature_flags,
+ options,
+ payload_reader.as_mut(),
+ )?;
}
if !was_ok.load(Ordering::Acquire) {
@@ -465,7 +488,8 @@ fn main() {
.arg_param(&["archive", "target"])
.completion_cb("archive", complete_file_name)
.completion_cb("target", complete_file_name)
- .completion_cb("files-from", complete_file_name),
+ .completion_cb("files-from", complete_file_name)
+ .completion_cb("payload-input", complete_file_name),
)
.insert(
"mount",
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 18/58] pxar: bin: cover listing for split archives
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (16 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 17/58] pxar: bin: add optional payload input for archive restore Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 19/58] pxar: bin: add more context to extraction error Christian Ebner
` (39 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Allows to list entries of split pxar archives. As the decoder skips
over the file payloads, the corresponding payload file has to be
provided. Otherwise the decoder would skip inside the metadata
archive, leading to incorrect decoding.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- prefix patch subject with `pxar: bin` instead of `pxar` only
- add missing file completion
pxar-bin/src/main.rs | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index fe5c91c97..2657577dc 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -454,12 +454,26 @@ async fn mount_archive(archive: String, mountpoint: String, verbose: bool) -> Re
archive: {
description: "Archive name.",
},
+ "payload-input": {
+ description: "'ppxar' payload input data file for split archive.",
+ optional: true,
+ },
},
},
)]
/// List the contents of an archive.
-fn dump_archive(archive: String) -> Result<(), Error> {
- for entry in pxar::decoder::Decoder::open(pxar::PxarVariant::Unified(archive))? {
+fn dump_archive(archive: String, payload_input: Option<String>) -> Result<(), Error> {
+ if archive.ends_with(".mpxar") && payload_input.is_none() {
+ bail!("Payload input required for split pxar archives");
+ }
+
+ let input = if let Some(payload_input) = payload_input {
+ pxar::PxarVariant::Split(archive, payload_input)
+ } else {
+ pxar::PxarVariant::Unified(archive)
+ };
+
+ for entry in pxar::decoder::Decoder::open(input)? {
let entry = entry?;
if log::log_enabled!(log::Level::Debug) {
@@ -502,7 +516,8 @@ fn main() {
"list",
CliCommand::new(&API_METHOD_DUMP_ARCHIVE)
.arg_param(&["archive"])
- .completion_cb("archive", complete_file_name),
+ .completion_cb("archive", complete_file_name)
+ .completion_cb("payload-input", complete_file_name),
);
let rpcenv = CliEnvironment::new();
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 19/58] pxar: bin: add more context to extraction error
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (17 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 18/58] pxar: bin: cover listing for split archives Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 20/58] client: pxar: include payload offset in entry listing Christian Ebner
` (38 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Show more of the extraction error context provided by the pxar decoder.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- prefix patch subject with `pxar: bin` instead of `pxar` only
pxar-bin/src/main.rs | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index 2657577dc..b4c8f0626 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -231,7 +231,8 @@ fn extract_archive(
if archive == "-" {
let stdin = std::io::stdin();
let mut reader = stdin.lock();
- extract_archive_from_reader(&mut reader, target, feature_flags, options, None)?;
+ extract_archive_from_reader(&mut reader, target, feature_flags, options, None)
+ .map_err(|err| format_err!("error extracting archive - {err:#}"))?;
} else {
log::debug!("PXAR extract: {}", archive);
let file = std::fs::File::open(archive)?;
@@ -248,7 +249,8 @@ fn extract_archive(
feature_flags,
options,
payload_reader.as_mut(),
- )?;
+ )
+ .map_err(|err| format_err!("error extracting archive - {err:#}"))?
}
if !was_ok.load(Ordering::Acquire) {
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 20/58] client: pxar: include payload offset in entry listing
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (18 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 19/58] pxar: bin: add more context to extraction error Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 21/58] client: pxar: helper for lookup of reusable dynamic entries Christian Ebner
` (37 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Also display the payload offset as listing output when the regular file
entry had a payload reference rather than the payload encoded in the
archive. This allows for debugging by inspecting the raw payload data
file at given offset.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar/tools.rs | 116 ++++++++++++++++++++++++-----------
1 file changed, 80 insertions(+), 36 deletions(-)
diff --git a/pbs-client/src/pxar/tools.rs b/pbs-client/src/pxar/tools.rs
index 0cfbaf5b9..459951d50 100644
--- a/pbs-client/src/pxar/tools.rs
+++ b/pbs-client/src/pxar/tools.rs
@@ -128,25 +128,42 @@ pub fn format_single_line_entry(entry: &Entry) -> String {
let meta = entry.metadata();
- let (size, link) = match entry.kind() {
- EntryKind::File { size, .. } => (format!("{}", *size), String::new()),
- EntryKind::Symlink(link) => ("0".to_string(), format!(" -> {:?}", link.as_os_str())),
- EntryKind::Hardlink(link) => ("0".to_string(), format!(" -> {:?}", link.as_os_str())),
- EntryKind::Device(dev) => (format!("{},{}", dev.major, dev.minor), String::new()),
- _ => ("0".to_string(), String::new()),
+ let (size, link, payload_offset) = match entry.kind() {
+ EntryKind::File {
+ size,
+ payload_offset,
+ ..
+ } => (format!("{}", *size), String::new(), *payload_offset),
+ EntryKind::Symlink(link) => ("0".to_string(), format!(" -> {:?}", link.as_os_str()), None),
+ EntryKind::Hardlink(link) => ("0".to_string(), format!(" -> {:?}", link.as_os_str()), None),
+ EntryKind::Device(dev) => (format!("{},{}", dev.major, dev.minor), String::new(), None),
+ _ => ("0".to_string(), String::new(), None),
};
let owner_string = format!("{}/{}", meta.stat.uid, meta.stat.gid);
- format!(
- "{} {:<13} {} {:>8} {:?}{}",
- mode_string,
- owner_string,
- format_mtime(&meta.stat.mtime),
- size,
- entry.path(),
- link,
- )
+ if let Some(offset) = payload_offset {
+ format!(
+ "{} {:<13} {} {:>8} {:?}{} {}",
+ mode_string,
+ owner_string,
+ format_mtime(&meta.stat.mtime),
+ size,
+ entry.path(),
+ link,
+ offset,
+ )
+ } else {
+ format!(
+ "{} {:<13} {} {:>8} {:?}{}",
+ mode_string,
+ owner_string,
+ format_mtime(&meta.stat.mtime),
+ size,
+ entry.path(),
+ link,
+ )
+ }
}
pub fn format_multi_line_entry(entry: &Entry) -> String {
@@ -154,17 +171,23 @@ pub fn format_multi_line_entry(entry: &Entry) -> String {
let meta = entry.metadata();
- let (size, link, type_name) = match entry.kind() {
- EntryKind::File { size, .. } => (format!("{}", *size), String::new(), "file"),
+ let (size, link, type_name, payload_offset) = match entry.kind() {
+ EntryKind::File {
+ size,
+ payload_offset,
+ ..
+ } => (format!("{}", *size), String::new(), "file", *payload_offset),
EntryKind::Symlink(link) => (
"0".to_string(),
format!(" -> {:?}", link.as_os_str()),
"symlink",
+ None,
),
EntryKind::Hardlink(link) => (
"0".to_string(),
format!(" -> {:?}", link.as_os_str()),
"symlink",
+ None,
),
EntryKind::Device(dev) => (
format!("{},{}", dev.major, dev.minor),
@@ -176,11 +199,12 @@ pub fn format_multi_line_entry(entry: &Entry) -> String {
} else {
"device"
},
+ None,
),
- EntryKind::Socket => ("0".to_string(), String::new(), "socket"),
- EntryKind::Fifo => ("0".to_string(), String::new(), "fifo"),
- EntryKind::Directory => ("0".to_string(), String::new(), "directory"),
- EntryKind::GoodbyeTable => ("0".to_string(), String::new(), "bad entry"),
+ EntryKind::Socket => ("0".to_string(), String::new(), "socket", None),
+ EntryKind::Fifo => ("0".to_string(), String::new(), "fifo", None),
+ EntryKind::Directory => ("0".to_string(), String::new(), "directory", None),
+ EntryKind::GoodbyeTable => ("0".to_string(), String::new(), "bad entry", None),
};
let file_name = match std::str::from_utf8(entry.path().as_os_str().as_bytes()) {
@@ -188,19 +212,39 @@ pub fn format_multi_line_entry(entry: &Entry) -> String {
Err(_) => std::borrow::Cow::Owned(format!("{:?}", entry.path())),
};
- format!(
- " File: {}{}\n \
- Size: {:<13} Type: {}\n\
- Access: ({:o}/{}) Uid: {:<5} Gid: {:<5}\n\
- Modify: {}\n",
- file_name,
- link,
- size,
- type_name,
- meta.file_mode(),
- mode_string,
- meta.stat.uid,
- meta.stat.gid,
- format_mtime(&meta.stat.mtime),
- )
+ if let Some(offset) = payload_offset {
+ format!(
+ " File: {}{}\n \
+ Size: {:<13} Type: {}\n\
+ Access: ({:o}/{}) Uid: {:<5} Gid: {:<5}\n\
+ Modify: {}\n
+ PayloadOffset: {}\n",
+ file_name,
+ link,
+ size,
+ type_name,
+ meta.file_mode(),
+ mode_string,
+ meta.stat.uid,
+ meta.stat.gid,
+ format_mtime(&meta.stat.mtime),
+ offset,
+ )
+ } else {
+ format!(
+ " File: {}{}\n \
+ Size: {:<13} Type: {}\n\
+ Access: ({:o}/{}) Uid: {:<5} Gid: {:<5}\n\
+ Modify: {}\n",
+ file_name,
+ link,
+ size,
+ type_name,
+ meta.file_mode(),
+ mode_string,
+ meta.stat.uid,
+ meta.stat.gid,
+ format_mtime(&meta.stat.mtime),
+ )
+ }
}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 21/58] client: pxar: helper for lookup of reusable dynamic entries
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (19 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 20/58] client: pxar: include payload offset in entry listing Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 22/58] upload stream: implement reused chunk injector Christian Ebner
` (36 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
The helper method allows to lookup the entries of a dynamic index
which fully cover a given offset range. Further, the helper returns
the start padding from the start offset of the dynamic index entry
to the start offset of the given range and the end padding.
This will be used to lookup size and digest for chunks covering the
payload range of a regular file in order to re-use found chunks by
indexing them in the archives index file instead of re-encoding the
payload.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar/create.rs | 70 +++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index cc75f0262..bcf4fb328 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -2,6 +2,7 @@ use std::collections::{HashMap, HashSet};
use std::ffi::{CStr, CString, OsStr};
use std::fmt;
use std::io::{self, Read};
+use std::ops::Range;
use std::os::unix::ffi::OsStrExt;
use std::os::unix::io::{AsRawFd, FromRawFd, IntoRawFd, OwnedFd, RawFd};
use std::path::{Path, PathBuf};
@@ -25,6 +26,8 @@ use proxmox_lang::c_str;
use proxmox_sys::fs::{self, acl, xattr};
use pbs_datastore::catalog::BackupCatalogWriter;
+use pbs_datastore::dynamic_index::DynamicIndexReader;
+use pbs_datastore::index::IndexFile;
use crate::pxar::metadata::errno_is_unsupported;
use crate::pxar::tools::assert_single_path_component;
@@ -780,6 +783,73 @@ impl Archiver {
}
}
+/// Dynamic entry reusable by payload references
+#[derive(Clone, Debug)]
+#[repr(C)]
+pub struct ReusableDynamicEntry {
+ size: u64,
+ padding: u64,
+ digest: [u8; 32],
+}
+
+impl ReusableDynamicEntry {
+ #[inline]
+ pub fn size(&self) -> u64 {
+ self.size
+ }
+
+ #[inline]
+ pub fn digest(&self) -> [u8; 32] {
+ self.digest
+ }
+}
+
+/// List of dynamic entries containing the data given by an offset range
+fn lookup_dynamic_entries(
+ index: &DynamicIndexReader,
+ range: &Range<u64>,
+) -> Result<(Vec<ReusableDynamicEntry>, u64, u64), Error> {
+ let end_idx = index.index_count() - 1;
+ let chunk_end = index.chunk_end(end_idx);
+ let start = index.binary_search(0, 0, end_idx, chunk_end, range.start)?;
+
+ let mut prev_end = if start == 0 {
+ 0
+ } else {
+ index.chunk_end(start - 1)
+ };
+ let padding_start = range.start - prev_end;
+ let mut padding_end = 0;
+
+ let mut indices = Vec::new();
+ for dynamic_entry in &index.index()[start..] {
+ let end = dynamic_entry.end();
+
+ let reusable_dynamic_entry = ReusableDynamicEntry {
+ size: (end - prev_end),
+ padding: 0,
+ digest: dynamic_entry.digest(),
+ };
+ indices.push(reusable_dynamic_entry);
+
+ if range.end < end {
+ padding_end = end - range.end;
+ break;
+ }
+ prev_end = end;
+ }
+
+ if let Some(first) = indices.first_mut() {
+ first.padding += padding_start;
+ }
+
+ if let Some(last) = indices.last_mut() {
+ last.padding += padding_end;
+ }
+
+ Ok((indices, padding_start, padding_end))
+}
+
fn get_metadata(
fd: RawFd,
stat: &FileStat,
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 22/58] upload stream: implement reused chunk injector
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (20 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 21/58] client: pxar: helper for lookup of reusable dynamic entries Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 23/58] client: chunk stream: add struct to hold injection state Christian Ebner
` (35 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
In order to be included in the backups index file, reused payload
chunks have to be injected into the payload upload stream at a
forced boundary. The chunker forces a chunk boundary and sends the
list of reusable dynamic entries to be uploaded.
This implements the logic to receive these dynamic entries via the
corresponding communication channel from the chunker and inject the
entries into the backup upload stream by looking for the matching
chunk boundary, already forced by the chunker.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- got rid of unused, leftover buffer
pbs-client/src/inject_reused_chunks.rs | 127 +++++++++++++++++++++++++
pbs-client/src/lib.rs | 1 +
2 files changed, 128 insertions(+)
create mode 100644 pbs-client/src/inject_reused_chunks.rs
diff --git a/pbs-client/src/inject_reused_chunks.rs b/pbs-client/src/inject_reused_chunks.rs
new file mode 100644
index 000000000..4b2922012
--- /dev/null
+++ b/pbs-client/src/inject_reused_chunks.rs
@@ -0,0 +1,127 @@
+use std::cmp;
+use std::pin::Pin;
+use std::sync::atomic::{AtomicUsize, Ordering};
+use std::sync::{mpsc, Arc};
+use std::task::{Context, Poll};
+
+use anyhow::{anyhow, Error};
+use futures::{ready, Stream};
+use pin_project_lite::pin_project;
+
+use crate::pxar::create::ReusableDynamicEntry;
+
+pin_project! {
+ pub struct InjectReusedChunksQueue<S> {
+ #[pin]
+ input: S,
+ next_injection: Option<InjectChunks>,
+ injections: Option<mpsc::Receiver<InjectChunks>>,
+ stream_len: Arc<AtomicUsize>,
+ }
+}
+
+type StreamOffset = u64;
+#[derive(Debug)]
+/// Holds a list of chunks to inject at the given boundary by forcing a chunk boundary.
+pub struct InjectChunks {
+ /// Offset at which to force the boundary
+ pub boundary: StreamOffset,
+ /// List of chunks to inject
+ pub chunks: Vec<ReusableDynamicEntry>,
+ /// Cumulative size of the chunks in the list
+ pub size: usize,
+}
+
+/// Variants for stream consumer to distinguish between raw data chunks and injected ones.
+pub enum InjectedChunksInfo {
+ Known(Vec<ReusableDynamicEntry>),
+ Raw(bytes::BytesMut),
+}
+
+pub trait InjectReusedChunks: Sized {
+ fn inject_reused_chunks(
+ self,
+ injections: Option<mpsc::Receiver<InjectChunks>>,
+ stream_len: Arc<AtomicUsize>,
+ ) -> InjectReusedChunksQueue<Self>;
+}
+
+impl<S> InjectReusedChunks for S
+where
+ S: Stream<Item = Result<bytes::BytesMut, Error>>,
+{
+ fn inject_reused_chunks(
+ self,
+ injections: Option<mpsc::Receiver<InjectChunks>>,
+ stream_len: Arc<AtomicUsize>,
+ ) -> InjectReusedChunksQueue<Self> {
+ InjectReusedChunksQueue {
+ input: self,
+ next_injection: None,
+ injections,
+ stream_len,
+ }
+ }
+}
+
+impl<S> Stream for InjectReusedChunksQueue<S>
+where
+ S: Stream<Item = Result<bytes::BytesMut, Error>>,
+{
+ type Item = Result<InjectedChunksInfo, Error>;
+
+ fn poll_next(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Option<Self::Item>> {
+ let mut this = self.project();
+
+ // loop to skip over possible empty chunks
+ loop {
+ if this.next_injection.is_none() {
+ if let Some(injections) = this.injections.as_mut() {
+ if let Ok(injection) = injections.try_recv() {
+ *this.next_injection = Some(injection);
+ }
+ }
+ }
+
+ if let Some(inject) = this.next_injection.take() {
+ // got reusable dynamic entries to inject
+ let offset = this.stream_len.load(Ordering::SeqCst) as u64;
+
+ match inject.boundary.cmp(&offset) {
+ // inject now
+ cmp::Ordering::Equal => {
+ let chunk_info = InjectedChunksInfo::Known(inject.chunks);
+ return Poll::Ready(Some(Ok(chunk_info)));
+ }
+ // inject later
+ cmp::Ordering::Greater => *this.next_injection = Some(inject),
+ // incoming new chunks and injections didn't line up?
+ cmp::Ordering::Less => {
+ return Poll::Ready(Some(Err(anyhow!("invalid injection boundary"))))
+ }
+ }
+ }
+
+ // nothing to inject now, await further input
+ match ready!(this.input.as_mut().poll_next(cx)) {
+ None => {
+ if let Some(injections) = this.injections.as_mut() {
+ if this.next_injection.is_some() || injections.try_recv().is_ok() {
+ // stream finished, but remaining dynamic entries to inject
+ return Poll::Ready(Some(Err(anyhow!(
+ "injection queue not fully consumed"
+ ))));
+ }
+ }
+ // stream finished and all dynamic entries already injected
+ return Poll::Ready(None);
+ }
+ Some(Err(err)) => return Poll::Ready(Some(Err(err))),
+ // ignore empty chunks, injected chunks from queue at forced boundary, but boundary
+ // did not require splitting of the raw stream buffer to force the boundary
+ Some(Ok(raw)) if raw.is_empty() => continue,
+ Some(Ok(raw)) => return Poll::Ready(Some(Ok(InjectedChunksInfo::Raw(raw)))),
+ }
+ }
+ }
+}
diff --git a/pbs-client/src/lib.rs b/pbs-client/src/lib.rs
index 21cf8556b..3e7bd2a8b 100644
--- a/pbs-client/src/lib.rs
+++ b/pbs-client/src/lib.rs
@@ -7,6 +7,7 @@ pub mod catalog_shell;
pub mod pxar;
pub mod tools;
+mod inject_reused_chunks;
mod merge_known_chunks;
pub mod pipe_to_stream;
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 23/58] client: chunk stream: add struct to hold injection state
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (21 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 22/58] upload stream: implement reused chunk injector Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 24/58] chunker: add method to reset chunker state Christian Ebner
` (34 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Adds a dedicated structure to hold the optional sender and receiver
instances and state for injection of reused dynamic entries in the
payload stream for split stream pxar archives.
The asynchronous channels must only be attached to the payload
archive, leaving the current behavior for the metadata archive and
current default encoding without reusing payload chunks of previous
snapshots.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/chunk_stream.rs | 23 +++++++++++++++++++++++
pbs-client/src/lib.rs | 2 +-
2 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/pbs-client/src/chunk_stream.rs b/pbs-client/src/chunk_stream.rs
index 895f6eae2..83c75ba28 100644
--- a/pbs-client/src/chunk_stream.rs
+++ b/pbs-client/src/chunk_stream.rs
@@ -1,4 +1,5 @@
use std::pin::Pin;
+use std::sync::mpsc;
use std::task::{Context, Poll};
use anyhow::Error;
@@ -8,6 +9,28 @@ use futures::stream::{Stream, TryStream};
use pbs_datastore::Chunker;
+use crate::inject_reused_chunks::InjectChunks;
+
+/// Holds the queues for optional injection of reused dynamic index entries
+pub struct InjectionData {
+ boundaries: mpsc::Receiver<InjectChunks>,
+ injections: mpsc::Sender<InjectChunks>,
+ consumed: u64,
+}
+
+impl InjectionData {
+ pub fn new(
+ boundaries: mpsc::Receiver<InjectChunks>,
+ injections: mpsc::Sender<InjectChunks>,
+ ) -> Self {
+ Self {
+ boundaries,
+ injections,
+ consumed: 0,
+ }
+ }
+}
+
/// Split input stream into dynamic sized chunks
pub struct ChunkStream<S: Unpin> {
input: S,
diff --git a/pbs-client/src/lib.rs b/pbs-client/src/lib.rs
index 3e7bd2a8b..3d2da27b9 100644
--- a/pbs-client/src/lib.rs
+++ b/pbs-client/src/lib.rs
@@ -39,6 +39,6 @@ mod backup_specification;
pub use backup_specification::*;
mod chunk_stream;
-pub use chunk_stream::{ChunkStream, FixedChunkStream};
+pub use chunk_stream::{ChunkStream, FixedChunkStream, InjectionData};
pub const PROXMOX_BACKUP_TCP_KEEPALIVE_TIME: u32 = 120;
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 24/58] chunker: add method to reset chunker state
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (22 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 23/58] client: chunk stream: add struct to hold injection state Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 25/58] client: streams: add channels for dynamic entry injection Christian Ebner
` (33 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
When forcing a boundary, the internal chunker state is not in sync
with the chunk stream anymore. The reset method therefore allows
to reset the internal state.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-datastore/src/chunker.rs | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/pbs-datastore/src/chunker.rs b/pbs-datastore/src/chunker.rs
index 712751829..253d2cf4c 100644
--- a/pbs-datastore/src/chunker.rs
+++ b/pbs-datastore/src/chunker.rs
@@ -167,6 +167,12 @@ impl Chunker {
0
}
+ pub fn reset(&mut self) {
+ self.h = 0;
+ self.chunk_size = 0;
+ self.window_size = 0;
+ }
+
// fast implementation avoiding modulo
// #[inline(always)]
fn shall_break(&self) -> bool {
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 25/58] client: streams: add channels for dynamic entry injection
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (23 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 24/58] chunker: add method to reset chunker state Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 26/58] specs: add backup detection mode specification Christian Ebner
` (32 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
To reuse dynamic entries of a previous backup run and index them for
the new snapshot. Adds a non-blocking channel between the pxar
archiver and the chunk stream, as well as the chunk stream and the
backup writer.
The archiver sends forced boundary positions and the dynamic
entries to inject into the chunk stream following this boundary.
The chunk stream consumes this channel inputs as receiver whenever a
new chunk is requested by the upload stream, forcing a non-regular
chunk boundary in the pxar stream at the requested positions.
The dynamic entries to inject and the boundary are then send via the
second asynchronous channel to the backup writer's upload stream,
indexing them by inserting the dynamic entries as known chunks into
the upload stream.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
examples/test_chunk_speed2.rs | 2 +-
pbs-client/src/backup_writer.rs | 98 ++++++++++++-------
pbs-client/src/chunk_stream.rs | 78 ++++++++++++++-
pbs-client/src/pxar/create.rs | 6 +-
pbs-client/src/pxar_backup_stream.rs | 8 +-
proxmox-backup-client/src/main.rs | 28 ++++--
.../src/proxmox_restore_daemon/api.rs | 2 +-
pxar-bin/src/main.rs | 1 +
tests/catar.rs | 1 +
9 files changed, 171 insertions(+), 53 deletions(-)
diff --git a/examples/test_chunk_speed2.rs b/examples/test_chunk_speed2.rs
index 3f69b436d..22dd14ce2 100644
--- a/examples/test_chunk_speed2.rs
+++ b/examples/test_chunk_speed2.rs
@@ -26,7 +26,7 @@ async fn run() -> Result<(), Error> {
.map_err(Error::from);
//let chunk_stream = FixedChunkStream::new(stream, 4*1024*1024);
- let mut chunk_stream = ChunkStream::new(stream, None);
+ let mut chunk_stream = ChunkStream::new(stream, None, None);
let start_time = std::time::Instant::now();
diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index dc9aa569f..b2ada85cd 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -23,6 +23,7 @@ use pbs_tools::crypt_config::CryptConfig;
use proxmox_human_byte::HumanByte;
+use super::inject_reused_chunks::{InjectChunks, InjectReusedChunks, InjectedChunksInfo};
use super::merge_known_chunks::{MergeKnownChunks, MergedChunkInfo};
use super::{H2Client, HttpClient};
@@ -265,6 +266,7 @@ impl BackupWriter {
archive_name: &str,
stream: impl Stream<Item = Result<bytes::BytesMut, Error>>,
options: UploadOptions,
+ injections: Option<std::sync::mpsc::Receiver<InjectChunks>>,
) -> Result<BackupStats, Error> {
let known_chunks = Arc::new(Mutex::new(HashSet::new()));
@@ -341,6 +343,7 @@ impl BackupWriter {
None
},
options.compress,
+ injections,
)
.await?;
@@ -636,6 +639,7 @@ impl BackupWriter {
known_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
crypt_config: Option<Arc<CryptConfig>>,
compress: bool,
+ injections: Option<std::sync::mpsc::Receiver<InjectChunks>>,
) -> impl Future<Output = Result<UploadStats, Error>> {
let total_chunks = Arc::new(AtomicUsize::new(0));
let total_chunks2 = total_chunks.clone();
@@ -662,48 +666,72 @@ impl BackupWriter {
let index_csum_2 = index_csum.clone();
stream
- .and_then(move |data| {
- let chunk_len = data.len();
+ .inject_reused_chunks(injections, stream_len.clone())
+ .and_then(move |chunk_info| match chunk_info {
+ InjectedChunksInfo::Known(chunks) => {
+ // account for injected chunks
+ let count = chunks.len();
+ total_chunks.fetch_add(count, Ordering::SeqCst);
+
+ let mut known = Vec::new();
+ let mut guard = index_csum.lock().unwrap();
+ let csum = guard.as_mut().unwrap();
+ for chunk in chunks {
+ let offset =
+ stream_len.fetch_add(chunk.size() as usize, Ordering::SeqCst) as u64;
+ reused_len.fetch_add(chunk.size() as usize, Ordering::SeqCst);
+ let digest = chunk.digest();
+ known.push((offset, digest));
+ let end_offset = offset + chunk.size();
+ csum.update(&end_offset.to_le_bytes());
+ csum.update(&digest);
+ }
+ future::ok(MergedChunkInfo::Known(known))
+ }
+ InjectedChunksInfo::Raw(data) => {
+ // account for not injected chunks (new and known)
+ let chunk_len = data.len();
- total_chunks.fetch_add(1, Ordering::SeqCst);
- let offset = stream_len.fetch_add(chunk_len, Ordering::SeqCst) as u64;
+ total_chunks.fetch_add(1, Ordering::SeqCst);
+ let offset = stream_len.fetch_add(chunk_len, Ordering::SeqCst) as u64;
- let mut chunk_builder = DataChunkBuilder::new(data.as_ref()).compress(compress);
+ let mut chunk_builder = DataChunkBuilder::new(data.as_ref()).compress(compress);
- if let Some(ref crypt_config) = crypt_config {
- chunk_builder = chunk_builder.crypt_config(crypt_config);
- }
+ if let Some(ref crypt_config) = crypt_config {
+ chunk_builder = chunk_builder.crypt_config(crypt_config);
+ }
- let mut known_chunks = known_chunks.lock().unwrap();
- let digest = chunk_builder.digest();
+ let mut known_chunks = known_chunks.lock().unwrap();
+ let digest = chunk_builder.digest();
- let mut guard = index_csum.lock().unwrap();
- let csum = guard.as_mut().unwrap();
+ let mut guard = index_csum.lock().unwrap();
+ let csum = guard.as_mut().unwrap();
- let chunk_end = offset + chunk_len as u64;
+ let chunk_end = offset + chunk_len as u64;
- if !is_fixed_chunk_size {
- csum.update(&chunk_end.to_le_bytes());
- }
- csum.update(digest);
-
- let chunk_is_known = known_chunks.contains(digest);
- if chunk_is_known {
- known_chunk_count.fetch_add(1, Ordering::SeqCst);
- reused_len.fetch_add(chunk_len, Ordering::SeqCst);
- future::ok(MergedChunkInfo::Known(vec![(offset, *digest)]))
- } else {
- let compressed_stream_len2 = compressed_stream_len.clone();
- known_chunks.insert(*digest);
- future::ready(chunk_builder.build().map(move |(chunk, digest)| {
- compressed_stream_len2.fetch_add(chunk.raw_size(), Ordering::SeqCst);
- MergedChunkInfo::New(ChunkInfo {
- chunk,
- digest,
- chunk_len: chunk_len as u64,
- offset,
- })
- }))
+ if !is_fixed_chunk_size {
+ csum.update(&chunk_end.to_le_bytes());
+ }
+ csum.update(digest);
+
+ let chunk_is_known = known_chunks.contains(digest);
+ if chunk_is_known {
+ known_chunk_count.fetch_add(1, Ordering::SeqCst);
+ reused_len.fetch_add(chunk_len, Ordering::SeqCst);
+ future::ok(MergedChunkInfo::Known(vec![(offset, *digest)]))
+ } else {
+ let compressed_stream_len2 = compressed_stream_len.clone();
+ known_chunks.insert(*digest);
+ future::ready(chunk_builder.build().map(move |(chunk, digest)| {
+ compressed_stream_len2.fetch_add(chunk.raw_size(), Ordering::SeqCst);
+ MergedChunkInfo::New(ChunkInfo {
+ chunk,
+ digest,
+ chunk_len: chunk_len as u64,
+ offset,
+ })
+ }))
+ }
}
})
.merge_known_chunks()
diff --git a/pbs-client/src/chunk_stream.rs b/pbs-client/src/chunk_stream.rs
index 83c75ba28..87a018d50 100644
--- a/pbs-client/src/chunk_stream.rs
+++ b/pbs-client/src/chunk_stream.rs
@@ -14,6 +14,7 @@ use crate::inject_reused_chunks::InjectChunks;
/// Holds the queues for optional injection of reused dynamic index entries
pub struct InjectionData {
boundaries: mpsc::Receiver<InjectChunks>,
+ next_boundary: Option<InjectChunks>,
injections: mpsc::Sender<InjectChunks>,
consumed: u64,
}
@@ -25,6 +26,7 @@ impl InjectionData {
) -> Self {
Self {
boundaries,
+ next_boundary: None,
injections,
consumed: 0,
}
@@ -37,15 +39,17 @@ pub struct ChunkStream<S: Unpin> {
chunker: Chunker,
buffer: BytesMut,
scan_pos: usize,
+ injection_data: Option<InjectionData>,
}
impl<S: Unpin> ChunkStream<S> {
- pub fn new(input: S, chunk_size: Option<usize>) -> Self {
+ pub fn new(input: S, chunk_size: Option<usize>, injection_data: Option<InjectionData>) -> Self {
Self {
input,
chunker: Chunker::new(chunk_size.unwrap_or(4 * 1024 * 1024)),
buffer: BytesMut::new(),
scan_pos: 0,
+ injection_data,
}
}
}
@@ -62,7 +66,70 @@ where
fn poll_next(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Option<Self::Item>> {
let this = self.get_mut();
+
loop {
+ if let Some(InjectionData {
+ boundaries,
+ next_boundary,
+ injections,
+ consumed,
+ }) = this.injection_data.as_mut()
+ {
+ if next_boundary.is_none() {
+ if let Ok(boundary) = boundaries.try_recv() {
+ *next_boundary = Some(boundary);
+ }
+ }
+
+ if let Some(inject) = next_boundary.take() {
+ // require forced boundary, lookup next regular boundary
+ let pos = if this.scan_pos < this.buffer.len() {
+ this.chunker.scan(&this.buffer[this.scan_pos..])
+ } else {
+ 0
+ };
+
+ let chunk_boundary = if pos == 0 {
+ *consumed + this.buffer.len() as u64
+ } else {
+ *consumed + (this.scan_pos + pos) as u64
+ };
+
+ if inject.boundary <= chunk_boundary {
+ // forced boundary is before next boundary, force within current buffer
+ let chunk_size = (inject.boundary - *consumed) as usize;
+ let raw_chunk = this.buffer.split_to(chunk_size);
+ this.chunker.reset();
+ this.scan_pos = 0;
+
+ *consumed += chunk_size as u64;
+
+ // add the size of the injected chunks to consumed, so chunk stream offsets
+ // are in sync with the rest of the archive.
+ *consumed += inject.size as u64;
+
+ injections.send(inject).unwrap();
+
+ // the chunk can be empty, return nevertheless to allow the caller to
+ // make progress by consuming from the injection queue
+ return Poll::Ready(Some(Ok(raw_chunk)));
+ } else if pos != 0 {
+ *next_boundary = Some(inject);
+ // forced boundary is after next boundary, split off chunk from buffer
+ let chunk_size = this.scan_pos + pos;
+ let raw_chunk = this.buffer.split_to(chunk_size);
+ *consumed += chunk_size as u64;
+ this.scan_pos = 0;
+
+ return Poll::Ready(Some(Ok(raw_chunk)));
+ } else {
+ // forced boundary is after current buffer length, continue reading
+ *next_boundary = Some(inject);
+ this.scan_pos = this.buffer.len();
+ }
+ }
+ }
+
if this.scan_pos < this.buffer.len() {
let boundary = this.chunker.scan(&this.buffer[this.scan_pos..]);
@@ -70,11 +137,14 @@ where
if boundary == 0 {
this.scan_pos = this.buffer.len();
- // continue poll
} else if chunk_size <= this.buffer.len() {
- let result = this.buffer.split_to(chunk_size);
+ // found new chunk boundary inside buffer, split off chunk from buffer
+ let raw_chunk = this.buffer.split_to(chunk_size);
+ if let Some(InjectionData { consumed, .. }) = this.injection_data.as_mut() {
+ *consumed += chunk_size as u64;
+ }
this.scan_pos = 0;
- return Poll::Ready(Some(Ok(result)));
+ return Poll::Ready(Some(Ok(raw_chunk)));
} else {
panic!("got unexpected chunk boundary from chunker");
}
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index bcf4fb328..b4ea2ae46 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -6,7 +6,7 @@ use std::ops::Range;
use std::os::unix::ffi::OsStrExt;
use std::os::unix::io::{AsRawFd, FromRawFd, IntoRawFd, OwnedFd, RawFd};
use std::path::{Path, PathBuf};
-use std::sync::{Arc, Mutex};
+use std::sync::{mpsc, Arc, Mutex};
use anyhow::{bail, Context, Error};
use futures::future::BoxFuture;
@@ -29,6 +29,7 @@ use pbs_datastore::catalog::BackupCatalogWriter;
use pbs_datastore::dynamic_index::DynamicIndexReader;
use pbs_datastore::index::IndexFile;
+use crate::inject_reused_chunks::InjectChunks;
use crate::pxar::metadata::errno_is_unsupported;
use crate::pxar::tools::assert_single_path_component;
use crate::pxar::Flags;
@@ -134,6 +135,7 @@ struct Archiver {
hardlinks: HashMap<HardLinkInfo, (PathBuf, LinkOffset)>,
file_copy_buffer: Vec<u8>,
skip_e2big_xattr: bool,
+ forced_boundaries: Option<mpsc::Sender<InjectChunks>>,
}
type Encoder<'a, T> = pxar::encoder::aio::Encoder<'a, T>;
@@ -158,6 +160,7 @@ pub async fn create_archive<T, F>(
feature_flags: Flags,
callback: F,
options: PxarCreateOptions,
+ forced_boundaries: Option<mpsc::Sender<InjectChunks>>,
) -> Result<(), Error>
where
T: SeqWrite + Send,
@@ -213,6 +216,7 @@ where
hardlinks: HashMap::new(),
file_copy_buffer: vec::undefined(4 * 1024 * 1024),
skip_e2big_xattr: options.skip_e2big_xattr,
+ forced_boundaries,
};
archiver
diff --git a/pbs-client/src/pxar_backup_stream.rs b/pbs-client/src/pxar_backup_stream.rs
index 3541eddb5..fb6d063f2 100644
--- a/pbs-client/src/pxar_backup_stream.rs
+++ b/pbs-client/src/pxar_backup_stream.rs
@@ -2,7 +2,7 @@ use std::io::Write;
//use std::os::unix::io::FromRawFd;
use std::path::Path;
use std::pin::Pin;
-use std::sync::{Arc, Mutex};
+use std::sync::{mpsc, Arc, Mutex};
use std::task::{Context, Poll};
use anyhow::{format_err, Error};
@@ -17,6 +17,7 @@ use proxmox_io::StdChannelWriter;
use pbs_datastore::catalog::CatalogWriter;
+use crate::inject_reused_chunks::InjectChunks;
use crate::pxar::create::PxarWriters;
/// Stream implementation to encode and upload .pxar archives.
@@ -42,6 +43,7 @@ impl PxarBackupStream {
dir: Dir,
catalog: Arc<Mutex<CatalogWriter<W>>>,
options: crate::pxar::PxarCreateOptions,
+ boundaries: Option<mpsc::Sender<InjectChunks>>,
separate_payload_stream: bool,
) -> Result<(Self, Option<Self>), Error> {
let buffer_size = 256 * 1024;
@@ -82,6 +84,7 @@ impl PxarBackupStream {
Ok(())
},
options,
+ boundaries,
)
.await
{
@@ -113,11 +116,12 @@ impl PxarBackupStream {
dirname: &Path,
catalog: Arc<Mutex<CatalogWriter<W>>>,
options: crate::pxar::PxarCreateOptions,
+ boundaries: Option<mpsc::Sender<InjectChunks>>,
separate_payload_stream: bool,
) -> Result<(Self, Option<Self>), Error> {
let dir = nix::dir::Dir::open(dirname, OFlag::O_DIRECTORY, Mode::empty())?;
- Self::new(dir, catalog, options, separate_payload_stream)
+ Self::new(dir, catalog, options, boundaries, separate_payload_stream)
}
}
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index ba2c5fa59..75227b3e6 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -45,8 +45,8 @@ use pbs_client::tools::{
use pbs_client::{
delete_ticket_info, parse_backup_specification, view_task_result, BackupReader,
BackupRepository, BackupSpecificationType, BackupStats, BackupWriter, ChunkStream,
- FixedChunkStream, HttpClient, PxarBackupStream, RemoteChunkReader, UploadOptions,
- BACKUP_SOURCE_SCHEMA,
+ FixedChunkStream, HttpClient, InjectionData, PxarBackupStream, RemoteChunkReader,
+ UploadOptions, BACKUP_SOURCE_SCHEMA,
};
use pbs_datastore::catalog::{BackupCatalogWriter, CatalogReader, CatalogWriter};
use pbs_datastore::chunk_store::verify_chunk_size;
@@ -199,14 +199,16 @@ async fn backup_directory<P: AsRef<Path>>(
bail!("cannot backup directory with fixed chunk size!");
}
+ let (payload_boundaries_tx, payload_boundaries_rx) = std::sync::mpsc::channel();
let (pxar_stream, payload_stream) = PxarBackupStream::open(
dir_path.as_ref(),
catalog,
pxar_create_options,
+ Some(payload_boundaries_tx),
payload_target.is_some(),
)?;
- let mut chunk_stream = ChunkStream::new(pxar_stream, chunk_size);
+ let mut chunk_stream = ChunkStream::new(pxar_stream, chunk_size, None);
let (tx, rx) = mpsc::channel(10); // allow to buffer 10 chunks
let stream = ReceiverStream::new(rx).map_err(Error::from);
@@ -218,13 +220,16 @@ async fn backup_directory<P: AsRef<Path>>(
}
});
- let stats = client.upload_stream(archive_name, stream, upload_options.clone());
+ let stats = client.upload_stream(archive_name, stream, upload_options.clone(), None);
if let Some(payload_stream) = payload_stream {
let payload_target = payload_target
.ok_or_else(|| format_err!("got payload stream, but no target archive name"))?;
- let mut payload_chunk_stream = ChunkStream::new(payload_stream, chunk_size);
+ let (payload_injections_tx, payload_injections_rx) = std::sync::mpsc::channel();
+ let injection_data = InjectionData::new(payload_boundaries_rx, payload_injections_tx);
+ let mut payload_chunk_stream =
+ ChunkStream::new(payload_stream, chunk_size, Some(injection_data));
let (payload_tx, payload_rx) = mpsc::channel(10); // allow to buffer 10 chunks
let stream = ReceiverStream::new(payload_rx).map_err(Error::from);
@@ -235,7 +240,12 @@ async fn backup_directory<P: AsRef<Path>>(
}
});
- let payload_stats = client.upload_stream(&payload_target, stream, upload_options);
+ let payload_stats = client.upload_stream(
+ &payload_target,
+ stream,
+ upload_options,
+ Some(payload_injections_rx),
+ );
match futures::join!(stats, payload_stats) {
(Ok(stats), Ok(payload_stats)) => Ok((stats, Some(payload_stats))),
@@ -271,7 +281,7 @@ async fn backup_image<P: AsRef<Path>>(
}
let stats = client
- .upload_stream(archive_name, stream, upload_options)
+ .upload_stream(archive_name, stream, upload_options, None)
.await?;
Ok(stats)
@@ -562,7 +572,7 @@ fn spawn_catalog_upload(
let (catalog_tx, catalog_rx) = std::sync::mpsc::sync_channel(10); // allow to buffer 10 writes
let catalog_stream = proxmox_async::blocking::StdChannelStream(catalog_rx);
let catalog_chunk_size = 512 * 1024;
- let catalog_chunk_stream = ChunkStream::new(catalog_stream, Some(catalog_chunk_size));
+ let catalog_chunk_stream = ChunkStream::new(catalog_stream, Some(catalog_chunk_size), None);
let catalog_writer = Arc::new(Mutex::new(CatalogWriter::new(TokioWriterAdapter::new(
StdChannelWriter::new(catalog_tx),
@@ -578,7 +588,7 @@ fn spawn_catalog_upload(
tokio::spawn(async move {
let catalog_upload_result = client
- .upload_stream(CATALOG_NAME, catalog_chunk_stream, upload_options)
+ .upload_stream(CATALOG_NAME, catalog_chunk_stream, upload_options, None)
.await;
if let Err(ref err) = catalog_upload_result {
diff --git a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
index 95c9f4619..f7fbae093 100644
--- a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
+++ b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
@@ -363,7 +363,7 @@ fn extract(
};
let pxar_writer = pxar::PxarVariant::Unified(TokioWriter::new(writer));
- create_archive(dir, PxarWriters::new(pxar_writer, None), Flags::DEFAULT, |_| Ok(()), options)
+ create_archive(dir, PxarWriters::new(pxar_writer, None), Flags::DEFAULT, |_| Ok(()), options, None)
.await
}
.await;
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index b4c8f0626..a9a5fccdc 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -409,6 +409,7 @@ async fn create_archive(
Ok(())
},
options,
+ None,
)
.await?;
diff --git a/tests/catar.rs b/tests/catar.rs
index 932df61a9..9f83b4cc2 100644
--- a/tests/catar.rs
+++ b/tests/catar.rs
@@ -39,6 +39,7 @@ fn run_test(dir_name: &str) -> Result<(), Error> {
Flags::DEFAULT,
|_| Ok(()),
options,
+ None,
))?;
Command::new("cmp")
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 26/58] specs: add backup detection mode specification
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (24 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 25/58] client: streams: add channels for dynamic entry injection Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 27/58] client: implement prepare reference method Christian Ebner
` (31 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Adds the specification for switching the detection mode used to
identify regular files which changed since a reference backup run.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/backup_specification.rs | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/pbs-client/src/backup_specification.rs b/pbs-client/src/backup_specification.rs
index 619a3a9da..66bf71965 100644
--- a/pbs-client/src/backup_specification.rs
+++ b/pbs-client/src/backup_specification.rs
@@ -1,4 +1,5 @@
use anyhow::{bail, Error};
+use serde::{Deserialize, Serialize};
use proxmox_schema::*;
@@ -45,3 +46,28 @@ pub fn parse_backup_specification(value: &str) -> Result<BackupSpecification, Er
bail!("unable to parse backup source specification '{}'", value);
}
+
+#[api]
+#[derive(Default, Deserialize, Serialize)]
+#[serde(rename_all = "lowercase")]
+/// Mode to detect file changes since last backup run
+pub enum BackupDetectionMode {
+ /// Encode backup as self contained pxar archive
+ #[default]
+ Default,
+ /// Split backup mode, re-encode payload data
+ Data,
+ /// Compare metadata, reuse payload chunks if metadata unchanged
+ Metadata,
+}
+
+impl BackupDetectionMode {
+ /// Selected mode is data based file change detection with split meta/payload streams
+ pub fn is_data(&self) -> bool {
+ matches!(self, Self::Data)
+ }
+ /// Selected mode is metadata based file change detection
+ pub fn is_metadata(&self) -> bool {
+ matches!(self, Self::Metadata)
+ }
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 27/58] client: implement prepare reference method
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (25 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 26/58] specs: add backup detection mode specification Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 28/58] client: pxar: add method for metadata comparison Christian Ebner
` (30 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Implement a method that prepares the decoder instance to access a
previous snapshots metadata index and payload index in order to
pass it to the pxar archiver. The archiver than can utilize these
to compare the metadata for files to the previous state and gather
reusable chunks.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- add check for CryptMode changes between current and previous backup
run
pbs-client/src/pxar/create.rs | 67 ++++++++-
pbs-client/src/pxar/mod.rs | 4 +-
proxmox-backup-client/src/main.rs | 135 +++++++++++++++---
.../src/proxmox_restore_daemon/api.rs | 1 +
pxar-bin/src/main.rs | 1 +
5 files changed, 184 insertions(+), 24 deletions(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index b4ea2ae46..d183d3f6b 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -18,6 +18,8 @@ use nix::sys::stat::{FileStat, Mode};
use pathpatterns::{MatchEntry, MatchFlag, MatchList, MatchType, PatternFlag};
use proxmox_sys::error::SysError;
+use pxar::accessor::aio::{Accessor, Directory};
+use pxar::accessor::ReadAt;
use pxar::encoder::{LinkOffset, SeqWrite};
use pxar::{Metadata, PxarVariant};
@@ -35,7 +37,7 @@ use crate::pxar::tools::assert_single_path_component;
use crate::pxar::Flags;
/// Pxar options for creating a pxar archive/stream
-#[derive(Default, Clone)]
+#[derive(Default)]
pub struct PxarCreateOptions {
/// Device/mountpoint st_dev numbers that should be included. None for no limitation.
pub device_set: Option<HashSet<u64>>,
@@ -47,6 +49,20 @@ pub struct PxarCreateOptions {
pub skip_lost_and_found: bool,
/// Skip xattrs of files that return E2BIG error
pub skip_e2big_xattr: bool,
+ /// Reference state for partial backups
+ pub previous_ref: Option<PxarPrevRef>,
+}
+
+pub type MetadataArchiveReader = Arc<dyn ReadAt + Send + Sync + 'static>;
+
+/// Statefull information of previous backups snapshots for partial backups
+pub struct PxarPrevRef {
+ /// Reference accessor for metadata comparison
+ pub accessor: Accessor<MetadataArchiveReader>,
+ /// Reference index for reusing payload chunks
+ pub payload_index: DynamicIndexReader,
+ /// Reference archive name for partial backups
+ pub archive_name: String,
}
fn detect_fs_type(fd: RawFd) -> Result<i64, Error> {
@@ -136,6 +152,7 @@ struct Archiver {
file_copy_buffer: Vec<u8>,
skip_e2big_xattr: bool,
forced_boundaries: Option<mpsc::Sender<InjectChunks>>,
+ previous_payload_index: Option<DynamicIndexReader>,
}
type Encoder<'a, T> = pxar::encoder::aio::Encoder<'a, T>;
@@ -200,6 +217,15 @@ where
MatchType::Exclude,
)?);
}
+ let (previous_payload_index, previous_metadata_accessor) =
+ if let Some(refs) = options.previous_ref {
+ (
+ Some(refs.payload_index),
+ refs.accessor.open_root().await.ok(),
+ )
+ } else {
+ (None, None)
+ };
let mut archiver = Archiver {
feature_flags,
@@ -217,10 +243,11 @@ where
file_copy_buffer: vec::undefined(4 * 1024 * 1024),
skip_e2big_xattr: options.skip_e2big_xattr,
forced_boundaries,
+ previous_payload_index,
};
archiver
- .archive_dir_contents(&mut encoder, source_dir, true)
+ .archive_dir_contents(&mut encoder, previous_metadata_accessor, source_dir, true)
.await?;
encoder.finish().await?;
encoder.close().await?;
@@ -252,6 +279,7 @@ impl Archiver {
fn archive_dir_contents<'a, T: SeqWrite + Send>(
&'a mut self,
encoder: &'a mut Encoder<'_, T>,
+ mut previous_metadata_accessor: Option<Directory<MetadataArchiveReader>>,
mut dir: Dir,
is_root: bool,
) -> BoxFuture<'a, Result<(), Error>> {
@@ -286,9 +314,15 @@ impl Archiver {
(self.callback)(&file_entry.path)?;
self.path = file_entry.path;
- self.add_entry(encoder, dir_fd, &file_entry.name, &file_entry.stat)
- .await
- .map_err(|err| self.wrap_err(err))?;
+ self.add_entry(
+ encoder,
+ &mut previous_metadata_accessor,
+ dir_fd,
+ &file_entry.name,
+ &file_entry.stat,
+ )
+ .await
+ .map_err(|err| self.wrap_err(err))?;
}
self.path = old_path;
self.entry_counter = entry_counter;
@@ -536,6 +570,7 @@ impl Archiver {
async fn add_entry<T: SeqWrite + Send>(
&mut self,
encoder: &mut Encoder<'_, T>,
+ previous_metadata: &mut Option<Directory<MetadataArchiveReader>>,
parent: RawFd,
c_file_name: &CStr,
stat: &FileStat,
@@ -625,7 +660,14 @@ impl Archiver {
catalog.lock().unwrap().start_directory(c_file_name)?;
}
let result = self
- .add_directory(encoder, dir, c_file_name, &metadata, stat)
+ .add_directory(
+ encoder,
+ previous_metadata,
+ dir,
+ c_file_name,
+ &metadata,
+ stat,
+ )
.await;
if let Some(ref catalog) = self.catalog {
catalog.lock().unwrap().end_directory()?;
@@ -678,6 +720,7 @@ impl Archiver {
async fn add_directory<T: SeqWrite + Send>(
&mut self,
encoder: &mut Encoder<'_, T>,
+ previous_metadata_accessor: &mut Option<Directory<MetadataArchiveReader>>,
dir: Dir,
dir_name: &CStr,
metadata: &Metadata,
@@ -708,7 +751,17 @@ impl Archiver {
log::info!("skipping mount point: {:?}", self.path);
Ok(())
} else {
- self.archive_dir_contents(encoder, dir, false).await
+ let mut dir_accessor = None;
+ if let Some(accessor) = previous_metadata_accessor.as_mut() {
+ if let Some(file_entry) = accessor.lookup(dir_name).await? {
+ if file_entry.entry().is_dir() {
+ let dir = file_entry.enter_directory().await?;
+ dir_accessor = Some(dir);
+ }
+ }
+ }
+ self.archive_dir_contents(encoder, dir_accessor, dir, false)
+ .await
};
self.fs_magic = old_fs_magic;
diff --git a/pbs-client/src/pxar/mod.rs b/pbs-client/src/pxar/mod.rs
index b7dcf8362..5248a1956 100644
--- a/pbs-client/src/pxar/mod.rs
+++ b/pbs-client/src/pxar/mod.rs
@@ -56,7 +56,9 @@ pub(crate) mod tools;
mod flags;
pub use flags::Flags;
-pub use create::{create_archive, PxarCreateOptions, PxarWriters};
+pub use create::{
+ create_archive, MetadataArchiveReader, PxarCreateOptions, PxarPrevRef, PxarWriters,
+};
pub use extract::{
create_tar, create_zip, extract_archive, extract_sub_dir, extract_sub_dir_seq, ErrorHandler,
OverwriteFlags, PxarExtractContext, PxarExtractOptions,
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index 75227b3e6..37412b154 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -21,6 +21,7 @@ use proxmox_router::{cli::*, ApiMethod, RpcEnvironment};
use proxmox_schema::api;
use proxmox_sys::fs::{file_get_json, image_size, replace_file, CreateOptions};
use proxmox_time::{epoch_i64, strftime_local};
+use pxar::accessor::aio::Accessor;
use pxar::accessor::{MaybeReady, ReadAt, ReadAtOperation};
use pbs_api_types::{
@@ -30,7 +31,7 @@ use pbs_api_types::{
BACKUP_TYPE_SCHEMA, TRAFFIC_CONTROL_BURST_SCHEMA, TRAFFIC_CONTROL_RATE_SCHEMA,
};
use pbs_client::catalog_shell::Shell;
-use pbs_client::pxar::ErrorHandler as PxarErrorHandler;
+use pbs_client::pxar::{ErrorHandler as PxarErrorHandler, MetadataArchiveReader, PxarPrevRef};
use pbs_client::tools::{
complete_archive_name, complete_auth_id, complete_backup_group, complete_backup_snapshot,
complete_backup_source, complete_chunk_size, complete_group_or_snapshot,
@@ -43,14 +44,14 @@ use pbs_client::tools::{
CHUNK_SIZE_SCHEMA, REPO_URL_SCHEMA,
};
use pbs_client::{
- delete_ticket_info, parse_backup_specification, view_task_result, BackupReader,
- BackupRepository, BackupSpecificationType, BackupStats, BackupWriter, ChunkStream,
- FixedChunkStream, HttpClient, InjectionData, PxarBackupStream, RemoteChunkReader,
+ delete_ticket_info, parse_backup_specification, view_task_result, BackupDetectionMode,
+ BackupReader, BackupRepository, BackupSpecificationType, BackupStats, BackupWriter,
+ ChunkStream, FixedChunkStream, HttpClient, InjectionData, PxarBackupStream, RemoteChunkReader,
UploadOptions, BACKUP_SOURCE_SCHEMA,
};
use pbs_datastore::catalog::{BackupCatalogWriter, CatalogReader, CatalogWriter};
use pbs_datastore::chunk_store::verify_chunk_size;
-use pbs_datastore::dynamic_index::{BufferedDynamicReader, DynamicIndexReader};
+use pbs_datastore::dynamic_index::{BufferedDynamicReader, DynamicIndexReader, LocalDynamicReadAt};
use pbs_datastore::fixed_index::FixedIndexReader;
use pbs_datastore::index::IndexFile;
use pbs_datastore::manifest::{
@@ -687,6 +688,10 @@ fn spawn_catalog_upload(
schema: TRAFFIC_CONTROL_BURST_SCHEMA,
optional: true,
},
+ "change-detection-mode": {
+ type: BackupDetectionMode,
+ optional: true,
+ },
"exclude": {
type: Array,
description: "List of paths or patterns for matching files to exclude.",
@@ -722,6 +727,7 @@ async fn create_backup(
param: Value,
all_file_systems: bool,
skip_lost_and_found: bool,
+ change_detection_mode: Option<BackupDetectionMode>,
dry_run: bool,
skip_e2big_xattr: bool,
_info: &ApiMethod,
@@ -881,6 +887,8 @@ async fn create_backup(
let backup_time = backup_time_opt.unwrap_or_else(epoch_i64);
+ let detection_mode = change_detection_mode.unwrap_or_default();
+
let http_client = connect_rate_limited(&repo, rate_limit)?;
record_repository(&repo);
@@ -981,7 +989,7 @@ async fn create_backup(
None
};
- let mut manifest = BackupManifest::new(snapshot);
+ let mut manifest = BackupManifest::new(snapshot.clone());
let mut catalog = None;
let mut catalog_result_rx = None;
@@ -1028,22 +1036,21 @@ async fn create_backup(
manifest.add_file(target, stats.size, stats.csum, crypto.mode)?;
}
(BackupSpecificationType::PXAR, false) => {
- let metadata_mode = false; // Until enabled via param
-
let target_base = if let Some(base) = target_base.strip_suffix(".pxar") {
base.to_string()
} else {
bail!("unexpected suffix in target: {target_base}");
};
- let (target, payload_target) = if metadata_mode {
- (
- format!("{target_base}.mpxar.{extension}"),
- Some(format!("{target_base}.ppxar.{extension}")),
- )
- } else {
- (target, None)
- };
+ let (target, payload_target) =
+ if detection_mode.is_metadata() || detection_mode.is_data() {
+ (
+ format!("{target_base}.mpxar.{extension}"),
+ Some(format!("{target_base}.ppxar.{extension}")),
+ )
+ } else {
+ (target, None)
+ };
// start catalog upload on first use
if catalog.is_none() {
@@ -1060,12 +1067,42 @@ async fn create_backup(
.unwrap()
.start_directory(std::ffi::CString::new(target.as_str())?.as_c_str())?;
+ let mut previous_ref = None;
+ if detection_mode.is_metadata() {
+ if let Some(ref manifest) = previous_manifest {
+ // BackupWriter::start created a new snapshot, get the one before
+ if let Some(backup_time) = client.previous_backup_time().await? {
+ let backup_dir: BackupDir =
+ (snapshot.group.clone(), backup_time).into();
+ let backup_reader = BackupReader::start(
+ &http_client,
+ crypt_config.clone(),
+ repo.store(),
+ &backup_ns,
+ &backup_dir,
+ true,
+ )
+ .await?;
+ previous_ref = prepare_reference(
+ &target,
+ manifest.clone(),
+ &client,
+ backup_reader.clone(),
+ crypt_config.clone(),
+ crypto.mode,
+ )
+ .await?
+ }
+ }
+ }
+
let pxar_options = pbs_client::pxar::PxarCreateOptions {
device_set: devices.clone(),
patterns: pattern_list.clone(),
entries_max: entries_max as usize,
skip_lost_and_found,
skip_e2big_xattr,
+ previous_ref,
};
let upload_options = UploadOptions {
@@ -1177,6 +1214,72 @@ async fn create_backup(
Ok(Value::Null)
}
+async fn prepare_reference(
+ target: &str,
+ manifest: Arc<BackupManifest>,
+ backup_writer: &BackupWriter,
+ backup_reader: Arc<BackupReader>,
+ crypt_config: Option<Arc<CryptConfig>>,
+ crypt_mode: CryptMode,
+) -> Result<Option<PxarPrevRef>, Error> {
+ let (target, payload_target) =
+ match pbs_client::tools::get_pxar_archive_names(target, &manifest) {
+ Ok((target, payload_target)) => (target, payload_target),
+ Err(_) => return Ok(None),
+ };
+ let payload_target = payload_target.unwrap_or_default();
+
+ let metadata_ref_index = if let Ok(index) = backup_reader
+ .download_dynamic_index(&manifest, &target)
+ .await
+ {
+ index
+ } else {
+ log::info!("No previous metadata index, continue without reference");
+ return Ok(None);
+ };
+
+ let file_info = match manifest.lookup_file_info(&payload_target) {
+ Ok(file_info) => file_info,
+ Err(_) => {
+ log::info!("No previous payload index found in manifest, continue without reference");
+ return Ok(None);
+ }
+ };
+
+ if file_info.crypt_mode != crypt_mode {
+ log::info!("Crypt mode mismatch, continue without reference");
+ return Ok(None);
+ }
+
+ let known_payload_chunks = Arc::new(Mutex::new(HashSet::new()));
+ let payload_ref_index = backup_writer
+ .download_previous_dynamic_index(&payload_target, &manifest, known_payload_chunks)
+ .await?;
+
+ log::info!("Using previous index as metadata reference for '{target}'");
+
+ let most_used = metadata_ref_index.find_most_used_chunks(8);
+ let file_info = manifest.lookup_file_info(&target)?;
+ let chunk_reader = RemoteChunkReader::new(
+ backup_reader.clone(),
+ crypt_config.clone(),
+ file_info.chunk_crypt_mode(),
+ most_used,
+ );
+ let reader = BufferedDynamicReader::new(metadata_ref_index, chunk_reader);
+ let archive_size = reader.archive_size();
+ let reader: MetadataArchiveReader = Arc::new(LocalDynamicReadAt::new(reader));
+ // only care about the metadata, therefore do not attach payload reader
+ let accessor = Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?;
+
+ Ok(Some(pbs_client::pxar::PxarPrevRef {
+ accessor,
+ payload_index: payload_ref_index,
+ archive_name: target,
+ }))
+}
+
async fn dump_image<W: Write>(
client: Arc<BackupReader>,
crypt_config: Option<Arc<CryptConfig>>,
diff --git a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
index f7fbae093..681fa6db9 100644
--- a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
+++ b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
@@ -360,6 +360,7 @@ fn extract(
patterns,
skip_lost_and_found: false,
skip_e2big_xattr: false,
+ previous_ref: None,
};
let pxar_writer = pxar::PxarVariant::Unified(TokioWriter::new(writer));
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index a9a5fccdc..ecb617d65 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -362,6 +362,7 @@ async fn create_archive(
patterns,
skip_lost_and_found: false,
skip_e2big_xattr: false,
+ previous_ref: None,
};
let source = PathBuf::from(source);
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 28/58] client: pxar: add method for metadata comparison
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (26 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 27/58] client: implement prepare reference method Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 29/58] pxar: caching: add look-ahead cache Christian Ebner
` (29 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Add method to compare metadata of current file entry against metadata
of the entry looked up in the previous backup snapshot. If the
metadata matched, the start offset pointing to the files payload
header in the payload steam is returned.
This is in preparation for reusing payload chunks for unchanged files.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar/create.rs | 37 ++++++++++++++++++++++++++++++++++-
1 file changed, 36 insertions(+), 1 deletion(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index d183d3f6b..1cf11fc08 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -2,6 +2,7 @@ use std::collections::{HashMap, HashSet};
use std::ffi::{CStr, CString, OsStr};
use std::fmt;
use std::io::{self, Read};
+use std::mem::size_of;
use std::ops::Range;
use std::os::unix::ffi::OsStrExt;
use std::os::unix::io::{AsRawFd, FromRawFd, IntoRawFd, OwnedFd, RawFd};
@@ -21,7 +22,7 @@ use proxmox_sys::error::SysError;
use pxar::accessor::aio::{Accessor, Directory};
use pxar::accessor::ReadAt;
use pxar::encoder::{LinkOffset, SeqWrite};
-use pxar::{Metadata, PxarVariant};
+use pxar::{EntryKind, Metadata, PxarVariant};
use proxmox_io::vec;
use proxmox_lang::c_str;
@@ -333,6 +334,40 @@ impl Archiver {
.boxed()
}
+ async fn is_reusable_entry(
+ &mut self,
+ previous_metadata_accessor: &Option<Directory<MetadataArchiveReader>>,
+ file_name: &Path,
+ metadata: &Metadata,
+ ) -> Result<Option<Range<u64>>, Error> {
+ if let Some(previous_metadata_accessor) = previous_metadata_accessor {
+ if let Some(file_entry) = previous_metadata_accessor.lookup(file_name).await? {
+ if metadata == file_entry.metadata() {
+ if let EntryKind::File {
+ payload_offset: Some(offset),
+ size,
+ ..
+ } = file_entry.entry().kind()
+ {
+ let range =
+ *offset..*offset + size + size_of::<pxar::format::Header>() as u64;
+ log::debug!(
+ "reusable: {file_name:?} at range {range:?} has unchanged metadata."
+ );
+ return Ok(Some(range));
+ }
+ log::debug!("reencode: {file_name:?} not a regular file.");
+ return Ok(None);
+ }
+ log::debug!("reencode: {file_name:?} metadata did not match.");
+ return Ok(None);
+ }
+ log::debug!("reencode: {file_name:?} not found in previous archive.");
+ }
+
+ Ok(None)
+ }
+
/// openat() wrapper which allows but logs `EACCES` and turns `ENOENT` into `None`.
///
/// The `existed` flag is set when iterating through a directory to note that we know the file
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 29/58] pxar: caching: add look-ahead cache
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (27 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 28/58] client: pxar: add method for metadata comparison Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 30/58] client: pxar: refactor catalog encoding for directories Christian Ebner
` (28 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Add a lookahead cache and the neccessary types to store the required
data and keep track of directory boundaries while traversing the
filesystem tree, in order to postpone a decision if to reuse or
reencode a given regular file with unchanged metadata.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- inline clear_range
- return start path on `take_and_reset`, removing `start_path`
pbs-client/src/pxar/create.rs | 2 +-
pbs-client/src/pxar/look_ahead_cache.rs | 162 ++++++++++++++++++++++++
pbs-client/src/pxar/mod.rs | 1 +
3 files changed, 164 insertions(+), 1 deletion(-)
create mode 100644 pbs-client/src/pxar/look_ahead_cache.rs
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 1cf11fc08..1961b9b54 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -131,7 +131,7 @@ impl fmt::Display for ArchiveError {
}
#[derive(Eq, PartialEq, Hash)]
-struct HardLinkInfo {
+pub(crate) struct HardLinkInfo {
st_dev: u64,
st_ino: u64,
}
diff --git a/pbs-client/src/pxar/look_ahead_cache.rs b/pbs-client/src/pxar/look_ahead_cache.rs
new file mode 100644
index 000000000..37c07a9bc
--- /dev/null
+++ b/pbs-client/src/pxar/look_ahead_cache.rs
@@ -0,0 +1,162 @@
+use std::collections::HashSet;
+use std::ffi::CString;
+use std::ops::Range;
+use std::os::unix::io::OwnedFd;
+use std::path::PathBuf;
+
+use nix::sys::stat::FileStat;
+
+use pxar::encoder::PayloadOffset;
+use pxar::Metadata;
+
+use super::create::*;
+
+const DEFAULT_CACHE_SIZE: usize = 512;
+
+pub(crate) struct CacheEntryData {
+ pub(crate) fd: OwnedFd,
+ pub(crate) c_file_name: CString,
+ pub(crate) stat: FileStat,
+ pub(crate) metadata: Metadata,
+ pub(crate) payload_offset: PayloadOffset,
+}
+
+pub(crate) enum CacheEntry {
+ RegEntry(CacheEntryData),
+ DirEntry(CacheEntryData),
+ DirEnd,
+}
+
+pub(crate) struct PxarLookaheadCache {
+ // Current state of the cache
+ enabled: bool,
+ // Cached entries
+ entries: Vec<CacheEntry>,
+ // Entries encountered having more than one link given by stat
+ hardlinks: HashSet<HardLinkInfo>,
+ // Payload range covered by the currently cached entries
+ range: Range<u64>,
+ // Possible held back last chunk from last flush, used for possible chunk continuation
+ last_chunk: Option<ReusableDynamicEntry>,
+ // Path when started caching
+ start_path: PathBuf,
+ // Number of entries with file descriptors
+ fd_entries: usize,
+ // Max number of entries with file descriptors
+ cache_size: usize,
+}
+
+impl PxarLookaheadCache {
+ pub(crate) fn new(size: Option<usize>) -> Self {
+ Self {
+ enabled: false,
+ entries: Vec::new(),
+ hardlinks: HashSet::new(),
+ range: 0..0,
+ last_chunk: None,
+ start_path: PathBuf::new(),
+ fd_entries: 0,
+ cache_size: size.unwrap_or(DEFAULT_CACHE_SIZE),
+ }
+ }
+
+ pub(crate) fn is_full(&self) -> bool {
+ self.fd_entries >= self.cache_size
+ }
+
+ pub(crate) fn caching_enabled(&self) -> bool {
+ self.enabled
+ }
+
+ pub(crate) fn insert(
+ &mut self,
+ fd: OwnedFd,
+ c_file_name: CString,
+ stat: FileStat,
+ metadata: Metadata,
+ payload_offset: PayloadOffset,
+ path: PathBuf,
+ ) {
+ if !self.enabled {
+ self.start_path = path;
+ if !metadata.is_dir() {
+ self.start_path.pop();
+ }
+ }
+ self.enabled = true;
+ self.fd_entries += 1;
+ if metadata.is_dir() {
+ self.entries.push(CacheEntry::DirEntry(CacheEntryData {
+ fd,
+ c_file_name,
+ stat,
+ metadata,
+ payload_offset,
+ }))
+ } else {
+ self.entries.push(CacheEntry::RegEntry(CacheEntryData {
+ fd,
+ c_file_name,
+ stat,
+ metadata,
+ payload_offset,
+ }))
+ }
+ }
+
+ pub(crate) fn insert_dir_end(&mut self) {
+ self.entries.push(CacheEntry::DirEnd);
+ }
+
+ pub(crate) fn take_and_reset(&mut self) -> (Vec<CacheEntry>, PathBuf) {
+ self.fd_entries = 0;
+ self.enabled = false;
+ // keep end for possible continuation if cache has been cleared because
+ // it was full, but further caching would be fine
+ self.range = self.range.end..self.range.end;
+ (
+ std::mem::take(&mut self.entries),
+ std::mem::take(&mut self.start_path),
+ )
+ }
+
+ pub(crate) fn contains_hardlink(&self, info: &HardLinkInfo) -> bool {
+ self.hardlinks.contains(info)
+ }
+
+ pub(crate) fn insert_hardlink(&mut self, info: HardLinkInfo) -> bool {
+ self.hardlinks.insert(info)
+ }
+
+ pub(crate) fn range(&self) -> &Range<u64> {
+ &self.range
+ }
+
+ pub(crate) fn update_range(&mut self, range: Range<u64>) {
+ self.range = range;
+ }
+
+ pub(crate) fn try_extend_range(&mut self, range: Range<u64>) -> bool {
+ if self.range.end == 0 {
+ // initialize first range to start and end with start of new range
+ self.range.start = range.start;
+ self.range.end = range.start;
+ }
+
+ // range continued, update end
+ if self.range.end == range.start {
+ self.range.end = range.end;
+ return true;
+ }
+
+ false
+ }
+
+ pub(crate) fn take_last_chunk(&mut self) -> Option<ReusableDynamicEntry> {
+ self.last_chunk.take()
+ }
+
+ pub(crate) fn update_last_chunk(&mut self, chunk: Option<ReusableDynamicEntry>) {
+ self.last_chunk = chunk;
+ }
+}
diff --git a/pbs-client/src/pxar/mod.rs b/pbs-client/src/pxar/mod.rs
index 5248a1956..334759df6 100644
--- a/pbs-client/src/pxar/mod.rs
+++ b/pbs-client/src/pxar/mod.rs
@@ -50,6 +50,7 @@
pub(crate) mod create;
pub(crate) mod dir_stack;
pub(crate) mod extract;
+pub(crate) mod look_ahead_cache;
pub(crate) mod metadata;
pub(crate) mod tools;
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 30/58] client: pxar: refactor catalog encoding for directories
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (28 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 29/58] pxar: caching: add look-ahead cache Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 31/58] fix #3174: client: pxar: enable caching and meta comparison Christian Ebner
` (27 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Move the catalog directory start and end encoding from `add_entry`
to the `add_directory`, the latter being called by the previous.
By this, the `add_entry` method can be reused to walk the filesystem
tree in the context of an enabled lookahead cache without encoding
anything.
No functional change intended.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar/create.rs | 38 +++++++++++++++++------------------
1 file changed, 18 insertions(+), 20 deletions(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 1961b9b54..d126b2777 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -690,24 +690,15 @@ impl Archiver {
}
mode::IFDIR => {
let dir = Dir::from_fd(fd.into_raw_fd())?;
-
- if let Some(ref catalog) = self.catalog {
- catalog.lock().unwrap().start_directory(c_file_name)?;
- }
- let result = self
- .add_directory(
- encoder,
- previous_metadata,
- dir,
- c_file_name,
- &metadata,
- stat,
- )
- .await;
- if let Some(ref catalog) = self.catalog {
- catalog.lock().unwrap().end_directory()?;
- }
- result
+ self.add_directory(
+ encoder,
+ previous_metadata,
+ dir,
+ c_file_name,
+ &metadata,
+ stat,
+ )
+ .await
}
mode::IFSOCK => {
if let Some(ref catalog) = self.catalog {
@@ -757,12 +748,15 @@ impl Archiver {
encoder: &mut Encoder<'_, T>,
previous_metadata_accessor: &mut Option<Directory<MetadataArchiveReader>>,
dir: Dir,
- dir_name: &CStr,
+ c_dir_name: &CStr,
metadata: &Metadata,
stat: &FileStat,
) -> Result<(), Error> {
- let dir_name = OsStr::from_bytes(dir_name.to_bytes());
+ let dir_name = OsStr::from_bytes(c_dir_name.to_bytes());
+ if let Some(ref catalog) = self.catalog {
+ catalog.lock().unwrap().start_directory(c_dir_name)?;
+ }
encoder.create_directory(dir_name, metadata).await?;
let old_fs_magic = self.fs_magic;
@@ -804,6 +798,10 @@ impl Archiver {
self.current_st_dev = old_st_dev;
encoder.finish().await?;
+ if let Some(ref catalog) = self.catalog {
+ catalog.lock().unwrap().end_directory()?;
+ }
+
result
}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 31/58] fix #3174: client: pxar: enable caching and meta comparison
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (29 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 30/58] client: pxar: refactor catalog encoding for directories Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 32/58] client: backup writer: add injected chunk count to stats Christian Ebner
` (26 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
When walking the file system tree, check for each entry if it is
reusable, meaning that the metadata did not change and the payload
chunks can be reindexed instead of reencoding the whole data.
If the metadata matched, the range of the dynamic index entries for
that file are looked up in the previous payload data index.
Use the range and possible padding introduced by partial reuse of
chunks to decide whether to reuse the dynamic entries and encode
the file payloads as payload reference right away or cache the entry
for now and keep looking ahead.
If however a non-reusable (because changed) entry is encountered
before the padding threshold is reached, the entries on the cache are
flushed to the archive by reencoding them, resetting the cached state.
Reusable chunk digests and size as well as reference offsets to the
start of regular files payloads within the payload stream are injected
into the backup stream by sending them to the chunker via a dedicated
channel, forcing a chunk boundary and inserting the chunks.
If the threshold value for reuse is reached, the chunks are injected
in the payload stream and the references with the corresponding
offsets encoded in the metadata stream.
Since multiple files might be contained within a single chunk, it is
assured that the deduplication of chunks is performed, by keeping back
the last chunk, so following files might as well reuse that same
chunk without double indexing it. It is assured that this chunk is
injected in the stream also in case that the following lookups lead to
a cache clear and reencoding.
Directory boundaries are cached as well, and written as part of the
encoding when flushing.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- fix issue with incorrect reencoding of reusable entries
- adapt to changes in lookahead cache
- pass current path on cache insert, which might update start_path
- rearrange `prev_last_chunk` and payload range
- avoid to push to chunk_list, create by to_vec instead
pbs-client/src/pxar/create.rs | 384 +++++++++++++++++++++++++++++++---
1 file changed, 357 insertions(+), 27 deletions(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index d126b2777..6e6ce1a2b 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -21,9 +21,10 @@ use pathpatterns::{MatchEntry, MatchFlag, MatchList, MatchType, PatternFlag};
use proxmox_sys::error::SysError;
use pxar::accessor::aio::{Accessor, Directory};
use pxar::accessor::ReadAt;
-use pxar::encoder::{LinkOffset, SeqWrite};
+use pxar::encoder::{LinkOffset, PayloadOffset, SeqWrite};
use pxar::{EntryKind, Metadata, PxarVariant};
+use proxmox_human_byte::HumanByte;
use proxmox_io::vec;
use proxmox_lang::c_str;
use proxmox_sys::fs::{self, acl, xattr};
@@ -33,10 +34,13 @@ use pbs_datastore::dynamic_index::DynamicIndexReader;
use pbs_datastore::index::IndexFile;
use crate::inject_reused_chunks::InjectChunks;
+use crate::pxar::look_ahead_cache::{CacheEntry, CacheEntryData, PxarLookaheadCache};
use crate::pxar::metadata::errno_is_unsupported;
use crate::pxar::tools::assert_single_path_component;
use crate::pxar::Flags;
+const CHUNK_PADDING_THRESHOLD: f64 = 0.1;
+
/// Pxar options for creating a pxar archive/stream
#[derive(Default)]
pub struct PxarCreateOptions {
@@ -154,6 +158,7 @@ struct Archiver {
skip_e2big_xattr: bool,
forced_boundaries: Option<mpsc::Sender<InjectChunks>>,
previous_payload_index: Option<DynamicIndexReader>,
+ cache: PxarLookaheadCache,
}
type Encoder<'a, T> = pxar::encoder::aio::Encoder<'a, T>;
@@ -207,6 +212,7 @@ where
set.insert(stat.st_dev);
}
+ let metadata_mode = options.previous_ref.is_some() && writers.archive.payload().is_some();
let mut encoder = Encoder::new(writers.archive, &metadata).await?;
let mut patterns = options.patterns;
@@ -245,11 +251,19 @@ where
skip_e2big_xattr: options.skip_e2big_xattr,
forced_boundaries,
previous_payload_index,
+ cache: PxarLookaheadCache::new(None),
};
archiver
.archive_dir_contents(&mut encoder, previous_metadata_accessor, source_dir, true)
.await?;
+
+ if metadata_mode {
+ archiver
+ .flush_cached_reusing_if_below_threshold(&mut encoder, false)
+ .await?;
+ }
+
encoder.finish().await?;
encoder.close().await?;
@@ -307,7 +321,10 @@ impl Archiver {
for file_entry in file_list {
let file_name = file_entry.name.to_bytes();
- if is_root && file_name == b".pxarexclude-cli" {
+ if is_root
+ && file_name == b".pxarexclude-cli"
+ && previous_metadata_accessor.is_none()
+ {
self.encode_pxarexclude_cli(encoder, &file_entry.name, old_patterns_count)
.await?;
continue;
@@ -610,8 +627,6 @@ impl Archiver {
c_file_name: &CStr,
stat: &FileStat,
) -> Result<(), Error> {
- use pxar::format::mode;
-
let file_mode = stat.st_mode & libc::S_IFMT;
let open_mode = if file_mode == libc::S_IFREG || file_mode == libc::S_IFDIR {
OFlag::empty()
@@ -649,6 +664,126 @@ impl Archiver {
self.skip_e2big_xattr,
)?;
+ if self.previous_payload_index.is_none() {
+ return self
+ .add_entry_to_archive(encoder, &mut None, c_file_name, stat, fd, &metadata, None)
+ .await;
+ }
+
+ // Avoid having to many open file handles in cached entries
+ if self.cache.is_full() {
+ log::debug!("Max cache size reached, reuse cached entries");
+ self.flush_cached_reusing_if_below_threshold(encoder, true)
+ .await?;
+ }
+
+ if metadata.is_regular_file() {
+ if stat.st_nlink > 1 {
+ let link_info = HardLinkInfo {
+ st_dev: stat.st_dev,
+ st_ino: stat.st_ino,
+ };
+ if self.cache.contains_hardlink(&link_info) {
+ // This hardlink has been seen by the lookahead cache already, put it on the cache
+ // with a dummy offset and continue without lookup and chunk injection.
+ // On flushing or re-encoding, the logic there will store the actual hardlink with
+ // offset.
+ self.cache.insert(
+ fd,
+ c_file_name.into(),
+ *stat,
+ metadata.clone(),
+ PayloadOffset::default(),
+ self.path.clone(),
+ );
+ return Ok(());
+ } else {
+ // mark this hardlink as seen by the lookahead cache
+ self.cache.insert_hardlink(link_info);
+ }
+ }
+
+ let file_name: &Path = OsStr::from_bytes(c_file_name.to_bytes()).as_ref();
+ if let Some(payload_range) = self
+ .is_reusable_entry(previous_metadata, file_name, &metadata)
+ .await?
+ {
+ if !self.cache.try_extend_range(payload_range.clone()) {
+ log::debug!("Cache range has hole, new range: {payload_range:?}");
+ self.flush_cached_reusing_if_below_threshold(encoder, true)
+ .await?;
+ // range has to be set after flushing of cached entries, which resets the range
+ self.cache.update_range(payload_range.clone());
+ }
+
+ // offset relative to start of current range, does not include possible padding of
+ // actual chunks, which needs to be added before encoding the payload reference
+ let offset =
+ PayloadOffset::default().add(payload_range.start - self.cache.range().start);
+ log::debug!("Offset relative to range start: {offset:?}");
+
+ self.cache.insert(
+ fd,
+ c_file_name.into(),
+ *stat,
+ metadata.clone(),
+ offset,
+ self.path.clone(),
+ );
+ return Ok(());
+ } else {
+ self.flush_cached_reusing_if_below_threshold(encoder, false)
+ .await?;
+ }
+ } else if self.cache.caching_enabled() {
+ self.cache.insert(
+ fd.try_clone()?,
+ c_file_name.into(),
+ *stat,
+ metadata.clone(),
+ PayloadOffset::default(),
+ self.path.clone(),
+ );
+
+ if metadata.is_dir() {
+ self.add_directory(
+ encoder,
+ previous_metadata,
+ Dir::from_fd(fd.into_raw_fd())?,
+ c_file_name,
+ &metadata,
+ stat,
+ )
+ .await?;
+ }
+ return Ok(());
+ }
+
+ self.encode_entries_to_archive(encoder, None).await?;
+ self.add_entry_to_archive(
+ encoder,
+ previous_metadata,
+ c_file_name,
+ stat,
+ fd,
+ &metadata,
+ None,
+ )
+ .await
+ }
+
+ async fn add_entry_to_archive<T: SeqWrite + Send>(
+ &mut self,
+ encoder: &mut Encoder<'_, T>,
+ previous_metadata: &mut Option<Directory<MetadataArchiveReader>>,
+ c_file_name: &CStr,
+ stat: &FileStat,
+ fd: OwnedFd,
+ metadata: &Metadata,
+ payload_offset: Option<PayloadOffset>,
+ ) -> Result<(), Error> {
+ use pxar::format::mode;
+
let file_name: &Path = OsStr::from_bytes(c_file_name.to_bytes()).as_ref();
match metadata.file_type() {
mode::IFREG => {
@@ -677,9 +812,14 @@ impl Archiver {
.add_file(c_file_name, file_size, stat.st_mtime)?;
}
- let offset: LinkOffset = self
- .add_regular_file(encoder, fd, file_name, &metadata, file_size)
- .await?;
+ let offset: LinkOffset = if let Some(payload_offset) = payload_offset {
+ encoder
+ .add_payload_ref(metadata, file_name, file_size, payload_offset)
+ .await?
+ } else {
+ self.add_regular_file(encoder, fd, file_name, metadata, file_size)
+ .await?
+ };
if stat.st_nlink > 1 {
self.hardlinks
@@ -690,50 +830,43 @@ impl Archiver {
}
mode::IFDIR => {
let dir = Dir::from_fd(fd.into_raw_fd())?;
- self.add_directory(
- encoder,
- previous_metadata,
- dir,
- c_file_name,
- &metadata,
- stat,
- )
- .await
+ self.add_directory(encoder, previous_metadata, dir, c_file_name, metadata, stat)
+ .await
}
mode::IFSOCK => {
if let Some(ref catalog) = self.catalog {
catalog.lock().unwrap().add_socket(c_file_name)?;
}
- Ok(encoder.add_socket(&metadata, file_name).await?)
+ Ok(encoder.add_socket(metadata, file_name).await?)
}
mode::IFIFO => {
if let Some(ref catalog) = self.catalog {
catalog.lock().unwrap().add_fifo(c_file_name)?;
}
- Ok(encoder.add_fifo(&metadata, file_name).await?)
+ Ok(encoder.add_fifo(metadata, file_name).await?)
}
mode::IFLNK => {
if let Some(ref catalog) = self.catalog {
catalog.lock().unwrap().add_symlink(c_file_name)?;
}
- self.add_symlink(encoder, fd, file_name, &metadata).await
+ self.add_symlink(encoder, fd, file_name, metadata).await
}
mode::IFBLK => {
if let Some(ref catalog) = self.catalog {
catalog.lock().unwrap().add_block_device(c_file_name)?;
}
- self.add_device(encoder, file_name, &metadata, stat).await
+ self.add_device(encoder, file_name, metadata, stat).await
}
mode::IFCHR => {
if let Some(ref catalog) = self.catalog {
catalog.lock().unwrap().add_char_device(c_file_name)?;
}
- self.add_device(encoder, file_name, &metadata, stat).await
+ self.add_device(encoder, file_name, metadata, stat).await
}
other => bail!(
"encountered unknown file type: 0x{:x} (0o{:o})",
@@ -743,6 +876,197 @@ impl Archiver {
}
}
+ async fn flush_cached_reusing_if_below_threshold<T: SeqWrite + Send>(
+ &mut self,
+ encoder: &mut Encoder<'_, T>,
+ keep_last_chunk: bool,
+ ) -> Result<(), Error> {
+ if self.cache.range().is_empty() {
+ // only non regular file entries (e.g. directories) in cache, allows to do regular encoding
+ self.encode_entries_to_archive(encoder, None).await?;
+ return Ok(());
+ }
+
+ if let Some(ref ref_payload_index) = self.previous_payload_index {
+ // Take ownership of previous last chunk, only update where it must be injected
+ let prev_last_chunk = self.cache.take_last_chunk();
+ let range = self.cache.range();
+ let (mut indices, start_padding, end_padding) =
+ lookup_dynamic_entries(ref_payload_index, range)?;
+ let mut padding = start_padding + end_padding;
+ let total_size = (range.end - range.start) + padding;
+
+ // take into account used bytes of kept back chunk for padding
+ if let (Some(first), Some(last)) = (indices.first(), prev_last_chunk.as_ref()) {
+ if last.digest() == first.digest() {
+ // Update padding used for threshold calculation only
+ let used = last.size() - last.padding;
+ padding -= used;
+ }
+ }
+
+ let ratio = padding as f64 / total_size as f64;
+
+ // do not reuse chunks if introduced padding higher than threshold
+ // opt for re-encoding in that case
+ if ratio > CHUNK_PADDING_THRESHOLD {
+ log::debug!(
+ "Padding ratio: {ratio} > {CHUNK_PADDING_THRESHOLD}, padding: {}, total {}, chunks: {}",
+ HumanByte::from(padding),
+ HumanByte::from(total_size),
+ indices.len(),
+ );
+ self.cache.update_last_chunk(prev_last_chunk);
+ self.encode_entries_to_archive(encoder, None).await?;
+ } else {
+ log::debug!(
+ "Padding ratio: {ratio} < {CHUNK_PADDING_THRESHOLD}, padding: {}, total {}, chunks: {}",
+ HumanByte::from(padding),
+ HumanByte::from(total_size),
+ indices.len(),
+ );
+
+ // check for cases where kept back last is not equal first chunk because the range
+ // end aligned with a chunk boundary, and the chunks therefore needs to be injected
+ if let (Some(first), Some(last)) = (indices.first_mut(), prev_last_chunk) {
+ if last.digest() != first.digest() {
+ // make sure to inject previous last chunk before encoding entries
+ self.inject_chunks_at_current_payload_position(encoder, vec![last])?;
+ } else {
+ let used = last.size() - last.padding;
+ first.padding -= used;
+ }
+ }
+
+ let base_offset = Some(encoder.payload_position()?.add(start_padding));
+ self.encode_entries_to_archive(encoder, base_offset).await?;
+
+ if keep_last_chunk {
+ self.cache.update_last_chunk(indices.pop());
+ }
+
+ self.inject_chunks_at_current_payload_position(encoder, indices)?;
+ }
+
+ Ok(())
+ } else {
+ bail!("cannot reuse chunks without previous index reader");
+ }
+ }
+
+ // Take ownership of cached entries and encode them to the archive
+ // Encode with reused payload chunks when base offset is some, reencode otherwise
+ async fn encode_entries_to_archive<T: SeqWrite + Send>(
+ &mut self,
+ encoder: &mut Encoder<'_, T>,
+ base_offset: Option<PayloadOffset>,
+ ) -> Result<(), Error> {
+ if let Some(prev) = self.cache.take_last_chunk() {
+ // make sure to inject previous last chunk before encoding entries
+ self.inject_chunks_at_current_payload_position(encoder, vec![prev])?;
+ }
+
+ // take ownership of cached entries and reset caching state
+ let (entries, start_path) = self.cache.take_and_reset();
+ let old_path = self.path.clone();
+ self.path = start_path;
+ log::debug!(
+ "Got {} cache entries to encode: reuse is {}",
+ entries.len(),
+ base_offset.is_some()
+ );
+
+ for entry in entries {
+ match entry {
+ CacheEntry::RegEntry(CacheEntryData {
+ fd,
+ c_file_name,
+ stat,
+ metadata,
+ payload_offset,
+ }) => {
+ let file_name = OsStr::from_bytes(c_file_name.to_bytes());
+ self.path.push(file_name);
+ self.add_entry_to_archive(
+ encoder,
+ &mut None,
+ &c_file_name,
+ &stat,
+ fd,
+ &metadata,
+ base_offset.map(|base_offset| payload_offset.add(base_offset.raw())),
+ )
+ .await?;
+ self.path.pop();
+ }
+ CacheEntry::DirEntry(CacheEntryData {
+ c_file_name,
+ metadata,
+ ..
+ }) => {
+ let file_name = OsStr::from_bytes(c_file_name.to_bytes());
+ self.path.push(file_name);
+ if let Some(ref catalog) = self.catalog {
+ catalog.lock().unwrap().start_directory(&c_file_name)?;
+ }
+ let dir_name = OsStr::from_bytes(c_file_name.to_bytes());
+ encoder.create_directory(dir_name, &metadata).await?;
+ }
+ CacheEntry::DirEnd => {
+ encoder.finish().await?;
+ if let Some(ref catalog) = self.catalog {
+ catalog.lock().unwrap().end_directory()?;
+ }
+ self.path.pop();
+ }
+ }
+ }
+
+ self.path = old_path;
+
+ Ok(())
+ }
+
+ fn inject_chunks_at_current_payload_position<T: SeqWrite + Send>(
+ &mut self,
+ encoder: &mut Encoder<'_, T>,
+ reused_chunks: Vec<ReusableDynamicEntry>,
+ ) -> Result<(), Error> {
+ let mut injection_boundary = encoder.payload_position()?;
+
+ for chunks in reused_chunks.chunks(128) {
+ let chunks = chunks.to_vec();
+ let mut size = PayloadOffset::default();
+
+ for chunk in chunks.iter() {
+ log::debug!(
+ "Injecting chunk with {} padding (chunk size {})",
+ HumanByte::from(chunk.padding),
+ HumanByte::from(chunk.size()),
+ );
+ size = size.add(chunk.size());
+ }
+
+ let inject_chunks = InjectChunks {
+ boundary: injection_boundary.raw(),
+ chunks,
+ size: size.raw() as usize,
+ };
+
+ if let Some(sender) = self.forced_boundaries.as_mut() {
+ sender.send(inject_chunks)?;
+ } else {
+ bail!("missing injection queue");
+ };
+
+ injection_boundary = injection_boundary.add(size.raw());
+ log::debug!("Advance payload position by: {size:?}");
+ encoder.advance(size)?;
+ }
+
+ Ok(())
+ }
+
async fn add_directory<T: SeqWrite + Send>(
&mut self,
encoder: &mut Encoder<'_, T>,
@@ -754,10 +1078,12 @@ impl Archiver {
) -> Result<(), Error> {
let dir_name = OsStr::from_bytes(c_dir_name.to_bytes());
- if let Some(ref catalog) = self.catalog {
- catalog.lock().unwrap().start_directory(c_dir_name)?;
+ if !self.cache.caching_enabled() {
+ if let Some(ref catalog) = self.catalog {
+ catalog.lock().unwrap().start_directory(c_dir_name)?;
+ }
+ encoder.create_directory(dir_name, metadata).await?;
}
- encoder.create_directory(dir_name, metadata).await?;
let old_fs_magic = self.fs_magic;
let old_fs_feature_flags = self.fs_feature_flags;
@@ -797,9 +1123,13 @@ impl Archiver {
self.fs_feature_flags = old_fs_feature_flags;
self.current_st_dev = old_st_dev;
- encoder.finish().await?;
- if let Some(ref catalog) = self.catalog {
- catalog.lock().unwrap().end_directory()?;
+ if !self.cache.caching_enabled() {
+ encoder.finish().await?;
+ if let Some(ref catalog) = self.catalog {
+ catalog.lock().unwrap().end_directory()?;
+ }
+ } else {
+ self.cache.insert_dir_end();
}
result
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 32/58] client: backup writer: add injected chunk count to stats
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (30 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 31/58] fix #3174: client: pxar: enable caching and meta comparison Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 33/58] pxar: create: keep track of reused chunks and files Christian Ebner
` (25 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Track the number of injected chunks and show them in the debug output
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/backup_writer.rs | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index b2ada85cd..c22978096 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -57,8 +57,10 @@ pub struct UploadOptions {
struct UploadStats {
chunk_count: usize,
chunk_reused: usize,
+ chunk_injected: usize,
size: usize,
size_reused: usize,
+ size_injected: usize,
size_compressed: usize,
duration: std::time::Duration,
csum: [u8; 32],
@@ -355,6 +357,14 @@ impl BackupWriter {
pbs_tools::format::strip_server_file_extension(archive_name)
};
+ if upload_stats.chunk_injected > 0 {
+ log::info!(
+ "{archive}: reused {} from previous snapshot for unchanged files ({} chunks)",
+ HumanByte::from(upload_stats.size_injected),
+ upload_stats.chunk_injected,
+ );
+ }
+
if archive_name != CATALOG_NAME {
let speed: HumanByte =
((size_dirty * 1_000_000) / (upload_stats.duration.as_micros() as usize)).into();
@@ -645,6 +655,8 @@ impl BackupWriter {
let total_chunks2 = total_chunks.clone();
let known_chunk_count = Arc::new(AtomicUsize::new(0));
let known_chunk_count2 = known_chunk_count.clone();
+ let injected_chunk_count = Arc::new(AtomicUsize::new(0));
+ let injected_chunk_count2 = injected_chunk_count.clone();
let stream_len = Arc::new(AtomicUsize::new(0));
let stream_len2 = stream_len.clone();
@@ -652,6 +664,8 @@ impl BackupWriter {
let compressed_stream_len2 = compressed_stream_len.clone();
let reused_len = Arc::new(AtomicUsize::new(0));
let reused_len2 = reused_len.clone();
+ let injected_len = Arc::new(AtomicUsize::new(0));
+ let injected_len2 = injected_len.clone();
let append_chunk_path = format!("{}_index", prefix);
let upload_chunk_path = format!("{}_chunk", prefix);
@@ -672,6 +686,7 @@ impl BackupWriter {
// account for injected chunks
let count = chunks.len();
total_chunks.fetch_add(count, Ordering::SeqCst);
+ injected_chunk_count.fetch_add(count, Ordering::SeqCst);
let mut known = Vec::new();
let mut guard = index_csum.lock().unwrap();
@@ -680,6 +695,7 @@ impl BackupWriter {
let offset =
stream_len.fetch_add(chunk.size() as usize, Ordering::SeqCst) as u64;
reused_len.fetch_add(chunk.size() as usize, Ordering::SeqCst);
+ injected_len.fetch_add(chunk.size() as usize, Ordering::SeqCst);
let digest = chunk.digest();
known.push((offset, digest));
let end_offset = offset + chunk.size();
@@ -795,8 +811,10 @@ impl BackupWriter {
let duration = start_time.elapsed();
let chunk_count = total_chunks2.load(Ordering::SeqCst);
let chunk_reused = known_chunk_count2.load(Ordering::SeqCst);
+ let chunk_injected = injected_chunk_count2.load(Ordering::SeqCst);
let size = stream_len2.load(Ordering::SeqCst);
let size_reused = reused_len2.load(Ordering::SeqCst);
+ let size_injected = injected_len2.load(Ordering::SeqCst);
let size_compressed = compressed_stream_len2.load(Ordering::SeqCst) as usize;
let mut guard = index_csum_2.lock().unwrap();
@@ -805,8 +823,10 @@ impl BackupWriter {
futures::future::ok(UploadStats {
chunk_count,
chunk_reused,
+ chunk_injected,
size,
size_reused,
+ size_injected,
size_compressed,
duration,
csum,
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 33/58] pxar: create: keep track of reused chunks and files
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (31 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 32/58] client: backup writer: add injected chunk count to stats Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 34/58] pxar: create: show chunk injection stats info output Christian Ebner
` (24 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Track and log reused or reencoded files as well as the reused chunks
and their paddings.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- remove chunk_list push occurrence, adapting to previous patch change
pbs-client/src/pxar/create.rs | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 6e6ce1a2b..704f58e86 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -140,6 +140,18 @@ pub(crate) struct HardLinkInfo {
st_ino: u64,
}
+#[derive(Default)]
+struct ReuseStats {
+ files_reused_count: u64,
+ files_hardlink_count: u64,
+ files_reencoded_count: u64,
+ total_injected_count: u64,
+ partial_chunks_count: u64,
+ total_injected_size: u64,
+ total_reused_payload_size: u64,
+ total_reencoded_size: u64,
+}
+
struct Archiver {
feature_flags: Flags,
fs_feature_flags: Flags,
@@ -159,6 +171,7 @@ struct Archiver {
forced_boundaries: Option<mpsc::Sender<InjectChunks>>,
previous_payload_index: Option<DynamicIndexReader>,
cache: PxarLookaheadCache,
+ reuse_stats: ReuseStats,
}
type Encoder<'a, T> = pxar::encoder::aio::Encoder<'a, T>;
@@ -252,6 +265,7 @@ where
forced_boundaries,
previous_payload_index,
cache: PxarLookaheadCache::new(None),
+ reuse_stats: ReuseStats::default(),
};
archiver
@@ -813,15 +827,24 @@ impl Archiver {
}
let offset: LinkOffset = if let Some(payload_offset) = payload_offset {
+ self.reuse_stats.total_reused_payload_size +=
+ file_size + size_of::<pxar::format::Header>() as u64;
+ self.reuse_stats.files_reused_count += 1;
+
encoder
.add_payload_ref(metadata, file_name, file_size, payload_offset)
.await?
} else {
+ self.reuse_stats.total_reencoded_size +=
+ file_size + size_of::<pxar::format::Header>() as u64;
+ self.reuse_stats.files_reencoded_count += 1;
+
self.add_regular_file(encoder, fd, file_name, metadata, file_size)
.await?
};
if stat.st_nlink > 1 {
+ self.reuse_stats.files_hardlink_count += 1;
self.hardlinks
.insert(link_info, (self.path.clone(), offset));
}
@@ -1044,6 +1067,13 @@ impl Archiver {
HumanByte::from(chunk.padding),
HumanByte::from(chunk.size()),
);
+ self.reuse_stats.total_injected_size += chunk.size();
+ self.reuse_stats.total_injected_count += 1;
+
+ if chunk.padding > 0 {
+ self.reuse_stats.partial_chunks_count += 1;
+ }
+
size = size.add(chunk.size());
}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 34/58] pxar: create: show chunk injection stats info output
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (32 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 33/58] pxar: create: keep track of reused chunks and files Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 35/58] client: backup writer: make backup info output more concise Christian Ebner
` (23 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- fix subject as the output is logged as info, not debug
- reword partial chunks to partially reused chunks
pbs-client/src/pxar/create.rs | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 704f58e86..53c8a5307 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -281,6 +281,34 @@ where
encoder.finish().await?;
encoder.close().await?;
+ if metadata_mode {
+ log::info!("Change detection summary:");
+ log::info!(
+ " - {} total files ({} hardlinks)",
+ archiver.reuse_stats.files_reused_count
+ + archiver.reuse_stats.files_reencoded_count
+ + archiver.reuse_stats.files_hardlink_count,
+ archiver.reuse_stats.files_hardlink_count,
+ );
+ log::info!(
+ " - {} unchanged, reusable files with {} data",
+ archiver.reuse_stats.files_reused_count,
+ HumanByte::from(archiver.reuse_stats.total_reused_payload_size),
+ );
+ log::info!(
+ " - {} changed or non-reusable files with {} data",
+ archiver.reuse_stats.files_reencoded_count,
+ HumanByte::from(archiver.reuse_stats.total_reencoded_size),
+ );
+ log::info!(
+ " - {} padding in {} partially reused chunks",
+ HumanByte::from(
+ archiver.reuse_stats.total_injected_size
+ - archiver.reuse_stats.total_reused_payload_size
+ ),
+ archiver.reuse_stats.partial_chunks_count,
+ );
+ }
Ok(())
}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 35/58] client: backup writer: make backup info output more concise
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (33 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 34/58] pxar: create: show chunk injection stats info output Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 36/58] client: pxar: add helper to handle optional preludes Christian Ebner
` (22 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
With the additional output in case of split pxar archives, the upload
statistics logged by the backup writer following a backup are crowded
and hard to read.
Make the output more concise by merging the currenlty 2 lines per
upload stream, shown as e.g.:
```
data.ppxar: had to backup 4 MiB of 10.943 GiB (compressed 159 B) in 49.30s
data.ppxar: average backup speed: 83.09 KiB/s
```
into a single line, shown as e.g.:
```
data.ppxar: had to back up 4 MiB of 10.943 GiB (159 B compressed) in 49.30 s (average 83.09 KiB/s)
```
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- not present in previous version
pbs-client/src/backup_writer.rs | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index c22978096..813c8d602 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -371,14 +371,9 @@ impl BackupWriter {
let size_dirty: HumanByte = size_dirty.into();
let size_compressed: HumanByte = upload_stats.size_compressed.into();
log::info!(
- "{}: had to backup {} of {} (compressed {}) in {:.2}s",
- archive,
- size_dirty,
- size,
- size_compressed,
+ "{archive}: had to backup {size_dirty} of {size} (compressed {size_compressed}) in {:.2} s (average {speed}/s)",
upload_stats.duration.as_secs_f64()
);
- log::info!("{}: average backup speed: {}/s", archive, speed);
} else {
log::info!("Uploaded backup catalog ({})", size);
}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 36/58] client: pxar: add helper to handle optional preludes
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (34 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 35/58] client: backup writer: make backup info output more concise Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 37/58] client: pxar: opt encode cli exclude patterns as Prelude Christian Ebner
` (21 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Pxar archives with format version 2 allows to store optional
information file format version and prelude entries.
Cover the case for these entries, the file format version entry being
introduced to distinguish between different file formats used for
encoding as well as the prelude entry used to store optional metadata
such as the pxar cli exlude parameters.
Add the logic to accept and decode these prelude entries when
accessing the archive via a decoder instance.
For now simply ignore them.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- add debug log output of encountered pxar format version
pbs-client/src/pxar/create.rs | 2 +-
| 7 +++--
pbs-client/src/pxar/tools.rs | 7 +++++
pbs-client/src/tools/mod.rs | 36 +++++++++++++++++++++++
src/api2/tape/restore.rs | 17 ++++-------
src/tape/file_formats/snapshot_archive.rs | 1 +
6 files changed, 55 insertions(+), 15 deletions(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 53c8a5307..56931dad7 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -226,7 +226,7 @@ where
}
let metadata_mode = options.previous_ref.is_some() && writers.archive.payload().is_some();
- let mut encoder = Encoder::new(writers.archive, &metadata).await?;
+ let mut encoder = Encoder::new(writers.archive, &metadata, None).await?;
let mut patterns = options.patterns;
--git a/pbs-client/src/pxar/extract.rs b/pbs-client/src/pxar/extract.rs
index 5f5ac6188..e22390606 100644
--- a/pbs-client/src/pxar/extract.rs
+++ b/pbs-client/src/pxar/extract.rs
@@ -29,6 +29,7 @@ use proxmox_compression::zip::{ZipEncoder, ZipEntry};
use crate::pxar::dir_stack::PxarDirStack;
use crate::pxar::metadata;
use crate::pxar::Flags;
+use crate::tools::handle_root_with_optional_format_version_prelude;
pub struct PxarExtractOptions<'a> {
pub match_list: &'a [MatchEntry],
@@ -124,9 +125,7 @@ where
// we use this to keep track of our directory-traversal
decoder.enable_goodbye_entries(true);
- let root = decoder
- .next()
- .context("found empty pxar archive")?
+ let (root, _) = handle_root_with_optional_format_version_prelude(&mut decoder)
.context("error reading pxar archive")?;
if !root.is_dir() {
@@ -267,6 +266,8 @@ where
};
let extract_res = match (did_match, entry.kind()) {
+ (_, EntryKind::Version(_version)) => Ok(()),
+ (_, EntryKind::Prelude(_prelude)) => Ok(()),
(_, EntryKind::Directory) => {
self.callback(entry.path());
diff --git a/pbs-client/src/pxar/tools.rs b/pbs-client/src/pxar/tools.rs
index 459951d50..27e5185a3 100644
--- a/pbs-client/src/pxar/tools.rs
+++ b/pbs-client/src/pxar/tools.rs
@@ -172,6 +172,13 @@ pub fn format_multi_line_entry(entry: &Entry) -> String {
let meta = entry.metadata();
let (size, link, type_name, payload_offset) = match entry.kind() {
+ EntryKind::Version(version) => (format!("{version:?}"), String::new(), "version", None),
+ EntryKind::Prelude(prelude) => (
+ "0".to_string(),
+ format!("raw data: {:?} bytes", prelude.data.len()),
+ "prelude",
+ None,
+ ),
EntryKind::File {
size,
payload_offset,
diff --git a/pbs-client/src/tools/mod.rs b/pbs-client/src/tools/mod.rs
index 6680dc475..d62b651ee 100644
--- a/pbs-client/src/tools/mod.rs
+++ b/pbs-client/src/tools/mod.rs
@@ -589,3 +589,39 @@ pub fn has_pxar_filename_extension(name: &str, with_didx_extension: bool) -> boo
name.ends_with(".pxar") || name.ends_with(".mpxar") || name.ends_with(".ppxar")
}
}
+
+/// Decode possible format version and prelude entries before getting the root directory
+/// entry.
+///
+/// Returns the root directory entry and, if present, the prelude entry
+pub fn handle_root_with_optional_format_version_prelude<R: pxar::decoder::SeqRead>(
+ decoder: &mut pxar::decoder::sync::Decoder<R>,
+) -> Result<(pxar::Entry, Option<pxar::Entry>), Error> {
+ let first = decoder
+ .next()
+ .ok_or_else(|| format_err!("missing root entry"))??;
+ match first.kind() {
+ pxar::EntryKind::Directory => {
+ let version = pxar::format::FormatVersion::Version1;
+ log::debug!("pxar format version '{version:?}'");
+ Ok((first, None))
+ }
+ pxar::EntryKind::Version(version) => {
+ log::debug!("pxar format version '{version:?}'");
+ let second = decoder
+ .next()
+ .ok_or_else(|| format_err!("missing root entry"))??;
+ match second.kind() {
+ pxar::EntryKind::Directory => Ok((second, None)),
+ pxar::EntryKind::Prelude(_prelude) => {
+ let third = decoder
+ .next()
+ .ok_or_else(|| format_err!("missing root entry"))??;
+ Ok((third, Some(second)))
+ }
+ _ => bail!("unexpected entry kind {:?}", second.kind()),
+ }
+ }
+ _ => bail!("unexpected entry kind {:?}", first.kind()),
+ }
+}
diff --git a/src/api2/tape/restore.rs b/src/api2/tape/restore.rs
index 9184ff934..382909647 100644
--- a/src/api2/tape/restore.rs
+++ b/src/api2/tape/restore.rs
@@ -23,6 +23,7 @@ use pbs_api_types::{
PRIV_DATASTORE_MODIFY, PRIV_TAPE_READ, TAPE_RESTORE_NAMESPACE_SCHEMA,
TAPE_RESTORE_SNAPSHOT_SCHEMA, UPID_SCHEMA,
};
+use pbs_client::tools::handle_root_with_optional_format_version_prelude;
use pbs_config::CachedUserInfo;
use pbs_datastore::dynamic_index::DynamicIndexReader;
use pbs_datastore::fixed_index::FixedIndexReader;
@@ -1713,17 +1714,11 @@ fn try_restore_snapshot_archive<R: pxar::decoder::SeqRead>(
decoder: &mut pxar::decoder::sync::Decoder<R>,
snapshot_path: &Path,
) -> Result<BackupManifest, Error> {
- let _root = match decoder.next() {
- None => bail!("missing root entry"),
- Some(root) => {
- let root = root?;
- match root.kind() {
- pxar::EntryKind::Directory => { /* Ok */ }
- _ => bail!("wrong root entry type"),
- }
- root
- }
- };
+ let (root, _) = handle_root_with_optional_format_version_prelude(decoder)?;
+ match root.kind() {
+ pxar::EntryKind::Directory => { /* Ok */ }
+ _ => bail!("wrong root entry type"),
+ }
let root_path = Path::new("/");
let manifest_file_name = OsStr::new(MANIFEST_BLOB_NAME);
diff --git a/src/tape/file_formats/snapshot_archive.rs b/src/tape/file_formats/snapshot_archive.rs
index 82f466980..f5a588f4e 100644
--- a/src/tape/file_formats/snapshot_archive.rs
+++ b/src/tape/file_formats/snapshot_archive.rs
@@ -61,6 +61,7 @@ pub fn tape_write_snapshot_archive<'a>(
let mut encoder = pxar::encoder::sync::Encoder::new(
pxar::PxarVariant::Unified(PxarTapeWriter::new(writer)),
&root_metadata,
+ None,
)?;
for filename in file_list.iter() {
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 37/58] client: pxar: opt encode cli exclude patterns as Prelude
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (35 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 36/58] client: pxar: add helper to handle optional preludes Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 38/58] client: pxar: allow to restore prelude to optional path Christian Ebner
` (20 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Instead of encoding the pxar cli exclude patterns as regular file
within the root directory of an archive, store this information
directly after the pxar format version entry in the entry of kind
Prelude.
This behavior is however currently exclusive to the archives written
with format version 2 in a split metadata and payload case.
This is a breaking change for the encoding of new cli exclude
parameters. Any new exclude parameter will not be added to an already
present .pxar-cliexclude file, and it will not be created if not
present.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar/create.rs | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 56931dad7..eadd670df 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -225,9 +225,6 @@ where
set.insert(stat.st_dev);
}
- let metadata_mode = options.previous_ref.is_some() && writers.archive.payload().is_some();
- let mut encoder = Encoder::new(writers.archive, &metadata, None).await?;
-
let mut patterns = options.patterns;
if options.skip_lost_and_found {
@@ -237,6 +234,15 @@ where
MatchType::Exclude,
)?);
}
+
+ let cli_params_content = generate_pxar_excludes_cli(&patterns[..]);
+ let cli_params = if options.previous_ref.is_some() {
+ Some(cli_params_content.as_slice())
+ } else {
+ None
+ };
+
+ let metadata_mode = options.previous_ref.is_some() && writers.archive.payload().is_some();
let (previous_payload_index, previous_metadata_accessor) =
if let Some(refs) = options.previous_ref {
(
@@ -247,6 +253,8 @@ where
(None, None)
};
+ let mut encoder = Encoder::new(writers.archive, &metadata, cli_params).await?;
+
let mut archiver = Archiver {
feature_flags,
fs_feature_flags,
@@ -348,7 +356,7 @@ impl Archiver {
let mut file_list = self.generate_directory_file_list(&mut dir, is_root)?;
- if is_root && old_patterns_count > 0 {
+ if is_root && old_patterns_count > 0 && previous_metadata_accessor.is_none() {
file_list.push(FileListEntry {
name: CString::new(".pxarexclude-cli").unwrap(),
path: PathBuf::new(),
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 38/58] client: pxar: allow to restore prelude to optional path
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (36 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 37/58] client: pxar: opt encode cli exclude patterns as Prelude Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 39/58] pxar: bin: show padding in debug output on archive list Christian Ebner
` (19 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Pxar archives allow to store additional information in a prelude
entry since pxar format version 2.
Add an optional parameter to `pxar` and `proxmox-backup-client` to
specify the path to restore the prelude to and pass this to the
archive extraction by extending the `PxarExtractOptions` by a
corresponding field. If none is given, the prelude is simply skipped
during restore.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- add missing file completion
| 23 +++++++++++++++++++++--
proxmox-backup-client/src/main.rs | 15 +++++++++++++--
pxar-bin/src/main.rs | 9 ++++++++-
3 files changed, 42 insertions(+), 5 deletions(-)
--git a/pbs-client/src/pxar/extract.rs b/pbs-client/src/pxar/extract.rs
index e22390606..99c0d0e10 100644
--- a/pbs-client/src/pxar/extract.rs
+++ b/pbs-client/src/pxar/extract.rs
@@ -2,7 +2,8 @@
use std::collections::HashMap;
use std::ffi::{CStr, CString, OsStr, OsString};
-use std::io;
+use std::fs::OpenOptions;
+use std::io::{self, Write};
use std::os::unix::ffi::OsStrExt;
use std::os::unix::io::{AsRawFd, FromRawFd, RawFd};
use std::path::{Path, PathBuf};
@@ -37,6 +38,7 @@ pub struct PxarExtractOptions<'a> {
pub allow_existing_dirs: bool,
pub overwrite_flags: OverwriteFlags,
pub on_error: Option<ErrorHandler>,
+ pub prelude_path: Option<PathBuf>,
}
bitflags! {
@@ -125,9 +127,26 @@ where
// we use this to keep track of our directory-traversal
decoder.enable_goodbye_entries(true);
- let (root, _) = handle_root_with_optional_format_version_prelude(&mut decoder)
+ let (root, prelude) = handle_root_with_optional_format_version_prelude(&mut decoder)
.context("error reading pxar archive")?;
+ if let Some(ref path) = options.prelude_path {
+ if let Some(entry) = prelude {
+ let mut prelude_file = OpenOptions::new()
+ .create(true)
+ .write(true)
+ .open(path)
+ .with_context(|| format!("error creating prelude file '{path:?}'"))?;
+ if let pxar::EntryKind::Prelude(ref prelude) = entry.kind() {
+ prelude_file.write_all(prelude.as_os_str().as_bytes())?;
+ } else {
+ log::info!("unexpected entry kind for prelude");
+ }
+ } else {
+ log::info!("No prelude entry found, skip prelude restore.");
+ }
+ }
+
if !root.is_dir() {
bail!("pxar archive does not start with a directory entry!");
}
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index 37412b154..b4d01ed3f 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -1441,7 +1441,12 @@ We do not extract '.pxar' archives when writing to standard output.
description: "ignore errors that occur during device node extraction",
optional: true,
default: false,
- }
+ },
+ "prelude-target": {
+ description: "Path to restore prelude to, (pxar v2 archives only).",
+ type: String,
+ optional: true,
+ },
}
}
)]
@@ -1603,12 +1608,17 @@ async fn restore(
overwrite_flags.insert(pbs_client::pxar::OverwriteFlags::all());
}
+ let prelude_path = param["prelude-target"]
+ .as_str()
+ .map(|path| PathBuf::from(path));
+
let options = pbs_client::pxar::PxarExtractOptions {
match_list: &[],
extract_match_default: true,
allow_existing_dirs,
overwrite_flags,
on_error,
+ prelude_path,
};
let mut feature_flags = pbs_client::pxar::Flags::DEFAULT;
@@ -1936,7 +1946,8 @@ fn main() {
.completion_cb("ns", complete_namespace)
.completion_cb("snapshot", complete_group_or_snapshot)
.completion_cb("archive-name", complete_archive_name)
- .completion_cb("target", complete_file_name);
+ .completion_cb("target", complete_file_name)
+ .completion_cb("prelude-target", complete_file_name);
let prune_cmd_def = CliCommand::new(&API_METHOD_PRUNE)
.arg_param(&["group"])
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index ecb617d65..bb57cf374 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -130,6 +130,10 @@ fn extract_archive_from_reader<R: std::io::Read>(
description: "'ppxar' payload input data file to restore split archive.",
optional: true,
},
+ "prelude-target": {
+ description: "Path to restore pxar archive prelude to.",
+ optional: true,
+ },
},
},
)]
@@ -153,6 +157,7 @@ fn extract_archive(
no_sockets: bool,
strict: bool,
payload_input: Option<String>,
+ prelude_target: Option<String>,
) -> Result<(), Error> {
let mut feature_flags = Flags::DEFAULT;
if no_xattrs {
@@ -226,6 +231,7 @@ fn extract_archive(
overwrite_flags,
extract_match_default,
on_error,
+ prelude_path: prelude_target.map(|path| PathBuf::from(path)),
};
if archive == "-" {
@@ -507,7 +513,8 @@ fn main() {
.completion_cb("archive", complete_file_name)
.completion_cb("target", complete_file_name)
.completion_cb("files-from", complete_file_name)
- .completion_cb("payload-input", complete_file_name),
+ .completion_cb("payload-input", complete_file_name)
+ .completion_cb("prelude-target", complete_file_name),
)
.insert(
"mount",
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 39/58] pxar: bin: show padding in debug output on archive list
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (37 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 38/58] client: pxar: allow to restore prelude to optional path Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 40/58] pxar: bin: ignore version and prelude entries in listing Christian Ebner
` (18 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
In addition to the entries, also show the padding encountered in-between
referenced payloads.
Example invocation: `PXAR_LOG=debug pxar list archive.mpxar`
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- prefix subject with `pxar: bin` instead of `pxar` only
- move use statement to correct location
pxar-bin/src/main.rs | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index bb57cf374..52bb1ca97 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -16,6 +16,7 @@ use pbs_client::pxar::{
format_single_line_entry, Flags, OverwriteFlags, PxarExtractOptions, PxarWriters,
ENCODER_MAX_ENTRIES,
};
+use pxar::EntryKind;
use proxmox_router::cli::*;
use proxmox_schema::api;
@@ -483,10 +484,28 @@ fn dump_archive(archive: String, payload_input: Option<String>) -> Result<(), Er
pxar::PxarVariant::Unified(archive)
};
+ let mut last = None;
for entry in pxar::decoder::Decoder::open(input)? {
let entry = entry?;
if log::log_enabled!(log::Level::Debug) {
+ match entry.kind() {
+ EntryKind::File {
+ payload_offset: Some(offset),
+ size,
+ ..
+ } => {
+ if let Some(last) = last {
+ let skipped = offset - last;
+ if skipped > 0 {
+ log::debug!("Encountered padding of {skipped} bytes");
+ }
+ }
+ last = Some(offset + size + std::mem::size_of::<pxar::format::Header>() as u64);
+ }
+ _ => (),
+ }
+
log::debug!("{}", format_single_line_entry(&entry));
} else {
log::info!("{:?}", entry.path());
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 40/58] pxar: bin: ignore version and prelude entries in listing
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (38 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 39/58] pxar: bin: show padding in debug output on archive list Christian Ebner
@ 2024-06-05 10:53 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 42/58] pxar: bin: support creation of split pxar archives via cli Christian Ebner
` (17 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:53 UTC (permalink / raw)
To: pbs-devel
Do not list the pxar format version and the prelude entries in the
output of pxar list, these are not regular entries. Do include them
however when dumping with the debug environmet variable set.
Since the prelude is arbitrary in size, only show the content size.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- prefix subject with `pxar: bin` instead of `pxar` only
- do show enties when debug environment variable is set
pxar-bin/Cargo.toml | 1 +
pxar-bin/src/main.rs | 14 +++++++++++++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/pxar-bin/Cargo.toml b/pxar-bin/Cargo.toml
index d91c03d3e..bb010ff78 100644
--- a/pxar-bin/Cargo.toml
+++ b/pxar-bin/Cargo.toml
@@ -20,6 +20,7 @@ pathpatterns.workspace = true
pxar.workspace = true
proxmox-async.workspace = true
+proxmox-human-byte.workspace = true
proxmox-router = { workspace = true, features = ["cli", "server"] }
proxmox-schema = { workspace = true, features = [ "api-macro" ] }
proxmox-sys.workspace = true
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index 52bb1ca97..7ea5b114a 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -18,6 +18,7 @@ use pbs_client::pxar::{
};
use pxar::EntryKind;
+use proxmox_human_byte::HumanByte;
use proxmox_router::cli::*;
use proxmox_schema::api;
@@ -490,6 +491,14 @@ fn dump_archive(archive: String, payload_input: Option<String>) -> Result<(), Er
if log::log_enabled!(log::Level::Debug) {
match entry.kind() {
+ EntryKind::Version(version) => {
+ log::debug!("pxar format version '{version:?}'");
+ continue;
+ }
+ EntryKind::Prelude(prelude) => {
+ log::debug!("prelude of size {}", HumanByte::from(prelude.data.len()));
+ continue;
+ }
EntryKind::File {
payload_offset: Some(offset),
size,
@@ -508,7 +517,10 @@ fn dump_archive(archive: String, payload_input: Option<String>) -> Result<(), Er
log::debug!("{}", format_single_line_entry(&entry));
} else {
- log::info!("{:?}", entry.path());
+ match entry.kind() {
+ EntryKind::Version(_) | EntryKind::Prelude(_) => continue,
+ _ => log::info!("{:?}", entry.path()),
+ }
}
}
Ok(())
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 42/58] pxar: bin: support creation of split pxar archives via cli
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (39 preceding siblings ...)
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 40/58] pxar: bin: ignore version and prelude entries in listing Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 43/58] pxar: add optional payload input to mount archive Christian Ebner
` (16 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Add support to create split pxar archives by redirecting the payload
output to a dedicated file.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- not present in previous version
pxar-bin/src/main.rs | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index 7ea5b114a..638ac00b6 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -328,6 +328,10 @@ fn extract_archive(
minimum: 0,
maximum: isize::MAX,
},
+ "payload-output": {
+ description: "'ppxar' payload output data file to create split archive.",
+ optional: true,
+ },
},
},
)]
@@ -345,6 +349,7 @@ async fn create_archive(
no_sockets: bool,
exclude: Option<Vec<String>>,
entries_max: isize,
+ payload_output: Option<String>,
) -> Result<(), Error> {
let patterns = {
let input = exclude.unwrap_or_default();
@@ -387,6 +392,16 @@ async fn create_archive(
.mode(0o640)
.open(archive)?;
+ let payload_file = payload_output
+ .map(|payload_output| {
+ OpenOptions::new()
+ .create_new(true)
+ .write(true)
+ .mode(0o640)
+ .open(payload_output)
+ })
+ .transpose()?;
+
let writer = std::io::BufWriter::with_capacity(1024 * 1024, file);
let mut feature_flags = Flags::DEFAULT;
if no_xattrs {
@@ -408,7 +423,15 @@ async fn create_archive(
feature_flags.remove(Flags::WITH_SOCKETS);
}
- let writer = pxar::PxarVariant::Unified(pxar::encoder::sync::StandardWriter::new(writer));
+ let writer = if let Some(payload_file) = payload_file {
+ let payload_writer = std::io::BufWriter::with_capacity(1024 * 1024, payload_file);
+ pxar::PxarVariant::Split(
+ pxar::encoder::sync::StandardWriter::new(writer),
+ pxar::encoder::sync::StandardWriter::new(payload_writer),
+ )
+ } else {
+ pxar::PxarVariant::Unified(pxar::encoder::sync::StandardWriter::new(writer))
+ };
pbs_client::pxar::create_archive(
dir,
PxarWriters::new(writer, None),
@@ -535,7 +558,8 @@ fn main() {
CliCommand::new(&API_METHOD_CREATE_ARCHIVE)
.arg_param(&["archive", "source"])
.completion_cb("archive", complete_file_name)
- .completion_cb("source", complete_file_name),
+ .completion_cb("source", complete_file_name)
+ .completion_cb("payload-output", complete_file_name),
)
.insert(
"extract",
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 43/58] pxar: add optional payload input to mount archive
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (40 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 42/58] pxar: bin: support creation of split pxar archives via cli Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 44/58] datastore: chunker: add Chunker trait Christian Ebner
` (15 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Allow to pass an optional input path to mount a split pxar archive
with dedicated payload data file.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- not present in previous version
pbs-pxar-fuse/src/lib.rs | 14 +++++++++++++-
pxar-bin/src/main.rs | 27 ++++++++++++++++++++++-----
2 files changed, 35 insertions(+), 6 deletions(-)
diff --git a/pbs-pxar-fuse/src/lib.rs b/pbs-pxar-fuse/src/lib.rs
index 377635b2a..780a4ddbe 100644
--- a/pbs-pxar-fuse/src/lib.rs
+++ b/pbs-pxar-fuse/src/lib.rs
@@ -61,12 +61,24 @@ impl Session {
options: &OsStr,
verbose: bool,
mountpoint: &Path,
+ payload_input_path: Option<&Path>,
) -> Result<Self, Error> {
// TODO: Add a buffered/caching ReadAt layer?
let file = std::fs::File::open(archive_path)?;
let file_size = file.metadata()?.len();
let reader: Reader = Arc::new(accessor::sync::FileReader::new(file));
- let accessor = Accessor::new(pxar::PxarVariant::Unified(reader), file_size).await?;
+ let accessor = if let Some(payload_input) = payload_input_path {
+ let payload_file = std::fs::File::open(payload_input)?;
+ let payload_size = payload_file.metadata()?.len();
+ let payload_reader: Reader = Arc::new(accessor::sync::FileReader::new(payload_file));
+ Accessor::new(
+ pxar::PxarVariant::Split(reader, (payload_reader, payload_size)),
+ file_size,
+ )
+ .await?
+ } else {
+ Accessor::new(pxar::PxarVariant::Unified(reader), file_size).await?
+ };
Self::mount(accessor, options, verbose, mountpoint)
}
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index 638ac00b6..85887a8ed 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -458,18 +458,34 @@ async fn create_archive(
optional: true,
default: false,
},
+ "payload-input": {
+ description: "'ppxar' payload input data file to restore split archive.",
+ optional: true,
+ },
},
},
)]
/// Mount the archive to the provided mountpoint via FUSE.
-async fn mount_archive(archive: String, mountpoint: String, verbose: bool) -> Result<(), Error> {
+async fn mount_archive(
+ archive: String,
+ mountpoint: String,
+ verbose: bool,
+ payload_input: Option<String>,
+) -> Result<(), Error> {
let archive = Path::new(&archive);
let mountpoint = Path::new(&mountpoint);
let options = OsStr::new("ro,default_permissions");
+ let payload_input = payload_input.map(|payload_input| PathBuf::from(payload_input));
- let session = pbs_pxar_fuse::Session::mount_path(archive, options, verbose, mountpoint)
- .await
- .map_err(|err| format_err!("pxar mount failed: {}", err))?;
+ let session = pbs_pxar_fuse::Session::mount_path(
+ archive,
+ options,
+ verbose,
+ mountpoint,
+ payload_input.as_deref(),
+ )
+ .await
+ .map_err(|err| format_err!("pxar mount failed: {}", err))?;
let mut interrupt = signal(SignalKind::interrupt())?;
@@ -576,7 +592,8 @@ fn main() {
CliCommand::new(&API_METHOD_MOUNT_ARCHIVE)
.arg_param(&["archive", "mountpoint"])
.completion_cb("archive", complete_file_name)
- .completion_cb("mountpoint", complete_file_name),
+ .completion_cb("mountpoint", complete_file_name)
+ .completion_cb("payload-input", complete_file_name),
)
.insert(
"list",
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 44/58] datastore: chunker: add Chunker trait
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (41 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 43/58] pxar: add optional payload input to mount archive Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 45/58] datastore: chunker: implement chunker for payload stream Christian Ebner
` (14 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Add the Chunker trait and move the current Chunker to ChunkerImpl to
implement the trait instead. This allows to use different chunker
implementations by dynamic dispatch and is in preparation for
implementing a dedicated payload chunker.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
examples/test_chunk_size.rs | 9 +--
examples/test_chunk_speed.rs | 7 ++-
pbs-client/src/chunk_stream.rs | 37 ++++++------
pbs-datastore/src/chunker.rs | 95 ++++++++++++++++++------------
pbs-datastore/src/dynamic_index.rs | 9 +--
pbs-datastore/src/lib.rs | 2 +-
6 files changed, 91 insertions(+), 68 deletions(-)
diff --git a/examples/test_chunk_size.rs b/examples/test_chunk_size.rs
index a01a5e640..2ebc22f64 100644
--- a/examples/test_chunk_size.rs
+++ b/examples/test_chunk_size.rs
@@ -5,10 +5,10 @@ extern crate proxmox_backup;
use anyhow::Error;
use std::io::{Read, Write};
-use pbs_datastore::Chunker;
+use pbs_datastore::{Chunker, ChunkerImpl};
struct ChunkWriter {
- chunker: Chunker,
+ chunker: ChunkerImpl,
last_chunk: usize,
chunk_offset: usize,
@@ -23,7 +23,7 @@ struct ChunkWriter {
impl ChunkWriter {
fn new(chunk_size: usize) -> Self {
ChunkWriter {
- chunker: Chunker::new(chunk_size),
+ chunker: ChunkerImpl::new(chunk_size),
last_chunk: 0,
chunk_offset: 0,
chunk_count: 0,
@@ -69,7 +69,8 @@ impl Write for ChunkWriter {
fn write(&mut self, data: &[u8]) -> std::result::Result<usize, std::io::Error> {
let chunker = &mut self.chunker;
- let pos = chunker.scan(data);
+ let ctx = pbs_datastore::chunker::Context::default();
+ let pos = chunker.scan(data, &ctx);
if pos > 0 {
self.chunk_offset += pos;
diff --git a/examples/test_chunk_speed.rs b/examples/test_chunk_speed.rs
index 37e13e0de..2d79604ab 100644
--- a/examples/test_chunk_speed.rs
+++ b/examples/test_chunk_speed.rs
@@ -1,6 +1,6 @@
extern crate proxmox_backup;
-use pbs_datastore::Chunker;
+use pbs_datastore::{Chunker, ChunkerImpl};
fn main() {
let mut buffer = Vec::new();
@@ -12,7 +12,7 @@ fn main() {
buffer.push(byte);
}
}
- let mut chunker = Chunker::new(64 * 1024);
+ let mut chunker = ChunkerImpl::new(64 * 1024);
let count = 5;
@@ -23,8 +23,9 @@ fn main() {
for _i in 0..count {
let mut pos = 0;
let mut _last = 0;
+ let ctx = pbs_datastore::chunker::Context::default();
while pos < buffer.len() {
- let k = chunker.scan(&buffer[pos..]);
+ let k = chunker.scan(&buffer[pos..], &ctx);
if k == 0 {
//println!("LAST {}", pos);
break;
diff --git a/pbs-client/src/chunk_stream.rs b/pbs-client/src/chunk_stream.rs
index 87a018d50..84158a2c9 100644
--- a/pbs-client/src/chunk_stream.rs
+++ b/pbs-client/src/chunk_stream.rs
@@ -7,7 +7,7 @@ use bytes::BytesMut;
use futures::ready;
use futures::stream::{Stream, TryStream};
-use pbs_datastore::Chunker;
+use pbs_datastore::{Chunker, ChunkerImpl};
use crate::inject_reused_chunks::InjectChunks;
@@ -16,7 +16,6 @@ pub struct InjectionData {
boundaries: mpsc::Receiver<InjectChunks>,
next_boundary: Option<InjectChunks>,
injections: mpsc::Sender<InjectChunks>,
- consumed: u64,
}
impl InjectionData {
@@ -28,7 +27,6 @@ impl InjectionData {
boundaries,
next_boundary: None,
injections,
- consumed: 0,
}
}
}
@@ -36,19 +34,22 @@ impl InjectionData {
/// Split input stream into dynamic sized chunks
pub struct ChunkStream<S: Unpin> {
input: S,
- chunker: Chunker,
+ chunker: Box<dyn Chunker + Send>,
buffer: BytesMut,
scan_pos: usize,
+ consumed: u64,
injection_data: Option<InjectionData>,
}
impl<S: Unpin> ChunkStream<S> {
pub fn new(input: S, chunk_size: Option<usize>, injection_data: Option<InjectionData>) -> Self {
+ let chunk_size = chunk_size.unwrap_or(4 * 1024 * 1024);
Self {
input,
- chunker: Chunker::new(chunk_size.unwrap_or(4 * 1024 * 1024)),
+ chunker: Box::new(ChunkerImpl::new(chunk_size)),
buffer: BytesMut::new(),
scan_pos: 0,
+ consumed: 0,
injection_data,
}
}
@@ -68,11 +69,15 @@ where
let this = self.get_mut();
loop {
+ let ctx = pbs_datastore::chunker::Context {
+ base: this.consumed,
+ total: this.buffer.len() as u64,
+ };
+
if let Some(InjectionData {
boundaries,
next_boundary,
injections,
- consumed,
}) = this.injection_data.as_mut()
{
if next_boundary.is_none() {
@@ -84,29 +89,29 @@ where
if let Some(inject) = next_boundary.take() {
// require forced boundary, lookup next regular boundary
let pos = if this.scan_pos < this.buffer.len() {
- this.chunker.scan(&this.buffer[this.scan_pos..])
+ this.chunker.scan(&this.buffer[this.scan_pos..], &ctx)
} else {
0
};
let chunk_boundary = if pos == 0 {
- *consumed + this.buffer.len() as u64
+ this.consumed + this.buffer.len() as u64
} else {
- *consumed + (this.scan_pos + pos) as u64
+ this.consumed + (this.scan_pos + pos) as u64
};
if inject.boundary <= chunk_boundary {
// forced boundary is before next boundary, force within current buffer
- let chunk_size = (inject.boundary - *consumed) as usize;
+ let chunk_size = (inject.boundary - this.consumed) as usize;
let raw_chunk = this.buffer.split_to(chunk_size);
this.chunker.reset();
this.scan_pos = 0;
- *consumed += chunk_size as u64;
+ this.consumed += chunk_size as u64;
// add the size of the injected chunks to consumed, so chunk stream offsets
// are in sync with the rest of the archive.
- *consumed += inject.size as u64;
+ this.consumed += inject.size as u64;
injections.send(inject).unwrap();
@@ -118,7 +123,7 @@ where
// forced boundary is after next boundary, split off chunk from buffer
let chunk_size = this.scan_pos + pos;
let raw_chunk = this.buffer.split_to(chunk_size);
- *consumed += chunk_size as u64;
+ this.consumed += chunk_size as u64;
this.scan_pos = 0;
return Poll::Ready(Some(Ok(raw_chunk)));
@@ -131,7 +136,7 @@ where
}
if this.scan_pos < this.buffer.len() {
- let boundary = this.chunker.scan(&this.buffer[this.scan_pos..]);
+ let boundary = this.chunker.scan(&this.buffer[this.scan_pos..], &ctx);
let chunk_size = this.scan_pos + boundary;
@@ -140,9 +145,7 @@ where
} else if chunk_size <= this.buffer.len() {
// found new chunk boundary inside buffer, split off chunk from buffer
let raw_chunk = this.buffer.split_to(chunk_size);
- if let Some(InjectionData { consumed, .. }) = this.injection_data.as_mut() {
- *consumed += chunk_size as u64;
- }
+ this.consumed += chunk_size as u64;
this.scan_pos = 0;
return Poll::Ready(Some(Ok(raw_chunk)));
} else {
diff --git a/pbs-datastore/src/chunker.rs b/pbs-datastore/src/chunker.rs
index 253d2cf4c..d75e63fa8 100644
--- a/pbs-datastore/src/chunker.rs
+++ b/pbs-datastore/src/chunker.rs
@@ -5,6 +5,20 @@
/// use hash value 0 to detect a boundary.
const CA_CHUNKER_WINDOW_SIZE: usize = 64;
+/// Additional context for chunker to find possible boundaries in payload streams
+#[derive(Default)]
+pub struct Context {
+ /// Already consumed bytes of the chunk stream consumer
+ pub base: u64,
+ /// Total size currently buffered
+ pub total: u64,
+}
+
+pub trait Chunker {
+ fn scan(&mut self, data: &[u8], ctx: &Context) -> usize;
+ fn reset(&mut self);
+}
+
/// Sliding window chunker (Buzhash)
///
/// This is a rewrite of *casync* chunker (cachunker.h) in rust.
@@ -15,7 +29,7 @@ const CA_CHUNKER_WINDOW_SIZE: usize = 64;
/// Hash](https://en.wikipedia.org/wiki/Rolling_hash) article from
/// Wikipedia.
-pub struct Chunker {
+pub struct ChunkerImpl {
h: u32,
window_size: usize,
chunk_size: usize,
@@ -67,7 +81,7 @@ const BUZHASH_TABLE: [u32; 256] = [
0x5eff22f4, 0x6027f4cc, 0x77178b3c, 0xae507131, 0x7bf7cabc, 0xf9c18d66, 0x593ade65, 0xd95ddf11,
];
-impl Chunker {
+impl ChunkerImpl {
/// Create a new Chunker instance, which produces and average
/// chunk size of `chunk_size_avg` (need to be a power of two). We
/// allow variation from `chunk_size_avg/4` up to a maximum of
@@ -105,11 +119,44 @@ impl Chunker {
}
}
+ // fast implementation avoiding modulo
+ // #[inline(always)]
+ fn shall_break(&self) -> bool {
+ if self.chunk_size >= self.chunk_size_max {
+ return true;
+ }
+
+ if self.chunk_size < self.chunk_size_min {
+ return false;
+ }
+
+ //(self.h & 0x1ffff) <= 2 //THIS IS SLOW!!!
+
+ //(self.h & self.break_test_mask) <= 2 // Bad on 0 streams
+
+ (self.h & self.break_test_mask) >= self.break_test_minimum
+ }
+
+ // This is the original implementation from casync
+ /*
+ #[inline(always)]
+ fn shall_break_orig(&self) -> bool {
+
+ if self.chunk_size >= self.chunk_size_max { return true; }
+
+ if self.chunk_size < self.chunk_size_min { return false; }
+
+ (self.h % self.discriminator) == (self.discriminator - 1)
+ }
+ */
+}
+
+impl Chunker for ChunkerImpl {
/// Scans the specified data for a chunk border. Returns 0 if none
/// was found (and the function should be called with more data
/// later on), or another value indicating the position of a
/// border.
- pub fn scan(&mut self, data: &[u8]) -> usize {
+ fn scan(&mut self, data: &[u8], _ctx: &Context) -> usize {
let window_len = self.window.len();
let data_len = data.len();
@@ -167,42 +214,11 @@ impl Chunker {
0
}
- pub fn reset(&mut self) {
+ fn reset(&mut self) {
self.h = 0;
self.chunk_size = 0;
self.window_size = 0;
}
-
- // fast implementation avoiding modulo
- // #[inline(always)]
- fn shall_break(&self) -> bool {
- if self.chunk_size >= self.chunk_size_max {
- return true;
- }
-
- if self.chunk_size < self.chunk_size_min {
- return false;
- }
-
- //(self.h & 0x1ffff) <= 2 //THIS IS SLOW!!!
-
- //(self.h & self.break_test_mask) <= 2 // Bad on 0 streams
-
- (self.h & self.break_test_mask) >= self.break_test_minimum
- }
-
- // This is the original implementation from casync
- /*
- #[inline(always)]
- fn shall_break_orig(&self) -> bool {
-
- if self.chunk_size >= self.chunk_size_max { return true; }
-
- if self.chunk_size < self.chunk_size_min { return false; }
-
- (self.h % self.discriminator) == (self.discriminator - 1)
- }
- */
}
#[test]
@@ -215,17 +231,18 @@ fn test_chunker1() {
buffer.push(byte);
}
}
- let mut chunker = Chunker::new(64 * 1024);
+ let mut chunker = ChunkerImpl::new(64 * 1024);
let mut pos = 0;
let mut last = 0;
let mut chunks1: Vec<(usize, usize)> = vec![];
let mut chunks2: Vec<(usize, usize)> = vec![];
+ let ctx = Context::default();
// test1: feed single bytes
while pos < buffer.len() {
- let k = chunker.scan(&buffer[pos..pos + 1]);
+ let k = chunker.scan(&buffer[pos..pos + 1], &ctx);
pos += 1;
if k != 0 {
let prev = last;
@@ -235,13 +252,13 @@ fn test_chunker1() {
}
chunks1.push((last, buffer.len() - last));
- let mut chunker = Chunker::new(64 * 1024);
+ let mut chunker = ChunkerImpl::new(64 * 1024);
let mut pos = 0;
// test2: feed with whole buffer
while pos < buffer.len() {
- let k = chunker.scan(&buffer[pos..]);
+ let k = chunker.scan(&buffer[pos..], &ctx);
if k != 0 {
chunks2.push((pos, k));
pos += k;
diff --git a/pbs-datastore/src/dynamic_index.rs b/pbs-datastore/src/dynamic_index.rs
index b8047b5b1..dc9eee050 100644
--- a/pbs-datastore/src/dynamic_index.rs
+++ b/pbs-datastore/src/dynamic_index.rs
@@ -23,7 +23,7 @@ use crate::data_blob::{DataBlob, DataChunkBuilder};
use crate::file_formats;
use crate::index::{ChunkReadInfo, IndexFile};
use crate::read_chunk::ReadChunk;
-use crate::Chunker;
+use crate::{Chunker, ChunkerImpl};
/// Header format definition for dynamic index files (`.dixd`)
#[repr(C)]
@@ -397,7 +397,7 @@ impl DynamicIndexWriter {
pub struct DynamicChunkWriter {
index: DynamicIndexWriter,
closed: bool,
- chunker: Chunker,
+ chunker: ChunkerImpl,
stat: ChunkStat,
chunk_offset: usize,
last_chunk: usize,
@@ -409,7 +409,7 @@ impl DynamicChunkWriter {
Self {
index,
closed: false,
- chunker: Chunker::new(chunk_size),
+ chunker: ChunkerImpl::new(chunk_size),
stat: ChunkStat::new(0),
chunk_offset: 0,
last_chunk: 0,
@@ -494,7 +494,8 @@ impl Write for DynamicChunkWriter {
fn write(&mut self, data: &[u8]) -> std::result::Result<usize, std::io::Error> {
let chunker = &mut self.chunker;
- let pos = chunker.scan(data);
+ let ctx = crate::chunker::Context::default();
+ let pos = chunker.scan(data, &ctx);
if pos > 0 {
self.chunk_buffer.extend_from_slice(&data[0..pos]);
diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
index 43050162f..24429626c 100644
--- a/pbs-datastore/src/lib.rs
+++ b/pbs-datastore/src/lib.rs
@@ -196,7 +196,7 @@ pub use backup_info::{BackupDir, BackupGroup, BackupInfo};
pub use checksum_reader::ChecksumReader;
pub use checksum_writer::ChecksumWriter;
pub use chunk_store::ChunkStore;
-pub use chunker::Chunker;
+pub use chunker::{Chunker, ChunkerImpl};
pub use crypt_reader::CryptReader;
pub use crypt_writer::CryptWriter;
pub use data_blob::DataBlob;
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 45/58] datastore: chunker: implement chunker for payload stream
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (42 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 44/58] datastore: chunker: add Chunker trait Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 46/58] chunker: tests: add regression tests for payload chunker Christian Ebner
` (13 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Implement the Chunker trait for a dedicated payload stream chunker,
which extends the regular chunker by the option to suggest boundaries
to be used over the hast based boundaries whenever possible.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-datastore/src/chunker.rs | 90 ++++++++++++++++++++++++++++++++++++
pbs-datastore/src/lib.rs | 2 +-
2 files changed, 91 insertions(+), 1 deletion(-)
diff --git a/pbs-datastore/src/chunker.rs b/pbs-datastore/src/chunker.rs
index d75e63fa8..d0543bca0 100644
--- a/pbs-datastore/src/chunker.rs
+++ b/pbs-datastore/src/chunker.rs
@@ -1,3 +1,5 @@
+use std::sync::mpsc::Receiver;
+
/// Note: window size 32 or 64, is faster because we can
/// speedup modulo operations, but always computes hash 0
/// for constant data streams .. 0,0,0,0,0,0
@@ -46,6 +48,16 @@ pub struct ChunkerImpl {
window: [u8; CA_CHUNKER_WINDOW_SIZE],
}
+/// Sliding window chunker (Buzhash) with boundary suggestions
+///
+/// Suggest to chunk at a given boundary instead of the regular chunk boundary for better alignment
+/// with file payload boundaries.
+pub struct PayloadChunker {
+ chunker: ChunkerImpl,
+ current_suggested: Option<u64>,
+ suggested_boundaries: Receiver<u64>,
+}
+
const BUZHASH_TABLE: [u32; 256] = [
0x458be752, 0xc10748cc, 0xfbbcdbb8, 0x6ded5b68, 0xb10a82b5, 0x20d75648, 0xdfc5665f, 0xa8428801,
0x7ebf5191, 0x841135c7, 0x65cc53b3, 0x280a597c, 0x16f60255, 0xc78cbc3e, 0x294415f5, 0xb938d494,
@@ -221,6 +233,84 @@ impl Chunker for ChunkerImpl {
}
}
+impl PayloadChunker {
+ /// Create a new PayloadChunker instance, which produces and average
+ /// chunk size of `chunk_size_avg` (need to be a power of two), if no
+ /// suggested boundaries are provided.
+ /// Use suggested boundaries instead, whenever the chunk size is within
+ /// the min - max range.
+ pub fn new(chunk_size_avg: usize, suggested_boundaries: Receiver<u64>) -> Self {
+ Self {
+ chunker: ChunkerImpl::new(chunk_size_avg),
+ current_suggested: None,
+ suggested_boundaries,
+ }
+ }
+}
+
+impl Chunker for PayloadChunker {
+ fn scan(&mut self, data: &[u8], ctx: &Context) -> usize {
+ assert!(ctx.total >= data.len() as u64);
+ let pos = ctx.total - data.len() as u64;
+
+ loop {
+ if let Some(boundary) = self.current_suggested {
+ if boundary < ctx.base + pos {
+ log::debug!("Boundary {boundary} in past");
+ // ignore passed boundaries
+ self.current_suggested = None;
+ continue;
+ }
+
+ if boundary > ctx.base + ctx.total {
+ log::debug!("Boundary {boundary} in future");
+ // boundary in future, cannot decide yet
+ return self.chunker.scan(data, ctx);
+ }
+
+ let chunk_size = (boundary - ctx.base) as usize;
+ if chunk_size < self.chunker.chunk_size_min {
+ log::debug!("Chunk size {chunk_size} below minimum chunk size");
+ // chunk to small, ignore boundary
+ self.current_suggested = None;
+ continue;
+ }
+
+ if chunk_size <= self.chunker.chunk_size_max {
+ self.current_suggested = None;
+ // calculate boundary relative to start of given data buffer
+ let len = chunk_size - pos as usize;
+ if len == 0 {
+ // passed this one, previous scan did not know about boundary just yet
+ return self.chunker.scan(data, ctx);
+ }
+ self.chunker.reset();
+ log::debug!(
+ "Chunk at suggested boundary: {boundary}, chunk size: {chunk_size}"
+ );
+ return len;
+ }
+
+ log::debug!("Chunk {chunk_size} to big, regular scan");
+ // chunk to big, cannot decide yet
+ // scan for hash based chunk boundary instead
+ return self.chunker.scan(data, ctx);
+ }
+
+ if let Ok(boundary) = self.suggested_boundaries.try_recv() {
+ self.current_suggested = Some(boundary);
+ } else {
+ log::debug!("No suggested boundary, regular scan");
+ return self.chunker.scan(data, ctx);
+ }
+ }
+ }
+
+ fn reset(&mut self) {
+ self.chunker.reset();
+ }
+}
+
#[test]
fn test_chunker1() {
let mut buffer = Vec::new();
diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
index 24429626c..3e4aa34c2 100644
--- a/pbs-datastore/src/lib.rs
+++ b/pbs-datastore/src/lib.rs
@@ -196,7 +196,7 @@ pub use backup_info::{BackupDir, BackupGroup, BackupInfo};
pub use checksum_reader::ChecksumReader;
pub use checksum_writer::ChecksumWriter;
pub use chunk_store::ChunkStore;
-pub use chunker::{Chunker, ChunkerImpl};
+pub use chunker::{Chunker, ChunkerImpl, PayloadChunker};
pub use crypt_reader::CryptReader;
pub use crypt_writer::CryptWriter;
pub use data_blob::DataBlob;
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 46/58] chunker: tests: add regression tests for payload chunker
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (43 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 45/58] datastore: chunker: implement chunker for payload stream Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 47/58] chunk stream: " Christian Ebner
` (12 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Test chunking of a payload stream with suggested chunk boundaries.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-datastore/src/chunker.rs | 94 ++++++++++++++++++++++++++++++++++++
1 file changed, 94 insertions(+)
diff --git a/pbs-datastore/src/chunker.rs b/pbs-datastore/src/chunker.rs
index d0543bca0..ecdbca296 100644
--- a/pbs-datastore/src/chunker.rs
+++ b/pbs-datastore/src/chunker.rs
@@ -382,3 +382,97 @@ fn test_chunker1() {
panic!("got different chunks");
}
}
+
+#[test]
+fn test_suggested_boundary() {
+ let mut buffer = Vec::new();
+
+ for i in 0..(256 * 1024) {
+ for j in 0..4 {
+ let byte = ((i >> (j << 3)) & 0xff) as u8;
+ buffer.push(byte);
+ }
+ }
+ let (tx, rx) = std::sync::mpsc::channel();
+ let mut chunker = PayloadChunker::new(64 * 1024, rx);
+
+ // Suggest chunk boundary within regular chunk
+ tx.send(32 * 1024).unwrap();
+ // Suggest chunk boundary within regular chunk, resulting chunk being 0
+ tx.send(32 * 1024).unwrap();
+ // Suggest chunk boundary in the past, must be ignored
+ tx.send(0).unwrap();
+ // Suggest chunk boundary aligned with regular boundary
+ tx.send(405521).unwrap();
+
+ let mut pos = 0;
+ let mut last = 0;
+
+ let mut chunks1: Vec<(usize, usize)> = vec![];
+ let mut chunks2: Vec<(usize, usize)> = vec![];
+ let mut ctx = Context::default();
+
+ // test1: feed single bytes with suggeset boundary
+ while pos < buffer.len() {
+ ctx.total += 1;
+ let k = chunker.scan(&buffer[pos..pos + 1], &ctx);
+ pos += 1;
+ if k != 0 {
+ let prev = last;
+ last = pos;
+ ctx.base += pos as u64;
+ ctx.total = 0;
+ chunks1.push((prev, pos - prev));
+ }
+ }
+ chunks1.push((last, buffer.len() - last));
+
+ let mut pos = 0;
+ let mut ctx = Context::default();
+ ctx.total = buffer.len() as u64;
+ chunker.reset();
+ // Suggest chunk boundary within regular chunk
+ tx.send(32 * 1024).unwrap();
+ // Suggest chunk boundary within regular chunk,
+ // resulting chunk being to small and therefore ignored
+ tx.send(32 * 1024).unwrap();
+ // Suggest chunk boundary in the past, must be ignored
+ tx.send(0).unwrap();
+ // Suggest chunk boundary aligned with regular boundary
+ tx.send(405521).unwrap();
+
+ while pos < buffer.len() {
+ let k = chunker.scan(&buffer[pos..], &ctx);
+ if k != 0 {
+ chunks2.push((pos, k));
+ pos += k;
+ ctx.base += pos as u64;
+ ctx.total = (buffer.len() - pos) as u64;
+ } else {
+ break;
+ }
+ }
+
+ chunks2.push((pos, buffer.len() - pos));
+
+ if chunks1 != chunks2 {
+ let mut size1 = 0;
+ for (_offset, len) in &chunks1 {
+ size1 += len;
+ }
+ println!("Chunks1: {size1}\n{chunks1:?}\n");
+
+ let mut size2 = 0;
+ for (_offset, len) in &chunks2 {
+ size2 += len;
+ }
+ println!("Chunks2: {size2}\n{chunks2:?}\n");
+
+ panic!("got different chunks");
+ }
+
+ let expected_sizes = [32768, 110609, 229376, 32768, 262144, 262144, 118767];
+ for ((_, chunk_size), expected) in chunks1.iter().zip(expected_sizes.iter()) {
+ assert_eq!(chunk_size, expected);
+ }
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 47/58] chunk stream: tests: add regression tests for payload chunker
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (44 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 46/58] chunker: tests: add regression tests for payload chunker Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 48/58] client: chunk stream: switch payload stream chunker Christian Ebner
` (11 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Regression tests to cover suggested and forced boundaries as well as
chunk injection.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/chunk_stream.rs | 117 +++++++++++++++++++++++++++++++++
1 file changed, 117 insertions(+)
diff --git a/pbs-client/src/chunk_stream.rs b/pbs-client/src/chunk_stream.rs
index 84158a2c9..070a10c17 100644
--- a/pbs-client/src/chunk_stream.rs
+++ b/pbs-client/src/chunk_stream.rs
@@ -228,3 +228,120 @@ where
}
}
}
+
+#[cfg(test)]
+mod test {
+ use futures::stream::StreamExt;
+
+ use super::*;
+
+ struct DummyInput {
+ data: Vec<u8>,
+ }
+
+ impl DummyInput {
+ fn new(data: Vec<u8>) -> Self {
+ Self { data }
+ }
+ }
+
+ impl Stream for DummyInput {
+ type Item = Result<Vec<u8>, Error>;
+
+ fn poll_next(self: Pin<&mut Self>, _cx: &mut Context) -> Poll<Option<Self::Item>> {
+ let this = self.get_mut();
+ match this.data.len() {
+ 0 => Poll::Ready(None),
+ size if size > 10 => Poll::Ready(Some(Ok(this.data.split_off(10)))),
+ _ => Poll::Ready(Some(Ok(std::mem::take(&mut this.data)))),
+ }
+ }
+ }
+
+ #[test]
+ fn test_chunk_stream_forced_boundaries() {
+ let mut data = Vec::new();
+ for i in 0..(256 * 1024) {
+ for j in 0..4 {
+ let byte = ((i >> (j << 3)) & 0xff) as u8;
+ data.push(byte);
+ }
+ }
+
+ let mut input = DummyInput::new(data);
+ let input = Pin::new(&mut input);
+
+ let (injections_tx, injections_rx) = mpsc::channel();
+ let (boundaries_tx, boundaries_rx) = mpsc::channel();
+ let (suggested_tx, suggested_rx) = mpsc::channel();
+ let injection_data = InjectionData::new(boundaries_rx, injections_tx);
+
+ let mut chunk_stream = ChunkStream::new(
+ input,
+ Some(64 * 1024),
+ Some(injection_data),
+ Some(suggested_rx),
+ );
+ let chunks = std::sync::Arc::new(std::sync::Mutex::new(Vec::new()));
+ let chunks_clone = chunks.clone();
+
+ // Suggested boundary matching forced boundary
+ suggested_tx.send(32 * 1024).unwrap();
+ // Suggested boundary not matching forced boundary
+ suggested_tx.send(64 * 1024).unwrap();
+ // Force chunk boundary at suggested boundary
+ boundaries_tx
+ .send(InjectChunks {
+ boundary: 32 * 1024,
+ chunks: Vec::new(),
+ size: 1024,
+ })
+ .unwrap();
+ // Force chunk boundary within regular chunk
+ boundaries_tx
+ .send(InjectChunks {
+ boundary: 128 * 1024,
+ chunks: Vec::new(),
+ size: 2048,
+ })
+ .unwrap();
+ // Force chunk boundary aligned with regular boundary
+ boundaries_tx
+ .send(InjectChunks {
+ boundary: 657408,
+ chunks: Vec::new(),
+ size: 512,
+ })
+ .unwrap();
+ // Force chunk boundary within regular chunk, without injecting data
+ boundaries_tx
+ .send(InjectChunks {
+ boundary: 657408 + 1024,
+ chunks: Vec::new(),
+ size: 0,
+ })
+ .unwrap();
+
+ let rt = tokio::runtime::Runtime::new().unwrap();
+ rt.block_on(async move {
+ while let Some(chunk) = chunk_stream.next().await {
+ let chunk = chunk.unwrap();
+ let mut chunks = chunks.lock().unwrap();
+ chunks.push(chunk);
+ }
+ });
+
+ let mut total = 0;
+ let chunks = chunks_clone.lock().unwrap();
+ let expected = [32768, 31744, 65536, 262144, 262144, 512, 262144, 131584];
+ for (chunk, expected) in chunks.as_slice().iter().zip(expected.iter()) {
+ assert_eq!(chunk.len(), *expected);
+ total += chunk.len();
+ }
+ while let Ok(injection) = injections_rx.recv() {
+ total += injection.size;
+ }
+
+ assert_eq!(total, 4 * 256 * 1024 + 1024 + 2048 + 512);
+ }
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 48/58] client: chunk stream: switch payload stream chunker
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (45 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 47/58] chunk stream: " Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 49/58] client: pxar: add archive creation with reference test Christian Ebner
` (10 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Use the dedicated chunker with boundary suggestions for the payload
stream, by attaching the channel sender to the archiver and the
channel receiver to the payload stream chunker.
The archiver sends the file boundaries for the chunker to consume.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
examples/test_chunk_speed2.rs | 2 +-
pbs-client/src/chunk_stream.rs | 15 +++++--
pbs-client/src/pxar/create.rs | 8 ++++
pbs-client/src/pxar_backup_stream.rs | 40 +++++++++++--------
proxmox-backup-client/src/main.rs | 16 +++++---
.../src/proxmox_restore_daemon/api.rs | 12 +++++-
pxar-bin/src/main.rs | 1 +
tests/catar.rs | 1 +
8 files changed, 68 insertions(+), 27 deletions(-)
diff --git a/examples/test_chunk_speed2.rs b/examples/test_chunk_speed2.rs
index 22dd14ce2..f2963746a 100644
--- a/examples/test_chunk_speed2.rs
+++ b/examples/test_chunk_speed2.rs
@@ -26,7 +26,7 @@ async fn run() -> Result<(), Error> {
.map_err(Error::from);
//let chunk_stream = FixedChunkStream::new(stream, 4*1024*1024);
- let mut chunk_stream = ChunkStream::new(stream, None, None);
+ let mut chunk_stream = ChunkStream::new(stream, None, None, None);
let start_time = std::time::Instant::now();
diff --git a/pbs-client/src/chunk_stream.rs b/pbs-client/src/chunk_stream.rs
index 070a10c17..e3f0980c6 100644
--- a/pbs-client/src/chunk_stream.rs
+++ b/pbs-client/src/chunk_stream.rs
@@ -7,7 +7,7 @@ use bytes::BytesMut;
use futures::ready;
use futures::stream::{Stream, TryStream};
-use pbs_datastore::{Chunker, ChunkerImpl};
+use pbs_datastore::{Chunker, ChunkerImpl, PayloadChunker};
use crate::inject_reused_chunks::InjectChunks;
@@ -42,11 +42,20 @@ pub struct ChunkStream<S: Unpin> {
}
impl<S: Unpin> ChunkStream<S> {
- pub fn new(input: S, chunk_size: Option<usize>, injection_data: Option<InjectionData>) -> Self {
+ pub fn new(
+ input: S,
+ chunk_size: Option<usize>,
+ injection_data: Option<InjectionData>,
+ suggested_boundaries: Option<mpsc::Receiver<u64>>,
+ ) -> Self {
let chunk_size = chunk_size.unwrap_or(4 * 1024 * 1024);
Self {
input,
- chunker: Box::new(ChunkerImpl::new(chunk_size)),
+ chunker: if let Some(suggested) = suggested_boundaries {
+ Box::new(PayloadChunker::new(chunk_size, suggested))
+ } else {
+ Box::new(ChunkerImpl::new(chunk_size))
+ },
buffer: BytesMut::new(),
scan_pos: 0,
consumed: 0,
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index eadd670df..03a6a1448 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -169,6 +169,7 @@ struct Archiver {
file_copy_buffer: Vec<u8>,
skip_e2big_xattr: bool,
forced_boundaries: Option<mpsc::Sender<InjectChunks>>,
+ suggested_boundaries: Option<mpsc::Sender<u64>>,
previous_payload_index: Option<DynamicIndexReader>,
cache: PxarLookaheadCache,
reuse_stats: ReuseStats,
@@ -197,6 +198,7 @@ pub async fn create_archive<T, F>(
callback: F,
options: PxarCreateOptions,
forced_boundaries: Option<mpsc::Sender<InjectChunks>>,
+ suggested_boundaries: Option<mpsc::Sender<u64>>,
) -> Result<(), Error>
where
T: SeqWrite + Send,
@@ -271,6 +273,7 @@ where
file_copy_buffer: vec::undefined(4 * 1024 * 1024),
skip_e2big_xattr: options.skip_e2big_xattr,
forced_boundaries,
+ suggested_boundaries,
previous_payload_index,
cache: PxarLookaheadCache::new(None),
reuse_stats: ReuseStats::default(),
@@ -862,6 +865,11 @@ impl Archiver {
.add_file(c_file_name, file_size, stat.st_mtime)?;
}
+ if let Some(sender) = self.suggested_boundaries.as_mut() {
+ let offset = encoder.payload_position()?.raw();
+ sender.send(offset)?;
+ }
+
let offset: LinkOffset = if let Some(payload_offset) = payload_offset {
self.reuse_stats.total_reused_payload_size +=
file_size + size_of::<pxar::format::Header>() as u64;
diff --git a/pbs-client/src/pxar_backup_stream.rs b/pbs-client/src/pxar_backup_stream.rs
index fb6d063f2..f322566f0 100644
--- a/pbs-client/src/pxar_backup_stream.rs
+++ b/pbs-client/src/pxar_backup_stream.rs
@@ -27,6 +27,7 @@ use crate::pxar::create::PxarWriters;
/// consumer.
pub struct PxarBackupStream {
rx: Option<std::sync::mpsc::Receiver<Result<Vec<u8>, Error>>>,
+ pub suggested_boundaries: Option<std::sync::mpsc::Receiver<u64>>,
handle: Option<AbortHandle>,
error: Arc<Mutex<Option<String>>>,
}
@@ -55,22 +56,26 @@ impl PxarBackupStream {
));
let writer = pxar::encoder::sync::StandardWriter::new(writer);
- let (writer, payload_rx) = if separate_payload_stream {
- let (tx, rx) = std::sync::mpsc::sync_channel(10);
- let payload_writer = TokioWriterAdapter::new(std::io::BufWriter::with_capacity(
- buffer_size,
- StdChannelWriter::new(tx),
- ));
- (
- pxar::PxarVariant::Split(
- writer,
- pxar::encoder::sync::StandardWriter::new(payload_writer),
- ),
- Some(rx),
- )
- } else {
- (pxar::PxarVariant::Unified(writer), None)
- };
+ let (writer, payload_rx, suggested_boundaries_tx, suggested_boundaries_rx) =
+ if separate_payload_stream {
+ let (tx, rx) = std::sync::mpsc::sync_channel(10);
+ let (suggested_boundaries_tx, suggested_boundaries_rx) = std::sync::mpsc::channel();
+ let payload_writer = TokioWriterAdapter::new(std::io::BufWriter::with_capacity(
+ buffer_size,
+ StdChannelWriter::new(tx),
+ ));
+ (
+ pxar::PxarVariant::Split(
+ writer,
+ pxar::encoder::sync::StandardWriter::new(payload_writer),
+ ),
+ Some(rx),
+ Some(suggested_boundaries_tx),
+ Some(suggested_boundaries_rx),
+ )
+ } else {
+ (pxar::PxarVariant::Unified(writer), None, None, None)
+ };
let error = Arc::new(Mutex::new(None));
let error2 = Arc::clone(&error);
@@ -85,6 +90,7 @@ impl PxarBackupStream {
},
options,
boundaries,
+ suggested_boundaries_tx,
)
.await
{
@@ -99,12 +105,14 @@ impl PxarBackupStream {
let backup_stream = Self {
rx: Some(rx),
+ suggested_boundaries: None,
handle: Some(handle.clone()),
error: Arc::clone(&error),
};
let backup_payload_stream = payload_rx.map(|rx| Self {
rx: Some(rx),
+ suggested_boundaries: suggested_boundaries_rx,
handle: Some(handle),
error,
});
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index b4d01ed3f..a17588edf 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -209,7 +209,7 @@ async fn backup_directory<P: AsRef<Path>>(
payload_target.is_some(),
)?;
- let mut chunk_stream = ChunkStream::new(pxar_stream, chunk_size, None);
+ let mut chunk_stream = ChunkStream::new(pxar_stream, chunk_size, None, None);
let (tx, rx) = mpsc::channel(10); // allow to buffer 10 chunks
let stream = ReceiverStream::new(rx).map_err(Error::from);
@@ -223,14 +223,19 @@ async fn backup_directory<P: AsRef<Path>>(
let stats = client.upload_stream(archive_name, stream, upload_options.clone(), None);
- if let Some(payload_stream) = payload_stream {
+ if let Some(mut payload_stream) = payload_stream {
let payload_target = payload_target
.ok_or_else(|| format_err!("got payload stream, but no target archive name"))?;
let (payload_injections_tx, payload_injections_rx) = std::sync::mpsc::channel();
let injection_data = InjectionData::new(payload_boundaries_rx, payload_injections_tx);
- let mut payload_chunk_stream =
- ChunkStream::new(payload_stream, chunk_size, Some(injection_data));
+ let suggested_boundaries = payload_stream.suggested_boundaries.take();
+ let mut payload_chunk_stream = ChunkStream::new(
+ payload_stream,
+ chunk_size,
+ Some(injection_data),
+ suggested_boundaries,
+ );
let (payload_tx, payload_rx) = mpsc::channel(10); // allow to buffer 10 chunks
let stream = ReceiverStream::new(payload_rx).map_err(Error::from);
@@ -573,7 +578,8 @@ fn spawn_catalog_upload(
let (catalog_tx, catalog_rx) = std::sync::mpsc::sync_channel(10); // allow to buffer 10 writes
let catalog_stream = proxmox_async::blocking::StdChannelStream(catalog_rx);
let catalog_chunk_size = 512 * 1024;
- let catalog_chunk_stream = ChunkStream::new(catalog_stream, Some(catalog_chunk_size), None);
+ let catalog_chunk_stream =
+ ChunkStream::new(catalog_stream, Some(catalog_chunk_size), None, None);
let catalog_writer = Arc::new(Mutex::new(CatalogWriter::new(TokioWriterAdapter::new(
StdChannelWriter::new(catalog_tx),
diff --git a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
index 681fa6db9..80af5011e 100644
--- a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
+++ b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
@@ -364,8 +364,16 @@ fn extract(
};
let pxar_writer = pxar::PxarVariant::Unified(TokioWriter::new(writer));
- create_archive(dir, PxarWriters::new(pxar_writer, None), Flags::DEFAULT, |_| Ok(()), options, None)
- .await
+ create_archive(
+ dir,
+ PxarWriters::new(pxar_writer, None),
+ Flags::DEFAULT,
+ |_| Ok(()),
+ options,
+ None,
+ None,
+ )
+ .await
}
.await;
if let Err(err) = result {
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index 85887a8ed..fa584b4e8 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -442,6 +442,7 @@ async fn create_archive(
},
options,
None,
+ None,
)
.await?;
diff --git a/tests/catar.rs b/tests/catar.rs
index 9f83b4cc2..94c565012 100644
--- a/tests/catar.rs
+++ b/tests/catar.rs
@@ -40,6 +40,7 @@ fn run_test(dir_name: &str) -> Result<(), Error> {
|_| Ok(()),
options,
None,
+ None,
))?;
Command::new("cmp")
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 49/58] client: pxar: add archive creation with reference test
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (46 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 48/58] client: chunk stream: switch payload stream chunker Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 50/58] client: tools: add helper to raise nofile rlimit Christian Ebner
` (9 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Add a basic regression test for archive creation with reference
metadata archive and index.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar/create.rs | 243 ++++++++++++++++++
tests/pxar/backup-client-pxar-data.mpxar | Bin 0 -> 15070 bytes
tests/pxar/backup-client-pxar-data.ppxar.didx | Bin 0 -> 8096 bytes
tests/pxar/backup-client-pxar-expected.mpxar | Bin 0 -> 15086 bytes
4 files changed, 243 insertions(+)
create mode 100644 tests/pxar/backup-client-pxar-data.mpxar
create mode 100644 tests/pxar/backup-client-pxar-data.ppxar.didx
create mode 100644 tests/pxar/backup-client-pxar-expected.mpxar
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 03a6a1448..42e4dc502 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -1715,3 +1715,246 @@ fn generate_pxar_excludes_cli(patterns: &[MatchEntry]) -> Vec<u8> {
content
}
+
+#[cfg(test)]
+mod tests {
+ use std::ffi::OsString;
+ use std::fs::File;
+ use std::fs::OpenOptions;
+ use std::io::{self, BufReader, Seek, SeekFrom, Write};
+ use std::pin::Pin;
+ use std::process::Command;
+ use std::sync::mpsc;
+ use std::task::{Context, Poll};
+
+ use pbs_datastore::dynamic_index::DynamicIndexReader;
+ use pxar::accessor::sync::FileReader;
+ use pxar::encoder::SeqWrite;
+
+ use crate::pxar::extract::Extractor;
+ use crate::pxar::OverwriteFlags;
+
+ use super::*;
+
+ struct DummyWriter {
+ file: Option<File>,
+ }
+
+ impl DummyWriter {
+ fn new<P: AsRef<Path>>(path: Option<P>) -> Result<Self, Error> {
+ let file = if let Some(path) = path {
+ Some(
+ OpenOptions::new()
+ .read(true)
+ .write(true)
+ .truncate(true)
+ .create(true)
+ .open(path)?,
+ )
+ } else {
+ None
+ };
+ Ok(Self { file })
+ }
+ }
+
+ impl Write for DummyWriter {
+ fn write(&mut self, data: &[u8]) -> io::Result<usize> {
+ if let Some(file) = self.file.as_mut() {
+ file.write_all(data)?;
+ }
+ Ok(data.len())
+ }
+
+ fn flush(&mut self) -> io::Result<()> {
+ if let Some(file) = self.file.as_mut() {
+ file.flush()?;
+ }
+ Ok(())
+ }
+ }
+
+ impl SeqWrite for DummyWriter {
+ fn poll_seq_write(
+ mut self: Pin<&mut Self>,
+ _cx: &mut Context,
+ buf: &[u8],
+ ) -> Poll<io::Result<usize>> {
+ Poll::Ready(self.as_mut().write(buf))
+ }
+
+ fn poll_flush(mut self: Pin<&mut Self>, _cx: &mut Context) -> Poll<Result<(), io::Error>> {
+ Poll::Ready(self.as_mut().flush())
+ }
+ }
+
+ fn prepare<P: AsRef<Path>>(dir_path: P) -> Result<(), Error> {
+ let dir = nix::dir::Dir::open(dir_path.as_ref(), OFlag::O_DIRECTORY, Mode::empty())?;
+
+ let fs_magic = detect_fs_type(dir.as_raw_fd()).unwrap();
+ let stat = nix::sys::stat::fstat(dir.as_raw_fd()).unwrap();
+ let mut fs_feature_flags = Flags::from_magic(fs_magic);
+ let metadata = get_metadata(
+ dir.as_raw_fd(),
+ &stat,
+ fs_feature_flags,
+ fs_magic,
+ &mut fs_feature_flags,
+ false,
+ )?;
+
+ let mut extractor = Extractor::new(
+ dir,
+ metadata.clone(),
+ true,
+ OverwriteFlags::empty(),
+ fs_feature_flags,
+ );
+
+ let dir_metadata = Metadata {
+ stat: pxar::Stat::default().mode(0o777u64).set_dir().gid(0).uid(0),
+ ..Default::default()
+ };
+
+ let file_metadata = Metadata {
+ stat: pxar::Stat::default()
+ .mode(0o777u64)
+ .set_regular_file()
+ .gid(0)
+ .uid(0),
+ ..Default::default()
+ };
+
+ extractor.enter_directory(
+ OsString::from(format!("testdir")),
+ dir_metadata.clone(),
+ true,
+ )?;
+
+ let size = 1024 * 1024;
+ let mut cursor = BufReader::new(std::io::Cursor::new(vec![0u8; size]));
+ for i in 0..10 {
+ extractor.enter_directory(
+ OsString::from(format!("folder_{i}")),
+ dir_metadata.clone(),
+ true,
+ )?;
+ for j in 0..10 {
+ cursor.seek(SeekFrom::Start(0))?;
+ extractor.extract_file(
+ CString::new(format!("file_{j}").as_str())?.as_c_str(),
+ &file_metadata,
+ size as u64,
+ &mut cursor,
+ true,
+ )?;
+ }
+ extractor.leave_directory()?;
+ }
+
+ extractor.leave_directory()?;
+
+ Ok(())
+ }
+
+ #[test]
+ fn test_create_archive_with_reference() -> Result<(), Error> {
+ let mut testdir = PathBuf::from("./target/testout");
+ testdir.push(std::module_path!());
+
+ let _ = std::fs::remove_dir_all(&testdir);
+ let _ = std::fs::create_dir_all(&testdir);
+
+ prepare(testdir.as_path())?;
+
+ let previous_payload_index = Some(DynamicIndexReader::new(File::open(
+ "../tests/pxar/backup-client-pxar-data.ppxar.didx",
+ )?)?);
+ let metadata_archive = File::open("../tests/pxar/backup-client-pxar-data.mpxar").unwrap();
+ let metadata_size = metadata_archive.metadata()?.len();
+ let reader: MetadataArchiveReader = Arc::new(FileReader::new(metadata_archive));
+
+ let rt = tokio::runtime::Runtime::new().unwrap();
+ let (suggested_boundaries, _rx) = mpsc::channel();
+ let (forced_boundaries, _rx) = mpsc::channel();
+
+ rt.block_on(async move {
+ testdir.push("testdir");
+ let source_dir =
+ nix::dir::Dir::open(testdir.as_path(), OFlag::O_DIRECTORY, Mode::empty()).unwrap();
+
+ let fs_magic = detect_fs_type(source_dir.as_raw_fd()).unwrap();
+ let stat = nix::sys::stat::fstat(source_dir.as_raw_fd()).unwrap();
+ let mut fs_feature_flags = Flags::from_magic(fs_magic);
+
+ let metadata = get_metadata(
+ source_dir.as_raw_fd(),
+ &stat,
+ fs_feature_flags,
+ fs_magic,
+ &mut fs_feature_flags,
+ false,
+ )?;
+
+ let writer = DummyWriter::new(Some("./target/backup-client-pxar-run.mpxar")).unwrap();
+ let payload_writer = DummyWriter::new::<PathBuf>(None).unwrap();
+
+ let mut encoder = Encoder::new(
+ pxar::PxarVariant::Split(writer, payload_writer),
+ &metadata,
+ Some(&[]),
+ )
+ .await?;
+
+ let mut archiver = Archiver {
+ feature_flags: Flags::from_magic(fs_magic),
+ fs_feature_flags: Flags::from_magic(fs_magic),
+ fs_magic,
+ callback: Box::new(|_| Ok(())),
+ patterns: Vec::new(),
+ catalog: None,
+ path: PathBuf::new(),
+ entry_counter: 0,
+ entry_limit: 1024,
+ current_st_dev: stat.st_dev,
+ device_set: None,
+ hardlinks: HashMap::new(),
+ file_copy_buffer: vec::undefined(4 * 1024 * 1024),
+ skip_e2big_xattr: false,
+ forced_boundaries: Some(forced_boundaries),
+ previous_payload_index,
+ suggested_boundaries: Some(suggested_boundaries),
+ cache: PxarLookaheadCache::new(),
+ reuse_stats: ReuseStats::default(),
+ };
+
+ let accessor = Accessor::new(pxar::PxarVariant::Unified(reader), metadata_size)
+ .await
+ .unwrap();
+ let root = accessor.open_root().await.ok();
+ archiver
+ .archive_dir_contents(&mut encoder, root, source_dir, true)
+ .await
+ .unwrap();
+
+ archiver
+ .flush_cached_reusing_if_below_threshold(&mut encoder, false)
+ .await
+ .unwrap();
+
+ encoder.finish().await.unwrap();
+ encoder.close().await.unwrap();
+
+ let status = Command::new("diff")
+ .args([
+ "../tests/pxar/backup-client-pxar-expected.mpxar",
+ "./target/backup-client-pxar-run.mpxar",
+ ])
+ .status()
+ .expect("failed to execute diff");
+ assert!(status.success());
+
+ Ok::<(), Error>(())
+ })
+ }
+}
diff --git a/tests/pxar/backup-client-pxar-data.mpxar b/tests/pxar/backup-client-pxar-data.mpxar
new file mode 100644
index 0000000000000000000000000000000000000000..00f3dc295fb38062c23e6cf7cac9ae110beb0a65
GIT binary patch
literal 15070
zcmeI3ZD<@t7{_Pd4&n>F7EIfqb#1^FO6}HK%_)tWO2s0r+iLp7VpnN`+SqK3@uiTk
z0g)mo3n~RgSX7jP#Z`-;wUEWA!B4JW5kF|x5580@T|bDmXzQ8GN@tzklRN*x`)~`#
z+|A9+Z+4!U-!r*zm%i41e0X5q&>}W-sk}V(=Du$q+4;h;F8=yl4}ZdoA2i1PeiW~F
z7gkDF&G*_D^Edhj2X^*7yu)JuwZnyZhYt+&$+{a8M{=R@jqX44;jVQr_n5qS`Ja!?
zJj=%~;8y>8^bO)nmIG_xu7%+&Cf=v??$*F?HnW6jmEx|0;T&euxV12x%N!baJq+hD
zm&V-y!}-jkaa}N6z<e54f#E_H2)HYTj;(+Fj+3hvDKpg*+uC;ZZa5IH;Qkxr`*hRW
zPwf8mu1mHb<?ZtNpKkkPvV3Urmo~1zy#C9{-_I`n@$iyOh4%etZrOKo^U622=`*~%
zeP!$T7i*WVKk;~>pO3D*w{v9smfybSqt4rptjHd`|MLm<Vqu)Wo?ABc{O-rP2Mg^x
z7k_iMt1TS)zR-W~W!=Mf|9sD>XZd*YdB}HcLEjPq_HYs}F67(1L&2w#Y%n&v?uz=3
zSjazEo-U<0$><xz#Vn$6IDIE9rg1oZr!1jyIDKa<rExfYGbN*OIDMBD#vM>&W#aU0
zDplb0RRf39x22dg4ySKhu>@R8-*xF*Vx%6v7kKeM>Dy6kA+B?*Z&z_>oMf`bW;a>I
z<m4$Xjl=2NS3DYr(|4fwG!CclPzh)pPT!Fd(m0&HV<n<-IDIEdOyh9+PL)K!we($=
zz9ow2nVpfOKE<8BGbI(`D#hVW-%QPD98TY5mGQr_Y8<H~u^F3PY>L^!RI9-0s|F6I
zZ%Z|498TZ1YSB2Hz8%%3aX5Xuszc*&`u0?p#^LnstDb;s>ANm{OZIGY=sQq-A+B?*
z?@$eB98TYn8qzqNzGF3_agwFbV75rqn8xAsovI0q!|6LyQyPcUH`6j2htqdiWBlvb
z8kruaZ&RxR&pTMO^j(*}C7Y-@^lfRT5Z5`@x2;(;4ySKNvuPYo->&A+IGnyc&82aY
zmDgal@HLOd;q)D7K8?faJJbRihtqeYg)|PQ?^ufjTua||>07d@n?v7;77KBmV|}Mu
zLgR4y&a{-q;q=Y)jK<;gUDg@@$9att98TY+UIm_af|D*4$wF^1TUfeD<8b=6b&JN~
z^zG<2jl=2N)g1xX(sy0@mMpX8(6^_%LR_VL68GJ=uX{8Or|&@bX&g@9p&rmUoW3JH
zq;WWX$9hELaQaU4n8r!=RfE|g)e{<r(|4w)G!Cb4W@G}crSH1*Es1+`=(}t%gFI5<
z^lchdAa#Pn>Dw|)8i&)jZCEr8r*FrwX&g@9uHn!)oW4E7rExfY`-Vs3B-^;bY!Mhf
zjl=0XGy(zF(sy0@mIR_X^c@+Y5Z5_AeaA*b<8b;;jF`sZ^qm?Bjl=0XGg2Cd(>E(+
zG!Ccla*375OpnvIS*il5g9T3CR>`Ds5^FS=E$osd;9B~Y>$^BF%I$l%bRK>#eY7&O
zHYWHE=v;8~)Sh?xU(H|VW$)?(9aB$LPP~7)*uLYlq5Vhi+jHX|?PC2`OI{f&Z9MeB
z>6K!Ahu7bI_2$0s#@C4T7d>;$s;8fP>(9z^v3}X<gBuoTx3=wFD%QVullIW@VSRMn
ee6jw{9WR|3pSSVg=*41v{&S{}`TgcUXZj0DmM>rc
literal 0
HcmV?d00001
diff --git a/tests/pxar/backup-client-pxar-data.ppxar.didx b/tests/pxar/backup-client-pxar-data.ppxar.didx
new file mode 100644
index 0000000000000000000000000000000000000000..a646218b5d504196443b17d62f3b22d171f011b8
GIT binary patch
literal 8096
zcmeIw&x=k`9LMqR`E|?Ga2Ha_;uI^9yBJNz!Z9f+>6R!pO*b*jh^~^|JnlRqSshV|
z%`G8TSEgiYa;C7OyU<J|9b=l(l;<{OBpc7nKj5>pIN$Ya@$KDb%dI01H%~o(*K{tQ
zJ~q6+^=PT==^y&xJ9`F3sC!@cUYwa2-S=W{^1`mZr(YJX_P=dkztmp8Zd>P0&yH!m
zYQlvAp+G1Q3WNfoKqwFjgaV;JC=d#S0-?bFT|iU3_TbPp*KU10`+DQ}SnEW!^~!_6
zmE|WN7T?dES$;Kh@A!s<^qNZtPc0p`(~IYOPkMW9;>Of`Ykp<tpHIL1?CtT1yY~$x
zkW0xxE~6B3Ic1P5D2JS-0&*o;$W>HA&QS%qnjGXj)sSn*LylMjxtI}Kh5y=%W?c!m
zglWhbmOw6L267ooA(yiZas|sFXITNcl3B=Atc09n736B>Am>>PxrTYj5pN(DbK=OZ
zH1A4ee_TV(@C0%xH;~JC3b~wTkSll&Im-*kmE1zE;w9u9uOL@*2RYAc$Ti$Ujzj~w
zSdc(=rA1dF`x6>+MkJ6+g@IfqQpn{ZgIpnU$XQW9t`rt>l_(+SL<PB8ILLWXL#`1X
zawHqb#gZhlD=oVc*`L&qGcti(Dh=c^nL;j?8RQC?L(a+qa;3D8t7Hi|Co9O+(m~G4
z8gh;FkR#PVE>@(FU1`;o$o`auoKXqnQe_~QsT6X#${<&$9CB6_kSmpiT%}6LIaNWf
zRt|Dr)sSnHha5!><l=}TWLG-sN@RbLhMb8K$YqgPb4Mq~o{fC&+#LA+d)TeK+0;Jx
Tczf@GpWkK|cE3#e4vqc=((xEu
literal 0
HcmV?d00001
diff --git a/tests/pxar/backup-client-pxar-expected.mpxar b/tests/pxar/backup-client-pxar-expected.mpxar
new file mode 100644
index 0000000000000000000000000000000000000000..ae4a18c89749f3d7ec82623e84509df19943d03e
GIT binary patch
literal 15086
zcmeI3Z-^9S9LJyew{ZQzQRvjGZ1W%mF~`ihExcw8BMEJ+&NoR;;T@Hij$M~!+%X3c
z5)=a!LLm(mg^)C*cv!*>U2*iP2{P$LIT8J_45t^7Nokw=%!_Ax+~4i?J=zyLa6C89
zJ@<T`XP)Qz{C>O3UiwDo@!`Q)L-SbmQh9m#&Zl18d#vMIli#0ud-r#bZF%Wv55GTG
z=D+abM~$(6erm4+b4!J*XM3IV`5y+h4{qsybhE|&Yln054j&rqmvuKLj^sk)8{PB%
zM_X6zEf;z7ykx98^L+dQZu!4Q-z3iBn7X*@U^tuQ^Q$wv6)>E`EdE&Q;I4<^TxQd_
zl`x#g92$264CgbK#@z_R1<a#yJuqCzd>U7R;UX3YxGRT_u72~*lgs8Q)#{0j9b5a>
z?2DIhA8zNZ*S-7XwomW5WYZDeF0cRj_D?3wgOk5@a0TY|UrzpUcHvKl7p$vkKXB&O
z-6z*CeQTp$?Kp2=x@-K{%EhZsJW<on$5-9oJ+f)T?_cwA<n2e6WDh_1`2>5pW}LsB
zTQv3Jww=9syS(h4|IOK+j&S6Mn*RGP>m9!Lm-|jV&&QKLhg^R(`j!Z=%tywH3;8zh
zQ1GcF8jMY^yIOt6Ead-K$2gMFH;GGFMB{M!PFYOjaQe<zLgR4yW=cxqaQZftjK<;g
zT~ru%K%Je5)3>FVG!Cb4TdB<N{8eXmIDI>cCE(inZb;t}BbE7C;Kl!>Z&$H}b(Ka7
zoW4E9p>dLjH8#D6RU4dq#iemLeFut1<8b;86`#i8^c^Vyjl=0XRzezw(|4joG!Ccl
zREcREPT!f52)MSs8`8H#5#{L_N$OKv_RZ8(SXU_yr*BiuXdF)8MV0YaS#@$8$=Zxf
zZ*6L$g{7J_4ySKht<3NIRcCfMeLJc}<8b<RRh!1)^zEq*jl=2NS6v#1(|4eH0<Nv^
zhV(5tv#p`;Q1yj%ond`PYCz*~`i|9*#^Lmxs1c2mY=tJHMXJU$4yW%-O=uiW-%Lws
z98TY+meDwzzKa^;Z^zaNy*PbanknGg`ff<yl0(!Q`nI*oMxrt}T=wl~7LCK{+tq9u
zhts#GIW!KZZ(nn1oMh)U87%_Mqj5NWhni30aQcq4fX3nU9cv+t!|6NGA_3RdcSHJ?
z?CRFgcdEt0y3TO+ooNY;!|9vpDUHMF+tf1}htqdaXZ(ZnIvE^J-<EFDILStDGFsSr
zWqwp*fz!96TQm-*Z&$Zz98TY!?g+THz8liFWSg~yzJ1*l)^&#U9q1m7!|6NJeHw?;
zccce24yW%}4{01u--#a4IGnyyJ*IJzbJb+D$n=E9;q=Xnl*ZxoZ5o+?YwNoqeM{E8
zHS}FHm_g<^xHnHM!=!OIecMK5epCjB)3;+-G!Cb4*RW|EPT!v4&^VmFeZ!@3IDH3(
zN8=>NxXEY{8a|D~={qt40oT@dL;991L~H0fHbP-tXE^&#jEKhJ^qm?pjl=0XGZGqy
z(>E)mG!Cb4vyjm^oW6?%Rv<$!PTy9+q;WWX+l9*fsKi2IjV7aoQ?LYFTi<eh*FG2J
zj$IqN55JH;UaBtE1U~`Yb8ea1@!r7e`F&pYE#KEQ^-Sr+2Um#gyFMG*bL4>?H~rZu
z)_=9&wV}e=gCCw=D%N*-1HIR*@Be;$g;;;lbJs3=_UU*2DlHc47oFa}W{!4S$F7B9
o{h^z+M~)BcqpN0%^>=T6<;?i3wfjde7VGn`GkwA5n}40@Z-J;Sk^lez
literal 0
HcmV?d00001
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 50/58] client: tools: add helper to raise nofile rlimit
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (47 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 49/58] client: pxar: add archive creation with reference test Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 51/58] client: pxar: set cache limit based on " Christian Ebner
` (8 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
The default soft limit for open file handles is rather low, as some
apis (e.g. the POSIX `select(2)` syscall) do not work [0].
The lookahead cache use during the backup clients metadata comparison
to reuse unchanged files however requires much higher limits to work
effectively.
This helper function allows to raise the soft limit to the hard
limit, as provided by the `getrlimit(2)` syscall.
[0] https://0pointer.net/blog/file-descriptor-limits.html
Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/tools/mod.rs | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/pbs-client/src/tools/mod.rs b/pbs-client/src/tools/mod.rs
index d62b651ee..7fdd99a6a 100644
--- a/pbs-client/src/tools/mod.rs
+++ b/pbs-client/src/tools/mod.rs
@@ -625,3 +625,26 @@ pub fn handle_root_with_optional_format_version_prelude<R: pxar::decoder::SeqRea
_ => bail!("unexpected entry kind {:?}", first.kind()),
}
}
+
+/// Raise the soft limit for open file handles to the hard limit
+///
+/// Returns the values set before raising the limit as libc::rlimit64
+pub fn raise_nofile_limit() -> Result<libc::rlimit64, Error> {
+ let mut old = libc::rlimit64 {
+ rlim_cur: 0,
+ rlim_max: 0,
+ };
+ if 0 != unsafe { libc::getrlimit64(libc::RLIMIT_NOFILE, &mut old as *mut libc::rlimit64) } {
+ bail!("Failed to get nofile rlimit");
+ }
+
+ let mut new = libc::rlimit64 {
+ rlim_cur: old.rlim_max,
+ rlim_max: old.rlim_max,
+ };
+ if 0 != unsafe { libc::setrlimit64(libc::RLIMIT_NOFILE, &mut new as *mut libc::rlimit64) } {
+ bail!("Failed to set nofile rlimit");
+ }
+
+ Ok(old)
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 51/58] client: pxar: set cache limit based on nofile rlimit
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (48 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 50/58] client: tools: add helper to raise nofile rlimit Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 52/58] api: datastore: add endpoint to lookup entries via pxar archive Christian Ebner
` (7 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
The lookahead cache size requires the resource limit for open file
handles to be high in order to allow for efficient reuse of unchanged
file payloads.
Increase the nofile soft limit to the hard limit and dynamically adapt
the cache size to the new soft limit minus the half of the previous
soft limit.
The `PxarCreateOptions` and the `Archiver` are therefore extended by
an additional field to store the maximum cache size, with fallback to
a default size of 512 entries.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
pbs-client/src/pxar/create.rs | 6 ++++--
proxmox-backup-client/src/main.rs | 21 ++++++++++++++++---
.../src/proxmox_restore_daemon/api.rs | 1 +
pxar-bin/src/main.rs | 1 +
4 files changed, 24 insertions(+), 5 deletions(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 42e4dc502..fcfb9a09c 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -56,6 +56,8 @@ pub struct PxarCreateOptions {
pub skip_e2big_xattr: bool,
/// Reference state for partial backups
pub previous_ref: Option<PxarPrevRef>,
+ /// Maximum number of lookahead cache entries
+ pub max_cache_size: Option<usize>,
}
pub type MetadataArchiveReader = Arc<dyn ReadAt + Send + Sync + 'static>;
@@ -275,7 +277,7 @@ where
forced_boundaries,
suggested_boundaries,
previous_payload_index,
- cache: PxarLookaheadCache::new(None),
+ cache: PxarLookaheadCache::new(options.max_cache_size),
reuse_stats: ReuseStats::default(),
};
@@ -1924,7 +1926,7 @@ mod tests {
forced_boundaries: Some(forced_boundaries),
previous_payload_index,
suggested_boundaries: Some(suggested_boundaries),
- cache: PxarLookaheadCache::new(),
+ cache: PxarLookaheadCache::new(None),
reuse_stats: ReuseStats::default(),
};
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index a17588edf..e01da26fc 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -41,7 +41,7 @@ use pbs_client::tools::{
crypto_parameters, format_key_source, get_encryption_key_password, KEYFD_SCHEMA,
KEYFILE_SCHEMA, MASTER_PUBKEY_FD_SCHEMA, MASTER_PUBKEY_FILE_SCHEMA,
},
- CHUNK_SIZE_SCHEMA, REPO_URL_SCHEMA,
+ raise_nofile_limit, CHUNK_SIZE_SCHEMA, REPO_URL_SCHEMA,
};
use pbs_client::{
delete_ticket_info, parse_backup_specification, view_task_result, BackupDetectionMode,
@@ -1074,7 +1074,8 @@ async fn create_backup(
.start_directory(std::ffi::CString::new(target.as_str())?.as_c_str())?;
let mut previous_ref = None;
- if detection_mode.is_metadata() {
+ let max_cache_size = if detection_mode.is_metadata() {
+ let old_rlimit = raise_nofile_limit()?;
if let Some(ref manifest) = previous_manifest {
// BackupWriter::start created a new snapshot, get the one before
if let Some(backup_time) = client.previous_backup_time().await? {
@@ -1100,7 +1101,20 @@ async fn create_backup(
.await?
}
}
- }
+
+ if old_rlimit.rlim_max <= 4096 {
+ log::info!(
+ "resource limit for open file handles low: {}",
+ old_rlimit.rlim_max,
+ );
+ }
+
+ Some(usize::try_from(
+ old_rlimit.rlim_max - old_rlimit.rlim_cur / 2,
+ )?)
+ } else {
+ None
+ };
let pxar_options = pbs_client::pxar::PxarCreateOptions {
device_set: devices.clone(),
@@ -1109,6 +1123,7 @@ async fn create_backup(
skip_lost_and_found,
skip_e2big_xattr,
previous_ref,
+ max_cache_size,
};
let upload_options = UploadOptions {
diff --git a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
index 80af5011e..0a535b7a7 100644
--- a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
+++ b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
@@ -361,6 +361,7 @@ fn extract(
skip_lost_and_found: false,
skip_e2big_xattr: false,
previous_ref: None,
+ max_cache_size: None,
};
let pxar_writer = pxar::PxarVariant::Unified(TokioWriter::new(writer));
diff --git a/pxar-bin/src/main.rs b/pxar-bin/src/main.rs
index fa584b4e8..e62348e25 100644
--- a/pxar-bin/src/main.rs
+++ b/pxar-bin/src/main.rs
@@ -376,6 +376,7 @@ async fn create_archive(
skip_lost_and_found: false,
skip_e2big_xattr: false,
previous_ref: None,
+ max_cache_size: None,
};
let source = PathBuf::from(source);
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 52/58] api: datastore: add endpoint to lookup entries via pxar archive
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (49 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 51/58] client: pxar: set cache limit based on " Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 53/58] api: datastore: add optional archive-name to file-restore Christian Ebner
` (6 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Add an api endpoint `pxar-lookup` to access the contents of a pxar
archive via a server side pxar accessor, providing the same response
as currently the `catalog` lookup. The intention is to fully replace
the catalog for split pxar archives, accessing all the entries via
the metadata archive instead of the catalog.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- not present in previous version
src/api2/admin/datastore.rs | 149 +++++++++++++++++++++++++++++++++++-
1 file changed, 147 insertions(+), 2 deletions(-)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 34a9105dd..abc4a4fba 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -30,7 +30,8 @@ use proxmox_sys::{task_log, task_warn};
use proxmox_time::CalendarEvent;
use pxar::accessor::aio::Accessor;
-use pxar::EntryKind;
+use pxar::format::SignedDuration;
+use pxar::{mode, EntryKind};
use pbs_api_types::{
print_ns_and_snapshot, print_store_and_ns, Authid, BackupContent, BackupNamespace, BackupType,
@@ -47,7 +48,7 @@ use pbs_client::pxar::{create_tar, create_zip};
use pbs_config::CachedUserInfo;
use pbs_datastore::backup_info::BackupInfo;
use pbs_datastore::cached_chunk_reader::CachedChunkReader;
-use pbs_datastore::catalog::{ArchiveEntry, CatalogReader};
+use pbs_datastore::catalog::{ArchiveEntry, CatalogReader, DirEntryAttribute};
use pbs_datastore::data_blob::DataBlob;
use pbs_datastore::data_blob_reader::DataBlobReader;
use pbs_datastore::dynamic_index::{BufferedDynamicReader, DynamicIndexReader, LocalDynamicReadAt};
@@ -1720,6 +1721,149 @@ pub async fn catalog(
.await?
}
+#[api(
+ input: {
+ properties: {
+ store: { schema: DATASTORE_SCHEMA },
+ "archive-name": {
+ type: String,
+ description: "Name of the archive to lookup given filepath (base64 encoded)",
+ },
+ ns: {
+ type: BackupNamespace,
+ optional: true,
+ },
+ backup_dir: {
+ type: pbs_api_types::BackupDir,
+ flatten: true,
+ },
+ "filepath": {
+ description: "Base64 encoded path.",
+ type: String,
+ }
+ },
+ },
+ access: {
+ description: "Requires on /datastore/{store}[/{namespace}] either DATASTORE_READ for any or \
+ DATASTORE_BACKUP and being the owner of the group",
+ permission: &Permission::Anybody,
+ },
+)]
+/// Get the entries of the given path of the pxar (metadata) archive
+pub async fn pxar_lookup(
+ store: String,
+ archive_name: String,
+ ns: Option<BackupNamespace>,
+ backup_dir: pbs_api_types::BackupDir,
+ filepath: String,
+ rpcenv: &mut dyn RpcEnvironment,
+) -> Result<Vec<ArchiveEntry>, Error> {
+ let archive_name = base64::decode(archive_name)
+ .map_err(|err| format_err!("base64 decode of archive-name failed - {err}"))?;
+
+ let archive_name = std::str::from_utf8(&archive_name)?;
+
+ let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
+
+ let ns = ns.unwrap_or_default();
+
+ let datastore = check_privs_and_load_store(
+ &store,
+ &ns,
+ &auth_id,
+ PRIV_DATASTORE_READ,
+ PRIV_DATASTORE_BACKUP,
+ Some(Operation::Read),
+ &backup_dir.group,
+ )?;
+
+ let backup_dir = datastore.backup_dir(ns, backup_dir)?;
+
+ let file_path = if filepath != "root" && filepath != "/" {
+ base64::decode(filepath)
+ .map_err(|err| format_err!("base64 decode of filepath failed - {err}"))?
+ } else {
+ vec![b'/']
+ };
+
+ let (manifest, files) = read_backup_index(&backup_dir)?;
+ for file in files {
+ if file.filename == archive_name && file.crypt_mode == Some(CryptMode::Encrypt) {
+ bail!("cannot decode '{archive_name}' - is encrypted");
+ }
+ }
+
+ let (archive_name, payload_archive_name) =
+ pbs_client::tools::get_pxar_archive_names(archive_name, &manifest)?;
+ let (reader, archive_size) =
+ get_local_pxar_reader(datastore.clone(), &manifest, &backup_dir, &archive_name)?;
+
+ let reader = if let Some(payload_archive_name) = payload_archive_name {
+ let payload_input =
+ get_local_pxar_reader(datastore, &manifest, &backup_dir, &payload_archive_name)?;
+ pxar::PxarVariant::Split(reader, payload_input)
+ } else {
+ pxar::PxarVariant::Unified(reader)
+ };
+ let accessor = Accessor::new(reader, archive_size).await?;
+
+ let root = accessor.open_root().await?;
+ let path = OsStr::from_bytes(&file_path).to_os_string();
+ let dir_entry = root
+ .lookup(&path)
+ .await
+ .map_err(|err| format_err!("lookup failed - {err}"))?
+ .ok_or_else(|| format_err!("lookup failed - error opening '{path:?}'"))?;
+
+ let mut entries = Vec::new();
+ if let EntryKind::Directory = dir_entry.kind() {
+ let dir_entry = dir_entry
+ .enter_directory()
+ .map_err(|err| format_err!("failed to enter directory - {err}"))
+ .await?;
+
+ let mut entries_iter = dir_entry.read_dir();
+ while let Some(entry) = entries_iter.next().await {
+ let entry = entry?.decode_entry().await?;
+
+ let entry_attr = match entry.kind() {
+ EntryKind::Version(_) | EntryKind::Prelude(_) | EntryKind::GoodbyeTable => continue,
+ EntryKind::Directory => DirEntryAttribute::Directory {
+ start: entry.entry_range_info().entry_range.start,
+ },
+ EntryKind::File { size, .. } => {
+ let mtime = match entry.metadata().mtime_as_duration() {
+ SignedDuration::Positive(val) => i64::try_from(val.as_secs())?,
+ SignedDuration::Negative(val) => -1 * i64::try_from(val.as_secs())?,
+ };
+ DirEntryAttribute::File { size: *size, mtime }
+ }
+ EntryKind::Device(_) => match entry.metadata().file_type() {
+ mode::IFBLK => DirEntryAttribute::BlockDevice,
+ mode::IFCHR => DirEntryAttribute::CharDevice,
+ _ => bail!("encountered unknown device type"),
+ },
+ EntryKind::Symlink(_) => DirEntryAttribute::Symlink,
+ EntryKind::Hardlink(_) => DirEntryAttribute::Hardlink,
+ EntryKind::Fifo => DirEntryAttribute::Fifo,
+ EntryKind::Socket => DirEntryAttribute::Socket,
+ };
+
+ entries.push(ArchiveEntry::new(
+ entry.path().as_os_str().as_bytes(),
+ Some(&entry_attr),
+ ));
+ }
+ } else {
+ bail!(format!(
+ "expected directory entry, got entry kind '{:?}'",
+ dir_entry.kind()
+ ));
+ }
+
+ Ok(entries)
+}
+
#[sortable]
pub const API_METHOD_PXAR_FILE_DOWNLOAD: ApiMethod = ApiMethod::new(
&ApiHandler::AsyncHttp(&pxar_file_download),
@@ -2414,6 +2558,7 @@ const DATASTORE_INFO_SUBDIRS: SubdirMap = &[
"pxar-file-download",
&Router::new().download(&API_METHOD_PXAR_FILE_DOWNLOAD),
),
+ ("pxar-lookup", &Router::new().get(&API_METHOD_PXAR_LOOKUP)),
("rrd", &Router::new().get(&API_METHOD_GET_RRD_STATS)),
(
"snapshots",
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 53/58] api: datastore: add optional archive-name to file-restore
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (50 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 52/58] api: datastore: add endpoint to lookup entries via pxar archive Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 54/58] www: content: lookup via metadata archive instead of catalog Christian Ebner
` (5 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Allow to pass the archive name as optional api call parameter instead
of having it as prefix to the path.
If this parameter is given, instead of splitting of the archive name
from the path, the parameter itself is used, leaving the path
untouched.
This allows to restore single files from the archive, without having
to artificially construct the path in case of file restores for split
pxar archives, where the response path of the listing does not
include the archive, as opposed to the response provided by lookup
via the catalog.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- not present in previous version
src/api2/admin/datastore.rs | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index abc4a4fba..cd014878e 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1877,6 +1877,7 @@ pub const API_METHOD_PXAR_FILE_DOWNLOAD: ApiMethod = ApiMethod::new(
("backup-time", false, &BACKUP_TIME_SCHEMA),
("filepath", false, &StringSchema::new("Base64 encoded path").schema()),
("tar", true, &BooleanSchema::new("Download as .tar.zst").schema()),
+ ("archive-name", true, &StringSchema::new("Base64 encoded archive name").schema()),
]),
)
).access(
@@ -1944,9 +1945,17 @@ pub fn pxar_file_download(
components.remove(0);
}
- let mut split = components.splitn(2, |c| *c == b'/');
- let pxar_name = std::str::from_utf8(split.next().unwrap())?;
- let file_path = split.next().unwrap_or(b"/");
+ let (pxar_name, file_path) = if let Some(archive_name) = param["archive-name"].as_str() {
+ let archive_name = base64::decode(archive_name)
+ .map_err(|err| format_err!("base64 decode of archive-name failed - {err}"))?;
+ (archive_name, base64::decode(&filepath)?)
+ } else {
+ let mut split = components.splitn(2, |c| *c == b'/');
+ let pxar_name = split.next().unwrap();
+ let file_path = split.next().unwrap_or(b"/");
+ (pxar_name.to_owned(), file_path.to_owned())
+ };
+ let pxar_name = std::str::from_utf8(&pxar_name)?;
let (manifest, files) = read_backup_index(&backup_dir)?;
for file in files {
if file.filename == pxar_name && file.crypt_mode == Some(CryptMode::Encrypt) {
@@ -1969,7 +1978,7 @@ pub fn pxar_file_download(
let decoder = Accessor::new(reader, archive_size).await?;
let root = decoder.open_root().await?;
- let path = OsStr::from_bytes(file_path).to_os_string();
+ let path = OsStr::from_bytes(&file_path).to_os_string();
let file = root
.lookup(&path)
.await?
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 54/58] www: content: lookup via metadata archive instead of catalog
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (51 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 53/58] api: datastore: add optional archive-name to file-restore Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 55/58] docs: file formats: describe split pxar archive file layout Christian Ebner
` (4 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
In case of pxar archives with split metadata and payload data, the
metadata archive has to be used to lookup entries for navigation
before performing a single file restore.
Decide based on the archive filename extension whether to use the
`catalog` or the `pxar-lookup` api endpoint.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- not present in previous version
www/datastore/Content.js | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/www/datastore/Content.js b/www/datastore/Content.js
index 6dd1ab319..f3ff3998c 100644
--- a/www/datastore/Content.js
+++ b/www/datastore/Content.js
@@ -789,15 +789,28 @@ Ext.define('PBS.DataStoreContent', {
if (view.namespace && view.namespace !== '') {
extraParams.ns = view.namespace;
}
- Ext.create('Proxmox.window.FileBrowser', {
- title: `${type}/${id}/${timetext}`,
- listURL: `/api2/json/admin/datastore/${view.datastore}/catalog`,
- downloadURL: `/api2/json/admin/datastore/${view.datastore}/pxar-file-download`,
- extraParams,
- enableTar: true,
- downloadPrefix: `${type}-${id}-`,
- archive: rec.data.filename,
- }).show();
+
+ if (rec.data.filename.endsWith(".mpxar.didx")) {
+ extraParams['archive-name'] = btoa(rec.data.filename);
+ Ext.create('Proxmox.window.FileBrowser', {
+ title: `${type}/${id}/${timetext}`,
+ listURL: `/api2/json/admin/datastore/${view.datastore}/pxar-lookup`,
+ downloadURL: `/api2/json/admin/datastore/${view.datastore}/pxar-file-download`,
+ extraParams,
+ enableTar: true,
+ downloadPrefix: `${type}-${id}-`,
+ archive: rec.data.filename,
+ }).show();
+ } else {
+ Ext.create('Proxmox.window.FileBrowser', {
+ title: `${type}/${id}/${timetext}`,
+ listURL: `/api2/json/admin/datastore/${view.datastore}/catalog`,
+ downloadURL: `/api2/json/admin/datastore/${view.datastore}/pxar-file-download`,
+ extraParams,
+ enableTar: true,
+ downloadPrefix: `${type}-${id}-`,
+ }).show();
+ }
},
filter: function(item, value) {
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 55/58] docs: file formats: describe split pxar archive file layout
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (52 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 54/58] www: content: lookup via metadata archive instead of catalog Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 56/58] docs: add section describing change detection mode Christian Ebner
` (3 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Describes the pxar metadata archive and the corresponding pxar payload
file-format layout.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
docs/file-formats.rst | 46 ++++++++++++++++++++++++++++++++
docs/meta-format-overview.dot | 50 +++++++++++++++++++++++++++++++++++
2 files changed, 96 insertions(+)
create mode 100644 docs/meta-format-overview.dot
diff --git a/docs/file-formats.rst b/docs/file-formats.rst
index 43ecfefce..77d55b5ef 100644
--- a/docs/file-formats.rst
+++ b/docs/file-formats.rst
@@ -8,7 +8,53 @@ Proxmox File Archive Format (``.pxar``)
.. graphviz:: pxar-format-overview.dot
+.. _pxar-meta-format:
+Proxmox File Archive Format - Meta (``.mpxar``)
+-----------------------------------------------
+
+Pxar metadata archive with same structure as a regular pxar archive, with the
+exception of regular file payloads not being contained within the archive
+itself, but rather being stored as payload references to the corresponding pxar
+payload (``.ppxar``) file.
+
+Can be used to lookup all the archive entries and metadata without the size
+overhead introduced by the file payloads.
+
+.. graphviz:: meta-format-overview.dot
+
+.. _ppxar-format:
+
+Proxmox File Archive Format - Payload (``.ppxar``)
+--------------------------------------------------
+
+Pxar payload file storing regular file payloads to be referenced and accessed by
+the corresponding pxar metadata (``.mpxar``) archive. Contains a concatenation
+of regular file payloads, each prefixed by a `PAYLOAD` header. Further, the
+actual referenced payload entries might be separated by padding (full/partial
+payloads not referenced), introduced when reusing chunks of a previous backup
+run, when chunk boundaries did not aligned to payload entry offsets.
+
+All headers are stored as little-endian.
+
+.. list-table::
+ :widths: auto
+
+ * - ``PAYLOAD_START_MARKER``
+ - header of ``[u8; 16]`` consisting of type hash and size;
+ marks start
+ * - ``PAYLOAD``
+ - header of ``[u8; 16]`` cosisting of type hash and size;
+ referenced by metadata archive
+ * - Payload
+ - raw regular file payload
+ * - Padding
+ - partial/full unreferenced payloads, caused by unaligned chunk boundary
+ * - ...
+ - further concatenation of payload header, payload and padding
+ * - ``PAYLOAD_TAIL_MARKER``
+ - header of ``[u8; 16]`` consisting of type hash and size;
+ marks end
.. _data-blob-format:
Data Blob Format (``.blob``)
diff --git a/docs/meta-format-overview.dot b/docs/meta-format-overview.dot
new file mode 100644
index 000000000..7eea4b55b
--- /dev/null
+++ b/docs/meta-format-overview.dot
@@ -0,0 +1,50 @@
+digraph g {
+graph [
+rankdir = "LR"
+fontname="Helvetica"
+];
+node [
+fontsize = "16"
+shape = "record"
+];
+edge [
+];
+
+"archive" [
+label = "archive.mpxar"
+shape = "record"
+];
+
+"rootdir" [
+label = "<fv>FORMAT_VERSION\l|PRELUDE\l|<f0>ENTRY\l|\{XATTR\}\* extended attribute list\l|\{ACL_USER\}\* USER ACL entries\l|\{ACL_GROUP\}\* GROUP ACL entries\l|\[ACL_GROUP_OBJ\] the ACL_GROUP_OBJ \l|\[ACL_DEFAULT\] the various default ACL fields\l|\{ACL_DEFAULT_USER\}\* USER ACL entries\l|\{ACL_DEFAULT_GROUP\}\* GROUP ACL entries\l|\[FCAPS\] file capability in Linux disk format\l|\[QUOTA_PROJECT_ID\] the ext4/xfs quota project ID\l|{<pl> PAYLOAD_REF|SYMLINK|DEVICE|{<de> \{DirectoryEntries\}\*|GOODBYE}}"
+shape = "record"
+];
+
+
+"entry" [
+label = "<f0> size: u64 = 64\l|type: u64 = ENTRY\l|feature_flags: u64\l|mode: u64\l|flags: u64\l|uid: u64\l|gid: u64\l|mtime: u64\l"
+labeljust = "l"
+shape = "record"
+];
+
+
+
+"direntry" [
+label = "<f0> FILENAME\l|{ENTRY\l|HARDLINK\l}"
+shape = "record"
+];
+
+"payloadrefentry" [
+label = "<f0> offset: u64\l|size: u64\l"
+shape = "record"
+];
+
+"archive" -> "rootdir":fv
+
+"rootdir":f0 -> "entry":f0
+
+"rootdir":de -> "direntry":f0
+
+"rootdir":pl -> "payloadrefentry":f0
+
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 56/58] docs: add section describing change detection mode
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (53 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 55/58] docs: file formats: describe split pxar archive file layout Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 57/58] test-suite: add detection mode change benchmark Christian Ebner
` (2 subsequent siblings)
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Describe the motivation and basic principle of the clients change
detection mode and show an example invocation.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- adapted to suggested rewording
docs/backup-client.rst | 47 +++++++++++++++++++++++++++++++++++++
docs/technical-overview.rst | 3 +++
2 files changed, 50 insertions(+)
diff --git a/docs/backup-client.rst b/docs/backup-client.rst
index 00a1abbb3..e541c5537 100644
--- a/docs/backup-client.rst
+++ b/docs/backup-client.rst
@@ -280,6 +280,53 @@ Multiple paths can be excluded like this:
# proxmox-backup-client backup.pxar:./linux --exclude=/usr --exclude=/rust
+.. _client_change_detection_mode:
+
+Change Detection Mode
+~~~~~~~~~~~~~~~~~~~~~
+
+File-based backups containing a lot of data can take a long time, as the default
+behavior for the Proxmox backup client is to read all data and encode it into a
+pxar archive.
+The encoded stream is split into variable sized chunks. For each chunk, a digest
+is calculated and used to decide whether the chunk needs to be uploaded or can
+be indexed without upload, as it is already available on the server (and
+therefore deduplicated). If the backed up files are largely unchanged,
+re-reading and then detecting the corresponding chunks don't need to be uploaded
+after all is time consuming and undesired.
+
+The backup client's `change-detection-mode` can be switched from default to
+`metadata` based detection to reduce limitations as described above, instructing
+the client to avoid re-reading files with unchanged metadata whenever possible.
+When using this mode, instead of the regular pxar archive, the backup snapshot
+is stored into two separate files: the `mpxar` containing the archive's metadata
+and the `ppxar` containing a concatenation of the file contents. This splitting
+allows for efficient metadata lookups.
+
+Using the `change-detection-mode` set to `data` allows to create the same split
+archive as when using the `metadata` mode, but without using a previous
+reference and therefore reencoding all file payloads.
+When creating the backup archives, the current file metadata is compared to the
+one looked up in the previous `mpxar` archive.
+The metadata comparison includes file size, file type, ownership and permission
+information, as well as acls and attributes and most importantly the file's
+mtime, for details see the
+:ref:`pxar metadata archive format <pxar-meta-format>`.
+
+If unchanged, the entry is cached for possible re-use of content chunks without
+re-reading, by indexing the already present chunks containing the contents from
+the previous backup snapshot. Since the file might only partially re-use chunks
+(thereby introducing wasted space in the form of padding), the decision whether
+to re-use or re-encode the currently cached entries is postponed to when enough
+information is available, comparing the possible padding to a threshold value.
+
+The following shows an example for the client invocation with the `metadata`
+mode:
+
+.. code-block:: console
+
+ # proxmox-backup-client backup.pxar:./linux --change-detection-mode=metadata
+
.. _client_encryption:
Encryption
diff --git a/docs/technical-overview.rst b/docs/technical-overview.rst
index 89835a7cc..a8b1c7268 100644
--- a/docs/technical-overview.rst
+++ b/docs/technical-overview.rst
@@ -28,6 +28,9 @@ which are not chunked, e.g. the client log), or one or more indexes
When uploading an index, the client first has to read the source data, chunk it
and send the data as chunks with their identifying checksum to the server.
+When using the :ref:`change detection mode <change_detection_mode>` payload
+chunks for unchanged files are reused from the previous snapshot, thereby not
+reading the source data again.
If there is a previous Snapshot in the backup group, the client can first
download the chunk list of the previous Snapshot. If it detects a chunk that
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 57/58] test-suite: add detection mode change benchmark
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (54 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 56/58] docs: add section describing change detection mode Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 58/58] test-suite: Makefile: add debian package and related files Christian Ebner
2024-06-06 6:47 ` [pbs-devel] partially-applied: [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Fabian Grünbichler
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Introduces the proxmox-backup-test-suite create intended for
benchmarking and high level user facing testing.
The initial code includes a benchmark intended for regression testing of
the proxmox-backup-client when using different file detection modes
during backup.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
Cargo.toml | 1 +
proxmox-backup-test-suite/Cargo.toml | 18 ++
.../src/detection_mode_bench.rs | 294 ++++++++++++++++++
proxmox-backup-test-suite/src/main.rs | 17 +
4 files changed, 330 insertions(+)
create mode 100644 proxmox-backup-test-suite/Cargo.toml
create mode 100644 proxmox-backup-test-suite/src/detection_mode_bench.rs
create mode 100644 proxmox-backup-test-suite/src/main.rs
diff --git a/Cargo.toml b/Cargo.toml
index 4119b3cac..e83c65b60 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -45,6 +45,7 @@ members = [
"proxmox-restore-daemon",
"pxar-bin",
+ "proxmox-backup-test-suite",
]
[lib]
diff --git a/proxmox-backup-test-suite/Cargo.toml b/proxmox-backup-test-suite/Cargo.toml
new file mode 100644
index 000000000..3f899e9bc
--- /dev/null
+++ b/proxmox-backup-test-suite/Cargo.toml
@@ -0,0 +1,18 @@
+[package]
+name = "proxmox-backup-test-suite"
+version = "0.1.0"
+authors.workspace = true
+edition.workspace = true
+
+[dependencies]
+anyhow.workspace = true
+futures.workspace = true
+serde.workspace = true
+serde_json.workspace = true
+
+pbs-client.workspace = true
+pbs-key-config.workspace = true
+pbs-tools.workspace = true
+proxmox-async.workspace = true
+proxmox-router = { workspace = true, features = ["cli"] }
+proxmox-schema = { workspace = true, features = [ "api-macro" ] }
diff --git a/proxmox-backup-test-suite/src/detection_mode_bench.rs b/proxmox-backup-test-suite/src/detection_mode_bench.rs
new file mode 100644
index 000000000..9a3c76802
--- /dev/null
+++ b/proxmox-backup-test-suite/src/detection_mode_bench.rs
@@ -0,0 +1,294 @@
+use std::path::Path;
+use std::process::Command;
+use std::{thread, time};
+
+use anyhow::{bail, format_err, Error};
+use serde_json::Value;
+
+use pbs_client::{
+ tools::{complete_repository, key_source::KEYFILE_SCHEMA, REPO_URL_SCHEMA},
+ BACKUP_SOURCE_SCHEMA,
+};
+use pbs_tools::json;
+use proxmox_router::cli::*;
+use proxmox_schema::api;
+
+const DEFAULT_NUMBER_OF_RUNS: u64 = 5;
+// Homepage https://cocodataset.org/
+const COCO_DATASET_SRC_URL: &'static str = "http://images.cocodataset.org/zips/unlabeled2017.zip";
+// Homepage https://kernel.org/
+const LINUX_GIT_REPOSITORY: &'static str =
+ "git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git";
+const LINUX_GIT_TAG: &'static str = "v6.5.5";
+
+pub(crate) fn detection_mode_bench_mgtm_cli() -> CliCommandMap {
+ let run_cmd_def = CliCommand::new(&API_METHOD_DETECTION_MODE_BENCH_RUN)
+ .arg_param(&["backupspec"])
+ .completion_cb("repository", complete_repository)
+ .completion_cb("keyfile", complete_file_name);
+
+ let prepare_cmd_def = CliCommand::new(&API_METHOD_DETECTION_MODE_BENCH_PREPARE);
+ CliCommandMap::new()
+ .insert("prepare", prepare_cmd_def)
+ .insert("run", run_cmd_def)
+}
+
+#[api(
+ input: {
+ properties: {
+ backupspec: {
+ type: Array,
+ description: "List of backup source specifications ([<label.ext>:<path>] ...)",
+ items: {
+ schema: BACKUP_SOURCE_SCHEMA,
+ }
+ },
+ repository: {
+ schema: REPO_URL_SCHEMA,
+ optional: true,
+ },
+ keyfile: {
+ schema: KEYFILE_SCHEMA,
+ optional: true,
+ },
+ "number-of-runs": {
+ description: "Number of times to repeat the run",
+ type: Integer,
+ optional: true,
+ },
+ }
+ }
+)]
+/// Run benchmark to compare performance for backups using different change detection modes.
+fn detection_mode_bench_run(param: Value) -> Result<(), Error> {
+ let mut pbc = Command::new("proxmox-backup-client");
+ pbc.arg("backup");
+
+ let backupspec_list = json::required_array_param(¶m, "backupspec")?;
+ for backupspec in backupspec_list {
+ let arg = backupspec
+ .as_str()
+ .ok_or_else(|| format_err!("failed to parse backupspec"))?;
+ pbc.arg(arg);
+ }
+
+ if let Some(repo) = param["repository"].as_str() {
+ pbc.arg("--repository");
+ pbc.arg::<&str>(repo);
+ }
+
+ if let Some(keyfile) = param["keyfile"].as_str() {
+ pbc.arg("--keyfile");
+ pbc.arg::<&str>(keyfile);
+ }
+
+ let number_of_runs = match param["number_of_runs"].as_u64() {
+ Some(n) => n,
+ None => DEFAULT_NUMBER_OF_RUNS,
+ };
+ if number_of_runs < 1 {
+ bail!("Number of runs must be greater than 1, aborting.");
+ }
+
+ // First run is an initial run to make sure all chunks are present already, reduce side effects
+ // by filesystem caches ecc.
+ let _stats_initial = do_run(&mut pbc, 1)?;
+
+ println!("\nStarting benchmarking backups with regular detection mode...\n");
+ let stats_reg = do_run(&mut pbc, number_of_runs)?;
+
+ // Make sure to have a valid reference with catalog fromat version 2
+ pbc.arg("--change-detection-mode=metadata");
+ let _stats_initial = do_run(&mut pbc, 1)?;
+
+ println!("\nStarting benchmarking backups with metadata detection mode...\n");
+ let stats_meta = do_run(&mut pbc, number_of_runs)?;
+
+ println!("\nCompleted benchmark with {number_of_runs} runs for each tested mode.");
+ println!("\nCompleted regular backup with:");
+ println!("Total runtime: {:.2} s", stats_reg.total);
+ println!("Average: {:.2} ± {:.2} s", stats_reg.avg, stats_reg.stddev);
+ println!("Min: {:.2} s", stats_reg.min);
+ println!("Max: {:.2} s", stats_reg.max);
+
+ println!("\nCompleted metadata detection mode backup with:");
+ println!("Total runtime: {:.2} s", stats_meta.total);
+ println!(
+ "Average: {:.2} ± {:.2} s",
+ stats_meta.avg, stats_meta.stddev
+ );
+ println!("Min: {:.2} s", stats_meta.min);
+ println!("Max: {:.2} s", stats_meta.max);
+
+ let diff_stddev =
+ ((stats_meta.stddev * stats_meta.stddev) + (stats_reg.stddev * stats_reg.stddev)).sqrt();
+ println!("\nDifferences (metadata based - regular):");
+ println!(
+ "Delta total runtime: {:.2} s ({:.2} %)",
+ stats_meta.total - stats_reg.total,
+ 100.0 * (stats_meta.total / stats_reg.total - 1.0),
+ );
+ println!(
+ "Delta average: {:.2} ± {:.2} s ({:.2} %)",
+ stats_meta.avg - stats_reg.avg,
+ diff_stddev,
+ 100.0 * (stats_meta.avg / stats_reg.avg - 1.0),
+ );
+ println!(
+ "Delta min: {:.2} s ({:.2} %)",
+ stats_meta.min - stats_reg.min,
+ 100.0 * (stats_meta.min / stats_reg.min - 1.0),
+ );
+ println!(
+ "Delta max: {:.2} s ({:.2} %)",
+ stats_meta.max - stats_reg.max,
+ 100.0 * (stats_meta.max / stats_reg.max - 1.0),
+ );
+
+ Ok(())
+}
+
+fn do_run(cmd: &mut Command, n_runs: u64) -> Result<Statistics, Error> {
+ // Avoid consecutive snapshot timestamps collision
+ thread::sleep(time::Duration::from_millis(1000));
+ let mut timings = Vec::with_capacity(n_runs as usize);
+ for iteration in 1..n_runs + 1 {
+ let start = std::time::SystemTime::now();
+ let mut child = cmd.spawn()?;
+ let exit_code = child.wait()?;
+ let elapsed = start.elapsed()?;
+ timings.push(elapsed);
+ if !exit_code.success() {
+ bail!("Run number {iteration} of {n_runs} failed, aborting.");
+ }
+ }
+
+ Ok(statistics(timings))
+}
+
+struct Statistics {
+ total: f64,
+ avg: f64,
+ stddev: f64,
+ min: f64,
+ max: f64,
+}
+
+fn statistics(timings: Vec<std::time::Duration>) -> Statistics {
+ let total = timings
+ .iter()
+ .fold(0f64, |sum, time| sum + time.as_secs_f64());
+ let avg = total / timings.len() as f64;
+ let var = 1f64 / (timings.len() - 1) as f64
+ * timings.iter().fold(0f64, |sq_sum, time| {
+ let diff = time.as_secs_f64() - avg;
+ sq_sum + diff * diff
+ });
+ let stddev = var.sqrt();
+ let min = timings.iter().min().unwrap().as_secs_f64();
+ let max = timings.iter().max().unwrap().as_secs_f64();
+
+ Statistics {
+ total,
+ avg,
+ stddev,
+ min,
+ max,
+ }
+}
+
+#[api(
+ input: {
+ properties: {
+ target: {
+ description: "target path to prepare test data.",
+ },
+ },
+ },
+)]
+/// Prepare files required for detection mode backup benchmarks.
+fn detection_mode_bench_prepare(target: String) -> Result<(), Error> {
+ let linux_repo_target = format!("{target}/linux");
+ let coco_dataset_target = format!("{target}/coco");
+ git_clone(LINUX_GIT_REPOSITORY, linux_repo_target.as_str())?;
+ git_checkout(LINUX_GIT_TAG, linux_repo_target.as_str())?;
+ wget_download(COCO_DATASET_SRC_URL, coco_dataset_target.as_str())?;
+
+ Ok(())
+}
+
+fn git_clone(repo: &str, target: &str) -> Result<(), Error> {
+ println!("Calling git clone for '{repo}'.");
+ let target_git = format!("{target}/.git");
+ let path = Path::new(&target_git);
+ if let Ok(true) = path.try_exists() {
+ println!("Target '{target}' already contains a git repository, skip.");
+ return Ok(());
+ }
+
+ let mut git = Command::new("git");
+ git.args(["clone", repo, target]);
+
+ let mut child = git.spawn()?;
+ let exit_code = child.wait()?;
+ if exit_code.success() {
+ println!("git clone finished with success.");
+ } else {
+ bail!("git clone failed for '{target}'.");
+ }
+
+ Ok(())
+}
+
+fn git_checkout(checkout_target: &str, target: &str) -> Result<(), Error> {
+ println!("Calling git checkout '{checkout_target}'.");
+ let mut git = Command::new("git");
+ git.args(["-C", target, "checkout", checkout_target]);
+
+ let mut child = git.spawn()?;
+ let exit_code = child.wait()?;
+ if exit_code.success() {
+ println!("git checkout finished with success.");
+ } else {
+ bail!("git checkout '{checkout_target}' failed for '{target}'.");
+ }
+ Ok(())
+}
+
+fn wget_download(source_url: &str, target: &str) -> Result<(), Error> {
+ let path = Path::new(&target);
+ if let Ok(true) = path.try_exists() {
+ println!("Target '{target}' already exists, skip.");
+ return Ok(());
+ }
+ let zip = format!("{}/unlabeled2017.zip", target);
+ let path = Path::new(&zip);
+ if !path.try_exists()? {
+ println!("Download archive using wget from '{source_url}' to '{target}'.");
+ let mut wget = Command::new("wget");
+ wget.args(["-P", target, source_url]);
+
+ let mut child = wget.spawn()?;
+ let exit_code = child.wait()?;
+ if exit_code.success() {
+ println!("Download finished with success.");
+ } else {
+ bail!("Failed to download '{source_url}' to '{target}'.");
+ }
+ return Ok(());
+ } else {
+ println!("Target '{target}' already contains download, skip download.");
+ }
+
+ let mut unzip = Command::new("unzip");
+ unzip.args([&zip, "-d", target]);
+
+ let mut child = unzip.spawn()?;
+ let exit_code = child.wait()?;
+ if exit_code.success() {
+ println!("Extracting zip archive finished with success.");
+ } else {
+ bail!("Failed to extract zip archive '{zip}' to '{target}'.");
+ }
+ Ok(())
+}
diff --git a/proxmox-backup-test-suite/src/main.rs b/proxmox-backup-test-suite/src/main.rs
new file mode 100644
index 000000000..0a5b436a8
--- /dev/null
+++ b/proxmox-backup-test-suite/src/main.rs
@@ -0,0 +1,17 @@
+use proxmox_router::cli::*;
+
+mod detection_mode_bench;
+
+fn main() {
+ let cmd_def = CliCommandMap::new().insert(
+ "detection-mode-bench",
+ detection_mode_bench::detection_mode_bench_mgtm_cli(),
+ );
+
+ let rpcenv = CliEnvironment::new();
+ run_cli_command(
+ cmd_def,
+ rpcenv,
+ Some(|future| proxmox_async::runtime::main(future)),
+ );
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] [PATCH v9 proxmox-backup 58/58] test-suite: Makefile: add debian package and related files
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (55 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 57/58] test-suite: add detection mode change benchmark Christian Ebner
@ 2024-06-05 10:54 ` Christian Ebner
2024-06-06 6:47 ` [pbs-devel] partially-applied: [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Fabian Grünbichler
57 siblings, 0 replies; 59+ messages in thread
From: Christian Ebner @ 2024-06-05 10:54 UTC (permalink / raw)
To: pbs-devel
Adds the required Makefile and debian packaging entries to package
the test suite binary as standalone debian package.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 8:
- no changes
Makefile | 18 +++++++++++-------
debian/control | 7 +++++++
debian/proxmox-backup-client.bash-completion | 1 +
debian/proxmox-backup-test-suite.bc | 8 ++++++++
debian/proxmox-backup-test-suite.install | 3 +++
docs/Makefile | 2 ++
docs/command-line-tools.rst | 5 +++++
docs/command-syntax.rst | 4 ++++
docs/conf.py | 1 +
docs/proxmox-backup-test-suite/description.rst | 2 ++
docs/proxmox-backup-test-suite/man1.rst | 17 +++++++++++++++++
zsh-completions/_proxmox-backup-test-suite | 13 +++++++++++++
12 files changed, 74 insertions(+), 7 deletions(-)
create mode 100644 debian/proxmox-backup-test-suite.bc
create mode 100644 debian/proxmox-backup-test-suite.install
create mode 100644 docs/proxmox-backup-test-suite/description.rst
create mode 100644 docs/proxmox-backup-test-suite/man1.rst
create mode 100644 zsh-completions/_proxmox-backup-test-suite
diff --git a/Makefile b/Makefile
index 03e938767..8529363ce 100644
--- a/Makefile
+++ b/Makefile
@@ -8,11 +8,12 @@ SUBDIRS := etc www docs templates
# Binaries usable by users
USR_BIN := \
- proxmox-backup-client \
- proxmox-file-restore \
- pxar \
- proxmox-tape \
- pmtx \
+ proxmox-backup-client \
+ proxmox-backup-test-suite \
+ proxmox-file-restore \
+ pxar \
+ proxmox-tape \
+ pmtx \
pmt
# Binaries usable by admins
@@ -60,9 +61,10 @@ CLIENT_DBG_DEB=$(PACKAGE)-client-dbgsym_$(DEB_VERSION)_$(ARCH).deb
RESTORE_DEB=proxmox-backup-file-restore_$(DEB_VERSION)_$(ARCH).deb
RESTORE_DBG_DEB=proxmox-backup-file-restore-dbgsym_$(DEB_VERSION)_$(ARCH).deb
DOC_DEB=$(PACKAGE)-docs_$(DEB_VERSION)_all.deb
+TEST_SUITE_DEB=$(PACKAGE)-test-suite_$(DEB_VERSION)_$(ARCH).deb
DEBS=$(SERVER_DEB) $(SERVER_DBG_DEB) $(CLIENT_DEB) $(CLIENT_DBG_DEB) \
- $(RESTORE_DEB) $(RESTORE_DBG_DEB)
+ $(RESTORE_DEB) $(RESTORE_DBG_DEB) $(TEST_SUITE_DEB)
DSC = rust-$(PACKAGE)_$(DEB_VERSION).dsc
@@ -165,6 +167,8 @@ $(COMPILED_BINS) $(COMPILEDIR)/dump-catalog-shell-cli $(COMPILEDIR)/docgen: .do-
--bin proxmox-backup-client \
--bin dump-catalog-shell-cli \
--bin proxmox-backup-debug \
+ --package proxmox-backup-test-suite \
+ --bin proxmox-backup-test-suite \
--package proxmox-file-restore \
--bin proxmox-file-restore \
--package pxar-bin \
@@ -218,7 +222,7 @@ upload: UPLOAD_DIST ?= $(DEB_DISTRIBUTION)
upload: $(SERVER_DEB) $(CLIENT_DEB) $(RESTORE_DEB) $(DOC_DEB)
# check if working directory is clean
git diff --exit-code --stat && git diff --exit-code --stat --staged
- tar cf - $(SERVER_DEB) $(SERVER_DBG_DEB) $(DOC_DEB) $(CLIENT_DEB) $(CLIENT_DBG_DEB) \
+ tar cf - $(SERVER_DEB) $(SERVER_DBG_DEB) $(DOC_DEB) $(CLIENT_DEB) $(CLIENT_DBG_DEB) $(TEST_SUIT_DEB) \
| ssh -X repoman@repo.proxmox.com upload --product pbs --dist $(UPLOAD_DIST)
tar cf - $(CLIENT_DEB) $(CLIENT_DBG_DEB) | ssh -X repoman@repo.proxmox.com upload --product "pve,pmg,pbs-client" --dist $(UPLOAD_DIST)
tar cf - $(RESTORE_DEB) $(RESTORE_DBG_DEB) | ssh -X repoman@repo.proxmox.com upload --product "pve" --dist $(UPLOAD_DIST)
diff --git a/debian/control b/debian/control
index 60fdabd5f..bbf6d2e8a 100644
--- a/debian/control
+++ b/debian/control
@@ -216,3 +216,10 @@ Description: Proxmox Backup single file restore tools for pxar and block device
This package contains the Proxmox Backup single file restore client for
restoring individual files and folders from both host/container and VM/block
device backups. It includes a block device restore driver using QEMU.
+
+Package: proxmox-backup-test-suite
+Architecture: any
+Depends: proxmox-backup-client, ${shlibs:Depends}
+Description: Proxmox Backup Test Suite tool
+ This package contains the Proxmox Backup Test Suite, which provides a cli tool
+ to run performance tests.
diff --git a/debian/proxmox-backup-client.bash-completion b/debian/proxmox-backup-client.bash-completion
index 437360175..c4ff02ae6 100644
--- a/debian/proxmox-backup-client.bash-completion
+++ b/debian/proxmox-backup-client.bash-completion
@@ -1,2 +1,3 @@
debian/proxmox-backup-client.bc proxmox-backup-client
+debian/proxmox-backup-test-suite.bc proxmox-backup-test-suite
debian/pxar.bc pxar
diff --git a/debian/proxmox-backup-test-suite.bc b/debian/proxmox-backup-test-suite.bc
new file mode 100644
index 000000000..2686d7eaa
--- /dev/null
+++ b/debian/proxmox-backup-test-suite.bc
@@ -0,0 +1,8 @@
+# proxmox-backup-test-suite bash completion
+
+# see http://tiswww.case.edu/php/chet/bash/FAQ
+# and __ltrim_colon_completions() in /usr/share/bash-completion/bash_completion
+# this modifies global var, but I found no better way
+COMP_WORDBREAKS=${COMP_WORDBREAKS//:}
+
+complete -C 'proxmox-backup-test-suite bashcomplete' proxmox-backup-test-suite
diff --git a/debian/proxmox-backup-test-suite.install b/debian/proxmox-backup-test-suite.install
new file mode 100644
index 000000000..e0cb31ac6
--- /dev/null
+++ b/debian/proxmox-backup-test-suite.install
@@ -0,0 +1,3 @@
+usr/bin/proxmox-backup-test-suite
+usr/share/man/man1/proxmox-backup-test-suite.1
+usr/share/zsh/vendor-completions/_proxmox-backup-test-suite
diff --git a/docs/Makefile b/docs/Makefile
index d6c61c86e..014739f69 100644
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -7,6 +7,7 @@ GENERATED_SYNOPSIS := \
proxmox-backup-manager/synopsis.rst \
proxmox-backup-debug/synopsis.rst \
proxmox-file-restore/synopsis.rst \
+ proxmox-backup-test-suite/synopsis.rst \
pxar/synopsis.rst \
pmtx/synopsis.rst \
pmt/synopsis.rst \
@@ -33,6 +34,7 @@ MAN1_PAGES := \
proxmox-backup-manager.1 \
proxmox-file-restore.1 \
proxmox-backup-debug.1 \
+ proxmox-backup-test-suite.1 \
pbs2to3.1 \
MAN5_PAGES := \
diff --git a/docs/command-line-tools.rst b/docs/command-line-tools.rst
index 0cac17c8b..3655b7c8c 100644
--- a/docs/command-line-tools.rst
+++ b/docs/command-line-tools.rst
@@ -40,3 +40,8 @@ Command-line Tools
~~~~~~~~~~~~~~~~~~~~~~~~
.. include:: proxmox-backup-debug/description.rst
+
+``proxmox-backup-test-suite``
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. include:: proxmox-backup-test-suite/description.rst
diff --git a/docs/command-syntax.rst b/docs/command-syntax.rst
index 9657557d1..bfaf635a1 100644
--- a/docs/command-syntax.rst
+++ b/docs/command-syntax.rst
@@ -65,3 +65,7 @@ The following commands are available in an interactive restore shell:
``proxmox-backup-debug``
------------------------
.. include:: proxmox-backup-debug/synopsis.rst
+
+``proxmox-backup-test-suite``
+------------------------
+.. include:: proxmox-backup-test-suite/synopsis.rst
diff --git a/docs/conf.py b/docs/conf.py
index fba726295..876e53479 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -98,6 +98,7 @@ man_pages = [
('proxmox-backup-proxy/man1', 'proxmox-backup-proxy', 'Proxmox Backup Public API Server', [author], 1),
('proxmox-backup/man1', 'proxmox-backup', 'Proxmox Backup Local API Server', [author], 1),
('proxmox-file-restore/man1', 'proxmox-file-restore', 'CLI tool for restoring files and directories from Proxmox Backup Server archives', [author], 1),
+ ('proxmox-backup-test-suite/man1', 'proxmox-backup-test-suite', 'CLI tool for performing performance benchmarks', [author], 1),
('proxmox-tape/man1', 'proxmox-tape', 'Proxmox Tape Backup CLI Tool', [author], 1),
('pxar/man1', 'pxar', 'Proxmox File Archive CLI Tool', [author], 1),
('pmt/man1', 'pmt', 'Control Linux Tape Devices', [author], 1),
diff --git a/docs/proxmox-backup-test-suite/description.rst b/docs/proxmox-backup-test-suite/description.rst
new file mode 100644
index 000000000..b99c29adf
--- /dev/null
+++ b/docs/proxmox-backup-test-suite/description.rst
@@ -0,0 +1,2 @@
+Command-line tool for running performance benchmarks.
+
diff --git a/docs/proxmox-backup-test-suite/man1.rst b/docs/proxmox-backup-test-suite/man1.rst
new file mode 100644
index 000000000..2e57423c0
--- /dev/null
+++ b/docs/proxmox-backup-test-suite/man1.rst
@@ -0,0 +1,17 @@
+:orphan:
+
+====================
+proxmox-backup-test-suite
+====================
+
+Synopsis
+========
+
+.. include:: synopsis.rst
+
+Description
+============
+
+.. include:: description.rst
+
+.. include:: ../pbs-copyright.rst
diff --git a/zsh-completions/_proxmox-backup-test-suite b/zsh-completions/_proxmox-backup-test-suite
new file mode 100644
index 000000000..72ebcea5f
--- /dev/null
+++ b/zsh-completions/_proxmox-backup-test-suite
@@ -0,0 +1,13 @@
+#compdef _proxmox-backup-test-suite() proxmox-backup-test-suite
+
+function _proxmox-backup-test-suite() {
+ local cwords line point cmd curr prev
+ cwords=${#words[@]}
+ line=$words
+ point=${#line}
+ cmd=${words[1]}
+ curr=${words[cwords]}
+ prev=${words[cwords-1]}
+ compadd -- $(COMP_CWORD="$cwords" COMP_LINE="$line" COMP_POINT="$point" \
+ proxmox-backup-test-suite bashcomplete "$cmd" "$curr" "$prev")
+}
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
* [pbs-devel] partially-applied: [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
` (56 preceding siblings ...)
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 58/58] test-suite: Makefile: add debian package and related files Christian Ebner
@ 2024-06-06 6:47 ` Fabian Grünbichler
57 siblings, 0 replies; 59+ messages in thread
From: Fabian Grünbichler @ 2024-06-06 6:47 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
applied all (including the smaller fixups from your tree) but 57/58 (I
am still not sure whether we want this as a package here, or split out
somewhere else for CI purposes only) and 52-54, since as discussed
off-list, I think those can be merged into the existing catalog API
endpoint and be made compatible.
here's to finding all the remaining edge-cases, and congrats on pulling
this through! ;)
On June 5, 2024 12:53 pm, Christian Ebner wrote:
> This series of patches implements an metadata based file change
> detection mechanism for improved pxar file level backup creation speed
> for unchanged files.
>
> The chosen approach is to split pxar archives on creation via the
> proxmox-backup-client into two separate data and upload streams,
> one exclusive for regular file payloads, the other one for the rest
> of the pxar archive, which is mostly metadata.
>
> On consecutive runs, the metadata archive of the previous backup run,
> which is limited in size and therefore rapidly accessed is used to
> lookup and compare the metadata for entries to encode.
> This assumes that the connection speed to the Proxmox Backup Server is
> sufficiently fast, allowing the download and chaching of the chunks for
> that index.
>
> Changes to regular files are detected by comparing all of the files
> metadata object, including mtime, acls, ecc. If no changes are detected,
> the previous payload index is used to lookup chunks to possibly re-use
> in the payload stream of the new archive.
> In order to reduce possible chunk fragmentation, the decision whether to
> reuse or reencode a file payload is deferred until enough information
> is gathered by adding entries to a look-ahead cache. If the padding
> introduced by reusing chunks falls below a threshold, the entries are
> referenced, the chunks are reused and injected into the pxar payload
> upload stream, otherwise they are discated and the files encoded
> regularly.
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 59+ messages in thread
end of thread, other threads:[~2024-06-06 6:47 UTC | newest]
Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-05 10:53 [pbs-devel] [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 01/58] client: pxar: switch to stack based encoder state Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 02/58] client: pxar: combine writers into struct Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 03/58] client: pxar: optionally split metadata and payload streams Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 04/58] client: helper: add helpers for creating reader instances Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 05/58] client: helper: add method for split archive name mapping Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 06/58] client: tools: helper to check pxar filename extensions Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 07/58] client: restore: read payload from dedicated index Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 08/58] client: tools: cover extension for split pxar archives Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 09/58] client: mount: make split pxar archives mountable Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 10/58] api: datastore: attach split archive payload chunk reader Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 11/58] catalog: shell: make split pxar archives accessible Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 12/58] www: cover metadata extension for pxar archives Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 13/58] file restore: cover extension for split " Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 14/58] file restore: factor out getting pxar reader Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 15/58] file restore: cover split metadata and payload archives Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 16/58] file restore: show more error context when extraction fails Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 17/58] pxar: bin: add optional payload input for archive restore Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 18/58] pxar: bin: cover listing for split archives Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 19/58] pxar: bin: add more context to extraction error Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 20/58] client: pxar: include payload offset in entry listing Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 21/58] client: pxar: helper for lookup of reusable dynamic entries Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 22/58] upload stream: implement reused chunk injector Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 23/58] client: chunk stream: add struct to hold injection state Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 24/58] chunker: add method to reset chunker state Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 25/58] client: streams: add channels for dynamic entry injection Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 26/58] specs: add backup detection mode specification Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 27/58] client: implement prepare reference method Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 28/58] client: pxar: add method for metadata comparison Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 29/58] pxar: caching: add look-ahead cache Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 30/58] client: pxar: refactor catalog encoding for directories Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 31/58] fix #3174: client: pxar: enable caching and meta comparison Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 32/58] client: backup writer: add injected chunk count to stats Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 33/58] pxar: create: keep track of reused chunks and files Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 34/58] pxar: create: show chunk injection stats info output Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 35/58] client: backup writer: make backup info output more concise Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 36/58] client: pxar: add helper to handle optional preludes Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 37/58] client: pxar: opt encode cli exclude patterns as Prelude Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 38/58] client: pxar: allow to restore prelude to optional path Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 39/58] pxar: bin: show padding in debug output on archive list Christian Ebner
2024-06-05 10:53 ` [pbs-devel] [PATCH v9 proxmox-backup 40/58] pxar: bin: ignore version and prelude entries in listing Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 42/58] pxar: bin: support creation of split pxar archives via cli Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 43/58] pxar: add optional payload input to mount archive Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 44/58] datastore: chunker: add Chunker trait Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 45/58] datastore: chunker: implement chunker for payload stream Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 46/58] chunker: tests: add regression tests for payload chunker Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 47/58] chunk stream: " Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 48/58] client: chunk stream: switch payload stream chunker Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 49/58] client: pxar: add archive creation with reference test Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 50/58] client: tools: add helper to raise nofile rlimit Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 51/58] client: pxar: set cache limit based on " Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 52/58] api: datastore: add endpoint to lookup entries via pxar archive Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 53/58] api: datastore: add optional archive-name to file-restore Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 54/58] www: content: lookup via metadata archive instead of catalog Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 55/58] docs: file formats: describe split pxar archive file layout Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 56/58] docs: add section describing change detection mode Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 57/58] test-suite: add detection mode change benchmark Christian Ebner
2024-06-05 10:54 ` [pbs-devel] [PATCH v9 proxmox-backup 58/58] test-suite: Makefile: add debian package and related files Christian Ebner
2024-06-06 6:47 ` [pbs-devel] partially-applied: [PATCH v9 proxmox-backup 00/58] fix #3174: improve file-level backup Fabian Grünbichler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox