* [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores
@ 2025-07-15 12:52 Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 1/9] s3 client: add crate for AWS s3 compatible object store client Christian Ebner
` (55 more replies)
0 siblings, 56 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Disclaimer: These patches are still in an experimental state and not
intended for production use.
This patch series aims to add S3 compatible object stores as storage
backend for PBS datastores. A PBS local cache store using the regular
datastore layout is used for faster operation, bypassing requests to
the S3 api when possible. Further, the local cache store allows to
keep frequently used chunks and is used to avoid expensive metadata
updates on the object store, e.g. by using local marker file during
garbage collection.
Backups are created by upload chunks to the corresponding S3 bucket,
while keeping the index files in the local cache store, on backup
finish, the snapshot metadata are persisted to the S3 storage backend.
Snapshot restores read chunks preferably from the local cache store,
downloading and insterting them if not present from the S3 object
store. Listing and snapsoht metadata operation currently rely soly on
the local cache store.
Currently chunks use a 1:1 mapping to S3 objects. An advanced packing
mechanism for chunks to significantly reduce the number of api
requests and therefore be more cost effective will be implemented as
followup patches.
Most notably changes since version 7 of the patches (thanks @Thomas
and @Lukas for feedback, testing and debugging):
- Improve self-signed certificate fingerprint check, verify valid
expected fingerprint is passed to client on instantiation.
- Rename previously missed host to endpoint is s3 client selector
- Use more specific `S3 Client ID` over ambiguous `Unique Identifier`
- Implement missing cli commands for s3 client manipulation
- Add in-use marker to s3 object stores to avoid accitental reuse of
object stores which are already used as datastore by another
instance, adding also flags to overwrite.
- Automatically perform an s3-refresh when recreating a datastore,
pre-populating the contents without further user interaction.
- Add documentation
- Fix formatting issue in proxmox-s3-client
Most notably changes since version 6 of the patches (thanks @Thomas
for feedback):
- Reworked uri encoding logic, instead of doing this in the
S3ObjectKey, perform this in the build_uri helper used by all
client api requests.
- Add cache-size optional parameter to datastore backend config,
allows to define the local datastore LRU cache capacity.
- Increase s3 client timeout, as otherwise delete objects operations
on Cloudflare R2 would run into a timeout error.
- Also upload client log, previously not uploaded to s3 backend.
- Add missing documentation to some pub types in the response reader
- Use s3 object key generation helper for index file upload, which
fixes the missing key prefix.
- Add basic regression tests for uri encoder and decoder helper
functions.
- Include some baseline performance tests for garbage collection as
well as chunk up-/download when caching.
Most notably changes since version 5 of the patches (thanks @Thomas
for feedback):
- Move s3 client into its own, dedicated crate in the proxmox repo
- Factor out any directly PBS related code from the client
- Guard implementation behind feature cfg, so api types can be used
independently
- Add basic example and extend on crate documentation
Most notably changes since version 4 of the patches:
- Fix race between S3 backend upload and local cache store insert,
avoiding possibly chunk loss for concurrent backups.
- Use the local datastore cache also for local chunk reader instances
- Fallback to fetching chunks from S3 backend if they should be cached
but the local chunk file is missing or empty, instead of failing
- Rename chunks detected as corrupt also on the S3 object store
- Retry chunk uploads via put objects in case of errors.
- Add possibility to add rate limits for the s3 client put requests, as
otherwise object stores can be overloaded.
- Allow for Cloudflare R2 compatible `auto` region, as otherwise AWS
sign v4 request authentication will fail
- Use `Async` instead of `Sync` variant for the api handler of the
s3-refresh command, as otherwise this fails.
- Take into account that some type folders might not be present when
performing an s3-refresh.
- Use `Local` instead of `Regular` to refer to normal datastores in the
creation window.
Most notably changes since version 3 of the patches:
- Rebased onto current master, fixed incompatibilities with upgraded
dependencies
- Added method to uri decode s3 object keys, as they are required in
order to download contents to a local store
- Added api endpoint to allow resyncing of the datastore contents to
the local cache store, introducing a new maintenance mode s3-refresh
to guarantee consistency.
Most notably changes since RFC version 2 of the patches (thanks
@Lukas for feedback):
- Extend S3 client implementation to also support path style bucket
addressing.
- Keep bucket name as config option for the datastore, allowing more
flexible reuse of a configured S3 client.
- Use the datastore name as additional object key prefix to allow for
multiple datastores on the same bucket.
- Allow bucket and region templating in S3 endpoint, making this more
flexible with respect to possible DNS records.
- Rework datastore create window to be less overloaded.
- Drop dead code in the S3 client implementation, since tagging and
object copying is currently not required.
- Fix missing locking when deleting chunks from s3 store during
garbage collection, avoiding possible chunk loss for concurrent
backups.
- Remove chunks from LRU cache when deleting chunks during garbage
collection, avoiding possible chunk loss for concurrent backups.
- Add dedicated types for object prefix and relative s3 key paths to
avoid misuse.
- Use more fitting icon for S3 client.
Link to the bugtracker issue:
https://bugzilla.proxmox.com/show_bug.cgi?id=2943
Steps to setup a local S3 object store using RADOS gateway or MinIO
can be found at (internal only, external users might use the steps
outlined in the cover letter and comments of RFC version 2):
https://wiki.intra.proxmox.com/PBS_Setup_S3_Object_Store
proxmox:
Christian Ebner (9):
s3 client: add crate for AWS s3 compatible object store client
s3 client: implement AWS signature v4 request authentication
s3 client: add dedicated type for s3 object keys
s3 client: add type for last modified timestamp in responses
s3 client: add helper to parse http date headers
s3 client: implement methods to operate on s3 objects in bucket
s3 client: add example usage for basic operations
pbs-api-types: extend datastore config by backend config enum
pbs-api-types: maintenance: add new maintenance mode S3 refresh
Cargo.toml | 7 +
pbs-api-types/Cargo.toml | 1 +
pbs-api-types/debian/control | 2 +
pbs-api-types/src/datastore.rs | 114 +++-
pbs-api-types/src/maintenance.rs | 4 +
proxmox-s3-client/Cargo.toml | 48 ++
proxmox-s3-client/debian/changelog | 5 +
proxmox-s3-client/debian/control | 111 ++++
proxmox-s3-client/debian/copyright | 18 +
proxmox-s3-client/debian/debcargo.toml | 7 +
proxmox-s3-client/examples/s3_client.rs | 69 +++
proxmox-s3-client/src/api_types.rs | 172 ++++++
proxmox-s3-client/src/aws_sign_v4.rs | 210 ++++++++
proxmox-s3-client/src/client.rs | 656 +++++++++++++++++++++++
proxmox-s3-client/src/lib.rs | 33 ++
proxmox-s3-client/src/object_key.rs | 86 +++
proxmox-s3-client/src/response_reader.rs | 376 +++++++++++++
proxmox-s3-client/src/timestamps.rs | 106 ++++
18 files changed, 2024 insertions(+), 1 deletion(-)
create mode 100644 proxmox-s3-client/Cargo.toml
create mode 100644 proxmox-s3-client/debian/changelog
create mode 100644 proxmox-s3-client/debian/control
create mode 100644 proxmox-s3-client/debian/copyright
create mode 100644 proxmox-s3-client/debian/debcargo.toml
create mode 100644 proxmox-s3-client/examples/s3_client.rs
create mode 100644 proxmox-s3-client/src/api_types.rs
create mode 100644 proxmox-s3-client/src/aws_sign_v4.rs
create mode 100644 proxmox-s3-client/src/client.rs
create mode 100644 proxmox-s3-client/src/lib.rs
create mode 100644 proxmox-s3-client/src/object_key.rs
create mode 100644 proxmox-s3-client/src/response_reader.rs
create mode 100644 proxmox-s3-client/src/timestamps.rs
proxmox-backup:
Christian Ebner (45):
datastore: add helpers for path/digest to s3 object key conversion
config: introduce s3 object store client configuration
api: config: implement endpoints to manipulate and list s3 configs
api: datastore: check s3 backend bucket access on datastore create
api/cli: add endpoint and command to check s3 client connection
datastore: allow to get the backend for a datastore
api: backup: store datastore backend in runtime environment
api: backup: conditionally upload chunks to s3 object store backend
api: backup: conditionally upload blobs to s3 object store backend
api: backup: conditionally upload indices to s3 object store backend
api: backup: conditionally upload manifest to s3 object store backend
api: datastore: conditionally upload client log to s3 backend
sync: pull: conditionally upload content to s3 backend
api: reader: fetch chunks based on datastore backend
datastore: local chunk reader: read chunks based on backend
verify worker: add datastore backed to verify worker
verify: implement chunk verification for stores with s3 backend
datastore: create namespace marker in s3 backend
datastore: create/delete protected marker file on s3 storage backend
datastore: prune groups/snapshots from s3 object store backend
datastore: get and set owner for s3 store backend
datastore: implement garbage collection for s3 backend
ui: add datastore type selector and reorganize component layout
ui: add s3 client edit window for configuration create/edit
ui: add s3 client view for configuration
ui: expose the s3 client view in the navigation tree
ui: add s3 client selector and bucket field for s3 backend setup
tools: lru cache: add removed callback for evicted cache nodes
tools: async lru cache: implement insert, remove and contains methods
datastore: add local datastore cache for network attached storages
api: backup: use local datastore cache on s3 backend chunk upload
api: reader: use local datastore cache on s3 backend chunk fetching
datastore: local chunk reader: get cached chunk from local cache store
api: backup: add no-cache flag to bypass local datastore cache
api/datastore: implement refresh endpoint for stores with s3 backend
cli: add dedicated subcommand for datastore s3 refresh
ui: render s3 refresh as valid maintenance type and task description
ui: expose s3 refresh button for datastores backed by object store
datastore: conditionally upload atime marker chunk to s3 backend
bin: implement client subcommands for s3 configuration manipulation
bin: expose reuse-datastore flag for proxmox-backup-manager
datastore: mark store as in-use by setting marker on s3 backend
datastore: run s3-refresh when reusing a datastore with s3 backend
api/ui: add flag to allow overwriting in-use marker for s3 backend
docs: Add section describing how to setup s3 backed datastore
Cargo.toml | 2 +
docs/storage.rst | 68 ++
examples/upload-speed.rs | 1 +
pbs-client/src/backup_writer.rs | 4 +-
pbs-config/Cargo.toml | 1 +
pbs-config/src/lib.rs | 1 +
pbs-config/src/s3.rs | 83 +++
pbs-datastore/Cargo.toml | 5 +
pbs-datastore/src/backup_info.rs | 63 +-
pbs-datastore/src/cached_chunk_reader.rs | 6 +-
pbs-datastore/src/chunk_store.rs | 30 +-
pbs-datastore/src/datastore.rs | 631 ++++++++++++++++--
pbs-datastore/src/dynamic_index.rs | 1 +
pbs-datastore/src/lib.rs | 5 +
pbs-datastore/src/local_chunk_reader.rs | 61 +-
.../src/local_datastore_lru_cache.rs | 172 +++++
pbs-datastore/src/s3.rs | 49 ++
pbs-tools/src/async_lru_cache.rs | 46 +-
pbs-tools/src/lru_cache.rs | 42 +-
proxmox-backup-client/src/benchmark.rs | 1 +
proxmox-backup-client/src/main.rs | 8 +
src/api2/admin/datastore.rs | 85 ++-
src/api2/admin/mod.rs | 2 +
src/api2/admin/s3.rs | 80 +++
src/api2/backup/environment.rs | 82 ++-
src/api2/backup/mod.rs | 131 ++--
src/api2/backup/upload_chunk.rs | 114 +++-
src/api2/config/datastore.rs | 140 +++-
src/api2/config/mod.rs | 2 +
src/api2/config/s3.rs | 310 +++++++++
src/api2/node/disks/directory.rs | 2 +-
src/api2/node/disks/zfs.rs | 2 +-
src/api2/reader/environment.rs | 12 +-
src/api2/reader/mod.rs | 62 +-
src/backup/verify.rs | 105 ++-
src/bin/proxmox-backup-manager.rs | 1 +
src/bin/proxmox_backup_manager/datastore.rs | 42 ++
src/bin/proxmox_backup_manager/mod.rs | 2 +
src/bin/proxmox_backup_manager/s3.rs | 102 +++
src/server/pull.rs | 70 +-
src/server/push.rs | 1 +
src/server/verify_job.rs | 2 +-
www/Makefile | 3 +
www/NavigationTree.js | 6 +
www/Utils.js | 4 +
www/config/S3ClientView.js | 141 ++++
www/datastore/Summary.js | 44 ++
www/form/S3ClientSelector.js | 33 +
www/window/DataStoreEdit.js | 130 +++-
www/window/MaintenanceOptions.js | 6 +-
www/window/S3ClientEdit.js | 148 ++++
51 files changed, 2865 insertions(+), 279 deletions(-)
create mode 100644 pbs-config/src/s3.rs
create mode 100644 pbs-datastore/src/local_datastore_lru_cache.rs
create mode 100644 pbs-datastore/src/s3.rs
create mode 100644 src/api2/admin/s3.rs
create mode 100644 src/api2/config/s3.rs
create mode 100644 src/bin/proxmox_backup_manager/s3.rs
create mode 100644 www/config/S3ClientView.js
create mode 100644 www/form/S3ClientSelector.js
create mode 100644 www/window/S3ClientEdit.js
Summary over all repositories:
69 files changed, 4889 insertions(+), 280 deletions(-)
--
Generated by git-murpp 0.8.1
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 1/9] s3 client: add crate for AWS s3 compatible object store client
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 21:13 ` [pve-devel] partially-applied-series: [pbs-devel] " Thomas Lamprecht
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 2/9] s3 client: implement AWS signature v4 request authentication Christian Ebner
` (54 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Adds the client to connect to an AWS S3 compatible object store REST
API. Force the use of an TLS encrypted connection as the communication
with the object store will contain sensitive information.
For self-signed certificates, check the fingerprint against the one
configured. This follows along the lines of the PBS client, used to
connect to the PBS server API.
The `S3Client` stores the client state and has to be configured upon
instantiation by providing `S3ClientOptions`.
Adds the new config types `S3ClientConfig` and `S3ClientSecret` to
be used for configuration of datastore backends.
Secrets are stored as different config to never be returned on api
calls, only allowing to set/update the values.
Use a different name (`secrets_id`) for the unique identifier in case
of the secrets type, although the same id should be used for storing
and lookup. By this, clashing of property names when using flattened
types as api parameters is avoided.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- Improved fingerprint check logic
- Add additional comments for cert chain check
- Add todo for shared cert handling logic once that has landed
Cargo.toml | 1 +
proxmox-s3-client/Cargo.toml | 32 ++++
proxmox-s3-client/debian/changelog | 5 +
proxmox-s3-client/debian/control | 81 +++++++++++
proxmox-s3-client/debian/copyright | 18 +++
proxmox-s3-client/debian/debcargo.toml | 7 +
proxmox-s3-client/src/api_types.rs | 172 ++++++++++++++++++++++
proxmox-s3-client/src/client.rs | 193 +++++++++++++++++++++++++
proxmox-s3-client/src/lib.rs | 12 ++
9 files changed, 521 insertions(+)
create mode 100644 proxmox-s3-client/Cargo.toml
create mode 100644 proxmox-s3-client/debian/changelog
create mode 100644 proxmox-s3-client/debian/control
create mode 100644 proxmox-s3-client/debian/copyright
create mode 100644 proxmox-s3-client/debian/debcargo.toml
create mode 100644 proxmox-s3-client/src/api_types.rs
create mode 100644 proxmox-s3-client/src/client.rs
create mode 100644 proxmox-s3-client/src/lib.rs
diff --git a/Cargo.toml b/Cargo.toml
index 020e7497..dc021796 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -33,6 +33,7 @@ members = [
"proxmox-router",
"proxmox-rrd",
"proxmox-rrd-api-types",
+ "proxmox-s3-client",
"proxmox-schema",
"proxmox-section-config",
"proxmox-sendmail",
diff --git a/proxmox-s3-client/Cargo.toml b/proxmox-s3-client/Cargo.toml
new file mode 100644
index 00000000..f62fc190
--- /dev/null
+++ b/proxmox-s3-client/Cargo.toml
@@ -0,0 +1,32 @@
+[package]
+name = "proxmox-s3-client"
+description = "low level REST API client for AWS S3 compatible object stores"
+version = "1.0.0"
+
+authors.workspace = true
+edition.workspace = true
+exclude.workspace = true
+homepage.workspace = true
+license.workspace = true
+repository.workspace = true
+rust-version.workspace = true
+
+[dependencies]
+anyhow.workspace = true
+const_format.workspace = true
+hex = { workspace = true, features = [ "serde" ] }
+http-body-util.workspace = true
+hyper-util = { workspace = true, features = [ "client-legacy", "tokio", "http1" ] }
+hyper.workspace = true
+openssl.workspace = true
+regex.workspace = true
+serde.workspace = true
+tracing.workspace = true
+
+proxmox-base64.workspace = true
+proxmox-http = { workspace = true, features = [ "body", "client", "client-trait", "rate-limiter" ] }
+proxmox-schema = { workspace = true, features = [ "api-macro", "api-types" ] }
+
+[features]
+default = []
+impl = []
diff --git a/proxmox-s3-client/debian/changelog b/proxmox-s3-client/debian/changelog
new file mode 100644
index 00000000..b2696c33
--- /dev/null
+++ b/proxmox-s3-client/debian/changelog
@@ -0,0 +1,5 @@
+rust-proxmox-s3-client (1.0.0-1) bookworm; urgency=medium
+
+ * initial packaging
+
+ -- Proxmox Support Team <support@proxmox.com> Mon, 07 Jul 2025 09:33:10 +0200
diff --git a/proxmox-s3-client/debian/control b/proxmox-s3-client/debian/control
new file mode 100644
index 00000000..36b94143
--- /dev/null
+++ b/proxmox-s3-client/debian/control
@@ -0,0 +1,81 @@
+Source: rust-proxmox-s3-client
+Section: rust
+Priority: optional
+Build-Depends: debhelper-compat (= 13),
+ dh-sequence-cargo
+Build-Depends-Arch: cargo:native <!nocheck>,
+ rustc:native (>= 1.82) <!nocheck>,
+ libstd-rust-dev <!nocheck>,
+ librust-anyhow-1+default-dev <!nocheck>,
+ librust-const-format-0.2+default-dev <!nocheck>,
+ librust-hex-0.4+default-dev <!nocheck>,
+ librust-hex-0.4+serde-dev <!nocheck>,
+ librust-http-body-util-0.1+default-dev <!nocheck>,
+ librust-hyper-1+default-dev <!nocheck>,
+ librust-hyper-util-0.1+client-legacy-dev (>= 0.1.12-~~) <!nocheck>,
+ librust-hyper-util-0.1+default-dev (>= 0.1.12-~~) <!nocheck>,
+ librust-hyper-util-0.1+http1-dev (>= 0.1.12-~~) <!nocheck>,
+ librust-hyper-util-0.1+tokio-dev (>= 0.1.12-~~) <!nocheck>,
+ librust-openssl-0.10+default-dev <!nocheck>,
+ librust-proxmox-base64-1+default-dev <!nocheck>,
+ librust-proxmox-http-1+body-dev <!nocheck>,
+ librust-proxmox-http-1+client-dev <!nocheck>,
+ librust-proxmox-http-1+client-trait-dev <!nocheck>,
+ librust-proxmox-http-1+default-dev <!nocheck>,
+ librust-proxmox-http-1+rate-limiter-dev <!nocheck>,
+ librust-proxmox-schema-4+api-macro-dev (>= 4.1.0-~~) <!nocheck>,
+ librust-proxmox-schema-4+api-types-dev (>= 4.1.0-~~) <!nocheck>,
+ librust-proxmox-schema-4+default-dev (>= 4.1.0-~~) <!nocheck>,
+ librust-regex-1+default-dev (>= 1.5-~~) <!nocheck>,
+ librust-serde-1+default-dev <!nocheck>,
+ librust-tracing-0.1+default-dev <!nocheck>
+Maintainer: Proxmox Support Team <support@proxmox.com>
+Standards-Version: 4.7.0
+Vcs-Git: git://git.proxmox.com/git/proxmox.git
+Vcs-Browser: https://git.proxmox.com/?p=proxmox.git
+Homepage: https://proxmox.com
+X-Cargo-Crate: proxmox-s3-client
+Rules-Requires-Root: no
+
+Package: librust-proxmox-s3-client-dev
+Architecture: any
+Multi-Arch: same
+Depends:
+ ${misc:Depends},
+ librust-anyhow-1+default-dev,
+ librust-const-format-0.2+default-dev,
+ librust-hex-0.4+default-dev,
+ librust-hex-0.4+serde-dev,
+ librust-http-body-util-0.1+default-dev,
+ librust-hyper-1+default-dev,
+ librust-hyper-util-0.1+client-legacy-dev (>= 0.1.12-~~),
+ librust-hyper-util-0.1+default-dev (>= 0.1.12-~~),
+ librust-hyper-util-0.1+http1-dev (>= 0.1.12-~~),
+ librust-hyper-util-0.1+tokio-dev (>= 0.1.12-~~),
+ librust-openssl-0.10+default-dev,
+ librust-proxmox-base64-1+default-dev,
+ librust-proxmox-http-1+body-dev,
+ librust-proxmox-http-1+client-dev,
+ librust-proxmox-http-1+client-trait-dev,
+ librust-proxmox-http-1+default-dev,
+ librust-proxmox-http-1+rate-limiter-dev,
+ librust-proxmox-schema-4+api-macro-dev (>= 4.1.0-~~),
+ librust-proxmox-schema-4+api-types-dev (>= 4.1.0-~~),
+ librust-proxmox-schema-4+default-dev (>= 4.1.0-~~),
+ librust-regex-1+default-dev (>= 1.5-~~),
+ librust-serde-1+default-dev,
+ librust-tracing-0.1+default-dev
+Provides:
+ librust-proxmox-s3-client+default-dev (= ${binary:Version}),
+ librust-proxmox-s3-client+impl-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1+default-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1+impl-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1.0-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1.0+default-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1.0+impl-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1.0.0-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1.0.0+default-dev (= ${binary:Version}),
+ librust-proxmox-s3-client-1.0.0+impl-dev (= ${binary:Version})
+Description: Low level REST API client for AWS S3 compatible object stores - Rust source code
+ Source code for Debianized Rust crate "proxmox-s3-client"
diff --git a/proxmox-s3-client/debian/copyright b/proxmox-s3-client/debian/copyright
new file mode 100644
index 00000000..d6e3c304
--- /dev/null
+++ b/proxmox-s3-client/debian/copyright
@@ -0,0 +1,18 @@
+Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
+
+Files:
+ *
+Copyright: 2025 Proxmox Server Solutions GmbH <support@proxmox.com>
+License: AGPL-3.0-or-later
+ This program is free software: you can redistribute it and/or modify it under
+ the terms of the GNU Affero General Public License as published by the Free
+ Software Foundation, either version 3 of the License, or (at your option) any
+ later version.
+ .
+ This program is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more
+ details.
+ .
+ You should have received a copy of the GNU Affero General Public License along
+ with this program. If not, see <https://www.gnu.org/licenses/>.
diff --git a/proxmox-s3-client/debian/debcargo.toml b/proxmox-s3-client/debian/debcargo.toml
new file mode 100644
index 00000000..b7864cdb
--- /dev/null
+++ b/proxmox-s3-client/debian/debcargo.toml
@@ -0,0 +1,7 @@
+overlay = "."
+crate_src_path = ".."
+maintainer = "Proxmox Support Team <support@proxmox.com>"
+
+[source]
+vcs_git = "git://git.proxmox.com/git/proxmox.git"
+vcs_browser = "https://git.proxmox.com/?p=proxmox.git"
diff --git a/proxmox-s3-client/src/api_types.rs b/proxmox-s3-client/src/api_types.rs
new file mode 100644
index 00000000..ab0c1ec1
--- /dev/null
+++ b/proxmox-s3-client/src/api_types.rs
@@ -0,0 +1,172 @@
+use anyhow::bail;
+use const_format::concatcp;
+use serde::{Deserialize, Serialize};
+
+use proxmox_schema::api_types::{
+ CERT_FINGERPRINT_SHA256_SCHEMA, DNS_LABEL_STR, IPRE_STR, SAFE_ID_FORMAT,
+};
+use proxmox_schema::{api, const_regex, ApiStringFormat, Schema, StringSchema, Updater};
+
+#[rustfmt::skip]
+/// Regex to match S3 endpoint full qualified domain names, including template patterns for bucket
+/// name or region.
+pub const S3_ENDPOINT_NAME_STR: &str = concatcp!(
+ r"(?:(?:(", DNS_LABEL_STR, r"|\{\{bucket\}\}|\{\{region\}\})\.)*", DNS_LABEL_STR, ")"
+);
+
+const_regex! {
+ /// Regex to match S3 bucket names.
+ ///
+ /// Be as strict as possible following the rules as described here:
+ /// https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html#general-purpose-bucket-names
+ pub S3_BUCKET_NAME_REGEX = r"^[a-z0-9]([a-z0-9\-]*[a-z0-9])?$";
+ /// Regex to match S3 endpoints including template patterns.
+ pub S3_ENDPOINT_REGEX = concatcp!(r"^(?:", S3_ENDPOINT_NAME_STR, "|", IPRE_STR, r")$");
+ /// Regex to match S3 regions.
+ pub S3_REGION_REGEX = r"(^auto$)|(^[a-z]{2,}(?:-[a-z\d]+)+$)";
+}
+
+/// S3 REST API endpoint format.
+pub const S3_ENDPOINT_FORMAT: ApiStringFormat = ApiStringFormat::Pattern(&S3_ENDPOINT_REGEX);
+/// S3 region format.
+pub const S3_REGION_FORMAT: ApiStringFormat = ApiStringFormat::Pattern(&S3_REGION_REGEX);
+
+/// ID to uniquely identify an S3 client config.
+pub const S3_CLIENT_ID_SCHEMA: Schema =
+ StringSchema::new("ID to uniquely identify s3 client config.")
+ .format(&SAFE_ID_FORMAT)
+ .min_length(3)
+ .max_length(32)
+ .schema();
+
+/// Endpoint to access S3 object store.
+pub const S3_ENDPOINT_SCHEMA: Schema = StringSchema::new("Endpoint to access S3 object store.")
+ .format(&S3_ENDPOINT_FORMAT)
+ .schema();
+
+/// Region to access S3 object store.
+pub const S3_REGION_SCHEMA: Schema = StringSchema::new("Region to access S3 object store.")
+ .format(&S3_REGION_FORMAT)
+ .min_length(3)
+ .max_length(32)
+ .schema();
+
+/// Bucket to access S3 object store.
+pub const S3_BUCKET_NAME_SCHEMA: Schema = StringSchema::new("Bucket name for S3 object store.")
+ .format(&ApiStringFormat::VerifyFn(|bucket_name| {
+ if !(S3_BUCKET_NAME_REGEX.regex_obj)().is_match(bucket_name) {
+ bail!("Bucket name does not match the regex pattern");
+ }
+
+ // Exclude pre- and postfixes described here:
+ // https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html#general-purpose-bucket-names
+ let forbidden_prefixes = ["xn--", "sthree-", "amzn-s3-demo-"];
+ for prefix in forbidden_prefixes {
+ if bucket_name.starts_with(prefix) {
+ bail!("Bucket name cannot start with '{prefix}'");
+ }
+ }
+
+ let forbidden_postfixes = ["--ol-s3", ".mrap", "--x-s3"];
+ for postfix in forbidden_postfixes {
+ if bucket_name.ends_with(postfix) {
+ bail!("Bucket name cannot end with '{postfix}'");
+ }
+ }
+
+ Ok(())
+ }))
+ .min_length(3)
+ .max_length(63)
+ .schema();
+
+#[api(
+ properties: {
+ id: {
+ schema: S3_CLIENT_ID_SCHEMA,
+ },
+ endpoint: {
+ schema: S3_ENDPOINT_SCHEMA,
+ },
+ port: {
+ type: u16,
+ optional: true,
+ },
+ region: {
+ schema: S3_REGION_SCHEMA,
+ optional: true,
+ },
+ fingerprint: {
+ schema: CERT_FINGERPRINT_SHA256_SCHEMA,
+ optional: true,
+ },
+ "access-key": {
+ type: String,
+ },
+ "path-style": {
+ type: bool,
+ optional: true,
+ default: false,
+ },
+ "put-rate-limit": {
+ type: u64,
+ optional: true,
+ },
+ }
+)]
+#[derive(Serialize, Deserialize, Updater, Clone, PartialEq)]
+#[serde(rename_all = "kebab-case")]
+/// S3 client configuration properties.
+pub struct S3ClientConfig {
+ /// ID to identify s3 client config.
+ #[updater(skip)]
+ pub id: String,
+ /// Endpoint to access S3 object store.
+ pub endpoint: String,
+ /// Port to access S3 object store.
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub port: Option<u16>,
+ /// Region to access S3 object store.
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub region: Option<String>,
+ /// Access key for S3 object store.
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub fingerprint: Option<String>,
+ /// Access key for S3 object store.
+ pub access_key: String,
+ /// Use path style bucket addressing over vhost style.
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub path_style: Option<bool>,
+ /// Rate limit for put requests given as #reqest/s.
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub put_rate_limit: Option<u64>,
+}
+
+impl S3ClientConfig {
+ /// Helper method to get ACL path for S3 client config
+ pub fn acl_path(&self) -> Vec<&str> {
+ // Needs permissions on root path
+ Vec::new()
+ }
+}
+
+#[api(
+ properties: {
+ "secrets-id": {
+ type: String,
+ },
+ "secret-key": {
+ type: String,
+ },
+ }
+)]
+#[derive(Serialize, Deserialize, Updater, Clone, PartialEq)]
+#[serde(rename_all = "kebab-case")]
+/// S3 client secrets configuration properties.
+pub struct S3ClientSecretsConfig {
+ /// ID to identify s3 client secret config.
+ #[updater(skip)]
+ pub secrets_id: String,
+ /// Secret key for S3 object store.
+ pub secret_key: String,
+}
diff --git a/proxmox-s3-client/src/client.rs b/proxmox-s3-client/src/client.rs
new file mode 100644
index 00000000..1ba42a08
--- /dev/null
+++ b/proxmox-s3-client/src/client.rs
@@ -0,0 +1,193 @@
+use std::sync::{Arc, Mutex};
+use std::time::Duration;
+
+use anyhow::{bail, format_err, Context, Error};
+use hyper::http::uri::Authority;
+use hyper_util::client::legacy::connect::HttpConnector;
+use hyper_util::client::legacy::Client;
+use hyper_util::rt::TokioExecutor;
+use openssl::hash::MessageDigest;
+use openssl::ssl::{SslConnector, SslMethod, SslVerifyMode};
+use openssl::x509::X509StoreContextRef;
+use tracing::error;
+
+use proxmox_http::client::HttpsConnector;
+use proxmox_http::{Body, RateLimiter};
+use proxmox_schema::api_types::CERT_FINGERPRINT_SHA256_SCHEMA;
+
+use crate::api_types::{S3ClientConfig, S3ClientSecretsConfig};
+
+const S3_HTTP_CONNECT_TIMEOUT: Duration = Duration::from_secs(10);
+const S3_TCP_KEEPALIVE_TIME: u32 = 120;
+
+/// Configuration options for client
+pub struct S3ClientOptions {
+ /// Endpoint to access S3 object store.
+ pub endpoint: String,
+ /// Port to access S3 object store.
+ pub port: Option<u16>,
+ /// Bucket to access S3 object store.
+ pub bucket: String,
+ /// Common prefix within bucket to use for objects keys for this client instance.
+ pub common_prefix: String,
+ /// Use path style bucket addressing over vhost style.
+ pub path_style: bool,
+ /// Secret key for S3 object store.
+ pub secret_key: String,
+ /// Access key for S3 object store.
+ pub access_key: String,
+ /// Region to access S3 object store.
+ pub region: String,
+ /// API certificate fingerprint for self signed certificates.
+ pub fingerprint: Option<String>,
+ /// Rate limit for put requests given as #reqest/s.
+ pub put_rate_limit: Option<u64>,
+}
+
+impl S3ClientOptions {
+ /// Construct options for the S3 client give the provided configuration parameters.
+ pub fn from_config(
+ config: S3ClientConfig,
+ secrets: S3ClientSecretsConfig,
+ bucket: String,
+ common_prefix: String,
+ ) -> Self {
+ Self {
+ endpoint: config.endpoint,
+ port: config.port,
+ bucket,
+ common_prefix,
+ path_style: config.path_style.unwrap_or_default(),
+ region: config.region.unwrap_or("us-west-1".to_string()),
+ fingerprint: config.fingerprint,
+ access_key: config.access_key,
+ secret_key: secrets.secret_key,
+ put_rate_limit: config.put_rate_limit,
+ }
+ }
+}
+
+/// S3 client for object stores compatible with the AWS S3 API
+pub struct S3Client {
+ client: Client<HttpsConnector, Body>,
+ options: S3ClientOptions,
+ authority: Authority,
+ put_rate_limiter: Option<Arc<Mutex<RateLimiter>>>,
+}
+
+impl S3Client {
+ /// Creates a new S3 client instance, connecting to the provided endpoint using https given the
+ /// provided options.
+ pub fn new(options: S3ClientOptions) -> Result<Self, Error> {
+ let expected_fingerprint = if let Some(ref fingerprint) = options.fingerprint {
+ CERT_FINGERPRINT_SHA256_SCHEMA
+ .unwrap_string_schema()
+ .check_constraints(fingerprint)
+ .context("invalid fingerprint provided")?;
+ Some(fingerprint.to_lowercase())
+ } else {
+ None
+ };
+ let verified_fingerprint = Arc::new(Mutex::new(None));
+ let trust_openssl_valid = Arc::new(Mutex::new(true));
+ let mut ssl_connector_builder = SslConnector::builder(SslMethod::tls())?;
+ ssl_connector_builder.set_verify_callback(
+ SslVerifyMode::PEER,
+ move |openssl_valid, context| match Self::verify_certificate_fingerprint(
+ openssl_valid,
+ context,
+ expected_fingerprint.clone(),
+ trust_openssl_valid.clone(),
+ ) {
+ Ok(None) => true,
+ Ok(Some(fingerprint)) => {
+ *verified_fingerprint.lock().unwrap() = Some(fingerprint);
+ true
+ }
+ Err(err) => {
+ error!("certificate validation failed {err:#}");
+ false
+ }
+ },
+ );
+
+ let mut http_connector = HttpConnector::new();
+ // want communication to object store backend api to always use https
+ http_connector.enforce_http(false);
+ http_connector.set_connect_timeout(Some(S3_HTTP_CONNECT_TIMEOUT));
+ let https_connector = HttpsConnector::with_connector(
+ http_connector,
+ ssl_connector_builder.build(),
+ S3_TCP_KEEPALIVE_TIME,
+ );
+ let client = Client::builder(TokioExecutor::new()).build::<_, Body>(https_connector);
+
+ let authority_template = if let Some(port) = options.port {
+ format!("{}:{port}", options.endpoint)
+ } else {
+ options.endpoint.clone()
+ };
+ let authority = authority_template
+ .replace("{{bucket}}", &options.bucket)
+ .replace("{{region}}", &options.region);
+ let authority = Authority::try_from(authority)?;
+
+ let put_rate_limiter = options.put_rate_limit.map(|limit| {
+ let limiter = RateLimiter::new(limit, limit);
+ Arc::new(Mutex::new(limiter))
+ });
+
+ Ok(Self {
+ client,
+ options,
+ authority,
+ put_rate_limiter,
+ })
+ }
+
+ // TODO: replace with our shared TLS cert verification once available
+ fn verify_certificate_fingerprint(
+ openssl_valid: bool,
+ context: &mut X509StoreContextRef,
+ expected_fingerprint: Option<String>,
+ trust_openssl: Arc<Mutex<bool>>,
+ ) -> Result<Option<String>, Error> {
+ let mut trust_openssl_valid = trust_openssl.lock().unwrap();
+
+ // only rely on openssl prevalidation if was not forced earlier
+ if openssl_valid && *trust_openssl_valid {
+ return Ok(None);
+ }
+
+ let certificate = match context.current_cert() {
+ Some(certificate) => certificate,
+ None => bail!("context lacks current certificate."),
+ };
+
+ // force trust in case of a chain, but set flag to no longer trust prevalidation by openssl
+ // see https://bugzilla.proxmox.com/show_bug.cgi?id=5248
+ if context.error_depth() > 0 {
+ *trust_openssl_valid = false;
+ return Ok(None);
+ }
+
+ let certificate_digest = certificate
+ .digest(MessageDigest::sha256())
+ .context("failed to calculate certificate digest")?;
+ let certificate_fingerprint = certificate_digest
+ .iter()
+ .map(|byte| format!("{byte:02x}"))
+ .collect::<Vec<String>>()
+ .join(":");
+
+ if let Some(expected_fingerprint) = expected_fingerprint {
+ if expected_fingerprint == certificate_fingerprint {
+ return Ok(Some(certificate_fingerprint));
+ }
+ }
+
+ Err(format_err!(
+ "unexpected certificate fingerprint {certificate_fingerprint}"
+ ))
+ }
+}
diff --git a/proxmox-s3-client/src/lib.rs b/proxmox-s3-client/src/lib.rs
new file mode 100644
index 00000000..e579ffbb
--- /dev/null
+++ b/proxmox-s3-client/src/lib.rs
@@ -0,0 +1,12 @@
+//! Low level REST API client for AWS S3 compatible object stores
+#![cfg_attr(docsrs, feature(doc_cfg, doc_auto_cfg))]
+#![deny(unsafe_op_in_unsafe_fn)]
+#![deny(missing_docs)]
+
+mod api_types;
+pub use api_types::*;
+
+#[cfg(feature = "impl")]
+mod client;
+#[cfg(feature = "impl")]
+pub use client::{S3Client, S3ClientOptions};
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 2/9] s3 client: implement AWS signature v4 request authentication
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 1/9] s3 client: add crate for AWS s3 compatible object store client Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 3/9] s3 client: add dedicated type for s3 object keys Christian Ebner
` (53 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
The S3 API authenticates client requests by checking the
authentication signature provided by the requests `Authorization`
header. The latest AWS signature v4 signature is required for the
newest AWS regions [0] and most widely adapted [1-4], so rely soly on
that, not implementing older versions.
Adds helper methods to sign client requests, this includes encoding
and normalization of the headers, digest calculation of the request body
(if any) and signature generation.
[0] https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html
[1] https://docs.ceph.com/en/reef/radosgw/s3/authentication/#aws-signature-v4
[2] https://cloud.google.com/storage/docs/interoperability
[3] https://docs.wasabi.com/v1/docs/how-do-i-use-aws-signature-version-4-with-wasabi
[4] https://min.io/product/s3-compatibility
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
proxmox-s3-client/Cargo.toml | 3 +
proxmox-s3-client/debian/control | 12 +-
proxmox-s3-client/src/aws_sign_v4.rs | 210 +++++++++++++++++++++++++++
proxmox-s3-client/src/lib.rs | 4 +
4 files changed, 227 insertions(+), 2 deletions(-)
create mode 100644 proxmox-s3-client/src/aws_sign_v4.rs
diff --git a/proxmox-s3-client/Cargo.toml b/proxmox-s3-client/Cargo.toml
index f62fc190..31deca59 100644
--- a/proxmox-s3-client/Cargo.toml
+++ b/proxmox-s3-client/Cargo.toml
@@ -22,10 +22,13 @@ openssl.workspace = true
regex.workspace = true
serde.workspace = true
tracing.workspace = true
+url.workspace = true
proxmox-base64.workspace = true
proxmox-http = { workspace = true, features = [ "body", "client", "client-trait", "rate-limiter" ] }
proxmox-schema = { workspace = true, features = [ "api-macro", "api-types" ] }
+proxmox-serde.workspace = true
+proxmox-time.workspace = true
[features]
default = []
diff --git a/proxmox-s3-client/debian/control b/proxmox-s3-client/debian/control
index 36b94143..0efb54db 100644
--- a/proxmox-s3-client/debian/control
+++ b/proxmox-s3-client/debian/control
@@ -26,9 +26,13 @@ Build-Depends-Arch: cargo:native <!nocheck>,
librust-proxmox-schema-4+api-macro-dev (>= 4.1.0-~~) <!nocheck>,
librust-proxmox-schema-4+api-types-dev (>= 4.1.0-~~) <!nocheck>,
librust-proxmox-schema-4+default-dev (>= 4.1.0-~~) <!nocheck>,
+ librust-proxmox-serde-1+default-dev <!nocheck>,
+ librust-proxmox-serde-1+serde-json-dev <!nocheck>,
+ librust-proxmox-time-2+default-dev (>= 2.1.0-~~) <!nocheck>,
librust-regex-1+default-dev (>= 1.5-~~) <!nocheck>,
librust-serde-1+default-dev <!nocheck>,
- librust-tracing-0.1+default-dev <!nocheck>
+ librust-tracing-0.1+default-dev <!nocheck>,
+ librust-url-2+default-dev (>= 2.2-~~) <!nocheck>
Maintainer: Proxmox Support Team <support@proxmox.com>
Standards-Version: 4.7.0
Vcs-Git: git://git.proxmox.com/git/proxmox.git
@@ -62,9 +66,13 @@ Depends:
librust-proxmox-schema-4+api-macro-dev (>= 4.1.0-~~),
librust-proxmox-schema-4+api-types-dev (>= 4.1.0-~~),
librust-proxmox-schema-4+default-dev (>= 4.1.0-~~),
+ librust-proxmox-serde-1+default-dev,
+ librust-proxmox-serde-1+serde-json-dev,
+ librust-proxmox-time-2+default-dev (>= 2.1.0-~~),
librust-regex-1+default-dev (>= 1.5-~~),
librust-serde-1+default-dev,
- librust-tracing-0.1+default-dev
+ librust-tracing-0.1+default-dev,
+ librust-url-2+default-dev (>= 2.2-~~)
Provides:
librust-proxmox-s3-client+default-dev (= ${binary:Version}),
librust-proxmox-s3-client+impl-dev (= ${binary:Version}),
diff --git a/proxmox-s3-client/src/aws_sign_v4.rs b/proxmox-s3-client/src/aws_sign_v4.rs
new file mode 100644
index 00000000..7cbc7d1a
--- /dev/null
+++ b/proxmox-s3-client/src/aws_sign_v4.rs
@@ -0,0 +1,210 @@
+//! Helpers for request authentication using AWS signature version 4
+
+use anyhow::{bail, Error};
+use hyper::Request;
+use openssl::hash::MessageDigest;
+use openssl::pkey::{PKey, Private};
+use openssl::sha::sha256;
+use openssl::sign::Signer;
+use url::Url;
+
+use proxmox_http::Body;
+
+use super::client::S3ClientOptions;
+
+pub(crate) const AWS_SIGN_V4_DATETIME_FORMAT: &str = "%Y%m%dT%H%M%SZ";
+
+const AWS_SIGN_V4_DATE_FORMAT: &str = "%Y%m%d";
+const AWS_SIGN_V4_SERVICE_S3: &str = "s3";
+const AWS_SIGN_V4_REQUEST_POSTFIX: &str = "aws4_request";
+
+/// Generate signature for S3 request authentication using AWS signature version 4.
+/// See: https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html
+pub(crate) fn aws_sign_v4_signature(
+ request: &Request<Body>,
+ options: &S3ClientOptions,
+ epoch: i64,
+ payload_digest: &str,
+) -> Result<String, Error> {
+ // Include all headers in signature calculation since the reference docs note:
+ // "For the purpose of calculating an authorization signature, only the 'host' and any 'x-amz-*'
+ // headers are required. however, in order to prevent data tampering, you should consider
+ // including all the headers in the signature calculation."
+ // See https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html
+ let mut canonical_headers = Vec::new();
+ let mut signed_headers = Vec::new();
+ for (key, value) in request.headers() {
+ canonical_headers.push(format!(
+ "{}:{}",
+ // Header name has to be lower case, key.as_str() does guarantee that, see
+ // https://docs.rs/http/0.2.0/http/header/struct.HeaderName.html
+ key.as_str(),
+ // No need to trim since `HeaderValue` only allows visible UTF8 chars, see
+ // https://docs.rs/http/0.2.0/http/header/struct.HeaderValue.html
+ value.to_str()?,
+ ));
+ signed_headers.push(key.as_str());
+ }
+ canonical_headers.sort();
+ signed_headers.sort();
+ let signed_headers_string = signed_headers.join(";");
+
+ let mut canonical_queries = Url::parse(&request.uri().to_string())?
+ .query_pairs()
+ .map(|(key, value)| {
+ format!(
+ "{}={}",
+ aws_sign_v4_uri_encode(&key, false),
+ aws_sign_v4_uri_encode(&value, false),
+ )
+ })
+ .collect::<Vec<String>>();
+ canonical_queries.sort();
+
+ let canonical_request = format!(
+ "{}\n{}\n{}\n{}\n\n{}\n{}",
+ request.method().as_str(),
+ request.uri().path(),
+ canonical_queries.join("&"),
+ canonical_headers.join("\n"),
+ signed_headers_string,
+ payload_digest,
+ );
+
+ let date = proxmox_time::strftime_utc(AWS_SIGN_V4_DATE_FORMAT, epoch)?;
+ let datetime = proxmox_time::strftime_utc(AWS_SIGN_V4_DATETIME_FORMAT, epoch)?;
+
+ let credential_scope = format!(
+ "{date}/{}/{AWS_SIGN_V4_SERVICE_S3}/{AWS_SIGN_V4_REQUEST_POSTFIX}",
+ options.region,
+ );
+ let canonical_request_hash = hex::encode(sha256(canonical_request.as_bytes()));
+ let string_to_sign =
+ format!("AWS4-HMAC-SHA256\n{datetime}\n{credential_scope}\n{canonical_request_hash}");
+
+ let date_sign_key = PKey::hmac(format!("AWS4{}", options.secret_key).as_bytes())?;
+ let date_tag = hmac_sha256(&date_sign_key, date.as_bytes())?;
+
+ let region_sign_key = PKey::hmac(&date_tag)?;
+ let region_tag = hmac_sha256(®ion_sign_key, options.region.as_bytes())?;
+
+ let service_sign_key = PKey::hmac(®ion_tag)?;
+ let service_tag = hmac_sha256(&service_sign_key, AWS_SIGN_V4_SERVICE_S3.as_bytes())?;
+
+ let signing_key = PKey::hmac(&service_tag)?;
+ let signing_tag = hmac_sha256(&signing_key, AWS_SIGN_V4_REQUEST_POSTFIX.as_bytes())?;
+
+ let signature_key = PKey::hmac(&signing_tag)?;
+ let signature = hmac_sha256(&signature_key, string_to_sign.as_bytes())?;
+ let signature = hex::encode(&signature);
+
+ Ok(format!(
+ "AWS4-HMAC-SHA256 Credential={}/{credential_scope},SignedHeaders={signed_headers_string},Signature={signature}",
+ options.access_key,
+ ))
+}
+// Custom `uri_encode` implementation as recommended by AWS docs, since possible implementation
+// incompatibility with uri encoding libraries.
+// See: https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html
+pub(crate) fn aws_sign_v4_uri_encode(input: &str, is_object_key_name: bool) -> String {
+ // Assume up to 2 bytes per char max in output
+ let mut accumulator = String::with_capacity(2 * input.len());
+
+ input.chars().for_each(|char| {
+ match char {
+ // Unreserved characters, do not uri encode these bytes
+ 'A'..='Z' | 'a'..='z' | '0'..='9' | '-' | '.' | '_' | '~' => accumulator.push(char),
+ // Space character is reserved, must be encoded as '%20', not '+'
+ ' ' => accumulator.push_str("%20"),
+ // Encode the forward slash character, '/', everywhere except in the object key name
+ '/' if !is_object_key_name => accumulator.push_str("%2F"),
+ '/' if is_object_key_name => accumulator.push(char),
+ // URI encoded byte is formed by a '%' and the two-digit hexadecimal value of the byte
+ // Letters in the hexadecimal value must be uppercase
+ _ => {
+ for byte in char.to_string().as_bytes() {
+ accumulator.push_str(&format!("%{byte:02X}"));
+ }
+ }
+ }
+ });
+
+ accumulator
+}
+
+// Helper for hmac sha256 calculation
+fn hmac_sha256(key: &PKey<Private>, data: &[u8]) -> Result<Vec<u8>, Error> {
+ let mut signer = Signer::new(MessageDigest::sha256(), key)?;
+ signer.update(data)?;
+ let hmac = signer.sign_to_vec()?;
+ Ok(hmac)
+}
+
+/// Custom `uri_decode` implementation
+pub fn uri_decode(input: &str) -> Result<String, Error> {
+ // Require full capacity if no characters are encoded, less otherwise
+ let mut accumulator = String::with_capacity(input.len());
+ let mut subslices_iter = input.split('%');
+ // First item present also when empty, nevertheless fallback to empty default
+ accumulator.push_str(subslices_iter.next().unwrap_or(""));
+
+ for subslice in subslices_iter {
+ if let Some((hex_digits, utf8_rest)) = subslice.as_bytes().split_at_checked(2) {
+ let mut ascii_code = 0u8;
+ for (pos, digit) in hex_digits.iter().enumerate().take(2) {
+ let val = match digit {
+ b'0'..=b'9' => digit - b'0',
+ b'A'..=b'F' => digit - b'A' + 10,
+ b'a'..=b'f' => digit - b'a' + 10,
+ _ => bail!("unexpected hex digit at %{subslice}"),
+ };
+ // Shift first diigts value to be upper byte half
+ ascii_code += val << (4 * ((pos + 1) % 2));
+ }
+ accumulator.push(ascii_code as char);
+ // Started from valid utf-8 without modification
+ let rest = unsafe { std::str::from_utf8_unchecked(utf8_rest) };
+ accumulator.push_str(rest);
+ } else {
+ bail!("failed to decode string at subslice %{subslice}");
+ }
+ }
+
+ Ok(accumulator)
+}
+
+#[test]
+fn test_aws_sign_v4_uri_encode() {
+ assert_eq!(aws_sign_v4_uri_encode("AZaz09-._~", false), "AZaz09-._~");
+ assert_eq!(aws_sign_v4_uri_encode("a b", false), "a%20b");
+ assert_eq!(
+ aws_sign_v4_uri_encode("/path/to/object", false),
+ "%2Fpath%2Fto%2Fobject"
+ );
+ assert_eq!(
+ aws_sign_v4_uri_encode("/path/to/object", true),
+ "/path/to/object"
+ );
+ assert_eq!(
+ aws_sign_v4_uri_encode(" !\"#$%&'()*+,:;=?@[]", false),
+ "%20%21%22%23%24%25%26%27%28%29%2A%2B%2C%3A%3B%3D%3F%40%5B%5D"
+ );
+ assert_eq!(aws_sign_v4_uri_encode("", false), "");
+}
+
+#[test]
+fn test_uri_decode() {
+ assert_eq!(uri_decode("a%20b%2FC").unwrap(), "a b/C");
+ assert_eq!(uri_decode("a%20b%2fc").unwrap(), "a b/c");
+ assert_eq!(uri_decode("simple-string").unwrap(), "simple-string");
+ assert_eq!(uri_decode("").unwrap(), "");
+ assert!(
+ uri_decode("test%").is_err(),
+ "Incomplete escape sequence at end"
+ );
+ assert!(
+ uri_decode("test%F").is_err(),
+ "Incomplete two-digit escape sequence"
+ );
+ assert!(uri_decode("test%GZ").is_err(), "Invalid hex digit");
+}
diff --git a/proxmox-s3-client/src/lib.rs b/proxmox-s3-client/src/lib.rs
index e579ffbb..f65d123f 100644
--- a/proxmox-s3-client/src/lib.rs
+++ b/proxmox-s3-client/src/lib.rs
@@ -6,6 +6,10 @@
mod api_types;
pub use api_types::*;
+#[cfg(feature = "impl")]
+mod aws_sign_v4;
+#[cfg(feature = "impl")]
+pub use aws_sign_v4::uri_decode;
#[cfg(feature = "impl")]
mod client;
#[cfg(feature = "impl")]
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 3/9] s3 client: add dedicated type for s3 object keys
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 1/9] s3 client: add crate for AWS s3 compatible object store client Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 2/9] s3 client: implement AWS signature v4 request authentication Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 4/9] s3 client: add type for last modified timestamp in responses Christian Ebner
` (52 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
S3 objects are uniquely identified within a bucket by their object
key [0].
Implements conversion and utility traits to easily convert a string
as corresponding object key for the S3 storage backend. Keys might
either be full or relative. Full keys are not further expanded when
performing api requests, while relative keys are prefixed by the
common prefix as configured in the client. This allows for easy key
grouping based on client configuration and is used for PBS datastore
separation within the same bucket.
Further, this adds type checking for s3 client operations requiring
an object key.
[0] https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
proxmox-s3-client/src/lib.rs | 4 ++
proxmox-s3-client/src/object_key.rs | 86 +++++++++++++++++++++++++++++
2 files changed, 90 insertions(+)
create mode 100644 proxmox-s3-client/src/object_key.rs
diff --git a/proxmox-s3-client/src/lib.rs b/proxmox-s3-client/src/lib.rs
index f65d123f..7286cbd1 100644
--- a/proxmox-s3-client/src/lib.rs
+++ b/proxmox-s3-client/src/lib.rs
@@ -14,3 +14,7 @@ pub use aws_sign_v4::uri_decode;
mod client;
#[cfg(feature = "impl")]
pub use client::{S3Client, S3ClientOptions};
+#[cfg(feature = "impl")]
+mod object_key;
+#[cfg(feature = "impl")]
+pub use object_key::S3ObjectKey;
diff --git a/proxmox-s3-client/src/object_key.rs b/proxmox-s3-client/src/object_key.rs
new file mode 100644
index 00000000..49959b6e
--- /dev/null
+++ b/proxmox-s3-client/src/object_key.rs
@@ -0,0 +1,86 @@
+use anyhow::Error;
+
+#[derive(Clone, Debug)]
+/// S3 Object Key
+pub enum S3ObjectKey {
+ /// Object key which will not be prefixed any further by the client
+ Full(String),
+ /// Object key which will be expanded by the client with its configured common prefix
+ Relative(String),
+}
+
+impl core::convert::From<&str> for S3ObjectKey {
+ fn from(s: &str) -> Self {
+ if let Some(s) = s.strip_prefix("/") {
+ Self::Full(s.to_string())
+ } else {
+ Self::Relative(s.to_string())
+ }
+ }
+}
+impl S3ObjectKey {
+ /// Convert the given object key to a full key by extending it via given prefix
+ /// If the object key is already a full key, the prefix is ignored.
+ pub(crate) fn to_full_key(&self, prefix: &str) -> Self {
+ match self {
+ Self::Full(ref key) => Self::Full(key.to_string()),
+ Self::Relative(ref key) => {
+ let prefix = prefix.strip_prefix("/").unwrap_or(&prefix);
+ Self::Full(format!("{prefix}/{key}"))
+ }
+ }
+ }
+}
+
+impl std::ops::Deref for S3ObjectKey {
+ type Target = str;
+
+ fn deref(&self) -> &Self::Target {
+ match self {
+ Self::Full(key) => key,
+ Self::Relative(key) => key,
+ }
+ }
+}
+
+impl std::fmt::Display for S3ObjectKey {
+ fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+ match self {
+ Self::Full(key) => write!(f, "{key}"),
+ Self::Relative(key) => write!(f, "{key}"),
+ }
+ }
+}
+
+impl std::str::FromStr for S3ObjectKey {
+ type Err = Error;
+
+ fn from_str(s: &str) -> Result<Self, Self::Err> {
+ Ok(Self::from(s))
+ }
+}
+
+// Do not mangle with prefixes when de-serializing
+impl<'de> serde::Deserialize<'de> for S3ObjectKey {
+ fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
+ where
+ D: serde::Deserializer<'de>,
+ {
+ let object_key = std::borrow::Cow::<'de, str>::deserialize(deserializer)?.to_string();
+ Ok(Self::Full(object_key))
+ }
+}
+
+impl S3ObjectKey {
+ /// Generate source key for copy object operations given the source bucket.
+ /// Extends relative object key variants also by the given prefix.
+ pub fn to_copy_source_key(&self, source_bucket: &str, prefix: &str) -> Self {
+ match self {
+ Self::Full(key) => Self::Full(format!("{source_bucket}{key}")),
+ Self::Relative(key) => {
+ let prefix = prefix.strip_prefix("/").unwrap_or(&prefix);
+ Self::Full(format!("{source_bucket}/{prefix}/{key}"))
+ }
+ }
+ }
+}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 4/9] s3 client: add type for last modified timestamp in responses
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (2 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 3/9] s3 client: add dedicated type for s3 object keys Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 5/9] s3 client: add helper to parse http date headers Christian Ebner
` (51 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Adds a helper to parse modified timestamps as encountered in s3 list
objects v2 and copy object api calls.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
Cargo.toml | 1 +
proxmox-s3-client/Cargo.toml | 2 ++
proxmox-s3-client/debian/control | 4 ++++
proxmox-s3-client/src/lib.rs | 4 ++++
proxmox-s3-client/src/timestamps.rs | 18 ++++++++++++++++++
5 files changed, 29 insertions(+)
create mode 100644 proxmox-s3-client/src/timestamps.rs
diff --git a/Cargo.toml b/Cargo.toml
index dc021796..cf0ef097 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -92,6 +92,7 @@ http-body = "1"
http-body-util = "0.1"
hyper = "1"
hyper-util = "0.1.12"
+iso8601 = "0.6.1"
ldap3 = { version = "0.11", default-features = false }
lettre = "0.11.1"
libc = "0.2.107"
diff --git a/proxmox-s3-client/Cargo.toml b/proxmox-s3-client/Cargo.toml
index 31deca59..f5b87493 100644
--- a/proxmox-s3-client/Cargo.toml
+++ b/proxmox-s3-client/Cargo.toml
@@ -18,9 +18,11 @@ hex = { workspace = true, features = [ "serde" ] }
http-body-util.workspace = true
hyper-util = { workspace = true, features = [ "client-legacy", "tokio", "http1" ] }
hyper.workspace = true
+iso8601.workspace = true
openssl.workspace = true
regex.workspace = true
serde.workspace = true
+serde_plain.workspace = true
tracing.workspace = true
url.workspace = true
diff --git a/proxmox-s3-client/debian/control b/proxmox-s3-client/debian/control
index 0efb54db..f3ffa2d9 100644
--- a/proxmox-s3-client/debian/control
+++ b/proxmox-s3-client/debian/control
@@ -16,6 +16,7 @@ Build-Depends-Arch: cargo:native <!nocheck>,
librust-hyper-util-0.1+default-dev (>= 0.1.12-~~) <!nocheck>,
librust-hyper-util-0.1+http1-dev (>= 0.1.12-~~) <!nocheck>,
librust-hyper-util-0.1+tokio-dev (>= 0.1.12-~~) <!nocheck>,
+ librust-iso8601-0.6+default-dev (>= 0.6.1-~~) <!nocheck>,
librust-openssl-0.10+default-dev <!nocheck>,
librust-proxmox-base64-1+default-dev <!nocheck>,
librust-proxmox-http-1+body-dev <!nocheck>,
@@ -31,6 +32,7 @@ Build-Depends-Arch: cargo:native <!nocheck>,
librust-proxmox-time-2+default-dev (>= 2.1.0-~~) <!nocheck>,
librust-regex-1+default-dev (>= 1.5-~~) <!nocheck>,
librust-serde-1+default-dev <!nocheck>,
+ librust-serde-plain-1+default-dev <!nocheck>,
librust-tracing-0.1+default-dev <!nocheck>,
librust-url-2+default-dev (>= 2.2-~~) <!nocheck>
Maintainer: Proxmox Support Team <support@proxmox.com>
@@ -56,6 +58,7 @@ Depends:
librust-hyper-util-0.1+default-dev (>= 0.1.12-~~),
librust-hyper-util-0.1+http1-dev (>= 0.1.12-~~),
librust-hyper-util-0.1+tokio-dev (>= 0.1.12-~~),
+ librust-iso8601-0.6+default-dev (>= 0.6.1-~~),
librust-openssl-0.10+default-dev,
librust-proxmox-base64-1+default-dev,
librust-proxmox-http-1+body-dev,
@@ -71,6 +74,7 @@ Depends:
librust-proxmox-time-2+default-dev (>= 2.1.0-~~),
librust-regex-1+default-dev (>= 1.5-~~),
librust-serde-1+default-dev,
+ librust-serde-plain-1+default-dev,
librust-tracing-0.1+default-dev,
librust-url-2+default-dev (>= 2.2-~~)
Provides:
diff --git a/proxmox-s3-client/src/lib.rs b/proxmox-s3-client/src/lib.rs
index 7286cbd1..fc314c42 100644
--- a/proxmox-s3-client/src/lib.rs
+++ b/proxmox-s3-client/src/lib.rs
@@ -15,6 +15,10 @@ mod client;
#[cfg(feature = "impl")]
pub use client::{S3Client, S3ClientOptions};
#[cfg(feature = "impl")]
+mod timestamps;
+#[cfg(feature = "impl")]
+pub use timestamps::*;
+#[cfg(feature = "impl")]
mod object_key;
#[cfg(feature = "impl")]
pub use object_key::S3ObjectKey;
diff --git a/proxmox-s3-client/src/timestamps.rs b/proxmox-s3-client/src/timestamps.rs
new file mode 100644
index 00000000..3713b6d0
--- /dev/null
+++ b/proxmox-s3-client/src/timestamps.rs
@@ -0,0 +1,18 @@
+use anyhow::{anyhow, Error};
+
+#[derive(Debug)]
+/// Last modified timestamp as obtained from API response http headers.
+pub struct LastModifiedTimestamp {
+ _datetime: iso8601::DateTime,
+}
+
+impl std::str::FromStr for LastModifiedTimestamp {
+ type Err = Error;
+
+ fn from_str(timestamp: &str) -> Result<Self, Self::Err> {
+ let _datetime = iso8601::datetime(timestamp).map_err(|err| anyhow!(err))?;
+ Ok(Self { _datetime })
+ }
+}
+
+serde_plain::derive_deserialize_from_fromstr!(LastModifiedTimestamp, "last modified timestamp");
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 5/9] s3 client: add helper to parse http date headers
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (3 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 4/9] s3 client: add type for last modified timestamp in responses Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 6/9] s3 client: implement methods to operate on s3 objects in bucket Christian Ebner
` (50 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Add a helper to parse the preferred date/time format for http `Date`
headers as specified in RFC 2616 [0], which is a fixed-length subset
of the format specified in RFC 1123 [1], itself being a followup to
RFC 822 [2]. Does not implement the format as described in the
obsolete RFC 850 [3].
This allows to parse the `Date` and `Last-Modified` headers of S3 API
responses.
[0] https://datatracker.ietf.org/doc/html/rfc2616#section-3.3
[1] https://datatracker.ietf.org/doc/html/rfc1123#section-5.2.14
[2] https://datatracker.ietf.org/doc/html/rfc822#section-5
[3] https://datatracker.ietf.org/doc/html/rfc850
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
proxmox-s3-client/src/timestamps.rs | 90 ++++++++++++++++++++++++++++-
1 file changed, 89 insertions(+), 1 deletion(-)
diff --git a/proxmox-s3-client/src/timestamps.rs b/proxmox-s3-client/src/timestamps.rs
index 3713b6d0..22330966 100644
--- a/proxmox-s3-client/src/timestamps.rs
+++ b/proxmox-s3-client/src/timestamps.rs
@@ -1,4 +1,9 @@
-use anyhow::{anyhow, Error};
+use anyhow::{anyhow, bail, Context, Error};
+
+const VALID_DAYS_OF_WEEK: [&str; 7] = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"];
+const VALID_MONTHS: [&str; 12] = [
+ "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec",
+];
#[derive(Debug)]
/// Last modified timestamp as obtained from API response http headers.
@@ -16,3 +21,86 @@ impl std::str::FromStr for LastModifiedTimestamp {
}
serde_plain::derive_deserialize_from_fromstr!(LastModifiedTimestamp, "last modified timestamp");
+
+/// Preferred date format specified by RFC2616, given as fixed-length
+/// subset of RFC1123, which itself is a followup to RFC822.
+///
+/// https://datatracker.ietf.org/doc/html/rfc2616#section-3.3
+/// https://datatracker.ietf.org/doc/html/rfc1123#section-5.2.14
+/// https://datatracker.ietf.org/doc/html/rfc822#section-5
+#[derive(Debug)]
+pub struct HttpDate {
+ _epoch: i64,
+}
+
+impl std::str::FromStr for HttpDate {
+ type Err = Error;
+
+ fn from_str(timestamp: &str) -> Result<Self, Self::Err> {
+ let input = timestamp.as_bytes();
+ if input.len() != 29 {
+ bail!("unexpected length: got {}", input.len());
+ }
+
+ let expect = |pos: usize, c: u8| {
+ if input[pos] != c {
+ bail!("unexpected char at pos {pos}");
+ }
+ Ok(())
+ };
+
+ let digit = |pos: usize| -> Result<i32, Error> {
+ let digit = input[pos] as i32;
+ if !(48..=57).contains(&digit) {
+ bail!("unexpected char at pos {pos}");
+ }
+ Ok(digit - 48)
+ };
+
+ fn check_max(i: i32, max: i32) -> Result<i32, Error> {
+ if i > max {
+ bail!("value too large ({i} > {max})");
+ }
+ Ok(i)
+ }
+
+ let mut tm = proxmox_time::TmEditor::new(true);
+
+ if !VALID_DAYS_OF_WEEK
+ .iter()
+ .any(|valid| valid.as_bytes() == &input[0..3])
+ {
+ bail!("unexpected day of week, got {:?}", &input[0..3]);
+ }
+
+ expect(3, b',').context("unexpected separator after day of week")?;
+ expect(4, b' ').context("missing space after day of week separator")?;
+ tm.set_mday(check_max(digit(5)? * 10 + digit(6)?, 31)?)?;
+ expect(7, b' ').context("unexpected separator after day")?;
+ if let Some(month) = VALID_MONTHS
+ .iter()
+ .position(|month| month.as_bytes() == &input[8..11])
+ {
+ // valid conversion to i32, position stems from fixed size array of 12 months.
+ tm.set_mon(check_max(month as i32 + 1, 12)?)?;
+ } else {
+ bail!("invalid month");
+ }
+ expect(11, b' ').context("unexpected separator after month")?;
+ tm.set_year(digit(12)? * 1000 + digit(13)? * 100 + digit(14)? * 10 + digit(15)?)?;
+ expect(16, b' ').context("unexpected separator after year")?;
+ tm.set_hour(check_max(digit(17)? * 10 + digit(18)?, 23)?)?;
+ expect(19, b':').context("unexpected separator after hour")?;
+ tm.set_min(check_max(digit(20)? * 10 + digit(21)?, 59)?)?;
+ expect(22, b':').context("unexpected separator after minute")?;
+ tm.set_sec(check_max(digit(23)? * 10 + digit(24)?, 60)?)?;
+ expect(25, b' ').context("unexpected separator after second")?;
+ if !input.ends_with(b"GMT") {
+ bail!("unexpected timezone");
+ }
+
+ let _epoch = tm.into_epoch()?;
+
+ Ok(Self { _epoch })
+ }
+}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 6/9] s3 client: implement methods to operate on s3 objects in bucket
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (4 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 5/9] s3 client: add helper to parse http date headers Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 7/9] s3 client: add example usage for basic operations Christian Ebner
` (49 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Adds the basic implementation of the client to use s3 object stores
as backend for PBS datastores.
This implements the basic client actions on a bucket and objects
stored within given bucket.
This is not feature complete and intended to be extended on a
per-demand fashion rather than implementing the whole client at once.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- fix formatting issue
Cargo.toml | 4 +
proxmox-s3-client/Cargo.toml | 7 +
proxmox-s3-client/debian/control | 18 +
proxmox-s3-client/src/client.rs | 469 ++++++++++++++++++++++-
proxmox-s3-client/src/lib.rs | 4 +-
proxmox-s3-client/src/response_reader.rs | 376 ++++++++++++++++++
6 files changed, 874 insertions(+), 4 deletions(-)
create mode 100644 proxmox-s3-client/src/response_reader.rs
diff --git a/Cargo.toml b/Cargo.toml
index cf0ef097..d88d8383 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -98,6 +98,7 @@ lettre = "0.11.1"
libc = "0.2.107"
log = "0.4.17"
mail-parser = "0.11"
+md5 = "0.7.0"
native-tls = "0.2"
nix = "0.29"
openssl = "0.10"
@@ -105,18 +106,21 @@ pam-sys = "0.5"
percent-encoding = "2.1"
pin-utils = "0.1.0"
proc-macro2 = "1.0"
+quick-xml = "0.36.1"
quote = "1.0"
regex = "1.5"
serde = "1.0"
serde_cbor = "0.11.1"
serde_json = "1.0"
serde_plain = "1.0"
+serde-xml-rs = "0.5"
syn = { version = "2", features = [ "full", "visit-mut" ] }
sync_wrapper = "1"
tar = "0.4"
tokio = "1.6"
tokio-openssl = "0.6.1"
tokio-stream = "0.1.0"
+tokio-util = "0.7"
tower-service = "0.3.0"
tracing = "0.1"
tracing-journald = "0.3.1"
diff --git a/proxmox-s3-client/Cargo.toml b/proxmox-s3-client/Cargo.toml
index f5b87493..18bddddd 100644
--- a/proxmox-s3-client/Cargo.toml
+++ b/proxmox-s3-client/Cargo.toml
@@ -13,16 +13,23 @@ rust-version.workspace = true
[dependencies]
anyhow.workspace = true
+bytes.workspace = true
+futures.workspace = true
const_format.workspace = true
hex = { workspace = true, features = [ "serde" ] }
http-body-util.workspace = true
hyper-util = { workspace = true, features = [ "client-legacy", "tokio", "http1" ] }
hyper.workspace = true
iso8601.workspace = true
+md5.workspace = true
openssl.workspace = true
+quick-xml = { workspace = true, features = [ "async-tokio" ] }
regex.workspace = true
serde.workspace = true
serde_plain.workspace = true
+serde-xml-rs.workspace = true
+tokio.workspace = true
+tokio-util = { workspace = true, features = [ "compat" ] }
tracing.workspace = true
url.workspace = true
diff --git a/proxmox-s3-client/debian/control b/proxmox-s3-client/debian/control
index f3ffa2d9..f0cd6b0a 100644
--- a/proxmox-s3-client/debian/control
+++ b/proxmox-s3-client/debian/control
@@ -7,7 +7,9 @@ Build-Depends-Arch: cargo:native <!nocheck>,
rustc:native (>= 1.82) <!nocheck>,
libstd-rust-dev <!nocheck>,
librust-anyhow-1+default-dev <!nocheck>,
+ librust-bytes-1+default-dev <!nocheck>,
librust-const-format-0.2+default-dev <!nocheck>,
+ librust-futures-0.3+default-dev <!nocheck>,
librust-hex-0.4+default-dev <!nocheck>,
librust-hex-0.4+serde-dev <!nocheck>,
librust-http-body-util-0.1+default-dev <!nocheck>,
@@ -17,6 +19,7 @@ Build-Depends-Arch: cargo:native <!nocheck>,
librust-hyper-util-0.1+http1-dev (>= 0.1.12-~~) <!nocheck>,
librust-hyper-util-0.1+tokio-dev (>= 0.1.12-~~) <!nocheck>,
librust-iso8601-0.6+default-dev (>= 0.6.1-~~) <!nocheck>,
+ librust-md5-0.7+default-dev <!nocheck>,
librust-openssl-0.10+default-dev <!nocheck>,
librust-proxmox-base64-1+default-dev <!nocheck>,
librust-proxmox-http-1+body-dev <!nocheck>,
@@ -30,9 +33,15 @@ Build-Depends-Arch: cargo:native <!nocheck>,
librust-proxmox-serde-1+default-dev <!nocheck>,
librust-proxmox-serde-1+serde-json-dev <!nocheck>,
librust-proxmox-time-2+default-dev (>= 2.1.0-~~) <!nocheck>,
+ librust-quick-xml-0.36+async-tokio-dev (>= 0.36.1-~~) <!nocheck>,
+ librust-quick-xml-0.36+default-dev (>= 0.36.1-~~) <!nocheck>,
librust-regex-1+default-dev (>= 1.5-~~) <!nocheck>,
librust-serde-1+default-dev <!nocheck>,
librust-serde-plain-1+default-dev <!nocheck>,
+ librust-serde-xml-rs-0.5+default-dev <!nocheck>,
+ librust-tokio-1+default-dev (>= 1.6-~~) <!nocheck>,
+ librust-tokio-util-0.7+compat-dev <!nocheck>,
+ librust-tokio-util-0.7+default-dev <!nocheck>,
librust-tracing-0.1+default-dev <!nocheck>,
librust-url-2+default-dev (>= 2.2-~~) <!nocheck>
Maintainer: Proxmox Support Team <support@proxmox.com>
@@ -49,7 +58,9 @@ Multi-Arch: same
Depends:
${misc:Depends},
librust-anyhow-1+default-dev,
+ librust-bytes-1+default-dev,
librust-const-format-0.2+default-dev,
+ librust-futures-0.3+default-dev,
librust-hex-0.4+default-dev,
librust-hex-0.4+serde-dev,
librust-http-body-util-0.1+default-dev,
@@ -59,6 +70,7 @@ Depends:
librust-hyper-util-0.1+http1-dev (>= 0.1.12-~~),
librust-hyper-util-0.1+tokio-dev (>= 0.1.12-~~),
librust-iso8601-0.6+default-dev (>= 0.6.1-~~),
+ librust-md5-0.7+default-dev,
librust-openssl-0.10+default-dev,
librust-proxmox-base64-1+default-dev,
librust-proxmox-http-1+body-dev,
@@ -72,9 +84,15 @@ Depends:
librust-proxmox-serde-1+default-dev,
librust-proxmox-serde-1+serde-json-dev,
librust-proxmox-time-2+default-dev (>= 2.1.0-~~),
+ librust-quick-xml-0.36+async-tokio-dev (>= 0.36.1-~~),
+ librust-quick-xml-0.36+default-dev (>= 0.36.1-~~),
librust-regex-1+default-dev (>= 1.5-~~),
librust-serde-1+default-dev,
librust-serde-plain-1+default-dev,
+ librust-serde-xml-rs-0.5+default-dev,
+ librust-tokio-1+default-dev (>= 1.6-~~),
+ librust-tokio-util-0.7+compat-dev,
+ librust-tokio-util-0.7+default-dev,
librust-tracing-0.1+default-dev,
librust-url-2+default-dev (>= 2.2-~~)
Provides:
diff --git a/proxmox-s3-client/src/client.rs b/proxmox-s3-client/src/client.rs
index 1ba42a08..08e45ebf 100644
--- a/proxmox-s3-client/src/client.rs
+++ b/proxmox-s3-client/src/client.rs
@@ -1,24 +1,51 @@
+use std::path::Path;
+use std::str::FromStr;
use std::sync::{Arc, Mutex};
-use std::time::Duration;
+use std::time::{Duration, Instant};
use anyhow::{bail, format_err, Context, Error};
-use hyper::http::uri::Authority;
+use hyper::body::{Bytes, Incoming};
+use hyper::http::method::Method;
+use hyper::http::uri::{Authority, Parts, PathAndQuery, Scheme};
+use hyper::http::{header, HeaderValue, StatusCode, Uri};
+use hyper::{Request, Response};
use hyper_util::client::legacy::connect::HttpConnector;
use hyper_util::client::legacy::Client;
use hyper_util::rt::TokioExecutor;
use openssl::hash::MessageDigest;
+use openssl::sha::Sha256;
use openssl::ssl::{SslConnector, SslMethod, SslVerifyMode};
use openssl::x509::X509StoreContextRef;
use tracing::error;
use proxmox_http::client::HttpsConnector;
-use proxmox_http::{Body, RateLimiter};
+use proxmox_http::{Body, RateLimit, RateLimiter};
use proxmox_schema::api_types::CERT_FINGERPRINT_SHA256_SCHEMA;
use crate::api_types::{S3ClientConfig, S3ClientSecretsConfig};
+use crate::aws_sign_v4::AWS_SIGN_V4_DATETIME_FORMAT;
+use crate::aws_sign_v4::{aws_sign_v4_signature, aws_sign_v4_uri_encode};
+use crate::object_key::S3ObjectKey;
+use crate::response_reader::{
+ CopyObjectResponse, DeleteObjectsResponse, GetObjectResponse, HeadObjectResponse,
+ ListObjectsV2Response, PutObjectResponse, ResponseReader,
+};
const S3_HTTP_CONNECT_TIMEOUT: Duration = Duration::from_secs(10);
+const S3_HTTP_REQUEST_TIMEOUT: Duration = Duration::from_secs(60);
const S3_TCP_KEEPALIVE_TIME: u32 = 120;
+const MAX_S3_UPLOAD_RETRY: usize = 3;
+
+/// S3 object key path prefix without the context prefix as defined by the client options.
+///
+/// The client option's context prefix will be pre-pended by the various client methods before
+/// sending api requests.
+pub enum S3PathPrefix {
+ /// Path prefix relative to client's context prefix
+ Some(String),
+ /// No prefix
+ None,
+}
/// Configuration options for client
pub struct S3ClientOptions {
@@ -190,4 +217,440 @@ impl S3Client {
"unexpected certificate fingerprint {certificate_fingerprint}"
))
}
+
+ /// Prepare API request by adding commonly required headers and perform request signing
+ async fn prepare(&self, mut request: Request<Body>) -> Result<Request<Body>, Error> {
+ let host_header = request
+ .uri()
+ .authority()
+ .ok_or_else(|| format_err!("request missing authority"))?
+ .to_string();
+
+ // Content verification for aws s3 signature
+ let mut hasher = Sha256::new();
+ let contents = request
+ .body()
+ .as_bytes()
+ .ok_or_else(|| format_err!("cannot prepare request with streaming body"))?;
+ hasher.update(contents);
+ // Use MD5 as upload integrity check, as other methods are not supported by all S3 object
+ // store providers and might be ignored and this is recommended by AWS as described in
+ // https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html#API_PutObject_RequestSyntax
+ let payload_md5 = md5::compute(contents);
+ let payload_digest = hex::encode(hasher.finish());
+ let payload_len = contents.len();
+
+ let epoch = proxmox_time::epoch_i64();
+ let datetime = proxmox_time::strftime_utc(AWS_SIGN_V4_DATETIME_FORMAT, epoch)?;
+
+ request
+ .headers_mut()
+ .insert("x-amz-date", HeaderValue::from_str(&datetime)?);
+ request
+ .headers_mut()
+ .insert("host", HeaderValue::from_str(&host_header)?);
+ request.headers_mut().insert(
+ "x-amz-content-sha256",
+ HeaderValue::from_str(&payload_digest)?,
+ );
+ request.headers_mut().insert(
+ header::CONTENT_LENGTH,
+ HeaderValue::from_str(&payload_len.to_string())?,
+ );
+ if payload_len > 0 {
+ let md5_digest = proxmox_base64::encode(*payload_md5);
+ request
+ .headers_mut()
+ .insert("Content-MD5", HeaderValue::from_str(&md5_digest)?);
+ }
+
+ let signature = aws_sign_v4_signature(&request, &self.options, epoch, &payload_digest)?;
+
+ request
+ .headers_mut()
+ .insert(header::AUTHORIZATION, HeaderValue::from_str(&signature)?);
+
+ Ok(request)
+ }
+
+ /// Send API request to the configured endpoint using the inner https client.
+ async fn send(&self, request: Request<Body>) -> Result<Response<Incoming>, Error> {
+ let request = self.prepare(request).await?;
+ if request.method() == Method::PUT {
+ if let Some(limiter) = &self.put_rate_limiter {
+ let sleep = {
+ let mut limiter = limiter.lock().unwrap();
+ limiter.register_traffic(Instant::now(), 1)
+ };
+ tokio::time::sleep(sleep).await;
+ }
+ }
+ let response = tokio::time::timeout(S3_HTTP_REQUEST_TIMEOUT, self.client.request(request))
+ .await
+ .context("request timeout")??;
+ Ok(response)
+ }
+
+ /// Check if bucket exists and got permissions to access it.
+ /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadBucket.html
+ pub async fn head_bucket(&self) -> Result<(), Error> {
+ let request = Request::builder()
+ .method(Method::HEAD)
+ .uri(self.build_uri("/", &[])?)
+ .body(Body::empty())?;
+ let response = self.send(request).await?;
+ let (parts, _body) = response.into_parts();
+
+ match parts.status {
+ StatusCode::OK => (),
+ StatusCode::BAD_REQUEST | StatusCode::FORBIDDEN | StatusCode::NOT_FOUND => {
+ bail!("bucket does not exist or no permission to access it")
+ }
+ status_code => bail!("unexpected status code {status_code}"),
+ }
+
+ Ok(())
+ }
+
+ /// Fetch metadata from an object without returning the object itself.
+ /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html
+ pub async fn head_object(
+ &self,
+ object_key: S3ObjectKey,
+ ) -> Result<Option<HeadObjectResponse>, Error> {
+ let object_key = object_key.to_full_key(&self.options.common_prefix);
+ let request = Request::builder()
+ .method(Method::HEAD)
+ .uri(self.build_uri(&object_key, &[])?)
+ .body(Body::empty())?;
+ let response = self.send(request).await?;
+ let response_reader = ResponseReader::new(response);
+ response_reader.head_object_response().await
+ }
+
+ /// Fetch an object from object store.
+ /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html
+ pub async fn get_object(
+ &self,
+ object_key: S3ObjectKey,
+ ) -> Result<Option<GetObjectResponse>, Error> {
+ let object_key = object_key.to_full_key(&self.options.common_prefix);
+ let request = Request::builder()
+ .method(Method::GET)
+ .uri(self.build_uri(&object_key, &[])?)
+ .body(Body::empty())?;
+
+ let response = self.send(request).await?;
+ let response_reader = ResponseReader::new(response);
+ response_reader.get_object_response().await
+ }
+
+ /// Returns some or all (up to 1,000) of the objects in a bucket with each request.
+ /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectTagging.html
+ pub async fn list_objects_v2(
+ &self,
+ prefix: &S3PathPrefix,
+ continuation_token: Option<&str>,
+ ) -> Result<ListObjectsV2Response, Error> {
+ let mut query = vec![("list-type", "2")];
+ let abs_prefix: String;
+ if let S3PathPrefix::Some(prefix) = prefix {
+ abs_prefix = if prefix.starts_with("/") {
+ format!("{}{prefix}", self.options.common_prefix)
+ } else {
+ format!("{}/{prefix}", self.options.common_prefix)
+ };
+ query.push(("prefix", &abs_prefix));
+ }
+ if let Some(token) = continuation_token {
+ query.push(("continuation-token", token));
+ }
+ let request = Request::builder()
+ .method(Method::GET)
+ .uri(self.build_uri("/", &query)?)
+ .body(Body::empty())?;
+
+ let response = self.send(request).await?;
+ let response_reader = ResponseReader::new(response);
+ response_reader.list_objects_v2_response().await
+ }
+
+ /// Add a new object to a bucket.
+ ///
+ /// Do not reupload if an object with matching key already exists in the bucket if the replace
+ /// flag is not set.
+ /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html
+ pub async fn put_object(
+ &self,
+ object_key: S3ObjectKey,
+ object_data: Body,
+ replace: bool,
+ ) -> Result<PutObjectResponse, Error> {
+ let object_key = object_key.to_full_key(&self.options.common_prefix);
+ let mut request = Request::builder()
+ .method(Method::PUT)
+ .uri(self.build_uri(&object_key, &[])?)
+ .header(header::CONTENT_TYPE, "binary/octet");
+
+ if !replace {
+ request = request.header(header::IF_NONE_MATCH, "*");
+ }
+
+ let request = request.body(object_data)?;
+
+ let response = self.send(request).await?;
+ let response_reader = ResponseReader::new(response);
+ response_reader.put_object_response().await
+ }
+
+ /// Removes an object from a bucket.
+ /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObject.html
+ pub async fn delete_object(&self, object_key: S3ObjectKey) -> Result<(), Error> {
+ let object_key = object_key.to_full_key(&self.options.common_prefix);
+ let request = Request::builder()
+ .method(Method::DELETE)
+ .uri(self.build_uri(&object_key, &[])?)
+ .body(Body::empty())?;
+
+ let response = self.send(request).await?;
+ let response_reader = ResponseReader::new(response);
+ response_reader.delete_object_response().await
+ }
+
+ /// Delete multiple objects from a bucket using a single HTTP request.
+ /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html
+ pub async fn delete_objects(
+ &self,
+ object_keys: &[S3ObjectKey],
+ ) -> Result<DeleteObjectsResponse, Error> {
+ let mut body = String::from(r#"<Delete xmlns="http://s3.amazonaws.com/doc/2006-03-01/">"#);
+ for object_key in object_keys {
+ body.push_str("<Object><Key>");
+ body.push_str(object_key);
+ body.push_str("</Key></Object>");
+ }
+ body.push_str("</Delete>");
+ let request = Request::builder()
+ .method(Method::POST)
+ .uri(self.build_uri("/", &[("delete", "")])?)
+ .body(Body::from(body))?;
+
+ let response = self.send(request).await?;
+ let response_reader = ResponseReader::new(response);
+ response_reader.delete_objects_response().await
+ }
+
+ /// Creates a copy of an object that is already stored in Amazon S3.
+ /// Uses the `x-amz-metadata-directive` set to `REPLACE`, therefore resulting in updated metadata.
+ /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html
+ pub async fn copy_object(
+ &self,
+ source_key: S3ObjectKey,
+ destination_key: S3ObjectKey,
+ ) -> Result<CopyObjectResponse, Error> {
+ let copy_source =
+ source_key.to_copy_source_key(&self.options.bucket, &self.options.common_prefix);
+ let copy_source = aws_sign_v4_uri_encode(©_source, true);
+ let destination_key = destination_key.to_full_key(&self.options.common_prefix);
+ let destination_key = aws_sign_v4_uri_encode(&destination_key, true);
+ let request = Request::builder()
+ .method(Method::PUT)
+ .uri(self.build_uri(&destination_key, &[])?)
+ .header("x-amz-copy-source", HeaderValue::from_str(©_source)?)
+ .header(
+ "x-amz-metadata-directive",
+ HeaderValue::from_str("REPLACE")?,
+ )
+ .body(Body::empty())?;
+
+ let response = self.send(request).await?;
+ let response_reader = ResponseReader::new(response);
+ response_reader.copy_object_response().await
+ }
+
+ /// Delete objects by given key prefix.
+ /// Requires at least 2 api calls.
+ pub async fn delete_objects_by_prefix(&self, prefix: &S3PathPrefix) -> Result<bool, Error> {
+ // S3 API does not provide a convenient way to delete objects by key prefix.
+ // List all objects with given group prefix and delete all objects found, so this
+ // requires at least 2 API calls.
+ let mut next_continuation_token: Option<String> = None;
+ let mut delete_errors = false;
+ loop {
+ let list_objects_result = self
+ .list_objects_v2(prefix, next_continuation_token.as_deref())
+ .await?;
+
+ let objects_to_delete: Vec<S3ObjectKey> = list_objects_result
+ .contents
+ .into_iter()
+ .map(|item| item.key)
+ .collect();
+
+ let response = self.delete_objects(&objects_to_delete).await?;
+ if response.error.is_some() {
+ delete_errors = true;
+ }
+
+ if list_objects_result.is_truncated {
+ next_continuation_token = list_objects_result
+ .next_continuation_token
+ .as_ref()
+ .cloned();
+ continue;
+ }
+ break;
+ }
+ Ok(delete_errors)
+ }
+
+ /// Delete objects by given key prefix, but exclude items pre-filter based on suffix
+ /// (including the parent component of the matched suffix). E.g. do not remove items in a
+ /// snapshot directory, by matching based on the protected file marker (given as suffix).
+ /// Items matching the suffix provided as `ignore` will be excluded in the parent of a matching
+ /// suffix entry. E.g. owner and notes for a group, if a group snapshots was matched by a
+ /// protected marker.
+ ///
+ /// Requires at least 2 api calls.
+ pub async fn delete_objects_by_prefix_with_suffix_filter(
+ &self,
+ prefix: &S3PathPrefix,
+ suffix: &str,
+ excldue_from_parent: &[&str],
+ ) -> Result<bool, Error> {
+ // S3 API does not provide a convenient way to delete objects by key prefix.
+ // List all objects with given group prefix and delete all objects found, so this
+ // requires at least 2 API calls.
+ let mut next_continuation_token: Option<String> = None;
+ let mut delete_errors = false;
+ let mut prefix_filters = Vec::new();
+ let mut list_objects = Vec::new();
+ loop {
+ let list_objects_result = self
+ .list_objects_v2(prefix, next_continuation_token.as_deref())
+ .await?;
+
+ let mut prefixes: Vec<String> = list_objects_result
+ .contents
+ .iter()
+ .filter_map(|item| {
+ let prefix_filter = item.key.strip_suffix(suffix).map(|prefix| {
+ let path = Path::new(prefix);
+ if let Some(parent) = path.parent() {
+ for filter in excldue_from_parent {
+ let filter = parent.join(filter);
+ // valid utf-8 as combined from `str` values
+ prefix_filters.push(filter.to_string_lossy().to_string());
+ }
+ }
+ prefix.to_string()
+ });
+ if prefix_filter.is_none() {
+ list_objects.push(item.key.clone());
+ }
+ prefix_filter
+ })
+ .collect();
+ prefix_filters.append(&mut prefixes);
+
+ if list_objects_result.is_truncated {
+ next_continuation_token = list_objects_result
+ .next_continuation_token
+ .as_ref()
+ .cloned();
+ continue;
+ }
+ break;
+ }
+
+ let objects_to_delete: Vec<S3ObjectKey> = list_objects
+ .into_iter()
+ .filter_map(|item| {
+ for prefix in &prefix_filters {
+ if item.strip_prefix(prefix).is_some() {
+ return None;
+ }
+ }
+ Some(item)
+ })
+ .collect();
+
+ for objects in objects_to_delete.chunks(1000) {
+ let result = self.delete_objects(objects).await?;
+ if result.error.is_some() {
+ delete_errors = true;
+ }
+ }
+
+ Ok(delete_errors)
+ }
+
+ /// Upload the given object via the S3 api, retrying up to 3 times in case of error.
+ pub async fn upload_with_retry(
+ &self,
+ object_key: S3ObjectKey,
+ object_data: Bytes,
+ replace: bool,
+ ) -> Result<bool, Error> {
+ for retry in 0..MAX_S3_UPLOAD_RETRY {
+ let body = Body::from(object_data.clone());
+ match self.put_object(object_key.clone(), body, replace).await {
+ Ok(PutObjectResponse::Success(_response_body)) => return Ok(false),
+ Ok(PutObjectResponse::PreconditionFailed) => return Ok(true),
+ Ok(PutObjectResponse::NeedsRetry) => {
+ if retry >= MAX_S3_UPLOAD_RETRY - 1 {
+ bail!("concurrent operation, chunk upload failed")
+ }
+ }
+ Err(err) => {
+ if retry >= MAX_S3_UPLOAD_RETRY - 1 {
+ return Err(err.context("chunk upload failed"));
+ }
+ }
+ };
+ }
+ Ok(false)
+ }
+
+ #[inline(always)]
+ /// Helper to generate [`Uri`] instance with common properties based on given path and query.
+ fn build_uri(&self, mut path: &str, query: &[(&str, &str)]) -> Result<Uri, Error> {
+ if path.starts_with('/') {
+ path = &path[1..];
+ }
+ let path = aws_sign_v4_uri_encode(path, true);
+ let mut path_and_query = if self.options.path_style {
+ format!("/{bucket}/{path}", bucket = self.options.bucket)
+ } else {
+ format!("/{path}")
+ };
+
+ if !query.is_empty() {
+ path_and_query.push('?');
+ // No further input validation as http::uri::Builder will check path and query
+ let mut query_iter = query.iter().peekable();
+ while let Some((key, value)) = query_iter.next() {
+ let key = aws_sign_v4_uri_encode(key, true);
+ path_and_query.push_str(&key);
+ if !value.is_empty() {
+ let value = aws_sign_v4_uri_encode(value, true);
+ path_and_query.push('=');
+ path_and_query.push_str(&value);
+ }
+ if query_iter.peek().is_some() {
+ path_and_query.push('&');
+ }
+ }
+ }
+
+ let path_and_query =
+ PathAndQuery::from_str(&path_and_query).context("failed to parse path and query")?;
+
+ let mut uri_parts = Parts::default();
+ uri_parts.scheme = Some(Scheme::HTTPS);
+ uri_parts.authority = Some(self.authority.clone());
+ uri_parts.path_and_query = Some(path_and_query);
+
+ Uri::from_parts(uri_parts).context("failed to build uri")
+ }
}
diff --git a/proxmox-s3-client/src/lib.rs b/proxmox-s3-client/src/lib.rs
index fc314c42..991e1546 100644
--- a/proxmox-s3-client/src/lib.rs
+++ b/proxmox-s3-client/src/lib.rs
@@ -13,7 +13,7 @@ pub use aws_sign_v4::uri_decode;
#[cfg(feature = "impl")]
mod client;
#[cfg(feature = "impl")]
-pub use client::{S3Client, S3ClientOptions};
+pub use client::{S3Client, S3ClientOptions, S3PathPrefix};
#[cfg(feature = "impl")]
mod timestamps;
#[cfg(feature = "impl")]
@@ -22,3 +22,5 @@ pub use timestamps::*;
mod object_key;
#[cfg(feature = "impl")]
pub use object_key::S3ObjectKey;
+#[cfg(feature = "impl")]
+mod response_reader;
diff --git a/proxmox-s3-client/src/response_reader.rs b/proxmox-s3-client/src/response_reader.rs
new file mode 100644
index 00000000..41e8c6d8
--- /dev/null
+++ b/proxmox-s3-client/src/response_reader.rs
@@ -0,0 +1,376 @@
+use std::str::FromStr;
+
+use anyhow::{anyhow, bail, Context, Error};
+use http_body_util::BodyExt;
+use hyper::body::Incoming;
+use hyper::header::HeaderName;
+use hyper::http::header;
+use hyper::http::StatusCode;
+use hyper::{HeaderMap, Response};
+use serde::Deserialize;
+
+use crate::S3ObjectKey;
+use crate::{HttpDate, LastModifiedTimestamp};
+
+pub(crate) struct ResponseReader {
+ response: Response<Incoming>,
+}
+
+#[derive(Debug)]
+/// Subset of the list object v2 response including some header values
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html#API_CopyObject_ResponseSyntax
+pub struct ListObjectsV2Response {
+ pub date: HttpDate,
+ pub name: String,
+ pub prefix: String,
+ pub key_count: u64,
+ pub max_keys: u64,
+ pub is_truncated: bool,
+ pub continuation_token: Option<String>,
+ pub next_continuation_token: Option<String>,
+ pub contents: Vec<ListObjectsV2Contents>,
+}
+
+#[derive(Deserialize, Debug)]
+#[serde(rename_all = "PascalCase")]
+/// Subset of items used to deserialize a list objects v2 respsonse
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html#API_CopyObject_ResponseSyntax
+struct ListObjectsV2ResponseBody {
+ pub name: String,
+ pub prefix: String,
+ pub key_count: u64,
+ pub max_keys: u64,
+ pub is_truncated: bool,
+ pub continuation_token: Option<String>,
+ pub next_continuation_token: Option<String>,
+ pub contents: Option<Vec<ListObjectsV2Contents>>,
+}
+
+impl ListObjectsV2ResponseBody {
+ fn with_date(self, date: HttpDate) -> ListObjectsV2Response {
+ ListObjectsV2Response {
+ date,
+ name: self.name,
+ prefix: self.prefix,
+ key_count: self.key_count,
+ max_keys: self.max_keys,
+ is_truncated: self.is_truncated,
+ continuation_token: self.continuation_token,
+ next_continuation_token: self.next_continuation_token,
+ contents: self.contents.unwrap_or_default(),
+ }
+ }
+}
+
+#[derive(Deserialize, Debug)]
+#[serde(rename_all = "PascalCase")]
+/// Subset used to deserialize the contents of a list objects v2 respsonse
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html#API_CopyObject_ResponseSyntax
+pub struct ListObjectsV2Contents {
+ pub key: S3ObjectKey,
+ pub last_modified: LastModifiedTimestamp,
+ pub e_tag: String,
+ pub size: u64,
+ pub storage_class: String,
+}
+
+#[derive(Debug)]
+/// Subset of the head object response (headers only, there is no body)
+/// See https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html#API_HeadObject_ResponseSyntax
+pub struct HeadObjectResponse {
+ pub content_length: u64,
+ pub content_type: String,
+ pub date: HttpDate,
+ pub e_tag: String,
+ pub last_modified: HttpDate,
+}
+
+/// Subset of the get object response including some headers
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html#API_GetObject_ResponseSyntax
+pub struct GetObjectResponse {
+ pub content_length: u64,
+ pub content_type: String,
+ pub date: HttpDate,
+ pub e_tag: String,
+ pub last_modified: HttpDate,
+ pub content: Incoming,
+}
+
+/// Subset of the put object response
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html#API_PutObject_ResponseSyntax
+#[derive(Debug)]
+pub enum PutObjectResponse {
+ NeedsRetry,
+ PreconditionFailed,
+ Success(String),
+}
+
+/// Subset of the delete objects response
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html#API_DeleteObjects_ResponseElements
+#[derive(Deserialize, Debug)]
+#[serde(rename_all = "PascalCase")]
+pub struct DeleteObjectsResponse {
+ pub deleted: Option<Vec<DeletedObject>>,
+ pub error: Option<Vec<DeleteObjectError>>,
+}
+
+/// Subset used to deserialize the deleted objects of a delete objects v2 respsonse
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeletedObject.html
+#[derive(Deserialize, Debug)]
+#[serde(rename_all = "PascalCase")]
+pub struct DeletedObject {
+ pub delete_marker: Option<bool>,
+ pub delete_marker_version_id: Option<String>,
+ pub key: Option<S3ObjectKey>,
+ pub version_id: Option<String>,
+}
+
+/// Subset used to deserialize the deleted object errors of a delete objects v2 respsonse
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_Error.html
+#[derive(Deserialize, Debug)]
+#[serde(rename_all = "PascalCase")]
+pub struct DeleteObjectError {
+ pub code: Option<String>,
+ pub key: Option<S3ObjectKey>,
+ pub message: Option<String>,
+ pub version_id: Option<String>,
+}
+
+#[derive(Debug)]
+/// Subset used to deserialize the copy object response
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html#API_CopyObject_ResponseSyntax
+pub struct CopyObjectResponse {
+ pub copy_object_result: CopyObjectResult,
+ pub x_amz_version_id: Option<String>,
+}
+
+#[derive(Deserialize, Debug)]
+#[serde(rename_all = "PascalCase")]
+/// Subset used to deserialize the copy object result of a copy object respsonse
+/// https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html#API_CopyObject_ResponseSyntax
+pub struct CopyObjectResult {
+ pub e_tag: String,
+ pub last_modified: LastModifiedTimestamp,
+}
+
+impl ResponseReader {
+ pub(crate) fn new(response: Response<Incoming>) -> Self {
+ Self { response }
+ }
+
+ pub(crate) async fn list_objects_v2_response(self) -> Result<ListObjectsV2Response, Error> {
+ let (parts, body) = self.response.into_parts();
+ let body = body.collect().await?.to_bytes();
+
+ match parts.status {
+ StatusCode::OK => (),
+ StatusCode::NOT_FOUND => bail!("bucket does not exist"),
+ status_code => {
+ if let Ok(body) = String::from_utf8(body.to_vec()) {
+ if !body.is_empty() {
+ tracing::error!("{body}");
+ }
+ }
+ bail!("unexpected status code {status_code}")
+ }
+ }
+
+ let body = String::from_utf8(body.to_vec())?;
+
+ let date: HttpDate = Self::parse_header(header::DATE, &parts.headers)?;
+
+ let response: ListObjectsV2ResponseBody =
+ serde_xml_rs::from_str(&body).context("failed to parse response body")?;
+
+ Ok(response.with_date(date))
+ }
+
+ pub(crate) async fn head_object_response(self) -> Result<Option<HeadObjectResponse>, Error> {
+ let (parts, body) = self.response.into_parts();
+ let body = body.collect().await?.to_bytes();
+
+ match parts.status {
+ StatusCode::OK => (),
+ StatusCode::NOT_FOUND => return Ok(None),
+ status_code => {
+ if let Ok(body) = String::from_utf8(body.to_vec()) {
+ if !body.is_empty() {
+ tracing::error!("{body}");
+ }
+ }
+ bail!("unexpected status code {status_code}")
+ }
+ }
+ if !body.is_empty() {
+ bail!("got unexpected non-empty response body");
+ }
+
+ let content_length: u64 = Self::parse_header(header::CONTENT_LENGTH, &parts.headers)?;
+ let content_type = Self::parse_header(header::CONTENT_TYPE, &parts.headers)?;
+ let e_tag = Self::parse_header(header::ETAG, &parts.headers)?;
+ let date = Self::parse_header(header::DATE, &parts.headers)?;
+ let last_modified = Self::parse_header(header::LAST_MODIFIED, &parts.headers)?;
+
+ Ok(Some(HeadObjectResponse {
+ content_length,
+ content_type,
+ date,
+ e_tag,
+ last_modified,
+ }))
+ }
+
+ pub(crate) async fn get_object_response(self) -> Result<Option<GetObjectResponse>, Error> {
+ let (parts, content) = self.response.into_parts();
+
+ match parts.status {
+ StatusCode::OK => (),
+ StatusCode::NOT_FOUND => return Ok(None),
+ StatusCode::FORBIDDEN => bail!("object is archived and inaccessible until restored"),
+ status_code => {
+ let body = content.collect().await?.to_bytes();
+ if let Ok(body) = String::from_utf8(body.to_vec()) {
+ if !body.is_empty() {
+ tracing::error!("{body}");
+ }
+ }
+ bail!("unexpected status code {status_code}")
+ }
+ }
+
+ let content_length: u64 = Self::parse_header(header::CONTENT_LENGTH, &parts.headers)?;
+ let content_type = Self::parse_header(header::CONTENT_TYPE, &parts.headers)?;
+ let e_tag = Self::parse_header(header::ETAG, &parts.headers)?;
+ let date = Self::parse_header(header::DATE, &parts.headers)?;
+ let last_modified = Self::parse_header(header::LAST_MODIFIED, &parts.headers)?;
+
+ Ok(Some(GetObjectResponse {
+ content_length,
+ content_type,
+ date,
+ e_tag,
+ last_modified,
+ content,
+ }))
+ }
+
+ pub(crate) async fn put_object_response(self) -> Result<PutObjectResponse, Error> {
+ let (parts, body) = self.response.into_parts();
+ let body = body.collect().await?.to_bytes();
+
+ match parts.status {
+ StatusCode::OK => (),
+ StatusCode::PRECONDITION_FAILED => return Ok(PutObjectResponse::PreconditionFailed),
+ StatusCode::CONFLICT => return Ok(PutObjectResponse::NeedsRetry),
+ StatusCode::BAD_REQUEST => bail!("invalid request"),
+ status_code => {
+ if let Ok(body) = String::from_utf8(body.to_vec()) {
+ if !body.is_empty() {
+ tracing::error!("{body}");
+ }
+ }
+ bail!("unexpected status code {status_code}")
+ }
+ };
+
+ if !body.is_empty() {
+ bail!("got unexpected non-empty response body");
+ }
+
+ let e_tag = Self::parse_header(header::ETAG, &parts.headers)?;
+
+ Ok(PutObjectResponse::Success(e_tag))
+ }
+
+ pub(crate) async fn delete_object_response(self) -> Result<(), Error> {
+ let (parts, _body) = self.response.into_parts();
+
+ match parts.status {
+ StatusCode::NO_CONTENT => (),
+ status_code => bail!("unexpected status code {status_code}"),
+ };
+
+ Ok(())
+ }
+
+ pub(crate) async fn delete_objects_response(self) -> Result<DeleteObjectsResponse, Error> {
+ let (parts, body) = self.response.into_parts();
+ let body = body.collect().await?.to_bytes();
+
+ match parts.status {
+ StatusCode::OK => (),
+ StatusCode::BAD_REQUEST => bail!("invalid request"),
+ status_code => {
+ if let Ok(body) = String::from_utf8(body.to_vec()) {
+ if !body.is_empty() {
+ tracing::error!("{body}");
+ }
+ }
+ bail!("unexpected status code {status_code}")
+ }
+ };
+
+ let body = String::from_utf8(body.to_vec())?;
+
+ let delete_objects_response: DeleteObjectsResponse =
+ serde_xml_rs::from_str(&body).context("failed to parse response body")?;
+
+ Ok(delete_objects_response)
+ }
+
+ pub(crate) async fn copy_object_response(self) -> Result<CopyObjectResponse, Error> {
+ let (parts, body) = self.response.into_parts();
+ let body = body.collect().await?.to_bytes();
+
+ match parts.status {
+ StatusCode::OK => (),
+ StatusCode::NOT_FOUND => bail!("object not found"),
+ StatusCode::FORBIDDEN => bail!("the source object is not in the active tier"),
+ status_code => {
+ if let Ok(body) = String::from_utf8(body.to_vec()) {
+ if !body.is_empty() {
+ tracing::error!("{body}");
+ }
+ }
+ bail!("unexpected status code {status_code}")
+ }
+ }
+
+ let body = String::from_utf8(body.to_vec())?;
+
+ let x_amz_version_id = match parts.headers.get("x-amz-version-id") {
+ Some(version_id) => Some(
+ version_id
+ .to_str()
+ .context("failed to parse version id header")?
+ .to_owned(),
+ ),
+ None => None,
+ };
+
+ let copy_object_result: CopyObjectResult =
+ serde_xml_rs::from_str(&body).context("failed to parse response body")?;
+
+ Ok(CopyObjectResponse {
+ copy_object_result,
+ x_amz_version_id,
+ })
+ }
+
+ fn parse_header<T: FromStr>(name: HeaderName, headers: &HeaderMap) -> Result<T, Error>
+ where
+ <T as FromStr>::Err: Send + Sync + 'static,
+ Result<T, <T as FromStr>::Err>: Context<T, <T as FromStr>::Err>,
+ {
+ let header_value = headers
+ .get(&name)
+ .ok_or_else(|| anyhow!("missing header '{name}'"))?;
+ let header_str = header_value
+ .to_str()
+ .with_context(|| format!("non UTF-8 header '{name}'"))?;
+ let value = header_str
+ .parse()
+ .with_context(|| format!("failed to parse header '{name}'"))?;
+ Ok(value)
+ }
+}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 7/9] s3 client: add example usage for basic operations
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (5 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 6/9] s3 client: implement methods to operate on s3 objects in bucket Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 8/9] pbs-api-types: extend datastore config by backend config enum Christian Ebner
` (48 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Add an examples for how to create the client instance using its
configuration options and how to perform some basic api requests on
the S3 endpoint.
Guarded by the cfg attribute for conditional compilation to not fail
if feature "impl" is not set.
Further, excluded via `Cargo.toml` from being executed as test, as
this requires S3 object store to be available and configured.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- fix formatting issues
proxmox-s3-client/Cargo.toml | 4 ++
proxmox-s3-client/examples/s3_client.rs | 69 +++++++++++++++++++++++++
proxmox-s3-client/src/lib.rs | 7 +++
3 files changed, 80 insertions(+)
create mode 100644 proxmox-s3-client/examples/s3_client.rs
diff --git a/proxmox-s3-client/Cargo.toml b/proxmox-s3-client/Cargo.toml
index 18bddddd..4388d5f6 100644
--- a/proxmox-s3-client/Cargo.toml
+++ b/proxmox-s3-client/Cargo.toml
@@ -42,3 +42,7 @@ proxmox-time.workspace = true
[features]
default = []
impl = []
+
+[[example]]
+name = "s3_client"
+test = false
diff --git a/proxmox-s3-client/examples/s3_client.rs b/proxmox-s3-client/examples/s3_client.rs
new file mode 100644
index 00000000..c65ceb83
--- /dev/null
+++ b/proxmox-s3-client/examples/s3_client.rs
@@ -0,0 +1,69 @@
+// Execute via `cargo run --example s3_client --features impl` in `proxmox` main repo folder
+
+#[cfg(not(feature = "impl"))]
+fn main() {
+ // intentionally left empty
+}
+
+#[cfg(feature = "impl")]
+use proxmox_s3_client::{S3Client, S3ClientOptions, S3ObjectKey, S3PathPrefix};
+
+#[cfg(feature = "impl")]
+fn main() -> Result<(), anyhow::Error> {
+ tokio::runtime::Builder::new_current_thread()
+ .enable_all()
+ .build()
+ .unwrap()
+ .block_on(run())
+}
+
+#[cfg(feature = "impl")]
+async fn run() -> Result<(), anyhow::Error> {
+ // Configure the client via the client options
+ let options = S3ClientOptions {
+ // Must be resolvable, e.g. the Ceph RADOS gateway.
+ // Allows to use {{bucket}} or {{region}} template pattern for ease of configuration.
+ // In this example, the final authority is `https://testbucket.s3.pve-c1.local:7480/`.
+ endpoint: "{{bucket}}.s3.pve-c1.local".to_string(),
+ // Must match the port the api is listening on
+ port: Some(7480),
+ // Name of the bucket to be used
+ bucket: "testbucket".to_string(),
+ common_prefix: "teststore".to_string(),
+ path_style: false,
+ access_key: "<your-access-key>".to_string(),
+ secret_key: "<your-secret-key>".to_string(),
+ region: "us-west-1".to_string(),
+ // Only required for self signed certificates, can be obtained by, e.g.
+ // `openssl s_client -connect testbucket.s3.pve-c1.local:7480 < /dev/null | openssl x509 -fingerprint -sha256 -noout`
+ fingerprint: Some("<s3-api-fingerprint>".to_string()),
+ put_rate_limit: None,
+ };
+
+ // Creating a client instance and connect to api endpoint
+ let s3_client = S3Client::new(options)?;
+
+ // Check if the bucket can be accessed
+ s3_client.head_bucket().await?;
+
+ let rel_object_key = S3ObjectKey::from("object.txt");
+ let body = proxmox_http::Body::empty();
+ let replace_existing_key = true;
+ let _response = s3_client
+ .put_object(rel_object_key, body, replace_existing_key)
+ .await?;
+
+ // List object, limiting to ones matching the given prefix. Since the api limits the response
+ // to 1000 entries, the following contents might be fetched using a continuation token, being
+ // part of the previouis response.
+ let prefix = S3PathPrefix::Some("/teststore/".to_string());
+ let continuation_token = None;
+ let _response = s3_client
+ .list_objects_v2(&prefix, continuation_token)
+ .await?;
+
+ // Delete a single object
+ let rel_object_key = S3ObjectKey::from("object.txt");
+ let _response = s3_client.delete_object(rel_object_key).await?;
+ Ok(())
+}
diff --git a/proxmox-s3-client/src/lib.rs b/proxmox-s3-client/src/lib.rs
index 991e1546..f485ac46 100644
--- a/proxmox-s3-client/src/lib.rs
+++ b/proxmox-s3-client/src/lib.rs
@@ -1,4 +1,11 @@
//! Low level REST API client for AWS S3 compatible object stores
+//!
+//! # Example
+//! A basic example on how to use the client can be found in
+//! `proxmox-s3-client/examples/s3_client.rs` and run via
+//! `cargo run --example s3_client --features impl` from the main
+//! repository folder.
+
#![cfg_attr(docsrs, feature(doc_cfg, doc_auto_cfg))]
#![deny(unsafe_op_in_unsafe_fn)]
#![deny(missing_docs)]
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 8/9] pbs-api-types: extend datastore config by backend config enum
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (6 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 7/9] s3 client: add example usage for basic operations Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 9/9] pbs-api-types: maintenance: add new maintenance mode S3 refresh Christian Ebner
` (47 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Allows to configure a backend config variant for a datastore on
creation. The current default `Filesystem` backend variant is
introduced to be compatible with existing storages. A new S3 backend
variant allows to create datastores backed by an S3 compatible object
store instead.
For S3 backends, the type, id of the corresponding S3 client
configuration as well as the bucket name are stored as property
string. A valid datastore backend configuration for S3 therefore
contains:
```
...
backend bucket=<BUCKET_NAME>,client=<S3_CONFIG_ID>,type=s3
...
```
Further, a maximum cache size for the local store cache can be
assigned, for example to limit to max 1G:
```
...
backend bucket=<BUCKET_NAME>,client=<S3_CONFIG_ID>,type=s3,max-cache-size=1G
...
```
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- fix formatting issue
Cargo.toml | 1 +
pbs-api-types/Cargo.toml | 1 +
pbs-api-types/debian/control | 2 +
pbs-api-types/src/datastore.rs | 113 ++++++++++++++++++++++++++++++++-
4 files changed, 116 insertions(+), 1 deletion(-)
diff --git a/Cargo.toml b/Cargo.toml
index d88d8383..f8221877 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -151,6 +151,7 @@ proxmox-product-config = { version = "1.0.0", path = "proxmox-product-config" }
proxmox-config-digest = { version = "1.0.0", path = "proxmox-config-digest" }
proxmox-rest-server = { version = "1.0.0", path = "proxmox-rest-server" }
proxmox-router = { version = "3.2.2", path = "proxmox-router" }
+proxmox-s3-client = { version = "1.0.0", path = "proxmox-s3-client" }
proxmox-schema = { version = "4.1.0", path = "proxmox-schema" }
proxmox-section-config = { version = "3.1.0", path = "proxmox-section-config" }
proxmox-sendmail = { version = "1.0.0", path = "proxmox-sendmail" }
diff --git a/pbs-api-types/Cargo.toml b/pbs-api-types/Cargo.toml
index 0e8ca379..4929f157 100644
--- a/pbs-api-types/Cargo.toml
+++ b/pbs-api-types/Cargo.toml
@@ -20,6 +20,7 @@ proxmox-auth-api = { workspace = true, features = [ "api-types" ] }
proxmox-apt-api-types.workspace = true
proxmox-human-byte.workspace = true
proxmox-lang.workspace=true
+proxmox-s3-client.workspace = true
proxmox-schema = { workspace = true, features = [ "api-macro" ] }
proxmox-serde.workspace = true
proxmox-time.workspace = true
diff --git a/pbs-api-types/debian/control b/pbs-api-types/debian/control
index de863c3b..0c5bffb0 100644
--- a/pbs-api-types/debian/control
+++ b/pbs-api-types/debian/control
@@ -15,6 +15,7 @@ Build-Depends-Arch: cargo:native <!nocheck>,
librust-proxmox-auth-api-1+default-dev <!nocheck>,
librust-proxmox-human-byte-1+default-dev <!nocheck>,
librust-proxmox-lang-1+default-dev (>= 1.5-~~) <!nocheck>,
+ librust-proxmox-s3-client-1+default-dev <!nocheck>,
librust-proxmox-schema-4+api-macro-dev (>= 4.1.0-~~) <!nocheck>,
librust-proxmox-schema-4+default-dev (>= 4.1.0-~~) <!nocheck>,
librust-proxmox-serde-1+default-dev <!nocheck>,
@@ -46,6 +47,7 @@ Depends:
librust-proxmox-auth-api-1+default-dev,
librust-proxmox-human-byte-1+default-dev,
librust-proxmox-lang-1+default-dev (>= 1.5-~~),
+ librust-proxmox-s3-client-1+default-dev,
librust-proxmox-schema-4+api-macro-dev (>= 4.1.0-~~),
librust-proxmox-schema-4+default-dev (>= 4.1.0-~~),
librust-proxmox-serde-1+default-dev,
diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index 5bd953ac..5d1e0555 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -8,6 +8,7 @@ use anyhow::{bail, format_err, Error};
use const_format::concatcp;
use serde::{Deserialize, Serialize};
+use proxmox_human_byte::HumanByte;
use proxmox_schema::{
api, const_regex, ApiStringFormat, ApiType, ArraySchema, EnumEntry, IntegerSchema, ReturnType,
Schema, StringSchema, Updater, UpdaterType,
@@ -286,6 +287,106 @@ pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore
))
.schema();
+#[api]
+#[derive(Copy, Clone, Default, Deserialize, Serialize, Updater, PartialEq)]
+#[serde(rename_all = "kebab-case")]
+/// Datastore backend type
+pub enum DatastoreBackendType {
+ /// Local filesystem
+ #[default]
+ Filesystem,
+ /// S3 object store
+ S3,
+}
+serde_plain::derive_display_from_serialize!(DatastoreBackendType);
+serde_plain::derive_fromstr_from_deserialize!(DatastoreBackendType);
+
+#[api(
+ properties: {
+ type: {
+ type: DatastoreBackendType,
+ optional: true,
+ },
+ client: {
+ schema: proxmox_s3_client::S3_CLIENT_ID_SCHEMA,
+ optional: true,
+ },
+ bucket: {
+ schema: proxmox_s3_client::S3_BUCKET_NAME_SCHEMA,
+ optional: true,
+ },
+ "max-cache-size": {
+ type: HumanByte,
+ optional: true,
+ }
+ },
+ default_key: "type",
+)]
+#[derive(Default, Deserialize, Serialize)]
+#[serde(rename_all = "kebab-case")]
+/// Datastore backend config
+pub struct DatastoreBackendConfig {
+ /// backend type
+ #[serde(rename = "type")]
+ pub ty: Option<DatastoreBackendType>,
+ /// s3 client id
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub client: Option<String>,
+ /// s3 bucket name
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub bucket: Option<String>,
+ /// maximum cache size for local datastore LRU cache
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub max_cache_size: Option<HumanByte>,
+}
+
+pub const DATASTORE_BACKEND_CONFIG_STRING_SCHEMA: Schema =
+ StringSchema::new("Datastore backend config")
+ .format(&ApiStringFormat::VerifyFn(verify_datastore_backend_config))
+ .type_text("<backend-config>")
+ .schema();
+
+fn verify_datastore_backend_config(input: &str) -> Result<(), Error> {
+ DatastoreBackendConfig::from_str(input).map(|_| ())
+}
+
+impl FromStr for DatastoreBackendConfig {
+ type Err = Error;
+
+ fn from_str(s: &str) -> Result<Self, Self::Err> {
+ let backend_config: DatastoreBackendConfig =
+ proxmox_schema::property_string::parse_with_schema(
+ s,
+ &DatastoreBackendConfig::API_SCHEMA,
+ )?;
+ let backend_type = backend_config.ty.unwrap_or_default();
+ match backend_type {
+ DatastoreBackendType::Filesystem => {
+ if backend_config.client.is_some() {
+ bail!("additional option client, not allowed for backend type filesystem");
+ }
+ if backend_config.bucket.is_some() {
+ bail!("additional option bucket, not allowed for backend type filesystem");
+ }
+ if backend_config.max_cache_size.is_some() {
+ bail!(
+ "additional option max-cache-size, not allowed for backend type filesystem"
+ );
+ }
+ }
+ DatastoreBackendType::S3 => {
+ if backend_config.client.is_none() {
+ bail!("missing option client, required for backend type s3");
+ }
+ if backend_config.bucket.is_none() {
+ bail!("missing option bucket, required for backend type s3");
+ }
+ }
+ }
+ Ok(backend_config)
+ }
+}
+
#[api(
properties: {
name: {
@@ -336,7 +437,11 @@ pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore
optional: true,
format: &proxmox_schema::api_types::UUID_FORMAT,
type: String,
- }
+ },
+ backend: {
+ schema: DATASTORE_BACKEND_CONFIG_STRING_SCHEMA,
+ optional: true,
+ },
}
)]
#[derive(Serialize, Deserialize, Updater, Clone, PartialEq)]
@@ -389,6 +494,11 @@ pub struct DataStoreConfig {
#[updater(skip)]
#[serde(skip_serializing_if = "Option::is_none")]
pub backing_device: Option<String>,
+
+ /// Backend configuration for datastore
+ #[updater(skip)]
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub backend: Option<String>,
}
#[api]
@@ -424,6 +534,7 @@ impl DataStoreConfig {
tuning: None,
maintenance_mode: None,
backing_device: None,
+ backend: None,
}
}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox v8 9/9] pbs-api-types: maintenance: add new maintenance mode S3 refresh
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (7 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 8/9] pbs-api-types: extend datastore config by backend config enum Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 01/45] datastore: add helpers for path/digest to s3 object key conversion Christian Ebner
` (46 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
The new maintenance mode S3 refresh disallows any read and write
operations on the underlying datastore, but without expecting it to
be unmountable or clearing it from the internal datastore cache.
This mode is intended to be used when refreshing a datastore backed
by and S3 object store, downloading, clearing and recreating the
contents.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- fix formatting issues
pbs-api-types/src/datastore.rs | 1 +
pbs-api-types/src/maintenance.rs | 4 ++++
2 files changed, 5 insertions(+)
diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
index 5d1e0555..b8464a10 100644
--- a/pbs-api-types/src/datastore.rs
+++ b/pbs-api-types/src/datastore.rs
@@ -567,6 +567,7 @@ impl DataStoreConfig {
Some(MaintenanceType::Unmount) => {
/* used to reset it after failed unmount, or alternative for aborting unmount task */
}
+ Some(MaintenanceType::S3Refresh) => { /* used to reset state after refresh finished */ }
Some(MaintenanceType::Delete) => {
match new_type {
Some(MaintenanceType::Delete) => { /* allow to delete a deleted storage */ }
diff --git a/pbs-api-types/src/maintenance.rs b/pbs-api-types/src/maintenance.rs
index 3c9aa819..a516a1d9 100644
--- a/pbs-api-types/src/maintenance.rs
+++ b/pbs-api-types/src/maintenance.rs
@@ -49,6 +49,8 @@ pub enum MaintenanceType {
Delete,
/// The (removable) datastore is being unmounted.
Unmount,
+ /// The S3 cache store is being refreshed.
+ S3Refresh,
}
serde_plain::derive_display_from_serialize!(MaintenanceType);
serde_plain::derive_fromstr_from_deserialize!(MaintenanceType);
@@ -100,6 +102,8 @@ impl MaintenanceMode {
bail!("datastore is being unmounted");
} else if self.ty == MaintenanceType::Offline {
bail!("offline maintenance mode: {}", message);
+ } else if self.ty == MaintenanceType::S3Refresh {
+ bail!("S3 refresh maintenance mode: {}", message);
} else if self.ty == MaintenanceType::ReadOnly {
if let Some(Operation::Write) = operation {
bail!("read-only maintenance mode: {}", message);
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 01/45] datastore: add helpers for path/digest to s3 object key conversion
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (8 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 9/9] pbs-api-types: maintenance: add new maintenance mode S3 refresh Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 7:24 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 02/45] config: introduce s3 object store client configuration Christian Ebner
` (45 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Adds helper methods to generate the s3 object keys given a relative
path and filename for datastore contents or digest in case of chunk
files.
Regular datastore contents are stored by grouping them with a content
prefix in the object key. In order to keep the object key length
small, given the max limit of 1024 bytes {0], `.cnt` is used as
content prefix. Chunks on the other hand are prefixed by `.chunks`,
same as on regular datastores.
The prefix allows for selective listing of either contents or chunks
by providing the prefix to the respective api calls.
[0] https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
Cargo.toml | 1 +
pbs-datastore/Cargo.toml | 1 +
pbs-datastore/src/lib.rs | 1 +
pbs-datastore/src/s3.rs | 49 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 52 insertions(+)
create mode 100644 pbs-datastore/src/s3.rs
diff --git a/Cargo.toml b/Cargo.toml
index ae57e7e20..b6b779cbc 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -77,6 +77,7 @@ proxmox-rest-server = { version = "1", features = [ "templates" ] }
proxmox-router = { version = "3.2.2", default-features = false }
proxmox-rrd = "1"
proxmox-rrd-api-types = "1.0.2"
+proxmox-s3-client = "1.0.0"
# everything but pbs-config and pbs-client use "api-macro"
proxmox-schema = "4"
proxmox-section-config = "3"
diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
index 56f6e9094..c42eff165 100644
--- a/pbs-datastore/Cargo.toml
+++ b/pbs-datastore/Cargo.toml
@@ -34,6 +34,7 @@ proxmox-borrow.workspace = true
proxmox-human-byte.workspace = true
proxmox-io.workspace = true
proxmox-lang.workspace=true
+proxmox-s3-client = { workspace = true, features = [ "impl" ] }
proxmox-schema = { workspace = true, features = [ "api-macro" ] }
proxmox-serde = { workspace = true, features = [ "serde_json" ] }
proxmox-sys.workspace = true
diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
index 5014b6c09..ffd0d91b2 100644
--- a/pbs-datastore/src/lib.rs
+++ b/pbs-datastore/src/lib.rs
@@ -182,6 +182,7 @@ pub mod manifest;
pub mod paperkey;
pub mod prune;
pub mod read_chunk;
+pub mod s3;
pub mod store_progress;
pub mod task_tracking;
diff --git a/pbs-datastore/src/s3.rs b/pbs-datastore/src/s3.rs
new file mode 100644
index 000000000..82843ee26
--- /dev/null
+++ b/pbs-datastore/src/s3.rs
@@ -0,0 +1,49 @@
+use std::path::{Path, PathBuf};
+
+use anyhow::{bail, format_err, Error};
+
+use proxmox_s3_client::S3ObjectKey;
+
+/// Object key prefix to group regular datastore contents (not chunks)
+pub const S3_CONTENT_PREFIX: &str = ".cnt";
+
+/// Generate a relative object key with content prefix from given path and filename
+pub fn object_key_from_path(path: &Path, filename: &str) -> Result<S3ObjectKey, Error> {
+ // Force the use of relative paths, otherwise this would loose the content prefix
+ if path.is_absolute() {
+ bail!("cannot generate object key from absolute path");
+ }
+ if filename.contains('/') {
+ bail!("invalid filename containing slashes");
+ }
+ let mut object_path = PathBuf::from(S3_CONTENT_PREFIX);
+ object_path.push(path);
+ object_path.push(filename);
+
+ let object_key_str = object_path
+ .to_str()
+ .ok_or_else(|| format_err!("unexpected object key path"))?;
+ Ok(S3ObjectKey::from(object_key_str))
+}
+
+/// Generate a relative object key with chunk prefix from given digest
+pub fn object_key_from_digest(digest: &[u8; 32]) -> Result<S3ObjectKey, Error> {
+ let object_key = hex::encode(digest);
+ let digest_prefix = &object_key[..4];
+ let object_key_string = format!(".chunks/{digest_prefix}/{object_key}");
+ Ok(S3ObjectKey::from(object_key_string.as_str()))
+}
+
+/// Generate a relative object key with chunk prefix from given digest, extended by suffix
+pub fn object_key_from_digest_with_suffix(
+ digest: &[u8; 32],
+ suffix: &str,
+) -> Result<S3ObjectKey, Error> {
+ if suffix.contains('/') {
+ bail!("invalid suffix containing slashes");
+ }
+ let object_key = hex::encode(digest);
+ let digest_prefix = &object_key[..4];
+ let object_key_string = format!(".chunks/{digest_prefix}/{object_key}{suffix}");
+ Ok(S3ObjectKey::from(object_key_string.as_str()))
+}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 02/45] config: introduce s3 object store client configuration
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (9 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 01/45] datastore: add helpers for path/digest to s3 object key conversion Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 7:22 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 03/45] api: config: implement endpoints to manipulate and list s3 configs Christian Ebner
` (44 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Adds the client configuration for s3 object store as dedicated
configuration files, with secrets being stored separately from the
regular configuration and excluded from api responses for security
reasons.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-config/Cargo.toml | 1 +
pbs-config/src/lib.rs | 1 +
pbs-config/src/s3.rs | 83 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 85 insertions(+)
create mode 100644 pbs-config/src/s3.rs
diff --git a/pbs-config/Cargo.toml b/pbs-config/Cargo.toml
index 284149658..74afb3c64 100644
--- a/pbs-config/Cargo.toml
+++ b/pbs-config/Cargo.toml
@@ -19,6 +19,7 @@ serde_json.workspace = true
proxmox-notify.workspace = true
proxmox-router = { workspace = true, default-features = false }
+proxmox-s3-client.workspace = true
proxmox-schema.workspace = true
proxmox-section-config.workspace = true
proxmox-shared-memory.workspace = true
diff --git a/pbs-config/src/lib.rs b/pbs-config/src/lib.rs
index 9c4d77c24..d03c079ab 100644
--- a/pbs-config/src/lib.rs
+++ b/pbs-config/src/lib.rs
@@ -10,6 +10,7 @@ pub mod network;
pub mod notifications;
pub mod prune;
pub mod remote;
+pub mod s3;
pub mod sync;
pub mod tape_job;
pub mod token_shadow;
diff --git a/pbs-config/src/s3.rs b/pbs-config/src/s3.rs
new file mode 100644
index 000000000..ec3998834
--- /dev/null
+++ b/pbs-config/src/s3.rs
@@ -0,0 +1,83 @@
+use std::collections::HashMap;
+use std::sync::LazyLock;
+
+use anyhow::Error;
+
+use proxmox_s3_client::{S3ClientConfig, S3ClientSecretsConfig};
+use proxmox_schema::*;
+use proxmox_section_config::{SectionConfig, SectionConfigData, SectionConfigPlugin};
+
+use pbs_api_types::JOB_ID_SCHEMA;
+
+use crate::{open_backup_lockfile, replace_backup_config, BackupLockGuard};
+
+pub static CONFIG: LazyLock<SectionConfig> = LazyLock::new(init);
+
+fn init() -> SectionConfig {
+ let obj_schema = match S3ClientConfig::API_SCHEMA {
+ Schema::Object(ref obj_schema) => obj_schema,
+ _ => unreachable!(),
+ };
+ let secrets_obj_schema = match S3ClientSecretsConfig::API_SCHEMA {
+ Schema::Object(ref obj_schema) => obj_schema,
+ _ => unreachable!(),
+ };
+
+ let plugin =
+ SectionConfigPlugin::new("s3client".to_string(), Some(String::from("id")), obj_schema);
+ let secrets_plugin = SectionConfigPlugin::new(
+ "s3secrets".to_string(),
+ Some(String::from("secrets-id")),
+ secrets_obj_schema,
+ );
+ let mut config = SectionConfig::new(&JOB_ID_SCHEMA);
+ config.register_plugin(plugin);
+ config.register_plugin(secrets_plugin);
+
+ config
+}
+
+pub const S3_CFG_FILENAME: &str = "/etc/proxmox-backup/s3.cfg";
+pub const S3_SECRETS_CFG_FILENAME: &str = "/etc/proxmox-backup/s3-secrets.cfg";
+pub const S3_CFG_LOCKFILE: &str = "/etc/proxmox-backup/.s3.lck";
+
+/// Get exclusive lock
+pub fn lock_config() -> Result<BackupLockGuard, Error> {
+ open_backup_lockfile(S3_CFG_LOCKFILE, None, true)
+}
+
+pub fn config() -> Result<(SectionConfigData, [u8; 32]), Error> {
+ parse_config(S3_CFG_FILENAME)
+}
+
+pub fn secrets_config() -> Result<(SectionConfigData, [u8; 32]), Error> {
+ parse_config(S3_SECRETS_CFG_FILENAME)
+}
+
+pub fn save_config(config: &SectionConfigData, secrets: &SectionConfigData) -> Result<(), Error> {
+ let raw = CONFIG.write(S3_CFG_FILENAME, config)?;
+ replace_backup_config(S3_CFG_FILENAME, raw.as_bytes())?;
+
+ let secrets_raw = CONFIG.write(S3_SECRETS_CFG_FILENAME, secrets)?;
+ // Secrets are stored with `backup` permissions to allow reading from
+ // not protected api endpoints as well.
+ replace_backup_config(S3_SECRETS_CFG_FILENAME, secrets_raw.as_bytes())?;
+
+ Ok(())
+}
+
+// shell completion helper
+pub fn complete_s3_client_id(_arg: &str, _param: &HashMap<String, String>) -> Vec<String> {
+ match config() {
+ Ok((data, _digest)) => data.sections.keys().map(|id| id.to_string()).collect(),
+ Err(_) => Vec::new(),
+ }
+}
+
+fn parse_config(path: &str) -> Result<(SectionConfigData, [u8; 32]), Error> {
+ let content = proxmox_sys::fs::file_read_optional_string(path)?;
+ let content = content.unwrap_or_default();
+ let digest = openssl::sha::sha256(content.as_bytes());
+ let data = CONFIG.parse(path, &content)?;
+ Ok((data, digest))
+}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 03/45] api: config: implement endpoints to manipulate and list s3 configs
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (10 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 02/45] config: introduce s3 object store client configuration Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 7:32 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 04/45] api: datastore: check s3 backend bucket access on datastore create Christian Ebner
` (43 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Allows to create, list, modify and delete configurations for s3
clients via the api.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
Cargo.toml | 1 +
src/api2/config/mod.rs | 2 +
src/api2/config/s3.rs | 310 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 313 insertions(+)
create mode 100644 src/api2/config/s3.rs
diff --git a/Cargo.toml b/Cargo.toml
index b6b779cbc..c7a77060e 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -225,6 +225,7 @@ proxmox-notify = { workspace = true, features = [ "pbs-context" ] }
proxmox-openid.workspace = true
proxmox-rest-server = { workspace = true, features = [ "rate-limited-stream" ] }
proxmox-router = { workspace = true, features = [ "cli", "server"] }
+proxmox-s3-client.workspace = true
proxmox-schema = { workspace = true, features = [ "api-macro" ] }
proxmox-section-config.workspace = true
proxmox-serde = { workspace = true, features = [ "serde_json" ] }
diff --git a/src/api2/config/mod.rs b/src/api2/config/mod.rs
index 15dc5db92..1cd9ead76 100644
--- a/src/api2/config/mod.rs
+++ b/src/api2/config/mod.rs
@@ -14,6 +14,7 @@ pub mod metrics;
pub mod notifications;
pub mod prune;
pub mod remote;
+pub mod s3;
pub mod sync;
pub mod tape_backup_job;
pub mod tape_encryption_keys;
@@ -32,6 +33,7 @@ const SUBDIRS: SubdirMap = &sorted!([
("notifications", ¬ifications::ROUTER),
("prune", &prune::ROUTER),
("remote", &remote::ROUTER),
+ ("s3", &s3::ROUTER),
("sync", &sync::ROUTER),
("tape-backup-job", &tape_backup_job::ROUTER),
("tape-encryption-keys", &tape_encryption_keys::ROUTER),
diff --git a/src/api2/config/s3.rs b/src/api2/config/s3.rs
new file mode 100644
index 000000000..c76704f5a
--- /dev/null
+++ b/src/api2/config/s3.rs
@@ -0,0 +1,310 @@
+use ::serde::{Deserialize, Serialize};
+use anyhow::Error;
+use hex::FromHex;
+use serde_json::Value;
+
+use proxmox_router::{http_bail, Permission, Router, RpcEnvironment};
+use proxmox_s3_client::{
+ S3ClientConfig, S3ClientConfigUpdater, S3ClientSecretsConfig, S3ClientSecretsConfigUpdater,
+};
+use proxmox_schema::{api, param_bail};
+
+use pbs_api_types::{JOB_ID_SCHEMA, PRIV_SYS_AUDIT, PRIV_SYS_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA};
+use pbs_config::s3;
+
+#[api(
+ input: {
+ properties: {},
+ },
+ returns: {
+ description: "List configured s3 clients.",
+ type: Array,
+ items: { type: S3ClientConfig },
+ },
+ access: {
+ permission: &Permission::Privilege(&[], PRIV_SYS_AUDIT, false),
+ },
+)]
+/// List all s3 client configurations.
+pub fn list_s3_client_config(
+ _param: Value,
+ rpcenv: &mut dyn RpcEnvironment,
+) -> Result<Vec<S3ClientConfig>, Error> {
+ let (config, digest) = s3::config()?;
+ let list = config.convert_to_typed_array("s3client")?;
+
+ let (_secrets, secrets_digest) = s3::secrets_config()?;
+ let digest = digest_with_secrets(&digest, &secrets_digest);
+ rpcenv["digest"] = hex::encode(digest).into();
+
+ Ok(list)
+}
+
+#[api(
+ protected: true,
+ input: {
+ properties: {
+ config: {
+ type: S3ClientConfig,
+ flatten: true,
+ },
+ secrets: {
+ type: S3ClientSecretsConfig,
+ flatten: true,
+ },
+ },
+ },
+ access: {
+ permission: &Permission::Privilege(&[], PRIV_SYS_MODIFY, false),
+ },
+)]
+/// Create a new s3 client configuration.
+pub fn create_s3_client_config(
+ config: S3ClientConfig,
+ secrets: S3ClientSecretsConfig,
+ _rpcenv: &mut dyn RpcEnvironment,
+) -> Result<(), Error> {
+ // Asssure both, config and secrets are referenced by the same `id`
+ if config.id != secrets.secrets_id {
+ param_bail!(
+ "id",
+ "config and secrets must use the same id ({} != {})",
+ config.id,
+ secrets.secrets_id
+ );
+ }
+
+ let _lock = s3::lock_config()?;
+ let (mut section_config, _digest) = s3::config()?;
+ if section_config.sections.contains_key(&config.id) {
+ param_bail!("id", "s3 client config '{}' already exists.", config.id);
+ }
+
+ let (mut section_secrets, _secrets_digest) = s3::secrets_config()?;
+ if section_secrets.sections.contains_key(&config.id) {
+ param_bail!("id", "s3 secrets config '{}' already exists.", config.id);
+ }
+
+ section_config.set_data(&config.id, "s3client", &config)?;
+ section_secrets.set_data(&config.id, "s3secrets", &secrets)?;
+ s3::save_config(§ion_config, §ion_secrets)?;
+
+ Ok(())
+}
+
+#[api(
+ input: {
+ properties: {
+ id: {
+ schema: JOB_ID_SCHEMA,
+ },
+ },
+ },
+ returns: { type: S3ClientConfig },
+ access: {
+ permission: &Permission::Privilege(&[], PRIV_SYS_AUDIT, false),
+ },
+)]
+/// Read an s3 client configuration.
+pub fn read_s3_client_config(
+ id: String,
+ rpcenv: &mut dyn RpcEnvironment,
+) -> Result<S3ClientConfig, Error> {
+ let (config, digest) = s3::config()?;
+ let s3_client_config: S3ClientConfig = config.lookup("s3client", &id)?;
+
+ let (_secrets, secrets_digest) = s3::secrets_config()?;
+ let digest = digest_with_secrets(&digest, &secrets_digest);
+ rpcenv["digest"] = hex::encode(digest).into();
+
+ Ok(s3_client_config)
+}
+
+#[api()]
+#[derive(Serialize, Deserialize)]
+#[serde(rename_all = "kebab-case")]
+/// Deletable property name
+pub enum DeletableProperty {
+ /// Delete the port property.
+ Port,
+ /// Delete the region property.
+ Region,
+ /// Delete the fingerprint property.
+ Fingerprint,
+ /// Delete the path-style property.
+ PathStyle,
+}
+
+#[api(
+ protected: true,
+ input: {
+ properties: {
+ id: {
+ schema: JOB_ID_SCHEMA,
+ },
+ update: {
+ type: S3ClientConfigUpdater,
+ flatten: true,
+ },
+ "update-secrets": {
+ type: S3ClientSecretsConfigUpdater,
+ flatten: true,
+ },
+ delete: {
+ description: "List of properties to delete.",
+ type: Array,
+ optional: true,
+ items: {
+ type: DeletableProperty,
+ }
+ },
+ digest: {
+ optional: true,
+ schema: PROXMOX_CONFIG_DIGEST_SCHEMA,
+ },
+ },
+ },
+ access: {
+ permission: &Permission::Privilege(&[], PRIV_SYS_MODIFY, false),
+ },
+)]
+/// Update an s3 client configuration.
+#[allow(clippy::too_many_arguments)]
+pub fn update_s3_client_config(
+ id: String,
+ update: S3ClientConfigUpdater,
+ update_secrets: S3ClientSecretsConfigUpdater,
+ delete: Option<Vec<DeletableProperty>>,
+ digest: Option<String>,
+ _rpcenv: &mut dyn RpcEnvironment,
+) -> Result<(), Error> {
+ let _lock = s3::lock_config()?;
+ let (mut config, expected_digest) = s3::config()?;
+ let (mut secrets, secrets_digest) = s3::secrets_config()?;
+ let expected_digest = digest_with_secrets(&expected_digest, &secrets_digest);
+
+ // Secrets are not included in digest concurrent changes therefore not detected.
+ if let Some(ref digest) = digest {
+ let digest = <[u8; 32]>::from_hex(digest)?;
+ crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
+ }
+
+ let mut data: S3ClientConfig = config.lookup("s3client", &id)?;
+
+ if let Some(delete) = delete {
+ for delete_prop in delete {
+ match delete_prop {
+ DeletableProperty::Port => {
+ data.port = None;
+ }
+ DeletableProperty::Region => {
+ data.region = None;
+ }
+ DeletableProperty::Fingerprint => {
+ data.fingerprint = None;
+ }
+ DeletableProperty::PathStyle => {
+ data.path_style = None;
+ }
+ }
+ }
+ }
+
+ if let Some(endpoint) = update.endpoint {
+ data.endpoint = endpoint;
+ }
+ if let Some(port) = update.port {
+ data.port = Some(port);
+ }
+ if let Some(region) = update.region {
+ data.region = Some(region);
+ }
+ if let Some(access_key) = update.access_key {
+ data.access_key = access_key;
+ }
+ if let Some(fingerprint) = update.fingerprint {
+ data.fingerprint = Some(fingerprint);
+ }
+ if let Some(path_style) = update.path_style {
+ data.path_style = Some(path_style);
+ }
+
+ let mut secrets_data: S3ClientSecretsConfig = secrets.lookup("s3secrets", &id)?;
+ if let Some(secret_key) = update_secrets.secret_key {
+ secrets_data.secret_key = secret_key;
+ }
+
+ config.set_data(&id, "s3client", &data)?;
+ secrets.set_data(&id, "s3secrets", &secrets_data)?;
+ s3::save_config(&config, &secrets)?;
+
+ Ok(())
+}
+
+#[api(
+ protected: true,
+ input: {
+ properties: {
+ id: {
+ schema: JOB_ID_SCHEMA,
+ },
+ digest: {
+ optional: true,
+ schema: PROXMOX_CONFIG_DIGEST_SCHEMA,
+ },
+ },
+ },
+ access: {
+ permission: &Permission::Privilege(&[], PRIV_SYS_MODIFY, false),
+ },
+)]
+/// Remove an s3 client configuration.
+pub fn delete_s3_client_config(
+ id: String,
+ digest: Option<String>,
+ _rpcenv: &mut dyn RpcEnvironment,
+) -> Result<(), Error> {
+ let _lock = s3::lock_config()?;
+ let (mut config, expected_digest) = s3::config()?;
+ let (mut secrets, secrets_digest) = s3::secrets_config()?;
+ let expected_digest = digest_with_secrets(&expected_digest, &secrets_digest);
+
+ if let Some(ref digest) = digest {
+ let digest = <[u8; 32]>::from_hex(digest)?;
+ crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
+ }
+
+ match (config.sections.remove(&id), secrets.sections.remove(&id)) {
+ (Some(_), Some(_)) => {}
+ (None, None) => http_bail!(
+ NOT_FOUND,
+ "s3 client config and secrets '{id}' do not exist."
+ ),
+ (Some(_), None) => http_bail!(
+ NOT_FOUND,
+ "removed s3 client config, but no secrets for '{id}' found."
+ ),
+ (None, Some(_)) => http_bail!(
+ NOT_FOUND,
+ "removed s3 client secrets, but no config for '{id}' found."
+ ),
+ }
+ s3::save_config(&config, &secrets)
+}
+
+// Calculate the digest based on the digest of config and secrets to detect changes for both
+fn digest_with_secrets(digest: &[u8; 32], secrets_digest: &[u8; 32]) -> [u8; 32] {
+ let mut digest = digest.to_vec();
+ digest.append(&mut secrets_digest.to_vec());
+ openssl::sha::sha256(&digest)
+}
+
+const ITEM_ROUTER: Router = Router::new()
+ .get(&API_METHOD_READ_S3_CLIENT_CONFIG)
+ .put(&API_METHOD_UPDATE_S3_CLIENT_CONFIG)
+ .delete(&API_METHOD_DELETE_S3_CLIENT_CONFIG);
+
+pub const ROUTER: Router = Router::new()
+ .get(&API_METHOD_LIST_S3_CLIENT_CONFIG)
+ .post(&API_METHOD_CREATE_S3_CLIENT_CONFIG)
+ .match_all("id", &ITEM_ROUTER);
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 04/45] api: datastore: check s3 backend bucket access on datastore create
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (11 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 03/45] api: config: implement endpoints to manipulate and list s3 configs Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 7:40 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 05/45] api/cli: add endpoint and command to check s3 client connection Christian Ebner
` (42 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Check if the configured S3 object store backend can be reached and
the provided secrets have the permissions to access the bucket.
Perform the check before creating the chunk store, so it is not left
behind if the bucket cannot be reached.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
Cargo.toml | 2 +-
src/api2/config/datastore.rs | 48 ++++++++++++++++++++++++++++++++----
2 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/Cargo.toml b/Cargo.toml
index c7a77060e..a5954635a 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -77,7 +77,7 @@ proxmox-rest-server = { version = "1", features = [ "templates" ] }
proxmox-router = { version = "3.2.2", default-features = false }
proxmox-rrd = "1"
proxmox-rrd-api-types = "1.0.2"
-proxmox-s3-client = "1.0.0"
+proxmox-s3-client = { version = "1.0.0", features = [ "impl" ] }
# everything but pbs-config and pbs-client use "api-macro"
proxmox-schema = "4"
proxmox-section-config = "3"
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index b133be707..0fb822c79 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -1,21 +1,22 @@
use std::path::{Path, PathBuf};
use ::serde::{Deserialize, Serialize};
-use anyhow::{bail, Context, Error};
+use anyhow::{bail, format_err, Context, Error};
use hex::FromHex;
use serde_json::Value;
use tracing::{info, warn};
use proxmox_router::{http_bail, Permission, Router, RpcEnvironment, RpcEnvironmentType};
+use proxmox_s3_client::{S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig};
use proxmox_schema::{api, param_bail, ApiType};
use proxmox_section_config::SectionConfigData;
use proxmox_uuid::Uuid;
use pbs_api_types::{
- Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreNotify, DatastoreTuning, KeepOptions,
- MaintenanceMode, PruneJobConfig, PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE,
- PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY, PRIV_SYS_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA,
- UPID_SCHEMA,
+ Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreBackendConfig, DatastoreBackendType,
+ DatastoreNotify, DatastoreTuning, KeepOptions, MaintenanceMode, PruneJobConfig,
+ PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE, PRIV_DATASTORE_AUDIT,
+ PRIV_DATASTORE_MODIFY, PRIV_SYS_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA, UPID_SCHEMA,
};
use pbs_config::BackupLockGuard;
use pbs_datastore::chunk_store::ChunkStore;
@@ -116,6 +117,43 @@ pub(crate) fn do_create_datastore(
.parse_property_string(datastore.tuning.as_deref().unwrap_or(""))?,
)?;
+ if let Some(ref backend_config) = datastore.backend {
+ let backend_config: DatastoreBackendConfig = backend_config.parse()?;
+ match backend_config.ty.unwrap_or_default() {
+ DatastoreBackendType::Filesystem => (),
+ DatastoreBackendType::S3 => {
+ let s3_client_id = backend_config
+ .client
+ .as_ref()
+ .ok_or_else(|| format_err!("missing required client"))?;
+ let bucket = backend_config
+ .bucket
+ .clone()
+ .ok_or_else(|| format_err!("missing required bucket"))?;
+ let (config, _config_digest) =
+ pbs_config::s3::config().context("failed to get s3 config")?;
+ let (secrets, _secrets_digest) =
+ pbs_config::s3::secrets_config().context("failed to get s3 secrets")?;
+ let config: S3ClientConfig = config
+ .lookup("s3client", s3_client_id)
+ .with_context(|| format!("no '{s3_client_id}' in config"))?;
+ let secrets: S3ClientSecretsConfig = secrets
+ .lookup("s3secrets", s3_client_id)
+ .with_context(|| format!("no '{s3_client_id}' in secrets"))?;
+ let options = S3ClientOptions::from_config(
+ config,
+ secrets,
+ bucket,
+ datastore.name.to_owned(),
+ );
+ let s3_client = S3Client::new(options).context("failed to create s3 client")?;
+ // Fine to block since this runs in worker task
+ proxmox_async::runtime::block_on(s3_client.head_bucket())
+ .context("failed to access bucket")?;
+ }
+ }
+ }
+
let unmount_guard = if datastore.backing_device.is_some() {
do_mount_device(datastore.clone())?;
UnmountGuard::new(Some(path.clone()))
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 05/45] api/cli: add endpoint and command to check s3 client connection
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (12 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 04/45] api: datastore: check s3 backend bucket access on datastore create Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 7:43 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 06/45] datastore: allow to get the backend for a datastore Christian Ebner
` (41 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Adds a dedicated api endpoint and a proxmox-backup-manager command to
check if the configured S3 client can reach the bucket.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/admin/mod.rs | 2 +
src/api2/admin/s3.rs | 80 +++++++++++++++++++++++++++
src/bin/proxmox-backup-manager.rs | 1 +
src/bin/proxmox_backup_manager/mod.rs | 2 +
src/bin/proxmox_backup_manager/s3.rs | 46 +++++++++++++++
5 files changed, 131 insertions(+)
create mode 100644 src/api2/admin/s3.rs
create mode 100644 src/bin/proxmox_backup_manager/s3.rs
diff --git a/src/api2/admin/mod.rs b/src/api2/admin/mod.rs
index a1c49f8e2..7694de4b9 100644
--- a/src/api2/admin/mod.rs
+++ b/src/api2/admin/mod.rs
@@ -9,6 +9,7 @@ pub mod gc;
pub mod metrics;
pub mod namespace;
pub mod prune;
+pub mod s3;
pub mod sync;
pub mod traffic_control;
pub mod verify;
@@ -19,6 +20,7 @@ const SUBDIRS: SubdirMap = &sorted!([
("metrics", &metrics::ROUTER),
("prune", &prune::ROUTER),
("gc", &gc::ROUTER),
+ ("s3", &s3::ROUTER),
("sync", &sync::ROUTER),
("traffic-control", &traffic_control::ROUTER),
("verify", &verify::ROUTER),
diff --git a/src/api2/admin/s3.rs b/src/api2/admin/s3.rs
new file mode 100644
index 000000000..d20031707
--- /dev/null
+++ b/src/api2/admin/s3.rs
@@ -0,0 +1,80 @@
+//! S3 bucket operations
+
+use anyhow::{Context, Error};
+use serde_json::Value;
+
+use proxmox_http::Body;
+use proxmox_router::{list_subdirs_api_method, Permission, Router, RpcEnvironment, SubdirMap};
+use proxmox_s3_client::{
+ S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3_BUCKET_NAME_SCHEMA,
+ S3_CLIENT_ID_SCHEMA,
+};
+use proxmox_schema::*;
+use proxmox_sortable_macro::sortable;
+
+use pbs_api_types::PRIV_SYS_MODIFY;
+
+#[api(
+ input: {
+ properties: {
+ "s3-client-id": {
+ schema: S3_CLIENT_ID_SCHEMA,
+ },
+ bucket: {
+ schema: S3_BUCKET_NAME_SCHEMA,
+ },
+ "store-prefix": {
+ type: String,
+ description: "Store prefix within bucket for S3 object keys (commonly datastore name)",
+ },
+ },
+ },
+ access: {
+ permission: &Permission::Privilege(&[], PRIV_SYS_MODIFY, false),
+ },
+)]
+/// Perform basic sanity check for given s3 client configuration
+pub async fn check(
+ s3_client_id: String,
+ bucket: String,
+ store_prefix: String,
+ _rpcenv: &mut dyn RpcEnvironment,
+) -> Result<Value, Error> {
+ let (config, _digest) = pbs_config::s3::config()?;
+ let config: S3ClientConfig = config
+ .lookup("s3client", &s3_client_id)
+ .context("config lookup failed")?;
+ let (secrets, _secrets_digest) = pbs_config::s3::secrets_config()?;
+ let secrets: S3ClientSecretsConfig = secrets
+ .lookup("s3secrets", &s3_client_id)
+ .context("secrets lookup failed")?;
+
+ let options = S3ClientOptions::from_config(config, secrets, bucket, store_prefix);
+
+ let test_object_key = ".s3-client-test";
+ let client = S3Client::new(options).context("client creation failed")?;
+ client.head_bucket().await.context("head object failed")?;
+ client
+ .put_object(test_object_key.into(), Body::empty(), true)
+ .await
+ .context("put object failed")?;
+ client
+ .get_object(test_object_key.into())
+ .await
+ .context("get object failed")?;
+ client
+ .delete_object(test_object_key.into())
+ .await
+ .context("delete object failed")?;
+
+ Ok(Value::Null)
+}
+
+#[sortable]
+const S3_OPERATION_SUBDIRS: SubdirMap = &[("check", &Router::new().get(&API_METHOD_CHECK))];
+
+const S3_OPERATION_ROUTER: Router = Router::new()
+ .get(&list_subdirs_api_method!(S3_OPERATION_SUBDIRS))
+ .subdirs(S3_OPERATION_SUBDIRS);
+
+pub const ROUTER: Router = Router::new().match_all("s3-client-id", &S3_OPERATION_ROUTER);
diff --git a/src/bin/proxmox-backup-manager.rs b/src/bin/proxmox-backup-manager.rs
index d4363e717..68d87c676 100644
--- a/src/bin/proxmox-backup-manager.rs
+++ b/src/bin/proxmox-backup-manager.rs
@@ -677,6 +677,7 @@ async fn run() -> Result<(), Error> {
.insert("garbage-collection", garbage_collection_commands())
.insert("acme", acme_mgmt_cli())
.insert("cert", cert_mgmt_cli())
+ .insert("s3", s3_commands())
.insert("subscription", subscription_commands())
.insert("sync-job", sync_job_commands())
.insert("verify-job", verify_job_commands())
diff --git a/src/bin/proxmox_backup_manager/mod.rs b/src/bin/proxmox_backup_manager/mod.rs
index 9b5c73e9a..312a6db6b 100644
--- a/src/bin/proxmox_backup_manager/mod.rs
+++ b/src/bin/proxmox_backup_manager/mod.rs
@@ -26,6 +26,8 @@ mod prune;
pub use prune::*;
mod remote;
pub use remote::*;
+mod s3;
+pub use s3::*;
mod subscription;
pub use subscription::*;
mod sync;
diff --git a/src/bin/proxmox_backup_manager/s3.rs b/src/bin/proxmox_backup_manager/s3.rs
new file mode 100644
index 000000000..9bb89ff55
--- /dev/null
+++ b/src/bin/proxmox_backup_manager/s3.rs
@@ -0,0 +1,46 @@
+use proxmox_router::{cli::*, RpcEnvironment};
+use proxmox_s3_client::{S3_BUCKET_NAME_SCHEMA, S3_CLIENT_ID_SCHEMA};
+use proxmox_schema::api;
+
+use proxmox_backup::api2;
+
+use anyhow::Error;
+use serde_json::Value;
+
+#[api(
+ input: {
+ properties: {
+ "s3-client-id": {
+ schema: S3_CLIENT_ID_SCHEMA,
+ },
+ bucket: {
+ schema: S3_BUCKET_NAME_SCHEMA,
+ },
+ "store-prefix": {
+ type: String,
+ description: "Store prefix within bucket for S3 object keys (commonly datastore name)",
+ },
+ },
+ },
+)]
+/// Perform basic sanity checks for given S3 client configuration
+async fn check(
+ s3_client_id: String,
+ bucket: String,
+ store_prefix: String,
+ rpcenv: &mut dyn RpcEnvironment,
+) -> Result<Value, Error> {
+ api2::admin::s3::check(s3_client_id, bucket, store_prefix, rpcenv).await?;
+ Ok(Value::Null)
+}
+
+pub fn s3_commands() -> CommandLineInterface {
+ let cmd_def = CliCommandMap::new().insert(
+ "check",
+ CliCommand::new(&API_METHOD_CHECK)
+ .arg_param(&["s3-client-id", "bucket"])
+ .completion_cb("s3-client-id", pbs_config::s3::complete_s3_client_id),
+ );
+
+ cmd_def.into()
+}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 06/45] datastore: allow to get the backend for a datastore
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (13 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 05/45] api/cli: add endpoint and command to check s3 client connection Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 7:52 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 07/45] api: backup: store datastore backend in runtime environment Christian Ebner
` (40 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Implements an enum with variants Filesystem and S3 to distinguish
between available backends. Filesystem will be used as default, if no
backend is configured in the datastores configuration. If the
datastore has an s3 backend configured, the backend method will
instantiate and s3 client and return it with the S3 variant.
This allows to instantiate the client once, keeping and reusing the
same open connection to the api for the lifetime of task or job, e.g.
in the backup writer/readers runtime environment.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/src/datastore.rs | 52 ++++++++++++++++++++++++++++++++--
pbs-datastore/src/lib.rs | 1 +
2 files changed, 51 insertions(+), 2 deletions(-)
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 924d8cf9c..90ab80005 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -12,6 +12,7 @@ use pbs_tools::lru_cache::LruCache;
use tracing::{info, warn};
use proxmox_human_byte::HumanByte;
+use proxmox_s3_client::{S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig};
use proxmox_schema::ApiType;
use proxmox_sys::error::SysError;
@@ -23,8 +24,8 @@ use proxmox_worker_task::WorkerTaskContext;
use pbs_api_types::{
ArchiveType, Authid, BackupGroupDeleteStats, BackupNamespace, BackupType, ChunkOrder,
- DataStoreConfig, DatastoreFSyncLevel, DatastoreTuning, GarbageCollectionStatus,
- MaintenanceMode, MaintenanceType, Operation, UPID,
+ DataStoreConfig, DatastoreBackendConfig, DatastoreBackendType, DatastoreFSyncLevel,
+ DatastoreTuning, GarbageCollectionStatus, MaintenanceMode, MaintenanceType, Operation, UPID,
};
use pbs_config::BackupLockGuard;
@@ -127,6 +128,7 @@ pub struct DataStoreImpl {
chunk_order: ChunkOrder,
last_digest: Option<[u8; 32]>,
sync_level: DatastoreFSyncLevel,
+ backend_config: DatastoreBackendConfig,
}
impl DataStoreImpl {
@@ -141,6 +143,7 @@ impl DataStoreImpl {
chunk_order: Default::default(),
last_digest: None,
sync_level: Default::default(),
+ backend_config: Default::default(),
})
}
}
@@ -196,6 +199,12 @@ impl Drop for DataStore {
}
}
+#[derive(Clone)]
+pub enum DatastoreBackend {
+ Filesystem,
+ S3(Arc<S3Client>),
+}
+
impl DataStore {
// This one just panics on everything
#[doc(hidden)]
@@ -206,6 +215,39 @@ impl DataStore {
})
}
+ /// Get the backend for this datastore based on it's configuration
+ pub fn backend(&self) -> Result<DatastoreBackend, Error> {
+ let backend_type = match self.inner.backend_config.ty.unwrap_or_default() {
+ DatastoreBackendType::Filesystem => DatastoreBackend::Filesystem,
+ DatastoreBackendType::S3 => {
+ let s3_client_id = self
+ .inner
+ .backend_config
+ .client
+ .as_ref()
+ .ok_or_else(|| format_err!("missing client for s3 backend"))?;
+ let bucket = self
+ .inner
+ .backend_config
+ .bucket
+ .clone()
+ .ok_or_else(|| format_err!("missing bucket for s3 backend"))?;
+
+ let (config, _config_digest) = pbs_config::s3::config()?;
+ let (secrets, _secrets_digest) = pbs_config::s3::secrets_config()?;
+ let config: S3ClientConfig = config.lookup("s3client", s3_client_id)?;
+ let secrets: S3ClientSecretsConfig = secrets.lookup("s3secrets", s3_client_id)?;
+
+ let options =
+ S3ClientOptions::from_config(config, secrets, bucket, self.name().to_owned());
+ let s3_client = S3Client::new(options)?;
+ DatastoreBackend::S3(Arc::new(s3_client))
+ }
+ };
+
+ Ok(backend_type)
+ }
+
pub fn lookup_datastore(
name: &str,
operation: Option<Operation>,
@@ -383,6 +425,11 @@ impl DataStore {
.parse_property_string(config.tuning.as_deref().unwrap_or(""))?,
)?;
+ let backend_config: DatastoreBackendConfig = serde_json::from_value(
+ DatastoreBackendConfig::API_SCHEMA
+ .parse_property_string(config.backend.as_deref().unwrap_or(""))?,
+ )?;
+
Ok(DataStoreImpl {
chunk_store,
gc_mutex: Mutex::new(()),
@@ -391,6 +438,7 @@ impl DataStore {
chunk_order: tuning.chunk_order.unwrap_or_default(),
last_digest,
sync_level: tuning.sync_level.unwrap_or_default(),
+ backend_config,
})
}
diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
index ffd0d91b2..ca6fdb7d8 100644
--- a/pbs-datastore/src/lib.rs
+++ b/pbs-datastore/src/lib.rs
@@ -204,6 +204,7 @@ pub use store_progress::StoreProgress;
mod datastore;
pub use datastore::{
check_backup_owner, ensure_datastore_is_mounted, get_datastore_mount_status, DataStore,
+ DatastoreBackend,
};
mod hierarchy;
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 07/45] api: backup: store datastore backend in runtime environment
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (14 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 06/45] datastore: allow to get the backend for a datastore Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 7:54 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 08/45] api: backup: conditionally upload chunks to s3 object store backend Christian Ebner
` (39 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Get and store the datastore's backend during creation of the backup
runtime environment and upload the chunks to the local filesystem or
s3 object store based on the backend variant.
By storing the backend variant in the environment the s3 client is
instantiated only once and reused for all api calls in the same
backup http/2 connection.
Refactor the upgrade method by moving all logic into the async block,
such that the now possible error on backup environment creation gets
propagated to the thread spawn call side.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/backup/environment.rs | 11 +--
src/api2/backup/mod.rs | 128 ++++++++++++++++-----------------
2 files changed, 71 insertions(+), 68 deletions(-)
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index 1d8f64aa0..7bd86f39c 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -16,7 +16,7 @@ use pbs_api_types::Authid;
use pbs_datastore::backup_info::{BackupDir, BackupInfo};
use pbs_datastore::dynamic_index::DynamicIndexWriter;
use pbs_datastore::fixed_index::FixedIndexWriter;
-use pbs_datastore::{DataBlob, DataStore};
+use pbs_datastore::{DataBlob, DataStore, DatastoreBackend};
use proxmox_rest_server::{formatter::*, WorkerTask};
use crate::backup::VerifyWorker;
@@ -116,6 +116,7 @@ pub struct BackupEnvironment {
pub datastore: Arc<DataStore>,
pub backup_dir: BackupDir,
pub last_backup: Option<BackupInfo>,
+ pub backend: DatastoreBackend,
state: Arc<Mutex<SharedBackupState>>,
}
@@ -126,7 +127,7 @@ impl BackupEnvironment {
worker: Arc<WorkerTask>,
datastore: Arc<DataStore>,
backup_dir: BackupDir,
- ) -> Self {
+ ) -> Result<Self, Error> {
let state = SharedBackupState {
finished: false,
uid_counter: 0,
@@ -138,7 +139,8 @@ impl BackupEnvironment {
backup_stat: UploadStatistic::new(),
};
- Self {
+ let backend = datastore.backend()?;
+ Ok(Self {
result_attributes: json!({}),
env_type,
auth_id,
@@ -148,8 +150,9 @@ impl BackupEnvironment {
formatter: JSON_FORMATTER,
backup_dir,
last_backup: None,
+ backend,
state: Arc::new(Mutex::new(state)),
- }
+ })
}
/// Register a Chunk with associated length.
diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
index a723e7cb0..026f1f106 100644
--- a/src/api2/backup/mod.rs
+++ b/src/api2/backup/mod.rs
@@ -187,7 +187,8 @@ fn upgrade_to_backup_protocol(
}
// lock last snapshot to prevent forgetting/pruning it during backup
- let guard = last.backup_dir
+ let guard = last
+ .backup_dir
.lock_shared()
.with_context(|| format!("while locking last snapshot during backup '{last:?}'"))?;
Some(guard)
@@ -206,14 +207,14 @@ fn upgrade_to_backup_protocol(
Some(worker_id),
auth_id.to_string(),
true,
- move |worker| {
+ move |worker| async move {
let mut env = BackupEnvironment::new(
env_type,
auth_id,
worker.clone(),
datastore,
backup_dir,
- );
+ )?;
env.debug = debug;
env.last_backup = last_backup;
@@ -247,74 +248,73 @@ fn upgrade_to_backup_protocol(
http.max_frame_size(4 * 1024 * 1024);
let env3 = env2.clone();
- http.serve_connection(conn, TowerToHyperService::new(service)).map(move |result| {
- match result {
- Err(err) => {
- // Avoid Transport endpoint is not connected (os error 107)
- // fixme: find a better way to test for that error
- if err.to_string().starts_with("connection error")
- && env3.finished()
- {
- Ok(())
- } else {
- Err(Error::from(err))
+ http.serve_connection(conn, TowerToHyperService::new(service))
+ .map(move |result| {
+ match result {
+ Err(err) => {
+ // Avoid Transport endpoint is not connected (os error 107)
+ // fixme: find a better way to test for that error
+ if err.to_string().starts_with("connection error")
+ && env3.finished()
+ {
+ Ok(())
+ } else {
+ Err(Error::from(err))
+ }
}
+ Ok(()) => Ok(()),
}
- Ok(()) => Ok(()),
- }
- })
+ })
});
let mut abort_future = abort_future.map(|_| Err(format_err!("task aborted")));
- async move {
- // keep flock until task ends
- let _group_guard = _group_guard;
- let snap_guard = snap_guard;
- let _last_guard = _last_guard;
-
- let res = select! {
- req = req_fut => req,
- abrt = abort_future => abrt,
- };
- if benchmark {
- env.log("benchmark finished successfully");
- proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
- return Ok(());
+ // keep flock until task ends
+ let _group_guard = _group_guard;
+ let snap_guard = snap_guard;
+ let _last_guard = _last_guard;
+
+ let res = select! {
+ req = req_fut => req,
+ abrt = abort_future => abrt,
+ };
+ if benchmark {
+ env.log("benchmark finished successfully");
+ proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
+ return Ok(());
+ }
+
+ let verify = |env: BackupEnvironment| {
+ if let Err(err) = env.verify_after_complete(snap_guard) {
+ env.log(format!(
+ "backup finished, but starting the requested verify task failed: {}",
+ err
+ ));
}
+ };
- let verify = |env: BackupEnvironment| {
- if let Err(err) = env.verify_after_complete(snap_guard) {
- env.log(format!(
- "backup finished, but starting the requested verify task failed: {}",
- err
- ));
- }
- };
-
- match (res, env.ensure_finished()) {
- (Ok(_), Ok(())) => {
- env.log("backup finished successfully");
- verify(env);
- Ok(())
- }
- (Err(err), Ok(())) => {
- // ignore errors after finish
- env.log(format!("backup had errors but finished: {}", err));
- verify(env);
- Ok(())
- }
- (Ok(_), Err(err)) => {
- env.log(format!("backup ended and finish failed: {}", err));
- env.log("removing unfinished backup");
- proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
- Err(err)
- }
- (Err(err), Err(_)) => {
- env.log(format!("backup failed: {}", err));
- env.log("removing failed backup");
- proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
- Err(err)
- }
+ match (res, env.ensure_finished()) {
+ (Ok(_), Ok(())) => {
+ env.log("backup finished successfully");
+ verify(env);
+ Ok(())
+ }
+ (Err(err), Ok(())) => {
+ // ignore errors after finish
+ env.log(format!("backup had errors but finished: {}", err));
+ verify(env);
+ Ok(())
+ }
+ (Ok(_), Err(err)) => {
+ env.log(format!("backup ended and finish failed: {}", err));
+ env.log("removing unfinished backup");
+ proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
+ Err(err)
+ }
+ (Err(err), Err(_)) => {
+ env.log(format!("backup failed: {}", err));
+ env.log("removing failed backup");
+ proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
+ Err(err)
}
}
},
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 08/45] api: backup: conditionally upload chunks to s3 object store backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (15 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 07/45] api: backup: store datastore backend in runtime environment Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 8:11 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 09/45] api: backup: conditionally upload blobs " Christian Ebner
` (38 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Upload fixed and dynamic sized chunks to either the filesystem or
the S3 object store, depending on the configured backend.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/backup/upload_chunk.rs | 71 +++++++++++++++++++--------------
1 file changed, 42 insertions(+), 29 deletions(-)
diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
index 2c66c2855..3ad8c3c75 100644
--- a/src/api2/backup/upload_chunk.rs
+++ b/src/api2/backup/upload_chunk.rs
@@ -16,7 +16,7 @@ use proxmox_sortable_macro::sortable;
use pbs_api_types::{BACKUP_ARCHIVE_NAME_SCHEMA, CHUNK_DIGEST_SCHEMA};
use pbs_datastore::file_formats::{DataBlobHeader, EncryptedDataBlobHeader};
-use pbs_datastore::{DataBlob, DataStore};
+use pbs_datastore::{DataBlob, DataStore, DatastoreBackend};
use pbs_tools::json::{required_integer_param, required_string_param};
use super::environment::*;
@@ -154,22 +154,10 @@ fn upload_fixed_chunk(
) -> ApiResponseFuture {
async move {
let wid = required_integer_param(¶m, "wid")? as usize;
- let size = required_integer_param(¶m, "size")? as u32;
- let encoded_size = required_integer_param(¶m, "encoded-size")? as u32;
-
- let digest_str = required_string_param(¶m, "digest")?;
- let digest = <[u8; 32]>::from_hex(digest_str)?;
-
let env: &BackupEnvironment = rpcenv.as_ref();
- let (digest, size, compressed_size, is_duplicate) = UploadChunk::new(
- BodyDataStream::new(req_body),
- env.datastore.clone(),
- digest,
- size,
- encoded_size,
- )
- .await?;
+ let (digest, size, compressed_size, is_duplicate) =
+ upload_to_backend(req_body, param, env).await?;
env.register_fixed_chunk(wid, digest, size, compressed_size, is_duplicate)?;
let digest_str = hex::encode(digest);
@@ -229,22 +217,10 @@ fn upload_dynamic_chunk(
) -> ApiResponseFuture {
async move {
let wid = required_integer_param(¶m, "wid")? as usize;
- let size = required_integer_param(¶m, "size")? as u32;
- let encoded_size = required_integer_param(¶m, "encoded-size")? as u32;
-
- let digest_str = required_string_param(¶m, "digest")?;
- let digest = <[u8; 32]>::from_hex(digest_str)?;
-
let env: &BackupEnvironment = rpcenv.as_ref();
- let (digest, size, compressed_size, is_duplicate) = UploadChunk::new(
- BodyDataStream::new(req_body),
- env.datastore.clone(),
- digest,
- size,
- encoded_size,
- )
- .await?;
+ let (digest, size, compressed_size, is_duplicate) =
+ upload_to_backend(req_body, param, env).await?;
env.register_dynamic_chunk(wid, digest, size, compressed_size, is_duplicate)?;
let digest_str = hex::encode(digest);
@@ -256,6 +232,43 @@ fn upload_dynamic_chunk(
.boxed()
}
+async fn upload_to_backend(
+ req_body: Incoming,
+ param: Value,
+ env: &BackupEnvironment,
+) -> Result<([u8; 32], u32, u32, bool), Error> {
+ let size = required_integer_param(¶m, "size")? as u32;
+ let encoded_size = required_integer_param(¶m, "encoded-size")? as u32;
+ let digest_str = required_string_param(¶m, "digest")?;
+ let digest = <[u8; 32]>::from_hex(digest_str)?;
+
+ match &env.backend {
+ DatastoreBackend::Filesystem => {
+ UploadChunk::new(
+ BodyDataStream::new(req_body),
+ env.datastore.clone(),
+ digest,
+ size,
+ encoded_size,
+ )
+ .await
+ }
+ DatastoreBackend::S3(s3_client) => {
+ let data = req_body.collect().await?.to_bytes();
+ if encoded_size != data.len() as u32 {
+ bail!(
+ "got blob with unexpected length ({encoded_size} != {})",
+ data.len()
+ );
+ }
+
+ let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
+ let is_duplicate = s3_client.upload_with_retry(object_key, data, false).await?;
+ Ok((digest, size, encoded_size, is_duplicate))
+ }
+ }
+}
+
pub const API_METHOD_UPLOAD_SPEEDTEST: ApiMethod = ApiMethod::new(
&ApiHandler::AsyncHttp(&upload_speedtest),
&ObjectSchema::new("Test upload speed.", &[]),
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 09/45] api: backup: conditionally upload blobs to s3 object store backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (16 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 08/45] api: backup: conditionally upload chunks to s3 object store backend Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 8:13 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 10/45] api: backup: conditionally upload indices " Christian Ebner
` (37 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Upload blobs to both, the local datastore cache and the S3 object
store if s3 is configured as backend.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/backup/environment.rs | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index 7bd86f39c..3d4677975 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -581,6 +581,22 @@ impl BackupEnvironment {
let blob = DataBlob::load_from_reader(&mut &data[..])?;
let raw_data = blob.raw_data();
+ if let DatastoreBackend::S3(s3_client) = &self.backend {
+ let object_key = pbs_datastore::s3::object_key_from_path(
+ &self.backup_dir.relative_path(),
+ file_name,
+ )
+ .context("invalid blob object key")?;
+ let data = hyper::body::Bytes::copy_from_slice(raw_data);
+ proxmox_async::runtime::block_on(s3_client.upload_with_retry(
+ object_key.clone(),
+ data,
+ true,
+ ))
+ .context("failed to upload blob to s3 backend")?;
+ self.log(format!("Uploaded blob to object store: {object_key}"))
+ }
+
replace_file(&path, raw_data, CreateOptions::new(), false)?;
self.log(format!(
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 10/45] api: backup: conditionally upload indices to s3 object store backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (17 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 09/45] api: backup: conditionally upload blobs " Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 8:20 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 11/45] api: backup: conditionally upload manifest " Christian Ebner
` (36 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
If the datastore is backed by an S3 compatible object store, upload
the dynamic or fixed index files to the object store after closing
them. The local index files are kept in the local caching datastore
to allow for fast and efficient content lookups, avoiding expensive
(as in monetary cost and IO latency) requests.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- fix clippy warning and formatting
src/api2/backup/environment.rs | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index 3d4677975..9ad13aeb3 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -2,6 +2,7 @@ use anyhow::{bail, format_err, Context, Error};
use pbs_config::BackupLockGuard;
use std::collections::HashMap;
+use std::io::Read;
use std::sync::{Arc, Mutex};
use tracing::info;
@@ -18,6 +19,7 @@ use pbs_datastore::dynamic_index::DynamicIndexWriter;
use pbs_datastore::fixed_index::FixedIndexWriter;
use pbs_datastore::{DataBlob, DataStore, DatastoreBackend};
use proxmox_rest_server::{formatter::*, WorkerTask};
+use proxmox_s3_client::S3Client;
use crate::backup::VerifyWorker;
@@ -479,6 +481,13 @@ impl BackupEnvironment {
);
}
+ // For S3 backends, upload the index file to the object store after closing
+ if let DatastoreBackend::S3(s3_client) = &self.backend {
+ self.s3_upload_index(s3_client, &data.name)
+ .context("failed to upload dynamic index to s3 backend")?;
+ self.log(format!("Uploaded index file to s3 backend: {}", data.name))
+ }
+
self.log_upload_stat(
&data.name,
&csum,
@@ -553,6 +562,16 @@ impl BackupEnvironment {
);
}
+ // For S3 backends, upload the index file to the object store after closing
+ if let DatastoreBackend::S3(s3_client) = &self.backend {
+ self.s3_upload_index(s3_client, &data.name)
+ .context("failed to upload fixed index to s3 backend")?;
+ self.log(format!(
+ "Uploaded fixed index file to object store: {}",
+ data.name
+ ))
+ }
+
self.log_upload_stat(
&data.name,
&expected_csum,
@@ -753,6 +772,21 @@ impl BackupEnvironment {
Ok(())
}
+
+ fn s3_upload_index(&self, s3_client: &S3Client, name: &str) -> Result<(), Error> {
+ let object_key =
+ pbs_datastore::s3::object_key_from_path(&self.backup_dir.relative_path(), name)
+ .context("invalid index file object key")?;
+
+ let mut full_path = self.backup_dir.full_path();
+ full_path.push(name);
+ let mut file = std::fs::File::open(&full_path)?;
+ let mut buffer = Vec::new();
+ file.read_to_end(&mut buffer)?;
+ let data = hyper::body::Bytes::from(buffer);
+ proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))?;
+ Ok(())
+ }
}
impl RpcEnvironment for BackupEnvironment {
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 11/45] api: backup: conditionally upload manifest to s3 object store backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (18 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 10/45] api: backup: conditionally upload indices " Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 8:26 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 12/45] api: datastore: conditionally upload client log to s3 backend Christian Ebner
` (35 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
Reupload the manifest to the S3 object store backend on manifest
updates, if s3 is configured as backend.
This also triggers the initial manifest upload when finishing backup
snapshot in the backup api call handler.
Updates also the locally cached version for fast and efficient
listing of contents without the need to perform expensive (as in
monetary cost and IO latency) requests.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/Cargo.toml | 3 +++
pbs-datastore/src/backup_info.rs | 12 +++++++++++-
src/api2/admin/datastore.rs | 14 ++++++++++++--
src/api2/backup/environment.rs | 16 ++++++++--------
src/backup/verify.rs | 2 +-
5 files changed, 35 insertions(+), 12 deletions(-)
diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
index c42eff165..7e56dbd31 100644
--- a/pbs-datastore/Cargo.toml
+++ b/pbs-datastore/Cargo.toml
@@ -13,6 +13,7 @@ crc32fast.workspace = true
endian_trait.workspace = true
futures.workspace = true
hex = { workspace = true, features = [ "serde" ] }
+hyper.workspace = true
libc.workspace = true
log.workspace = true
nix.workspace = true
@@ -29,8 +30,10 @@ zstd-safe.workspace = true
pathpatterns.workspace = true
pxar.workspace = true
+proxmox-async.workspace = true
proxmox-base64.workspace = true
proxmox-borrow.workspace = true
+proxmox-http.workspace = true
proxmox-human-byte.workspace = true
proxmox-io.workspace = true
proxmox-lang.workspace=true
diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
index e3ecd437f..46e5b61f0 100644
--- a/pbs-datastore/src/backup_info.rs
+++ b/pbs-datastore/src/backup_info.rs
@@ -19,7 +19,7 @@ use pbs_api_types::{
use pbs_config::{open_backup_lockfile, BackupLockGuard};
use crate::manifest::{BackupManifest, MANIFEST_LOCK_NAME};
-use crate::{DataBlob, DataStore};
+use crate::{DataBlob, DataStore, DatastoreBackend};
pub const DATASTORE_LOCKS_DIR: &str = "/run/proxmox-backup/locks";
const PROTECTED_MARKER_FILENAME: &str = ".protected";
@@ -666,6 +666,7 @@ impl BackupDir {
/// only use this method - anything else may break locking guarantees.
pub fn update_manifest(
&self,
+ backend: &DatastoreBackend,
update_fn: impl FnOnce(&mut BackupManifest),
) -> Result<(), Error> {
let _guard = self.lock_manifest()?;
@@ -678,6 +679,15 @@ impl BackupDir {
let blob = DataBlob::encode(manifest.as_bytes(), None, true)?;
let raw_data = blob.raw_data();
+ if let DatastoreBackend::S3(s3_client) = backend {
+ let object_key =
+ super::s3::object_key_from_path(&self.relative_path(), MANIFEST_BLOB_NAME.as_ref())
+ .context("invalid manifest object key")?;
+ let data = hyper::body::Bytes::copy_from_slice(raw_data);
+ proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))
+ .context("failed to update manifest on s3 backend")?;
+ }
+
let mut path = self.full_path();
path.push(MANIFEST_BLOB_NAME.as_ref());
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index e24bc1c1b..02666afda 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -65,7 +65,7 @@ use pbs_datastore::manifest::BackupManifest;
use pbs_datastore::prune::compute_prune_info;
use pbs_datastore::{
check_backup_owner, ensure_datastore_is_mounted, task_tracking, BackupDir, BackupGroup,
- DataStore, LocalChunkReader, StoreProgress,
+ DataStore, DatastoreBackend, LocalChunkReader, StoreProgress,
};
use pbs_tools::json::required_string_param;
use proxmox_rest_server::{formatter, WorkerTask};
@@ -2086,6 +2086,16 @@ pub fn set_group_notes(
&backup_group,
)?;
+ if let DatastoreBackend::S3(s3_client) = datastore.backend()? {
+ let mut path = ns.path();
+ path.push(format!("{backup_group}"));
+ let object_key = pbs_datastore::s3::object_key_from_path(&path, "notes")
+ .context("invalid owner file object key")?;
+ let data = hyper::body::Bytes::copy_from_slice(notes.as_bytes());
+ let _is_duplicate =
+ proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))
+ .context("failed to set notes on s3 backend")?;
+ }
let notes_path = datastore.group_notes_path(&ns, &backup_group);
replace_file(notes_path, notes.as_bytes(), CreateOptions::new(), false)?;
@@ -2188,7 +2198,7 @@ pub fn set_notes(
let backup_dir = datastore.backup_dir(ns, backup_dir)?;
backup_dir
- .update_manifest(|manifest| {
+ .update_manifest(&datastore.backend()?, |manifest| {
manifest.unprotected["notes"] = notes.into();
})
.map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index 9ad13aeb3..0017b347d 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -646,14 +646,6 @@ impl BackupEnvironment {
bail!("backup does not contain valid files (file count == 0)");
}
- // check for valid manifest and store stats
- let stats = serde_json::to_value(state.backup_stat)?;
- self.backup_dir
- .update_manifest(|manifest| {
- manifest.unprotected["chunk_upload_stats"] = stats;
- })
- .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
-
if let Some(base) = &self.last_backup {
let path = base.backup_dir.full_path();
if !path.exists() {
@@ -664,6 +656,14 @@ impl BackupEnvironment {
}
}
+ // check for valid manifest and store stats
+ let stats = serde_json::to_value(state.backup_stat)?;
+ self.backup_dir
+ .update_manifest(&self.backend, |manifest| {
+ manifest.unprotected["chunk_upload_stats"] = stats;
+ })
+ .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
+
self.datastore.try_ensure_sync_level()?;
// marks the backup as successful
diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index 0b954ae23..9344033d8 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -359,7 +359,7 @@ impl VerifyWorker {
if let Err(err) = {
let verify_state = serde_json::to_value(verify_state)?;
- backup_dir.update_manifest(|manifest| {
+ backup_dir.update_manifest(&self.datastore.backend()?, |manifest| {
manifest.unprotected["verify_state"] = verify_state;
})
} {
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 12/45] api: datastore: conditionally upload client log to s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (19 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 11/45] api: backup: conditionally upload manifest " Christian Ebner
@ 2025-07-15 12:52 ` Christian Ebner
2025-07-18 8:28 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 13/45] sync: pull: conditionally upload content " Christian Ebner
` (34 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:52 UTC (permalink / raw)
To: pbs-devel
If the datastore is backed by an s3 compatible object store, upload
the client log content to the s3 backend before persisting it to the
local cache store.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/admin/datastore.rs | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 02666afda..b28b646e8 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1637,6 +1637,17 @@ pub fn upload_backup_log(
// always verify blob/CRC at server side
let blob = DataBlob::load_from_reader(&mut &data[..])?;
+ if let DatastoreBackend::S3(s3_client) = datastore.backend()? {
+ let object_key = pbs_datastore::s3::object_key_from_path(
+ &backup_dir.relative_path(),
+ file_name.as_ref(),
+ )
+ .context("invalid client log object key")?;
+ let data = hyper::body::Bytes::copy_from_slice(blob.raw_data());
+ proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))
+ .context("failed to upload client log to s3 backend")?;
+ };
+
replace_file(&path, blob.raw_data(), CreateOptions::new(), false)?;
// fixme: use correct formatter
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 13/45] sync: pull: conditionally upload content to s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (20 preceding siblings ...)
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 12/45] api: datastore: conditionally upload client log to s3 backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 8:35 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend Christian Ebner
` (33 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
If the datastore is backed by an S3 object store, not only insert the
pulled contents to the local cache store, but also upload it to the
S3 backend.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/server/pull.rs | 66 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 63 insertions(+), 3 deletions(-)
diff --git a/src/server/pull.rs b/src/server/pull.rs
index b1724c142..fe87359ab 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -6,8 +6,9 @@ use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::{Arc, Mutex};
use std::time::SystemTime;
-use anyhow::{bail, format_err, Error};
+use anyhow::{bail, format_err, Context, Error};
use proxmox_human_byte::HumanByte;
+use tokio::io::AsyncReadExt;
use tracing::info;
use pbs_api_types::{
@@ -24,7 +25,7 @@ use pbs_datastore::fixed_index::FixedIndexReader;
use pbs_datastore::index::IndexFile;
use pbs_datastore::manifest::{BackupManifest, FileInfo};
use pbs_datastore::read_chunk::AsyncReadChunk;
-use pbs_datastore::{check_backup_owner, DataStore, StoreProgress};
+use pbs_datastore::{check_backup_owner, DataStore, DatastoreBackend, StoreProgress};
use pbs_tools::sha::sha256;
use super::sync::{
@@ -167,7 +168,20 @@ async fn pull_index_chunks<I: IndexFile>(
move |(chunk, digest, size): (DataBlob, [u8; 32], u64)| {
// println!("verify and write {}", hex::encode(&digest));
chunk.verify_unencrypted(size as usize, &digest)?;
- target2.insert_chunk(&chunk, &digest)?;
+ match target2.backend()? {
+ DatastoreBackend::Filesystem => {
+ target2.insert_chunk(&chunk, &digest)?;
+ }
+ DatastoreBackend::S3(s3_client) => {
+ let data = chunk.raw_data().to_vec();
+ let upload_data = hyper::body::Bytes::from(data);
+ let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
+ let _is_duplicate = proxmox_async::runtime::block_on(
+ s3_client.upload_with_retry(object_key, upload_data, false),
+ )
+ .context("failed to upload chunk to s3 backend")?;
+ }
+ }
Ok(())
},
);
@@ -331,6 +345,18 @@ async fn pull_single_archive<'a>(
if let Err(err) = std::fs::rename(&tmp_path, &path) {
bail!("Atomic rename file {:?} failed - {}", path, err);
}
+ if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
+ let object_key =
+ pbs_datastore::s3::object_key_from_path(&snapshot.relative_path(), archive_name)
+ .context("invalid archive object key")?;
+
+ let archive = tokio::fs::File::open(&path).await?;
+ let mut reader = tokio::io::BufReader::new(archive);
+ let mut contents = Vec::new();
+ reader.read_to_end(&mut contents).await?;
+ let data = hyper::body::Bytes::from(contents);
+ let _is_duplicate = s3_client.upload_with_retry(object_key, data, true).await?;
+ }
Ok(sync_stats)
}
@@ -401,6 +427,7 @@ async fn pull_snapshot<'a>(
}
}
+ let manifest_data = tmp_manifest_blob.raw_data().to_vec();
let manifest = BackupManifest::try_from(tmp_manifest_blob)?;
if ignore_not_verified_or_encrypted(
@@ -467,9 +494,42 @@ async fn pull_snapshot<'a>(
if let Err(err) = std::fs::rename(&tmp_manifest_name, &manifest_name) {
bail!("Atomic rename file {:?} failed - {}", manifest_name, err);
}
+ if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
+ let object_key = pbs_datastore::s3::object_key_from_path(
+ &snapshot.relative_path(),
+ MANIFEST_BLOB_NAME.as_ref(),
+ )
+ .context("invalid manifest object key")?;
+
+ let data = hyper::body::Bytes::from(manifest_data);
+ let _is_duplicate = s3_client
+ .upload_with_retry(object_key, data, true)
+ .await
+ .context("failed to upload manifest to s3 backend")?;
+ }
if !client_log_name.exists() {
reader.try_download_client_log(&client_log_name).await?;
+ if client_log_name.exists() {
+ if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
+ let object_key = pbs_datastore::s3::object_key_from_path(
+ &snapshot.relative_path(),
+ CLIENT_LOG_BLOB_NAME.as_ref(),
+ )
+ .context("invalid archive object key")?;
+
+ let log_file = tokio::fs::File::open(&client_log_name).await?;
+ let mut reader = tokio::io::BufReader::new(log_file);
+ let mut contents = Vec::new();
+ reader.read_to_end(&mut contents).await?;
+
+ let data = hyper::body::Bytes::from(contents);
+ let _is_duplicate = s3_client
+ .upload_with_retry(object_key, data, true)
+ .await
+ .context("failed to upload client log to s3 backend")?;
+ }
+ }
};
snapshot
.cleanup_unreferenced_files(&manifest)
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (21 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 13/45] sync: pull: conditionally upload content " Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 8:38 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 15/45] datastore: local chunk reader: read chunks based on backend Christian Ebner
` (32 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Read the chunk based on the datastores backend, reading from local
filesystem or fetching from S3 object store.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/reader/environment.rs | 12 ++++++----
src/api2/reader/mod.rs | 41 +++++++++++++++++++++++-----------
2 files changed, 36 insertions(+), 17 deletions(-)
diff --git a/src/api2/reader/environment.rs b/src/api2/reader/environment.rs
index 3b2f06f43..8924352b0 100644
--- a/src/api2/reader/environment.rs
+++ b/src/api2/reader/environment.rs
@@ -1,13 +1,14 @@
use std::collections::HashSet;
use std::sync::{Arc, RwLock};
+use anyhow::Error;
use serde_json::{json, Value};
use proxmox_router::{RpcEnvironment, RpcEnvironmentType};
use pbs_api_types::Authid;
use pbs_datastore::backup_info::BackupDir;
-use pbs_datastore::DataStore;
+use pbs_datastore::{DataStore, DatastoreBackend};
use proxmox_rest_server::formatter::*;
use proxmox_rest_server::WorkerTask;
use tracing::info;
@@ -23,6 +24,7 @@ pub struct ReaderEnvironment {
pub worker: Arc<WorkerTask>,
pub datastore: Arc<DataStore>,
pub backup_dir: BackupDir,
+ pub backend: DatastoreBackend,
allowed_chunks: Arc<RwLock<HashSet<[u8; 32]>>>,
}
@@ -33,8 +35,9 @@ impl ReaderEnvironment {
worker: Arc<WorkerTask>,
datastore: Arc<DataStore>,
backup_dir: BackupDir,
- ) -> Self {
- Self {
+ ) -> Result<Self, Error> {
+ let backend = datastore.backend()?;
+ Ok(Self {
result_attributes: json!({}),
env_type,
auth_id,
@@ -43,8 +46,9 @@ impl ReaderEnvironment {
debug: tracing::enabled!(tracing::Level::DEBUG),
formatter: JSON_FORMATTER,
backup_dir,
+ backend,
allowed_chunks: Arc::new(RwLock::new(HashSet::new())),
- }
+ })
}
pub fn log<S: AsRef<str>>(&self, msg: S) {
diff --git a/src/api2/reader/mod.rs b/src/api2/reader/mod.rs
index a77216043..997d9ca77 100644
--- a/src/api2/reader/mod.rs
+++ b/src/api2/reader/mod.rs
@@ -3,6 +3,7 @@
use anyhow::{bail, format_err, Context, Error};
use futures::*;
use hex::FromHex;
+use http_body_util::BodyExt;
use hyper::body::Incoming;
use hyper::header::{self, HeaderValue, CONNECTION, UPGRADE};
use hyper::http::request::Parts;
@@ -27,8 +28,9 @@ use pbs_api_types::{
};
use pbs_config::CachedUserInfo;
use pbs_datastore::index::IndexFile;
-use pbs_datastore::{DataStore, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
+use pbs_datastore::{DataStore, DatastoreBackend, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
use pbs_tools::json::required_string_param;
+use proxmox_s3_client::S3Client;
use crate::api2::backup::optional_ns_param;
use crate::api2::helpers;
@@ -162,7 +164,7 @@ fn upgrade_to_backup_reader_protocol(
worker.clone(),
datastore,
backup_dir,
- );
+ )?;
env.debug = debug;
@@ -323,17 +325,10 @@ fn download_chunk(
));
}
- let (path, _) = env.datastore.chunk_path(&digest);
- let path2 = path.clone();
-
- env.debug(format!("download chunk {:?}", path));
-
- let data =
- proxmox_async::runtime::block_in_place(|| std::fs::read(path)).map_err(move |err| {
- http_err!(BAD_REQUEST, "reading file {:?} failed: {}", path2, err)
- })?;
-
- let body = Body::from(data);
+ let body = match &env.backend {
+ DatastoreBackend::Filesystem => load_from_filesystem(env, &digest)?,
+ DatastoreBackend::S3(s3_client) => fetch_from_object_store(s3_client, &digest).await?,
+ };
// fixme: set other headers ?
Ok(Response::builder()
@@ -345,6 +340,26 @@ fn download_chunk(
.boxed()
}
+async fn fetch_from_object_store(s3_client: &S3Client, digest: &[u8; 32]) -> Result<Body, Error> {
+ let object_key = pbs_datastore::s3::object_key_from_digest(digest)?;
+ if let Some(response) = s3_client.get_object(object_key).await? {
+ let data = response.content.collect().await?.to_bytes();
+ return Ok(Body::from(data));
+ }
+ bail!("cannot find chunk with digest {}", hex::encode(digest));
+}
+
+fn load_from_filesystem(env: &ReaderEnvironment, digest: &[u8; 32]) -> Result<Body, Error> {
+ let (path, _) = env.datastore.chunk_path(digest);
+ let path2 = path.clone();
+
+ env.debug(format!("download chunk {path:?}"));
+
+ let data = proxmox_async::runtime::block_in_place(|| std::fs::read(path))
+ .map_err(move |err| http_err!(BAD_REQUEST, "reading file {path2:?} failed: {err}"))?;
+ Ok(Body::from(data))
+}
+
/* this is too slow
fn download_chunk_old(
_parts: Parts,
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 15/45] datastore: local chunk reader: read chunks based on backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (22 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 8:45 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 16/45] verify worker: add datastore backed to verify worker Christian Ebner
` (31 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Get and store the datastore's backend on local chunk reader
instantiantion and fetch chunks based on the variant from either the
filesystem or the s3 object store.
By storing the backend variant, the s3 client is instantiated only
once and reused until the local chunk reader instance is dropped.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/Cargo.toml | 1 +
pbs-datastore/src/local_chunk_reader.rs | 38 +++++++++++++++++++++----
2 files changed, 33 insertions(+), 6 deletions(-)
diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
index 7e56dbd31..8ce930a94 100644
--- a/pbs-datastore/Cargo.toml
+++ b/pbs-datastore/Cargo.toml
@@ -13,6 +13,7 @@ crc32fast.workspace = true
endian_trait.workspace = true
futures.workspace = true
hex = { workspace = true, features = [ "serde" ] }
+http-body-util.workspace = true
hyper.workspace = true
libc.workspace = true
log.workspace = true
diff --git a/pbs-datastore/src/local_chunk_reader.rs b/pbs-datastore/src/local_chunk_reader.rs
index 05a70c068..f5aa217ae 100644
--- a/pbs-datastore/src/local_chunk_reader.rs
+++ b/pbs-datastore/src/local_chunk_reader.rs
@@ -3,17 +3,21 @@ use std::pin::Pin;
use std::sync::Arc;
use anyhow::{bail, Error};
+use http_body_util::BodyExt;
use pbs_api_types::CryptMode;
use pbs_tools::crypt_config::CryptConfig;
+use proxmox_s3_client::S3Client;
use crate::data_blob::DataBlob;
+use crate::datastore::DatastoreBackend;
use crate::read_chunk::{AsyncReadChunk, ReadChunk};
use crate::DataStore;
#[derive(Clone)]
pub struct LocalChunkReader {
store: Arc<DataStore>,
+ backend: DatastoreBackend,
crypt_config: Option<Arc<CryptConfig>>,
crypt_mode: CryptMode,
}
@@ -24,8 +28,11 @@ impl LocalChunkReader {
crypt_config: Option<Arc<CryptConfig>>,
crypt_mode: CryptMode,
) -> Self {
+ // TODO: Error handling!
+ let backend = store.backend().unwrap();
Self {
store,
+ backend,
crypt_config,
crypt_mode,
}
@@ -47,10 +54,26 @@ impl LocalChunkReader {
}
}
+async fn fetch(s3_client: Arc<S3Client>, digest: &[u8; 32]) -> Result<DataBlob, Error> {
+ let object_key = crate::s3::object_key_from_digest(digest)?;
+ if let Some(response) = s3_client.get_object(object_key).await? {
+ let bytes = response.content.collect().await?.to_bytes();
+ DataBlob::from_raw(bytes.to_vec())
+ } else {
+ bail!("no object with digest {}", hex::encode(digest));
+ }
+}
+
impl ReadChunk for LocalChunkReader {
fn read_raw_chunk(&self, digest: &[u8; 32]) -> Result<DataBlob, Error> {
- let chunk = self.store.load_chunk(digest)?;
+ let chunk = match &self.backend {
+ DatastoreBackend::Filesystem => self.store.load_chunk(digest)?,
+ DatastoreBackend::S3(s3_client) => {
+ proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?
+ }
+ };
self.ensure_crypt_mode(chunk.crypt_mode()?)?;
+
Ok(chunk)
}
@@ -69,11 +92,14 @@ impl AsyncReadChunk for LocalChunkReader {
digest: &'a [u8; 32],
) -> Pin<Box<dyn Future<Output = Result<DataBlob, Error>> + Send + 'a>> {
Box::pin(async move {
- let (path, _) = self.store.chunk_path(digest);
-
- let raw_data = tokio::fs::read(&path).await?;
-
- let chunk = DataBlob::load_from_reader(&mut &raw_data[..])?;
+ let chunk = match &self.backend {
+ DatastoreBackend::Filesystem => {
+ let (path, _) = self.store.chunk_path(digest);
+ let raw_data = tokio::fs::read(&path).await?;
+ DataBlob::load_from_reader(&mut &raw_data[..])?
+ }
+ DatastoreBackend::S3(s3_client) => fetch(s3_client.clone(), digest).await?,
+ };
self.ensure_crypt_mode(chunk.crypt_mode()?)?;
Ok(chunk)
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 16/45] verify worker: add datastore backed to verify worker
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (23 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 15/45] datastore: local chunk reader: read chunks based on backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 17/45] verify: implement chunk verification for stores with s3 backend Christian Ebner
` (30 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
In order to fetch chunks from an S3 compatible object store,
instantiate and store the s3 client in the verify worker by storing
the datastore's backend. This allows to reuse the same instance for
the whole verification task.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/admin/datastore.rs | 2 +-
src/api2/backup/environment.rs | 2 +-
src/backup/verify.rs | 14 ++++++++++----
src/server/verify_job.rs | 2 +-
4 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index b28b646e8..35fcb2ac5 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -883,7 +883,7 @@ pub fn verify(
auth_id.to_string(),
to_stdout,
move |worker| {
- let verify_worker = VerifyWorker::new(worker.clone(), datastore);
+ let verify_worker = VerifyWorker::new(worker.clone(), datastore)?;
let failed_dirs = if let Some(backup_dir) = backup_dir {
let mut res = Vec::new();
if !verify_worker.verify_backup_dir(
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index 0017b347d..369385368 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -710,7 +710,7 @@ impl BackupEnvironment {
move |worker| {
worker.log_message("Automatically verifying newly added snapshot");
- let verify_worker = VerifyWorker::new(worker.clone(), datastore);
+ let verify_worker = VerifyWorker::new(worker.clone(), datastore)?;
if !verify_worker.verify_backup_dir_with_lock(
&backup_dir,
worker.upid().clone(),
diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index 9344033d8..dea10f618 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -17,7 +17,7 @@ use pbs_api_types::{
use pbs_datastore::backup_info::{BackupDir, BackupGroup, BackupInfo};
use pbs_datastore::index::IndexFile;
use pbs_datastore::manifest::{BackupManifest, FileInfo};
-use pbs_datastore::{DataBlob, DataStore, StoreProgress};
+use pbs_datastore::{DataBlob, DataStore, DatastoreBackend, StoreProgress};
use crate::tools::parallel_handler::ParallelHandler;
@@ -30,19 +30,25 @@ pub struct VerifyWorker {
datastore: Arc<DataStore>,
verified_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
corrupt_chunks: Arc<Mutex<HashSet<[u8; 32]>>>,
+ backend: DatastoreBackend,
}
impl VerifyWorker {
/// Creates a new VerifyWorker for a given task worker and datastore.
- pub fn new(worker: Arc<dyn WorkerTaskContext>, datastore: Arc<DataStore>) -> Self {
- Self {
+ pub fn new(
+ worker: Arc<dyn WorkerTaskContext>,
+ datastore: Arc<DataStore>,
+ ) -> Result<Self, Error> {
+ let backend = datastore.backend()?;
+ Ok(Self {
worker,
datastore,
// start with 16k chunks == up to 64G data
verified_chunks: Arc::new(Mutex::new(HashSet::with_capacity(16 * 1024))),
// start with 64 chunks since we assume there are few corrupt ones
corrupt_chunks: Arc::new(Mutex::new(HashSet::with_capacity(64))),
- }
+ backend,
+ })
}
fn verify_blob(backup_dir: &BackupDir, info: &FileInfo) -> Result<(), Error> {
diff --git a/src/server/verify_job.rs b/src/server/verify_job.rs
index 95a7b2a9b..c8792174b 100644
--- a/src/server/verify_job.rs
+++ b/src/server/verify_job.rs
@@ -41,7 +41,7 @@ pub fn do_verification_job(
None => Default::default(),
};
- let verify_worker = VerifyWorker::new(worker.clone(), datastore);
+ let verify_worker = VerifyWorker::new(worker.clone(), datastore)?;
let result = verify_worker.verify_all_backups(
worker.upid(),
ns,
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 17/45] verify: implement chunk verification for stores with s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (24 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 16/45] verify worker: add datastore backed to verify worker Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 8:56 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 18/45] datastore: create namespace marker in " Christian Ebner
` (29 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
For datastores backed by an S3 compatible object store, rather than
reading the chunks to be verified from the local filesystem, fetch
them via the s3 client from the configured bucket.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/backup/verify.rs | 89 ++++++++++++++++++++++++++++++++++++++------
1 file changed, 77 insertions(+), 12 deletions(-)
diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index dea10f618..3a4a1d0d5 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -5,6 +5,7 @@ use std::sync::{Arc, Mutex};
use std::time::Instant;
use anyhow::{bail, Error};
+use http_body_util::BodyExt;
use tracing::{error, info, warn};
use proxmox_worker_task::WorkerTaskContext;
@@ -89,6 +90,38 @@ impl VerifyWorker {
}
}
+ if let Ok(DatastoreBackend::S3(s3_client)) = datastore.backend() {
+ let suffix = format!(".{}.bad", counter);
+ let target_key =
+ match pbs_datastore::s3::object_key_from_digest_with_suffix(digest, &suffix) {
+ Ok(target_key) => target_key,
+ Err(err) => {
+ info!("could not generate target key for corrupted chunk {path:?} - {err}");
+ return;
+ }
+ };
+ let object_key = match pbs_datastore::s3::object_key_from_digest(digest) {
+ Ok(object_key) => object_key,
+ Err(err) => {
+ info!("could not generate object key for corrupted chunk {path:?} - {err}");
+ return;
+ }
+ };
+ if proxmox_async::runtime::block_on(
+ s3_client.copy_object(object_key.clone(), target_key),
+ )
+ .is_ok()
+ {
+ if proxmox_async::runtime::block_on(s3_client.delete_object(object_key)).is_err() {
+ info!("failed to delete corrupt chunk on s3 backend: {digest_str}");
+ }
+ } else {
+ info!("failed to copy corrupt chunk on s3 backend: {digest_str}");
+ }
+ } else {
+ info!("failed to get s3 backend while trying to rename bad chunk: {digest_str}");
+ }
+
match std::fs::rename(&path, &new_path) {
Ok(_) => {
info!("corrupted chunk renamed to {:?}", &new_path);
@@ -189,18 +222,50 @@ impl VerifyWorker {
continue; // already verified or marked corrupt
}
- match self.datastore.load_chunk(&info.digest) {
- Err(err) => {
- self.corrupt_chunks.lock().unwrap().insert(info.digest);
- error!("can't verify chunk, load failed - {err}");
- errors.fetch_add(1, Ordering::SeqCst);
- Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
- }
- Ok(chunk) => {
- let size = info.size();
- read_bytes += chunk.raw_size();
- decoder_pool.send((chunk, info.digest, size))?;
- decoded_bytes += size;
+ match &self.backend {
+ DatastoreBackend::Filesystem => match self.datastore.load_chunk(&info.digest) {
+ Err(err) => {
+ self.corrupt_chunks.lock().unwrap().insert(info.digest);
+ error!("can't verify chunk, load failed - {err}");
+ errors.fetch_add(1, Ordering::SeqCst);
+ Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
+ }
+ Ok(chunk) => {
+ let size = info.size();
+ read_bytes += chunk.raw_size();
+ decoder_pool.send((chunk, info.digest, size))?;
+ decoded_bytes += size;
+ }
+ },
+ DatastoreBackend::S3(s3_client) => {
+ let object_key = pbs_datastore::s3::object_key_from_digest(&info.digest)?;
+ match proxmox_async::runtime::block_on(s3_client.get_object(object_key)) {
+ Ok(Some(response)) => {
+ let bytes =
+ proxmox_async::runtime::block_on(response.content.collect())?
+ .to_bytes();
+ let chunk = DataBlob::from_raw(bytes.to_vec())?;
+ let size = info.size();
+ read_bytes += chunk.raw_size();
+ decoder_pool.send((chunk, info.digest, size))?;
+ decoded_bytes += size;
+ }
+ Ok(None) => {
+ self.corrupt_chunks.lock().unwrap().insert(info.digest);
+ error!(
+ "can't verify missing chunk with digest {}",
+ hex::encode(info.digest)
+ );
+ errors.fetch_add(1, Ordering::SeqCst);
+ Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
+ }
+ Err(err) => {
+ self.corrupt_chunks.lock().unwrap().insert(info.digest);
+ error!("can't verify chunk, load failed - {err}");
+ errors.fetch_add(1, Ordering::SeqCst);
+ Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
+ }
+ }
}
}
}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 18/45] datastore: create namespace marker in s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (25 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 17/45] verify: implement chunk verification for stores with s3 backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 19/45] datastore: create/delete protected marker file on s3 storage backend Christian Ebner
` (28 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
The S3 object store only allows to store objects, referenced by their
key. For backup namespaces datastores however use directories, so
they cannot be represented as one to one mapping.
Instead, create an empty marker file for each namespace and operate
based on that.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/src/datastore.rs | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 90ab80005..0bc14e31d 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -42,6 +42,7 @@ static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
LazyLock::new(|| Mutex::new(HashMap::new()));
const GROUP_NOTES_FILE_NAME: &str = "notes";
+const NAMESPACE_MARKER_FILENAME: &str = ".namespace";
/// checks if auth_id is owner, or, if owner is a token, if
/// auth_id is the user of the token
@@ -607,6 +608,17 @@ impl DataStore {
// construct ns before mkdir to enforce max-depth and name validity
let ns = BackupNamespace::from_parent_ns(parent, name)?;
+ if let DatastoreBackend::S3(s3_client) = self.backend()? {
+ let object_key = crate::s3::object_key_from_path(&ns.path(), NAMESPACE_MARKER_FILENAME)
+ .context("invalid namespace marker object key")?;
+ let _is_duplicate = proxmox_async::runtime::block_on(s3_client.upload_with_retry(
+ object_key,
+ hyper::body::Bytes::from(""),
+ false,
+ ))
+ .context("failed to create namespace on s3 backend")?;
+ }
+
let mut ns_full_path = self.base_path();
ns_full_path.push(ns.path());
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 19/45] datastore: create/delete protected marker file on s3 storage backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (26 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 18/45] datastore: create namespace marker in " Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 20/45] datastore: prune groups/snapshots from s3 object store backend Christian Ebner
` (27 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Commit 8292d3d2 ("api2/admin/datastore: add get/set_protection")
introduced the protected flag for backup snapshots, considering
snapshots as protected based on the presence/absence of the
`.protected` marker file in the corresponding snapshot directory.
To allow independent recovery of a datastore backed by an S3 bucket,
also create/delete the marker file on the object store backend. For
actual checks, still rely on the marker as encountered in the local
cache store.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/src/backup_info.rs | 2 +-
pbs-datastore/src/datastore.rs | 43 +++++++++++++++++++++++++++-----
2 files changed, 38 insertions(+), 7 deletions(-)
diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
index 46e5b61f0..112d82658 100644
--- a/pbs-datastore/src/backup_info.rs
+++ b/pbs-datastore/src/backup_info.rs
@@ -22,7 +22,7 @@ use crate::manifest::{BackupManifest, MANIFEST_LOCK_NAME};
use crate::{DataBlob, DataStore, DatastoreBackend};
pub const DATASTORE_LOCKS_DIR: &str = "/run/proxmox-backup/locks";
-const PROTECTED_MARKER_FILENAME: &str = ".protected";
+pub const PROTECTED_MARKER_FILENAME: &str = ".protected";
proxmox_schema::const_regex! {
pub BACKUP_FILES_AND_PROTECTED_REGEX = concatcp!(r"^(.*\.([fd]idx|blob)|\", PROTECTED_MARKER_FILENAME, ")$");
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 0bc14e31d..099c65ce2 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -29,7 +29,9 @@ use pbs_api_types::{
};
use pbs_config::BackupLockGuard;
-use crate::backup_info::{BackupDir, BackupGroup, BackupInfo, OLD_LOCKING};
+use crate::backup_info::{
+ BackupDir, BackupGroup, BackupInfo, OLD_LOCKING, PROTECTED_MARKER_FILENAME,
+};
use crate::chunk_store::ChunkStore;
use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
@@ -1554,12 +1556,41 @@ impl DataStore {
let protected_path = backup_dir.protected_file();
if protection {
- std::fs::File::create(protected_path)
+ std::fs::File::create(&protected_path)
.map_err(|err| format_err!("could not create protection file: {}", err))?;
- } else if let Err(err) = std::fs::remove_file(protected_path) {
- // ignore error for non-existing file
- if err.kind() != std::io::ErrorKind::NotFound {
- bail!("could not remove protection file: {}", err);
+ if let DatastoreBackend::S3(s3_client) = self.backend()? {
+ let object_key = crate::s3::object_key_from_path(
+ &backup_dir.relative_path(),
+ PROTECTED_MARKER_FILENAME,
+ )
+ .context("invalid protected marker object key")?;
+ let _is_duplicate = proxmox_async::runtime::block_on(s3_client.upload_with_retry(
+ object_key,
+ hyper::body::Bytes::from(""),
+ false,
+ ))
+ .context("failed to mark snapshot as protected on s3 backend")?;
+ }
+ } else {
+ if let Err(err) = std::fs::remove_file(&protected_path) {
+ // ignore error for non-existing file
+ if err.kind() != std::io::ErrorKind::NotFound {
+ bail!("could not remove protection file: {err}");
+ }
+ }
+ if let DatastoreBackend::S3(s3_client) = self.backend()? {
+ let object_key = crate::s3::object_key_from_path(
+ &backup_dir.relative_path(),
+ PROTECTED_MARKER_FILENAME,
+ )
+ .context("invalid protected marker object key")?;
+ if let Err(err) =
+ proxmox_async::runtime::block_on(s3_client.delete_object(object_key))
+ {
+ std::fs::File::create(&protected_path)
+ .map_err(|err| format_err!("could not re-create protection file: {err}"))?;
+ return Err(err);
+ }
}
}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 20/45] datastore: prune groups/snapshots from s3 object store backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (27 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 19/45] datastore: create/delete protected marker file on s3 storage backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 21/45] datastore: get and set owner for s3 " Christian Ebner
` (26 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
When pruning a backup group or a backup snapshot for a datastore with
S3 object store backend, remove the associated objects by removing
them based on the prefix.
In order to exclude protected contents, add a filtering based on the
presence of the protected marker.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/src/backup_info.rs | 49 ++++++++++++++++++++++++++++++--
pbs-datastore/src/datastore.rs | 43 ++++++++++++++++++++++++----
src/api2/admin/datastore.rs | 24 ++++++++++------
3 files changed, 100 insertions(+), 16 deletions(-)
diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
index 112d82658..2e811f32a 100644
--- a/pbs-datastore/src/backup_info.rs
+++ b/pbs-datastore/src/backup_info.rs
@@ -9,6 +9,7 @@ use std::time::Duration;
use anyhow::{bail, format_err, Context, Error};
use const_format::concatcp;
+use proxmox_s3_client::S3PathPrefix;
use proxmox_sys::fs::{lock_dir_noblock, lock_dir_noblock_shared, replace_file, CreateOptions};
use proxmox_systemd::escape_unit;
@@ -18,7 +19,9 @@ use pbs_api_types::{
};
use pbs_config::{open_backup_lockfile, BackupLockGuard};
+use crate::datastore::GROUP_NOTES_FILE_NAME;
use crate::manifest::{BackupManifest, MANIFEST_LOCK_NAME};
+use crate::s3::S3_CONTENT_PREFIX;
use crate::{DataBlob, DataStore, DatastoreBackend};
pub const DATASTORE_LOCKS_DIR: &str = "/run/proxmox-backup/locks";
@@ -218,7 +221,7 @@ impl BackupGroup {
///
/// Returns `BackupGroupDeleteStats`, containing the number of deleted snapshots
/// and number of protected snaphsots, which therefore were not removed.
- pub fn destroy(&self) -> Result<BackupGroupDeleteStats, Error> {
+ pub fn destroy(&self, backend: &DatastoreBackend) -> Result<BackupGroupDeleteStats, Error> {
let _guard = self
.lock()
.with_context(|| format!("while destroying group '{self:?}'"))?;
@@ -232,10 +235,30 @@ impl BackupGroup {
delete_stats.increment_protected_snapshots();
continue;
}
- snap.destroy(false)?;
+ // also for S3 cleanup local only, the actual S3 objects will be removed below,
+ // reducing the number of required API calls.
+ snap.destroy(false, &DatastoreBackend::Filesystem)?;
delete_stats.increment_removed_snapshots();
}
+ if let DatastoreBackend::S3(s3_client) = backend {
+ let path = self.relative_group_path();
+ let group_prefix = path
+ .to_str()
+ .ok_or_else(|| format_err!("invalid group path prefix"))?;
+ let prefix = format!("{S3_CONTENT_PREFIX}/{group_prefix}");
+ let delete_objects_error = proxmox_async::runtime::block_on(
+ s3_client.delete_objects_by_prefix_with_suffix_filter(
+ &S3PathPrefix::Some(prefix),
+ PROTECTED_MARKER_FILENAME,
+ &["owner", GROUP_NOTES_FILE_NAME],
+ ),
+ )?;
+ if delete_objects_error {
+ bail!("deleting objects failed");
+ }
+ }
+
// Note: make sure the old locking mechanism isn't used as `remove_dir_all` is not safe in
// that case
if delete_stats.all_removed() && !*OLD_LOCKING {
@@ -588,7 +611,7 @@ impl BackupDir {
/// Destroy the whole snapshot, bails if it's protected
///
/// Setting `force` to true skips locking and thus ignores if the backup is currently in use.
- pub fn destroy(&self, force: bool) -> Result<(), Error> {
+ pub fn destroy(&self, force: bool, backend: &DatastoreBackend) -> Result<(), Error> {
let (_guard, _manifest_guard);
if !force {
_guard = self
@@ -601,6 +624,20 @@ impl BackupDir {
bail!("cannot remove protected snapshot"); // use special error type?
}
+ if let DatastoreBackend::S3(s3_client) = backend {
+ let path = self.relative_path();
+ let snapshot_prefix = path
+ .to_str()
+ .ok_or_else(|| format_err!("invalid snapshot path"))?;
+ let prefix = format!("{S3_CONTENT_PREFIX}/{snapshot_prefix}");
+ let delete_objects_error = proxmox_async::runtime::block_on(
+ s3_client.delete_objects_by_prefix(&S3PathPrefix::Some(prefix)),
+ )?;
+ if delete_objects_error {
+ bail!("deleting objects failed");
+ }
+ }
+
let full_path = self.full_path();
log::info!("removing backup snapshot {:?}", full_path);
std::fs::remove_dir_all(&full_path).map_err(|err| {
@@ -630,6 +667,12 @@ impl BackupDir {
// do to rectify the situation.
if guard.is_ok() && group.list_backups()?.is_empty() && !*OLD_LOCKING {
group.remove_group_dir()?;
+ if let DatastoreBackend::S3(s3_client) = backend {
+ let object_key =
+ super::s3::object_key_from_path(&group.relative_group_path(), "owner")
+ .context("invalid owner file object key")?;
+ proxmox_async::runtime::block_on(s3_client.delete_object(object_key))?;
+ }
} else if let Err(err) = guard {
log::debug!("{err:#}");
}
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 099c65ce2..265624229 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -12,7 +12,9 @@ use pbs_tools::lru_cache::LruCache;
use tracing::{info, warn};
use proxmox_human_byte::HumanByte;
-use proxmox_s3_client::{S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig};
+use proxmox_s3_client::{
+ S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3PathPrefix,
+};
use proxmox_schema::ApiType;
use proxmox_sys::error::SysError;
@@ -37,13 +39,14 @@ use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
use crate::hierarchy::{ListGroups, ListGroupsType, ListNamespaces, ListNamespacesRecursive};
use crate::index::IndexFile;
+use crate::s3::S3_CONTENT_PREFIX;
use crate::task_tracking::{self, update_active_operations};
use crate::DataBlob;
static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
LazyLock::new(|| Mutex::new(HashMap::new()));
-const GROUP_NOTES_FILE_NAME: &str = "notes";
+pub const GROUP_NOTES_FILE_NAME: &str = "notes";
const NAMESPACE_MARKER_FILENAME: &str = ".namespace";
/// checks if auth_id is owner, or, if owner is a token, if
@@ -654,7 +657,9 @@ impl DataStore {
let mut stats = BackupGroupDeleteStats::default();
for group in self.iter_backup_groups(ns.to_owned())? {
- let delete_stats = group?.destroy()?;
+ let group = group?;
+ let backend = self.backend()?;
+ let delete_stats = group.destroy(&backend)?;
stats.add(&delete_stats);
removed_all_groups = removed_all_groups && delete_stats.all_removed();
}
@@ -688,6 +693,8 @@ impl DataStore {
let store = self.name();
let mut removed_all_requested = true;
let mut stats = BackupGroupDeleteStats::default();
+ let backend = self.backend()?;
+
if delete_groups {
log::info!("removing whole namespace recursively below {store}:/{ns}",);
for ns in self.recursive_iter_backup_ns(ns.to_owned())? {
@@ -695,6 +702,24 @@ impl DataStore {
stats.add(&delete_stats);
removed_all_requested = removed_all_requested && removed_ns_groups;
}
+
+ if let DatastoreBackend::S3(s3_client) = &backend {
+ let ns_dir = ns.path();
+ let ns_prefix = ns_dir
+ .to_str()
+ .ok_or_else(|| format_err!("invalid namespace path prefix"))?;
+ let prefix = format!("{S3_CONTENT_PREFIX}/{ns_prefix}");
+ let delete_objects_error = proxmox_async::runtime::block_on(
+ s3_client.delete_objects_by_prefix_with_suffix_filter(
+ &S3PathPrefix::Some(prefix),
+ PROTECTED_MARKER_FILENAME,
+ &["owner", GROUP_NOTES_FILE_NAME],
+ ),
+ )?;
+ if delete_objects_error {
+ bail!("deleting objects failed");
+ }
+ }
} else {
log::info!("pruning empty namespace recursively below {store}:/{ns}");
}
@@ -730,6 +755,14 @@ impl DataStore {
log::warn!("failed to remove namespace {ns} - {err}")
}
}
+ if let DatastoreBackend::S3(s3_client) = &backend {
+ // Only remove the namespace marker, if it was empty,
+ // than this is the same as the namespace being removed.
+ let object_key =
+ crate::s3::object_key_from_path(&ns.path(), NAMESPACE_MARKER_FILENAME)
+ .context("invalid namespace marker object key")?;
+ proxmox_async::runtime::block_on(s3_client.delete_object(object_key))?;
+ }
}
}
@@ -747,7 +780,7 @@ impl DataStore {
) -> Result<BackupGroupDeleteStats, Error> {
let backup_group = self.backup_group(ns.clone(), backup_group.clone());
- backup_group.destroy()
+ backup_group.destroy(&self.backend()?)
}
/// Remove a backup directory including all content
@@ -759,7 +792,7 @@ impl DataStore {
) -> Result<(), Error> {
let backup_dir = self.backup_dir(ns.clone(), backup_dir.clone())?;
- backup_dir.destroy(force)
+ backup_dir.destroy(force, &self.backend()?)
}
/// Returns the time of the last successful backup
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 35fcb2ac5..80740e3fb 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -422,7 +422,7 @@ pub async fn delete_snapshot(
let snapshot = datastore.backup_dir(ns, backup_dir)?;
- snapshot.destroy(false)?;
+ snapshot.destroy(false, &datastore.backend()?)?;
Ok(Value::Null)
})
@@ -1088,13 +1088,21 @@ pub fn prune(
});
if !keep {
- if let Err(err) = backup_dir.destroy(false) {
- warn!(
- "failed to remove dir {:?}: {}",
- backup_dir.relative_path(),
- err,
- );
- }
+ match datastore.backend() {
+ Ok(backend) => {
+ if let Err(err) = backup_dir.destroy(false, &backend) {
+ warn!(
+ "failed to remove dir {:?}: {}",
+ backup_dir.relative_path(),
+ err,
+ );
+ }
+ }
+ Err(err) => warn!(
+ "failed to remove dir {:?}: {err}",
+ backup_dir.relative_path()
+ ),
+ };
}
}
prune_result
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 21/45] datastore: get and set owner for s3 store backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (28 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 20/45] datastore: prune groups/snapshots from s3 object store backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 9:25 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 22/45] datastore: implement garbage collection for s3 backend Christian Ebner
` (25 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Read or write the ownership information from/to the corresponding
object in the S3 object store. Keep that information available if
the bucket is reused as datastore.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/src/datastore.rs | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 265624229..ca099c1d0 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -7,6 +7,7 @@ use std::sync::{Arc, LazyLock, Mutex};
use std::time::Duration;
use anyhow::{bail, format_err, Context, Error};
+use http_body_util::BodyExt;
use nix::unistd::{unlinkat, UnlinkatFlags};
use pbs_tools::lru_cache::LruCache;
use tracing::{info, warn};
@@ -832,6 +833,21 @@ impl DataStore {
backup_group: &pbs_api_types::BackupGroup,
) -> Result<Authid, Error> {
let full_path = self.owner_path(ns, backup_group);
+
+ if let DatastoreBackend::S3(s3_client) = self.backend()? {
+ let mut path = ns.path();
+ path.push(format!("{backup_group}"));
+ let object_key = crate::s3::object_key_from_path(&path, "owner")
+ .context("invalid owner file object key")?;
+ let response = proxmox_async::runtime::block_on(s3_client.get_object(object_key))?
+ .ok_or_else(|| format_err!("fetching owner failed"))?;
+ let content = proxmox_async::runtime::block_on(response.content.collect())?;
+ let owner = String::from_utf8(content.to_bytes().trim_ascii_end().to_vec())?;
+ return owner
+ .parse()
+ .map_err(|err| format_err!("parsing owner for {backup_group} failed: {err}"));
+ }
+
let owner = proxmox_sys::fs::file_read_firstline(full_path)?;
owner
.trim_end() // remove trailing newline
@@ -860,6 +876,18 @@ impl DataStore {
) -> Result<(), Error> {
let path = self.owner_path(ns, backup_group);
+ if let DatastoreBackend::S3(s3_client) = self.backend()? {
+ let mut path = ns.path();
+ path.push(format!("{backup_group}"));
+ let object_key = crate::s3::object_key_from_path(&path, "owner")
+ .context("invalid owner file object key")?;
+ let data = hyper::body::Bytes::from(format!("{auth_id}\n"));
+ let _is_duplicate = proxmox_async::runtime::block_on(
+ s3_client.upload_with_retry(object_key, data, true),
+ )
+ .context("failed to set owner on s3 backend")?;
+ }
+
let mut open_options = std::fs::OpenOptions::new();
open_options.write(true);
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 22/45] datastore: implement garbage collection for s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (29 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 21/45] datastore: get and set owner for s3 " Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 9:47 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 23/45] ui: add datastore type selector and reorganize component layout Christian Ebner
` (24 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Implements the garbage collection for datastores backed by an s3
object store.
Take advantage of the local datastore by placing marker files in the
chunk store during phase 1 of the garbage collection, updating their
atime if already present.
This allows us to avoid making expensive API calls to update object
metadata, which would only be possible via a copy object operation.
The phase 2 is implemented by fetching a list of all the chunks via
the ListObjectsV2 API call, filtered by the chunk folder prefix.
This operation has to be performed in batches of 1000 objects, given
by the APIs response limits.
For each object key, lookup the marker file and decide based on the
marker existence and it's atime if the chunk object needs to be
removed. Deletion happens via the delete objects operation, allowing
to delete multiple chunks by a single request.
This allows to efficiently lookup chunks which are not in use
anymore while being performant and cost effective.
Baseline runtime performance tests:
-----------------------------------
3 garbage collection runs were performed with hot filesystem caches
(by additional GC run before the test runs). The PBS instance was
virtualized, the same virtualized disk using ZFS for all the local
cache stores:
All datastores contained the same encrypted data, with the following
content statistics:
Original data usage: 269.685 GiB
On-Disk usage: 9.018 GiB (3.34%)
On-Disk chunks: 6477
Deduplication factor: 29.90
Average chunk size: 1.426 MiB
The resutlts demonstrate the overhead caused by the additional
ListObjectV2 API calls and their processing, but depending on the
object store backend.
Average garbage collection runtime:
Local datastore: (2.04 ± 0.01) s
Local RADOS gateway (Squid): (3.05 ± 0.01) s
AWS S3: (3.05 ± 0.01) s
Cloudflare R2: (6.71 ± 0.58) s
After pruning of all datastore contents (therefore including
DeleteObjects requests):
Local datastore: 3.04 s
Local RADOS gateway (Squid): 14.08 s
AWS S3: 13.06 s
Cloudflare R2: 78.21 s
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/src/chunk_store.rs | 4 +
pbs-datastore/src/datastore.rs | 211 +++++++++++++++++++++++++++----
2 files changed, 190 insertions(+), 25 deletions(-)
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 8c195df54..95f00e8d5 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -353,6 +353,10 @@ impl ChunkStore {
ProcessLocker::oldest_shared_lock(self.locker.clone().unwrap())
}
+ pub fn mutex(&self) -> &std::sync::Mutex<()> {
+ &self.mutex
+ }
+
pub fn sweep_unused_chunks(
&self,
oldest_writer: i64,
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index ca099c1d0..6cc7fdbaa 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -4,7 +4,7 @@ use std::os::unix::ffi::OsStrExt;
use std::os::unix::io::AsRawFd;
use std::path::{Path, PathBuf};
use std::sync::{Arc, LazyLock, Mutex};
-use std::time::Duration;
+use std::time::{Duration, SystemTime};
use anyhow::{bail, format_err, Context, Error};
use http_body_util::BodyExt;
@@ -1209,6 +1209,7 @@ impl DataStore {
chunk_lru_cache: &mut Option<LruCache<[u8; 32], ()>>,
status: &mut GarbageCollectionStatus,
worker: &dyn WorkerTaskContext,
+ s3_client: Option<Arc<S3Client>>,
) -> Result<(), Error> {
status.index_file_count += 1;
status.index_data_bytes += index.index_bytes();
@@ -1225,21 +1226,41 @@ impl DataStore {
}
}
- if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
- let hex = hex::encode(digest);
- warn!(
- "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
- );
-
- // touch any corresponding .bad files to keep them around, meaning if a chunk is
- // rewritten correctly they will be removed automatically, as well as if no index
- // file requires the chunk anymore (won't get to this loop then)
- for i in 0..=9 {
- let bad_ext = format!("{}.bad", i);
- let mut bad_path = PathBuf::new();
- bad_path.push(self.chunk_path(digest).0);
- bad_path.set_extension(bad_ext);
- self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
+ match s3_client {
+ None => {
+ // Filesystem backend
+ if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
+ let hex = hex::encode(digest);
+ warn!(
+ "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
+ );
+
+ // touch any corresponding .bad files to keep them around, meaning if a chunk is
+ // rewritten correctly they will be removed automatically, as well as if no index
+ // file requires the chunk anymore (won't get to this loop then)
+ for i in 0..=9 {
+ let bad_ext = format!("{}.bad", i);
+ let mut bad_path = PathBuf::new();
+ bad_path.push(self.chunk_path(digest).0);
+ bad_path.set_extension(bad_ext);
+ self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
+ }
+ }
+ }
+ Some(ref _s3_client) => {
+ // Update atime on local cache marker files.
+ if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
+ let (chunk_path, _digest) = self.chunk_path(digest);
+ // Insert empty file as marker to tell GC phase2 that this is
+ // a chunk still in-use, so to keep in the S3 object store.
+ std::fs::File::options()
+ .write(true)
+ .create_new(true)
+ .open(&chunk_path)
+ .with_context(|| {
+ format!("failed to create marker for chunk {}", hex::encode(digest))
+ })?;
+ }
}
}
}
@@ -1251,6 +1272,7 @@ impl DataStore {
status: &mut GarbageCollectionStatus,
worker: &dyn WorkerTaskContext,
cache_capacity: usize,
+ s3_client: Option<Arc<S3Client>>,
) -> Result<(), Error> {
// Iterate twice over the datastore to fetch index files, even if this comes with an
// additional runtime cost:
@@ -1344,6 +1366,7 @@ impl DataStore {
&mut chunk_lru_cache,
status,
worker,
+ s3_client.as_ref().cloned(),
)?;
if !unprocessed_index_list.remove(&path) {
@@ -1378,7 +1401,14 @@ impl DataStore {
continue;
}
};
- self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?;
+ self.index_mark_used_chunks(
+ index,
+ &path,
+ &mut chunk_lru_cache,
+ status,
+ worker,
+ s3_client.as_ref().cloned(),
+ )?;
warn!("Marked chunks for unexpected index file at '{path:?}'");
}
if strange_paths_count > 0 {
@@ -1476,18 +1506,149 @@ impl DataStore {
1024 * 1024
};
- info!("Start GC phase1 (mark used chunks)");
+ let s3_client = match self.backend()? {
+ DatastoreBackend::Filesystem => None,
+ DatastoreBackend::S3(s3_client) => {
+ proxmox_async::runtime::block_on(s3_client.head_bucket())
+ .context("failed to reach bucket")?;
+ Some(s3_client)
+ }
+ };
- self.mark_used_chunks(&mut gc_status, worker, gc_cache_capacity)
- .context("marking used chunks failed")?;
+ info!("Start GC phase1 (mark used chunks)");
- info!("Start GC phase2 (sweep unused chunks)");
- self.inner.chunk_store.sweep_unused_chunks(
- oldest_writer,
- min_atime,
+ self.mark_used_chunks(
&mut gc_status,
worker,
- )?;
+ gc_cache_capacity,
+ s3_client.as_ref().cloned(),
+ )
+ .context("marking used chunks failed")?;
+
+ info!("Start GC phase2 (sweep unused chunks)");
+
+ if let Some(ref s3_client) = s3_client {
+ let mut chunk_count = 0;
+ let prefix = S3PathPrefix::Some(".chunks/".to_string());
+ // Operates in batches of 1000 objects max per request
+ let mut list_bucket_result =
+ proxmox_async::runtime::block_on(s3_client.list_objects_v2(&prefix, None))
+ .context("failed to list chunk in s3 object store")?;
+
+ let mut delete_list = Vec::with_capacity(1000);
+ loop {
+ let lock = self.inner.chunk_store.mutex().lock().unwrap();
+
+ for content in list_bucket_result.contents {
+ // Check object is actually a chunk
+ let digest = match Path::new::<str>(&content.key).file_name() {
+ Some(file_name) => file_name,
+ // should never be the case as objects will have a filename
+ None => continue,
+ };
+ let bytes = digest.as_bytes();
+ if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
+ continue;
+ }
+ if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
+ continue;
+ }
+
+ let bad = bytes.ends_with(b".bad");
+
+ // Safe since contains valid ascii hexdigits only as checked above.
+ let digest_str = digest.to_string_lossy();
+ let hexdigit_prefix = unsafe { digest_str.get_unchecked(0..4) };
+ let mut chunk_path = self.base_path();
+ chunk_path.push(".chunks");
+ chunk_path.push(hexdigit_prefix);
+ chunk_path.push(digest);
+
+ // Check local markers (created or atime updated during phase1) and
+ // keep or delete chunk based on that.
+ let atime = match std::fs::metadata(chunk_path) {
+ Ok(stat) => stat.accessed()?,
+ Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
+ // File not found, delete by setting atime to unix epoch
+ info!("Not found, mark for deletion: {}", content.key);
+ SystemTime::UNIX_EPOCH
+ }
+ Err(err) => return Err(err.into()),
+ };
+ let atime = atime.duration_since(SystemTime::UNIX_EPOCH)?.as_secs() as i64;
+
+ chunk_count += 1;
+
+ if atime < min_atime {
+ delete_list.push(content.key);
+ if bad {
+ gc_status.removed_bad += 1;
+ } else {
+ gc_status.removed_chunks += 1;
+ }
+ gc_status.removed_bytes += content.size;
+ } else if atime < oldest_writer {
+ if bad {
+ gc_status.still_bad += 1;
+ } else {
+ gc_status.pending_chunks += 1;
+ }
+ gc_status.pending_bytes += content.size;
+ } else {
+ if !bad {
+ gc_status.disk_chunks += 1;
+ }
+ gc_status.disk_bytes += content.size;
+ }
+ }
+
+ if !delete_list.is_empty() {
+ let delete_objects_result = proxmox_async::runtime::block_on(
+ s3_client.delete_objects(&delete_list),
+ )?;
+ if let Some(_err) = delete_objects_result.error {
+ bail!("failed to delete some objects");
+ }
+ delete_list.clear();
+ }
+
+ drop(lock);
+
+ // Process next batch of chunks if there is more
+ if list_bucket_result.is_truncated {
+ list_bucket_result =
+ proxmox_async::runtime::block_on(s3_client.list_objects_v2(
+ &prefix,
+ list_bucket_result.next_continuation_token.as_deref(),
+ ))?;
+ continue;
+ }
+
+ break;
+ }
+ info!("processed {chunk_count} total chunks");
+
+ // Phase 2 GC of Filesystem backed storage is phase 3 for S3 backed GC
+ info!("Start GC phase3 (sweep unused chunk markers)");
+
+ let mut tmp_gc_status = GarbageCollectionStatus {
+ upid: Some(upid.to_string()),
+ ..Default::default()
+ };
+ self.inner.chunk_store.sweep_unused_chunks(
+ oldest_writer,
+ min_atime,
+ &mut tmp_gc_status,
+ worker,
+ )?;
+ } else {
+ self.inner.chunk_store.sweep_unused_chunks(
+ oldest_writer,
+ min_atime,
+ &mut gc_status,
+ worker,
+ )?;
+ }
info!(
"Removed garbage: {}",
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 23/45] ui: add datastore type selector and reorganize component layout
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (30 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 22/45] datastore: implement garbage collection for s3 backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 9:55 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 24/45] ui: add s3 client edit window for configuration create/edit Christian Ebner
` (23 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
In preparation for adding the S3 backed datastore variant to the edit
window. Introduce a datastore type selector in order to distinguish
between creation of regular and removable datastores, instead of
using the checkbox as is currently the case.
This allows to more easily expand for further datastore type variants
while keeping the datastore edit window compact.
Since selecting the type is one of the first steps during datastore
creation, position the component right below the datastore name field
and re-organize the components related to the removable datastore
creation, while keeping additional required components for the S3
backed datastore creation in mind.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
www/window/DataStoreEdit.js | 78 +++++++++++++++++++++----------------
1 file changed, 45 insertions(+), 33 deletions(-)
diff --git a/www/window/DataStoreEdit.js b/www/window/DataStoreEdit.js
index 372984e37..cd94f0335 100644
--- a/www/window/DataStoreEdit.js
+++ b/www/window/DataStoreEdit.js
@@ -52,6 +52,41 @@ Ext.define('PBS.DataStoreEdit', {
allowBlank: false,
fieldLabel: gettext('Name'),
},
+ {
+ xtype: 'proxmoxKVComboBox',
+ name: 'datastore-type',
+ fieldLabel: gettext('Datastore Type'),
+ value: '__default__',
+ submitValue: false,
+ comboItems: [
+ ['__default__', 'Local'],
+ ['removable', 'Removable'],
+ ],
+ cbind: {
+ disabled: '{!isCreate}',
+ },
+ listeners: {
+ change: function (checkbox, selected) {
+ let isRemovable = selected === 'removable';
+
+ let inputPanel = checkbox.up('inputpanel');
+ let pathField = inputPanel.down('[name=path]');
+ let uuidEditField = inputPanel.down('[name=backing-device]');
+
+ uuidEditField.setDisabled(!isRemovable);
+ uuidEditField.allowBlank = !isRemovable;
+ uuidEditField.setValue('');
+
+ if (isRemovable) {
+ pathField.setFieldLabel(gettext('Path on Device'));
+ pathField.setEmptyText(gettext('A relative path'));
+ } else {
+ pathField.setFieldLabel(gettext('Backing Path'));
+ pathField.setEmptyText(gettext('An absolute path'));
+ }
+ },
+ },
+ },
{
xtype: 'pmxDisplayEditField',
cbind: {
@@ -63,17 +98,6 @@ Ext.define('PBS.DataStoreEdit', {
emptyText: gettext('An absolute path'),
validator: (val) => val?.trim() !== '/',
},
- {
- xtype: 'pbsPartitionSelector',
- fieldLabel: gettext('Device'),
- name: 'backing-device',
- disabled: true,
- allowBlank: true,
- cbind: {
- editable: '{isCreate}',
- },
- emptyText: gettext('Device path'),
- },
],
column2: [
{
@@ -97,31 +121,19 @@ Ext.define('PBS.DataStoreEdit', {
value: '{scheduleValue}',
},
},
- ],
- columnB: [
{
- xtype: 'checkbox',
- boxLabel: gettext('Removable datastore'),
- submitValue: false,
- listeners: {
- change: function (checkbox, isRemovable) {
- let inputPanel = checkbox.up('inputpanel');
- let pathField = inputPanel.down('[name=path]');
- let uuidEditField = inputPanel.down('[name=backing-device]');
-
- uuidEditField.setDisabled(!isRemovable);
- uuidEditField.allowBlank = !isRemovable;
- uuidEditField.setValue('');
- if (isRemovable) {
- pathField.setFieldLabel(gettext('Path on Device'));
- pathField.setEmptyText(gettext('A relative path'));
- } else {
- pathField.setFieldLabel(gettext('Backing Path'));
- pathField.setEmptyText(gettext('An absolute path'));
- }
- },
+ xtype: 'pbsPartitionSelector',
+ fieldLabel: gettext('Device'),
+ name: 'backing-device',
+ disabled: true,
+ allowBlank: true,
+ cbind: {
+ editable: '{isCreate}',
},
+ emptyText: gettext('Device path'),
},
+ ],
+ columnB: [
{
xtype: 'textfield',
name: 'comment',
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 24/45] ui: add s3 client edit window for configuration create/edit
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (31 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 23/45] ui: add datastore type selector and reorganize component layout Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 25/45] ui: add s3 client view for configuration Christian Ebner
` (22 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Adds an edit window for creating or editing S3 client configurations.
Loosely based on the same edit window for the remote configuration.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- s/Unique Identifier/S3 Client ID/
www/window/S3ClientEdit.js | 148 +++++++++++++++++++++++++++++++++++++
1 file changed, 148 insertions(+)
create mode 100644 www/window/S3ClientEdit.js
diff --git a/www/window/S3ClientEdit.js b/www/window/S3ClientEdit.js
new file mode 100644
index 000000000..b22e920f8
--- /dev/null
+++ b/www/window/S3ClientEdit.js
@@ -0,0 +1,148 @@
+Ext.define('PBS.window.S3ClientEdit', {
+ extend: 'Proxmox.window.Edit',
+ alias: 'widget.pbsS3ClientEdit',
+ mixins: ['Proxmox.Mixin.CBind'],
+
+ onlineHelp: 'backup_s3client',
+
+ isAdd: true,
+
+ subject: gettext('S3 Client'),
+
+ fieldDefaults: { labelWidth: 120 },
+
+ cbindData: function (initialConfig) {
+ let me = this;
+
+ let baseurl = '/api2/extjs/config/s3';
+ let id = initialConfig.id;
+
+ me.isCreate = !id;
+ me.url = id ? `${baseurl}/${id}` : baseurl;
+ me.method = id ? 'PUT' : 'POST';
+ me.autoLoad = !!id;
+ return {
+ passwordEmptyText: me.isCreate ? '' : gettext('Unchanged'),
+ };
+ },
+
+ items: {
+ xtype: 'inputpanel',
+ column1: [
+ {
+ xtype: 'pmxDisplayEditField',
+ name: 'id',
+ fieldLabel: gettext('S3 Client ID'),
+ renderer: Ext.htmlEncode,
+ allowBlank: false,
+ minLength: 4,
+ cbind: {
+ editable: '{isCreate}',
+ },
+ },
+ {
+ xtype: 'proxmoxtextfield',
+ name: 'endpoint',
+ fieldLabel: gettext('Endpoint'),
+ allowBlank: false,
+ emptyText: gettext('e.g. {{bucket}}.s3.{{region}}.amazonaws.com'),
+ autoEl: {
+ tag: 'div',
+ 'data-qtip': gettext(
+ 'IP or FQDN S3 endpoint (allows {{bucket}} or {{region}} templating)',
+ ),
+ },
+ },
+ {
+ xtype: 'proxmoxtextfield',
+ name: 'port',
+ fieldLabel: gettext('Port'),
+ emptyText: gettext('default (443)'),
+ cbind: {
+ deleteEmpty: '{!isCreate}',
+ },
+ },
+ {
+ xtype: 'proxmoxcheckbox',
+ name: 'path-style',
+ fieldLabel: gettext('Path Style'),
+ autoEl: {
+ tag: 'div',
+ 'data-qtip': gettext('Use path style over vhost style bucket addressing.'),
+ },
+ uncheckedValue: false,
+ value: false,
+ },
+ ],
+
+ column2: [
+ {
+ xtype: 'proxmoxtextfield',
+ name: 'region',
+ fieldLabel: gettext('Region'),
+ emptyText: gettext('default (us-west-1)'),
+ cbind: {
+ deleteEmpty: '{!isCreate}',
+ },
+ },
+ {
+ xtype: 'proxmoxtextfield',
+ name: 'access-key',
+ fieldLabel: gettext('Access Key'),
+ cbind: {
+ emptyText: '{passwordEmptyText}',
+ allowBlank: '{!isCreate}',
+ },
+ },
+ {
+ xtype: 'textfield',
+ name: 'secret-key',
+ inputType: 'password',
+ fieldLabel: gettext('Secret Key'),
+ cbind: {
+ emptyText: '{passwordEmptyText}',
+ allowBlank: '{!isCreate}',
+ },
+ },
+ ],
+
+ columnB: [
+ {
+ xtype: 'proxmoxtextfield',
+ name: 'fingerprint',
+ fieldLabel: gettext('Fingerprint'),
+ emptyText: gettext(
+ "Server certificate's SHA-256 fingerprint, required for self-signed certificates",
+ ),
+ cbind: {
+ deleteEmpty: '{!isCreate}',
+ },
+ },
+ ],
+ },
+
+ getValues: function () {
+ let me = this;
+ let values = me.callParent(arguments);
+
+ if (me.isCreate) {
+ /// Secrets are stored into separate config, but set the same id for both configs
+ values['secrets-id'] = values.id;
+ }
+
+ if (values.delete && !Ext.isArray(values.delete)) {
+ values.delete = values.delete.split(',');
+ }
+ PBS.Utils.delete_if_default(values, 'path-style', false, me.isCreate);
+
+ if (values['access-key'] === '') {
+ delete values['access-key'];
+ }
+
+ if (values['secret-key'] === '') {
+ delete values['secret-key'];
+ }
+
+ return values;
+ },
+});
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 25/45] ui: add s3 client view for configuration
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (32 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 24/45] ui: add s3 client edit window for configuration create/edit Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 26/45] ui: expose the s3 client view in the navigation tree Christian Ebner
` (21 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Adds the view to configure S3 clients in the Configuration section of
the UI.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- s/Unique Identifier/S3 Client ID/
www/config/S3ClientView.js | 141 +++++++++++++++++++++++++++++++++++++
1 file changed, 141 insertions(+)
create mode 100644 www/config/S3ClientView.js
diff --git a/www/config/S3ClientView.js b/www/config/S3ClientView.js
new file mode 100644
index 000000000..15236960b
--- /dev/null
+++ b/www/config/S3ClientView.js
@@ -0,0 +1,141 @@
+Ext.define('pmx-s3client', {
+ extend: 'Ext.data.Model',
+ fields: ['id', 'endpoint', 'port', 'access-key', 'secret-key', 'region', 'fingerprint'],
+ idProperty: 'id',
+ proxy: {
+ type: 'proxmox',
+ url: '/api2/json/config/s3',
+ },
+});
+
+Ext.define('PBS.config.S3ClientView', {
+ extend: 'Ext.grid.GridPanel',
+ alias: 'widget.pbsS3ClientView',
+
+ title: gettext('S3 Clients'),
+
+ stateful: true,
+ stateId: 'grid-s3clients',
+ tools: [PBS.Utils.get_help_tool('backup-s3-client')],
+
+ controller: {
+ xclass: 'Ext.app.ViewController',
+
+ addS3Client: function () {
+ let me = this;
+ Ext.create('PBS.window.S3ClientEdit', {
+ listeners: {
+ destroy: function () {
+ me.reload();
+ },
+ },
+ }).show();
+ },
+
+ editS3Client: function () {
+ let me = this;
+ let view = me.getView();
+ let selection = view.getSelection();
+ if (selection.length < 1) {
+ return;
+ }
+
+ Ext.create('PBS.window.S3ClientEdit', {
+ id: selection[0].data.id,
+ listeners: {
+ destroy: function () {
+ me.reload();
+ },
+ },
+ }).show();
+ },
+
+ reload: function () {
+ this.getView().getStore().rstore.load();
+ },
+
+ init: function (view) {
+ Proxmox.Utils.monStoreErrors(view, view.getStore().rstore);
+ },
+ },
+
+ listeners: {
+ activate: 'reload',
+ itemdblclick: 'editS3Client',
+ },
+
+ store: {
+ type: 'diff',
+ autoDestroy: true,
+ autoDestroyRstore: true,
+ sorters: 'id',
+ rstore: {
+ type: 'update',
+ storeid: 'pmx-s3client',
+ model: 'pmx-s3client',
+ autoStart: true,
+ interval: 5000,
+ },
+ },
+
+ tbar: [
+ {
+ xtype: 'proxmoxButton',
+ text: gettext('Add'),
+ handler: 'addS3Client',
+ selModel: false,
+ },
+ {
+ xtype: 'proxmoxButton',
+ text: gettext('Edit'),
+ handler: 'editS3Client',
+ disabled: true,
+ },
+ {
+ xtype: 'proxmoxStdRemoveButton',
+ baseurl: '/config/s3',
+ callback: 'reload',
+ },
+ ],
+
+ viewConfig: {
+ trackOver: false,
+ },
+
+ columns: [
+ {
+ dataIndex: 'id',
+ header: gettext('S3 Client ID'),
+ renderer: Ext.String.htmlEncode,
+ sortable: true,
+ width: 200,
+ },
+ {
+ dataIndex: 'endpoint',
+ header: gettext('Endpoint'),
+ sortable: true,
+ width: 200,
+ },
+ {
+ dataIndex: 'port',
+ header: gettext('Port'),
+ renderer: Ext.String.htmlEncode,
+ sortable: true,
+ width: 100,
+ },
+ {
+ dataIndex: 'region',
+ header: gettext('Region'),
+ renderer: Ext.String.htmlEncode,
+ sortable: true,
+ width: 100,
+ },
+ {
+ dataIndex: 'fingerprint',
+ header: gettext('Fingerprint'),
+ renderer: Ext.String.htmlEncode,
+ sortable: false,
+ flex: 1,
+ },
+ ],
+});
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 26/45] ui: expose the s3 client view in the navigation tree
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (33 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 25/45] ui: add s3 client view for configuration Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 27/45] ui: add s3 client selector and bucket field for s3 backend setup Christian Ebner
` (20 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Add a `S3 Clients` item to the navigation tree to allow accessing the
S3 client configuration view and edit windows.
Adds the required source files to the Makefile.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
www/Makefile | 2 ++
www/NavigationTree.js | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/www/Makefile b/www/Makefile
index ab9946be0..767713c75 100644
--- a/www/Makefile
+++ b/www/Makefile
@@ -61,6 +61,7 @@ JSSRC= \
config/RemoteView.js \
config/TrafficControlView.js \
config/ACLView.js \
+ config/S3ClientView.js \
config/SyncView.js \
config/VerifyView.js \
config/PruneView.js \
@@ -85,6 +86,7 @@ JSSRC= \
window/PruneJobEdit.js \
window/GCJobEdit.js \
window/UserEdit.js \
+ window/S3ClientEdit.js \
window/Settings.js \
window/TokenEdit.js \
window/VerifyJobEdit.js \
diff --git a/www/NavigationTree.js b/www/NavigationTree.js
index aac9bd1b2..f445da49d 100644
--- a/www/NavigationTree.js
+++ b/www/NavigationTree.js
@@ -80,6 +80,12 @@ Ext.define('PBS.store.NavigationStore', {
path: 'pbsSubscription',
leaf: true,
},
+ {
+ text: gettext('S3 Clients'),
+ iconCls: 'fa fa-cloud-upload',
+ path: 'pbsS3ClientView',
+ leaf: true,
+ },
],
},
{
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 27/45] ui: add s3 client selector and bucket field for s3 backend setup
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (34 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 26/45] ui: expose the s3 client view in the navigation tree Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 10:02 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 28/45] tools: lru cache: add removed callback for evicted cache nodes Christian Ebner
` (19 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
In order to be able to create datastore with an s3 object store
backend. Implements a s3 client selector and exposes it in the
datastore edit window, together with the additional bucket name field
to associate with the datastore's s3 backend.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- use field endpoint insteand of host, fixing the selector listing
www/Makefile | 1 +
www/form/S3ClientSelector.js | 33 +++++++++++++++++++++++++++
www/window/DataStoreEdit.js | 44 ++++++++++++++++++++++++++++++++++++
3 files changed, 78 insertions(+)
create mode 100644 www/form/S3ClientSelector.js
diff --git a/www/Makefile b/www/Makefile
index 767713c75..410e9f3e0 100644
--- a/www/Makefile
+++ b/www/Makefile
@@ -42,6 +42,7 @@ JSSRC= \
Schema.js \
form/TokenSelector.js \
form/AuthidSelector.js \
+ form/S3ClientSelector.js \
form/RemoteSelector.js \
form/RemoteTargetSelector.js \
form/DataStoreSelector.js \
diff --git a/www/form/S3ClientSelector.js b/www/form/S3ClientSelector.js
new file mode 100644
index 000000000..243484909
--- /dev/null
+++ b/www/form/S3ClientSelector.js
@@ -0,0 +1,33 @@
+Ext.define('PBS.form.S3ClientSelector', {
+ extend: 'Proxmox.form.ComboGrid',
+ alias: 'widget.pbsS3ClientSelector',
+
+ allowBlank: false,
+ autoSelect: false,
+ valueField: 'id',
+ displayField: 'id',
+
+ store: {
+ model: 'pmx-s3client',
+ autoLoad: true,
+ sorters: 'id',
+ },
+
+ listConfig: {
+ columns: [
+ {
+ header: gettext('S3 Client ID'),
+ sortable: true,
+ dataIndex: 'id',
+ renderer: Ext.String.htmlEncode,
+ flex: 1,
+ },
+ {
+ header: gettext('Endpoint'),
+ sortable: true,
+ dataIndex: 'endpoint',
+ flex: 1,
+ },
+ ],
+ },
+});
diff --git a/www/window/DataStoreEdit.js b/www/window/DataStoreEdit.js
index cd94f0335..3379bf773 100644
--- a/www/window/DataStoreEdit.js
+++ b/www/window/DataStoreEdit.js
@@ -61,6 +61,7 @@ Ext.define('PBS.DataStoreEdit', {
comboItems: [
['__default__', 'Local'],
['removable', 'Removable'],
+ ['s3', 'S3 (experimental)'],
],
cbind: {
disabled: '{!isCreate}',
@@ -68,18 +69,32 @@ Ext.define('PBS.DataStoreEdit', {
listeners: {
change: function (checkbox, selected) {
let isRemovable = selected === 'removable';
+ let isS3 = selected === 's3';
let inputPanel = checkbox.up('inputpanel');
let pathField = inputPanel.down('[name=path]');
let uuidEditField = inputPanel.down('[name=backing-device]');
+ let bucketField = inputPanel.down('[name=bucket]');
+ let s3ClientSelector = inputPanel.down('[name=s3client]');
uuidEditField.setDisabled(!isRemovable);
uuidEditField.allowBlank = !isRemovable;
uuidEditField.setValue('');
+ bucketField.setDisabled(!isS3);
+ bucketField.allowBlank = !isS3;
+ bucketField.setValue('');
+
+ s3ClientSelector.setDisabled(!isS3);
+ s3ClientSelector.allowBlank = !isS3;
+ s3ClientSelector.setValue('');
+
if (isRemovable) {
pathField.setFieldLabel(gettext('Path on Device'));
pathField.setEmptyText(gettext('A relative path'));
+ } else if (isS3) {
+ pathField.setFieldLabel(gettext('Store Cache'));
+ pathField.setEmptyText(gettext('An absolute path'));
} else {
pathField.setFieldLabel(gettext('Backing Path'));
pathField.setEmptyText(gettext('An absolute path'));
@@ -98,6 +113,15 @@ Ext.define('PBS.DataStoreEdit', {
emptyText: gettext('An absolute path'),
validator: (val) => val?.trim() !== '/',
},
+ {
+ xtype: 'pbsS3ClientSelector',
+ name: 's3client',
+ fieldLabel: gettext('S3 Client ID'),
+ disabled: true,
+ cbind: {
+ editable: '{isCreate}',
+ },
+ },
],
column2: [
{
@@ -132,6 +156,13 @@ Ext.define('PBS.DataStoreEdit', {
},
emptyText: gettext('Device path'),
},
+ {
+ xtype: 'proxmoxtextfield',
+ name: 'bucket',
+ fieldLabel: gettext('Bucket'),
+ allowBlank: false,
+ disabled: true,
+ },
],
columnB: [
{
@@ -154,7 +185,20 @@ Ext.define('PBS.DataStoreEdit', {
if (me.isCreate) {
// New datastores default to using the notification system
values['notification-mode'] = 'notification-system';
+
+ if (values.s3client) {
+ let s3BackendConf = {
+ type: 's3',
+ client: values.s3client,
+ bucket: values.bucket,
+ };
+ values.backend = PBS.Utils.printPropertyString(s3BackendConf);
+ }
}
+
+ delete values.s3client;
+ delete values.bucket;
+
return values;
},
},
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 28/45] tools: lru cache: add removed callback for evicted cache nodes
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (35 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 27/45] ui: add s3 client selector and bucket field for s3 backend setup Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 29/45] tools: async lru cache: implement insert, remove and contains methods Christian Ebner
` (18 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Add a callback function to be executed on evicted cache nodes. The
callback gets the key of the removed node, allowing to externally act
based on that value.
Since the callback might fail, extend the current LRU cache api to
return an error on insert, covering the error for the `removed`
callback.
Async lru cache, callsites and tests are adapted to include the
additional callback parameter accordingly.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/src/cached_chunk_reader.rs | 6 +++-
pbs-datastore/src/datastore.rs | 2 +-
pbs-datastore/src/dynamic_index.rs | 1 +
pbs-tools/src/async_lru_cache.rs | 23 +++++++++----
pbs-tools/src/lru_cache.rs | 42 +++++++++++++++---------
5 files changed, 50 insertions(+), 24 deletions(-)
diff --git a/pbs-datastore/src/cached_chunk_reader.rs b/pbs-datastore/src/cached_chunk_reader.rs
index be7f2a1e2..95ac23a54 100644
--- a/pbs-datastore/src/cached_chunk_reader.rs
+++ b/pbs-datastore/src/cached_chunk_reader.rs
@@ -81,7 +81,11 @@ impl<I: IndexFile, R: AsyncReadChunk + Send + Sync + 'static> CachedChunkReader<
let info = self.index.chunk_info(chunk.0).unwrap();
// will never be None, see AsyncChunkCacher
- let data = self.cache.access(info.digest, &self.cacher).await?.unwrap();
+ let data = self
+ .cache
+ .access(info.digest, &self.cacher, |_| Ok(()))
+ .await?
+ .unwrap();
let want_bytes = ((info.range.end - cur_offset) as usize).min(size - read);
let slice = &mut buf[read..(read + want_bytes)];
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 6cc7fdbaa..89f45e7f8 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1221,7 +1221,7 @@ impl DataStore {
// Avoid multiple expensive atime updates by utimensat
if let Some(chunk_lru_cache) = chunk_lru_cache {
- if chunk_lru_cache.insert(*digest, ()) {
+ if chunk_lru_cache.insert(*digest, (), |_| Ok(()))? {
continue;
}
}
diff --git a/pbs-datastore/src/dynamic_index.rs b/pbs-datastore/src/dynamic_index.rs
index b1d85a049..83e13b311 100644
--- a/pbs-datastore/src/dynamic_index.rs
+++ b/pbs-datastore/src/dynamic_index.rs
@@ -599,6 +599,7 @@ impl<S: ReadChunk> BufferedDynamicReader<S> {
store: &mut self.store,
index: &self.index,
},
+ |_| Ok(()),
)?
.ok_or_else(|| format_err!("chunk not found by cacher"))?;
diff --git a/pbs-tools/src/async_lru_cache.rs b/pbs-tools/src/async_lru_cache.rs
index c43b87717..141114933 100644
--- a/pbs-tools/src/async_lru_cache.rs
+++ b/pbs-tools/src/async_lru_cache.rs
@@ -42,7 +42,16 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V: Clone + Send + 'static> AsyncL
/// Access an item either via the cache or by calling cacher.fetch. A return value of Ok(None)
/// means the item requested has no representation, Err(_) means a call to fetch() failed,
/// regardless of whether it was initiated by this call or a previous one.
- pub async fn access(&self, key: K, cacher: &dyn AsyncCacher<K, V>) -> Result<Option<V>, Error> {
+ /// Calls the removed callback on the evicted item, if any.
+ pub async fn access<F>(
+ &self,
+ key: K,
+ cacher: &dyn AsyncCacher<K, V>,
+ removed: F,
+ ) -> Result<Option<V>, Error>
+ where
+ F: Fn(K) -> Result<(), Error>,
+ {
let (owner, result_fut) = {
// check if already requested
let mut maps = self.maps.lock().unwrap();
@@ -71,7 +80,7 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V: Clone + Send + 'static> AsyncL
// this call was the one initiating the request, put into LRU and remove from map
let mut maps = self.maps.lock().unwrap();
if let Ok(Some(ref value)) = result {
- maps.0.insert(key, value.clone());
+ maps.0.insert(key, value.clone(), removed)?;
}
maps.1.remove(&key);
}
@@ -106,15 +115,15 @@ mod test {
let cache: AsyncLruCache<i32, String> = AsyncLruCache::new(2);
assert_eq!(
- cache.access(10, &cacher).await.unwrap(),
+ cache.access(10, &cacher, |_| Ok(())).await.unwrap(),
Some("x10".to_string())
);
assert_eq!(
- cache.access(20, &cacher).await.unwrap(),
+ cache.access(20, &cacher, |_| Ok(())).await.unwrap(),
Some("x20".to_string())
);
assert_eq!(
- cache.access(30, &cacher).await.unwrap(),
+ cache.access(30, &cacher, |_| Ok(())).await.unwrap(),
Some("x30".to_string())
);
@@ -123,14 +132,14 @@ mod test {
tokio::spawn(async move {
let cacher = TestAsyncCacher { prefix: "y" };
assert_eq!(
- c.access(40, &cacher).await.unwrap(),
+ c.access(40, &cacher, |_| Ok(())).await.unwrap(),
Some("y40".to_string())
);
});
}
assert_eq!(
- cache.access(20, &cacher).await.unwrap(),
+ cache.access(20, &cacher, |_| Ok(())).await.unwrap(),
Some("x20".to_string())
);
});
diff --git a/pbs-tools/src/lru_cache.rs b/pbs-tools/src/lru_cache.rs
index 94757bbf7..a7aea6528 100644
--- a/pbs-tools/src/lru_cache.rs
+++ b/pbs-tools/src/lru_cache.rs
@@ -60,10 +60,10 @@ impl<K, V> CacheNode<K, V> {
/// assert_eq!(cache.get_mut(1), None);
/// assert_eq!(cache.len(), 0);
///
-/// cache.insert(1, 1);
-/// cache.insert(2, 2);
-/// cache.insert(3, 3);
-/// cache.insert(4, 4);
+/// cache.insert(1, 1, |_| Ok(()));
+/// cache.insert(2, 2, |_| Ok(()));
+/// cache.insert(3, 3, |_| Ok(()));
+/// cache.insert(4, 4, |_| Ok(()));
/// assert_eq!(cache.len(), 3);
///
/// assert_eq!(cache.get_mut(1), None);
@@ -77,9 +77,9 @@ impl<K, V> CacheNode<K, V> {
/// assert_eq!(cache.len(), 0);
/// assert_eq!(cache.get_mut(2), None);
/// // access will fill in missing cache entry by fetching from LruCacher
-/// assert_eq!(cache.access(2, &mut LruCacher {}).unwrap(), Some(&mut 2));
+/// assert_eq!(cache.access(2, &mut LruCacher {}, |_| Ok(())).unwrap(), Some(&mut 2));
///
-/// cache.insert(1, 1);
+/// cache.insert(1, 1, |_| Ok(()));
/// assert_eq!(cache.get_mut(1), Some(&mut 1));
///
/// cache.clear();
@@ -135,7 +135,10 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
/// Insert or update an entry identified by `key` with the given `value`.
/// This entry is placed as the most recently used node at the head.
- pub fn insert(&mut self, key: K, value: V) -> bool {
+ pub fn insert<F>(&mut self, key: K, value: V, removed: F) -> Result<bool, anyhow::Error>
+ where
+ F: Fn(K) -> Result<(), anyhow::Error>,
+ {
match self.map.entry(key) {
Entry::Occupied(mut o) => {
// Node present, update value
@@ -144,7 +147,7 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
let mut node = unsafe { Box::from_raw(node_ptr) };
node.value = value;
let _node_ptr = Box::into_raw(node);
- true
+ Ok(true)
}
Entry::Vacant(v) => {
// Node not present, insert a new one
@@ -160,9 +163,11 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
// avoid borrow conflict. This means there are temporarily
// self.capacity + 1 cache nodes.
if self.map.len() > self.capacity {
- self.pop_tail();
+ if let Some(removed_node) = self.pop_tail() {
+ removed(removed_node)?;
+ }
}
- false
+ Ok(false)
}
}
}
@@ -176,11 +181,12 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
}
/// Remove the least recently used node from the cache.
- fn pop_tail(&mut self) {
+ fn pop_tail(&mut self) -> Option<K> {
if let Some(old_tail) = self.list.pop_tail() {
// Remove HashMap entry for old tail
- self.map.remove(&old_tail.key);
+ return self.map.remove(&old_tail.key).map(|_| old_tail.key);
}
+ None
}
/// Get a mutable reference to the value identified by `key`.
@@ -208,11 +214,15 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
/// value.
/// If fetch returns a value, it is inserted as the most recently used entry
/// in the cache.
- pub fn access<'a>(
+ pub fn access<'a, F>(
&'a mut self,
key: K,
cacher: &mut dyn Cacher<K, V>,
- ) -> Result<Option<&'a mut V>, anyhow::Error> {
+ removed: F,
+ ) -> Result<Option<&'a mut V>, anyhow::Error>
+ where
+ F: Fn(K) -> Result<(), anyhow::Error>,
+ {
match self.map.entry(key) {
Entry::Occupied(mut o) => {
// Cache hit, birng node to front of list
@@ -236,7 +246,9 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
// avoid borrow conflict. This means there are temporarily
// self.capacity + 1 cache nodes.
if self.map.len() > self.capacity {
- self.pop_tail();
+ if let Some(removed_node) = self.pop_tail() {
+ removed(removed_node)?;
+ }
}
}
}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 29/45] tools: async lru cache: implement insert, remove and contains methods
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (36 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 28/45] tools: lru cache: add removed callback for evicted cache nodes Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 30/45] datastore: add local datastore cache for network attached storages Christian Ebner
` (17 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Add methods to insert new cache entries without using the cacher,
remove cache entries given their key and check if the cache contains
a key, marking it the most recently used one if it does.
These methods will be used to implement the local datastore cache
which stores the values (chunks) on the filesystem rather than
keeping track of them by storing them in-memory in the cache. The lru
cache will only be used to allow for fast lookup and keep track of
the lookup order.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-tools/src/async_lru_cache.rs | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/pbs-tools/src/async_lru_cache.rs b/pbs-tools/src/async_lru_cache.rs
index 141114933..3a975de32 100644
--- a/pbs-tools/src/async_lru_cache.rs
+++ b/pbs-tools/src/async_lru_cache.rs
@@ -87,6 +87,29 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V: Clone + Send + 'static> AsyncL
result
}
+
+ /// Insert an item as the most recently used one into the cache, calling the removed callback
+ /// on the evicted cache item, if any.
+ pub fn insert<F>(&self, key: K, value: V, removed: F) -> Result<(), Error>
+ where
+ F: Fn(K) -> Result<(), Error>,
+ {
+ let mut maps = self.maps.lock().unwrap();
+ maps.0.insert(key, value.clone(), removed)?;
+ Ok(())
+ }
+
+ /// Check if the item exists and if so, mark it as the most recently uses one.
+ pub fn contains(&self, key: K) -> bool {
+ let mut maps = self.maps.lock().unwrap();
+ maps.0.get_mut(key).is_some()
+ }
+
+ /// Remove the item from the cache.
+ pub fn remove(&self, key: K) {
+ let mut maps = self.maps.lock().unwrap();
+ maps.0.remove(key);
+ }
}
#[cfg(test)]
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 30/45] datastore: add local datastore cache for network attached storages
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (37 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 29/45] tools: async lru cache: implement insert, remove and contains methods Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 11:24 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 31/45] api: backup: use local datastore cache on s3 backend chunk upload Christian Ebner
` (16 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Use a local datastore as cache using LRU cache replacement policy for
operations on a datastore backed by a network, e.g. by an S3 object
store backend. The goal is to reduce number of requests to the
backend and thereby save costs (monetary as well as time).
Cached chunks are stored on the local datastore cache, already
containing the datastore's contents metadata (namespace, group,
snapshot, owner, index files, ecc..), used to perform fast lookups.
The cache itself only stores chunk digests, not the raw data itself.
When payload data is required, contents are looked up and read from
the local datastore cache filesystem, including fallback to fetch from
the backend if the presumably cached entry is not found.
The cacher allows to fetch cache items on cache misses via the access
method.
The capacity of the cache is derived from the local datastore cache
filesystem, or by the user configured value, whichever is smalller.
The capacity is only set on instantiation of the store, and the current
value kept as long as the datastore remains cached in the datastore
cache. To change the value, the store has to be either be set to offline
mode and back, or the services restarted.
Basic performance tests:
Backup and upload of contents of linux git repository to AWS S3,
snapshots removed in-between each backup run to avoid other chunk reuse
optimization of PBS.
no-cache:
had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 50.76 s (average 102.258 MiB/s)
empty-cache:
had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 50.42 s (average 102.945 MiB/s)
all-cached:
had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 43.78 s (average 118.554 MiB/s)
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- use info instead of warn, as these might end up in the task logs as
well, possibly causing confusion if warning level
pbs-datastore/src/datastore.rs | 70 ++++++-
pbs-datastore/src/lib.rs | 3 +
.../src/local_datastore_lru_cache.rs | 172 ++++++++++++++++++
3 files changed, 244 insertions(+), 1 deletion(-)
create mode 100644 pbs-datastore/src/local_datastore_lru_cache.rs
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 89f45e7f8..cab0f5b4d 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -40,9 +40,10 @@ use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
use crate::hierarchy::{ListGroups, ListGroupsType, ListNamespaces, ListNamespacesRecursive};
use crate::index::IndexFile;
+use crate::local_datastore_lru_cache::S3Cacher;
use crate::s3::S3_CONTENT_PREFIX;
use crate::task_tracking::{self, update_active_operations};
-use crate::DataBlob;
+use crate::{DataBlob, LocalDatastoreLruCache};
static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
LazyLock::new(|| Mutex::new(HashMap::new()));
@@ -136,6 +137,7 @@ pub struct DataStoreImpl {
last_digest: Option<[u8; 32]>,
sync_level: DatastoreFSyncLevel,
backend_config: DatastoreBackendConfig,
+ lru_store_caching: Option<LocalDatastoreLruCache>,
}
impl DataStoreImpl {
@@ -151,6 +153,7 @@ impl DataStoreImpl {
last_digest: None,
sync_level: Default::default(),
backend_config: Default::default(),
+ lru_store_caching: None,
})
}
}
@@ -255,6 +258,37 @@ impl DataStore {
Ok(backend_type)
}
+ pub fn cache(&self) -> Option<&LocalDatastoreLruCache> {
+ self.inner.lru_store_caching.as_ref()
+ }
+
+ /// Check if the digest is present in the local datastore cache.
+ /// Always returns false if there is no cache configured for this datastore.
+ pub fn cache_contains(&self, digest: &[u8; 32]) -> bool {
+ if let Some(cache) = self.inner.lru_store_caching.as_ref() {
+ return cache.contains(digest);
+ }
+ false
+ }
+
+ /// Insert digest as most recently used on in the cache.
+ /// Returns with success if there is no cache configured for this datastore.
+ pub fn cache_insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
+ if let Some(cache) = self.inner.lru_store_caching.as_ref() {
+ return cache.insert(digest, chunk);
+ }
+ Ok(())
+ }
+
+ pub fn cacher(&self) -> Result<Option<S3Cacher>, Error> {
+ self.backend().map(|backend| match backend {
+ DatastoreBackend::S3(s3_client) => {
+ Some(S3Cacher::new(s3_client, self.inner.chunk_store.clone()))
+ }
+ DatastoreBackend::Filesystem => None,
+ })
+ }
+
pub fn lookup_datastore(
name: &str,
operation: Option<Operation>,
@@ -437,6 +471,33 @@ impl DataStore {
.parse_property_string(config.backend.as_deref().unwrap_or(""))?,
)?;
+ let lru_store_caching = if DatastoreBackendType::S3 == backend_config.ty.unwrap_or_default()
+ {
+ let mut cache_capacity = 0;
+ if let Ok(fs_info) = proxmox_sys::fs::fs_info(&chunk_store.base_path()) {
+ cache_capacity = fs_info.available / (16 * 1024 * 1024);
+ }
+ if let Some(max_cache_size) = backend_config.max_cache_size {
+ info!(
+ "Got requested max cache size {max_cache_size} for store {}",
+ config.name
+ );
+ let max_cache_capacity = max_cache_size.as_u64() / (16 * 1024 * 1024);
+ cache_capacity = cache_capacity.min(max_cache_capacity);
+ }
+ let cache_capacity = usize::try_from(cache_capacity).unwrap_or_default();
+
+ info!(
+ "Using datastore cache with capacity {cache_capacity} for store {}",
+ config.name
+ );
+
+ let cache = LocalDatastoreLruCache::new(cache_capacity, chunk_store.clone());
+ Some(cache)
+ } else {
+ None
+ };
+
Ok(DataStoreImpl {
chunk_store,
gc_mutex: Mutex::new(()),
@@ -446,6 +507,7 @@ impl DataStore {
last_digest,
sync_level: tuning.sync_level.unwrap_or_default(),
backend_config,
+ lru_store_caching,
})
}
@@ -1580,6 +1642,12 @@ impl DataStore {
chunk_count += 1;
if atime < min_atime {
+ if let Some(cache) = self.cache() {
+ let mut digest_bytes = [0u8; 32];
+ hex::decode_to_slice(digest.as_bytes(), &mut digest_bytes)?;
+ // ignore errors, phase 3 will retry cleanup anyways
+ let _ = cache.remove(&digest_bytes);
+ }
delete_list.push(content.key);
if bad {
gc_status.removed_bad += 1;
diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
index ca6fdb7d8..b9eb035c2 100644
--- a/pbs-datastore/src/lib.rs
+++ b/pbs-datastore/src/lib.rs
@@ -217,3 +217,6 @@ pub use snapshot_reader::SnapshotReader;
mod local_chunk_reader;
pub use local_chunk_reader::LocalChunkReader;
+
+mod local_datastore_lru_cache;
+pub use local_datastore_lru_cache::LocalDatastoreLruCache;
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
new file mode 100644
index 000000000..bb64c52f3
--- /dev/null
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -0,0 +1,172 @@
+//! Use a local datastore as cache for operations on a datastore attached via
+//! a network layer (e.g. via the S3 backend).
+
+use std::future::Future;
+use std::sync::Arc;
+
+use anyhow::{bail, Error};
+use http_body_util::BodyExt;
+
+use pbs_tools::async_lru_cache::{AsyncCacher, AsyncLruCache};
+use proxmox_s3_client::S3Client;
+
+use crate::ChunkStore;
+use crate::DataBlob;
+
+#[derive(Clone)]
+pub struct S3Cacher {
+ client: Arc<S3Client>,
+ store: Arc<ChunkStore>,
+}
+
+impl AsyncCacher<[u8; 32], ()> for S3Cacher {
+ fn fetch(
+ &self,
+ key: [u8; 32],
+ ) -> Box<dyn Future<Output = Result<Option<()>, Error>> + Send + 'static> {
+ let client = self.client.clone();
+ let store = self.store.clone();
+ Box::new(async move {
+ let object_key = crate::s3::object_key_from_digest(&key)?;
+ match client.get_object(object_key).await? {
+ None => bail!("could not fetch object with key {}", hex::encode(key)),
+ Some(response) => {
+ let bytes = response.content.collect().await?.to_bytes();
+ let chunk = DataBlob::from_raw(bytes.to_vec())?;
+ store.insert_chunk(&chunk, &key)?;
+ Ok(Some(()))
+ }
+ }
+ })
+ }
+}
+
+impl S3Cacher {
+ pub fn new(client: Arc<S3Client>, store: Arc<ChunkStore>) -> Self {
+ Self { client, store }
+ }
+}
+
+/// LRU cache using local datastore for caching chunks
+///
+/// Uses a LRU cache, but without storing the values in-memory but rather
+/// on the filesystem
+pub struct LocalDatastoreLruCache {
+ cache: AsyncLruCache<[u8; 32], ()>,
+ store: Arc<ChunkStore>,
+}
+
+impl LocalDatastoreLruCache {
+ pub fn new(capacity: usize, store: Arc<ChunkStore>) -> Self {
+ Self {
+ cache: AsyncLruCache::new(capacity),
+ store,
+ }
+ }
+
+ /// Insert a new chunk into the local datastore cache.
+ ///
+ /// Fails if the chunk cannot be inserted successfully.
+ pub fn insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
+ self.store.insert_chunk(chunk, digest)?;
+ self.cache.insert(*digest, (), |digest| {
+ let (path, _digest_str) = self.store.chunk_path(&digest);
+ // Truncate to free up space but keep the inode around, since that
+ // is used as marker for chunks in use by garbage collection.
+ if let Err(err) = nix::unistd::truncate(&path, 0) {
+ if err != nix::errno::Errno::ENOENT {
+ return Err(Error::from(err));
+ }
+ }
+ Ok(())
+ })
+ }
+
+ /// Remove a chunk from the local datastore cache.
+ ///
+ /// Fails if the chunk cannot be deleted successfully.
+ pub fn remove(&self, digest: &[u8; 32]) -> Result<(), Error> {
+ self.cache.remove(*digest);
+ let (path, _digest_str) = self.store.chunk_path(digest);
+ std::fs::remove_file(path).map_err(Error::from)
+ }
+
+ pub async fn access(
+ &self,
+ digest: &[u8; 32],
+ cacher: &mut S3Cacher,
+ ) -> Result<Option<DataBlob>, Error> {
+ if self
+ .cache
+ .access(*digest, cacher, |digest| {
+ let (path, _digest_str) = self.store.chunk_path(&digest);
+ // Truncate to free up space but keep the inode around, since that
+ // is used as marker for chunks in use by garbage collection.
+ if let Err(err) = nix::unistd::truncate(&path, 0) {
+ if err != nix::errno::Errno::ENOENT {
+ return Err(Error::from(err));
+ }
+ }
+ Ok(())
+ })
+ .await?
+ .is_some()
+ {
+ let (path, _digest_str) = self.store.chunk_path(digest);
+ let mut file = match std::fs::File::open(&path) {
+ Ok(file) => file,
+ Err(err) => {
+ // Expected chunk to be present since LRU cache has it, but it is missing
+ // locally, try to fetch again
+ if err.kind() == std::io::ErrorKind::NotFound {
+ let object_key = crate::s3::object_key_from_digest(digest)?;
+ match cacher.client.get_object(object_key).await? {
+ None => {
+ bail!("could not fetch object with key {}", hex::encode(digest))
+ }
+ Some(response) => {
+ let bytes = response.content.collect().await?.to_bytes();
+ let chunk = DataBlob::from_raw(bytes.to_vec())?;
+ self.store.insert_chunk(&chunk, digest)?;
+ std::fs::File::open(&path)?
+ }
+ }
+ } else {
+ return Err(Error::from(err));
+ }
+ }
+ };
+ let chunk = match DataBlob::load_from_reader(&mut file) {
+ Ok(chunk) => chunk,
+ Err(err) => {
+ use std::io::Seek;
+ // Check if file is empty marker file, try fetching content if so
+ if file.seek(std::io::SeekFrom::End(0))? == 0 {
+ let object_key = crate::s3::object_key_from_digest(digest)?;
+ match cacher.client.get_object(object_key).await? {
+ None => {
+ bail!("could not fetch object with key {}", hex::encode(digest))
+ }
+ Some(response) => {
+ let bytes = response.content.collect().await?.to_bytes();
+ let chunk = DataBlob::from_raw(bytes.to_vec())?;
+ self.store.insert_chunk(&chunk, digest)?;
+ let mut file = std::fs::File::open(&path)?;
+ DataBlob::load_from_reader(&mut file)?
+ }
+ }
+ } else {
+ return Err(err);
+ }
+ }
+ };
+ Ok(Some(chunk))
+ } else {
+ Ok(None)
+ }
+ }
+
+ pub fn contains(&self, digest: &[u8; 32]) -> bool {
+ self.cache.contains(*digest)
+ }
+}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 31/45] api: backup: use local datastore cache on s3 backend chunk upload
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (38 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 30/45] datastore: add local datastore cache for network attached storages Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 32/45] api: reader: use local datastore cache on s3 backend chunk fetching Christian Ebner
` (15 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Take advantage of the local datastore cache to avoid re-uploading of
already known chunks. This not only helps improve the backup/upload
speeds, but also avoids additionally costs by reducing the number of
requests and transferred payload data to the S3 object store api.
If the cache is present, lookup if it contains the chunk, skipping
upload altogether if it is. Otherwise, upload the chunk into memory,
upload it to the S3 object store api and insert it into the local
datastore cache.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/backup/upload_chunk.rs | 36 +++++++++++++++++++++++++++++++--
src/server/pull.rs | 4 ++++
2 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
index 3ad8c3c75..d97975b34 100644
--- a/src/api2/backup/upload_chunk.rs
+++ b/src/api2/backup/upload_chunk.rs
@@ -2,7 +2,7 @@ use std::pin::Pin;
use std::sync::Arc;
use std::task::{Context, Poll};
-use anyhow::{bail, format_err, Error};
+use anyhow::{bail, format_err, Context as AnyhowContext, Error};
use futures::*;
use hex::FromHex;
use http_body_util::{BodyDataStream, BodyExt};
@@ -262,8 +262,40 @@ async fn upload_to_backend(
);
}
+ // Avoid re-upload to S3 if the chunk is either present in the LRU cache or the chunk
+ // file exists on filesystem. The latter means that the chunk has been present in the
+ // past an was not cleaned up by garbage collection, so contained in the S3 object store.
+ if env.datastore.cache_contains(&digest) {
+ tracing::info!("Skip upload of cached chunk {}", hex::encode(digest));
+ return Ok((digest, size, encoded_size, true));
+ }
+ if let Ok(true) = env.datastore.cond_touch_chunk(&digest, false) {
+ tracing::info!(
+ "Skip upload of already encountered chunk {}",
+ hex::encode(digest)
+ );
+ return Ok((digest, size, encoded_size, true));
+ }
+
+ tracing::info!("Upload of new chunk {}", hex::encode(digest));
let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
- let is_duplicate = s3_client.upload_with_retry(object_key, data, false).await?;
+ let is_duplicate = s3_client
+ .upload_with_retry(object_key, data.clone(), false)
+ .await
+ .context("failed to upload chunk to s3 backend")?;
+
+ // Only insert the chunk into the cache after it has been successufuly uploaded.
+ // Although less performant than doing this in parallel, it is required for consisency
+ // since chunks are considered as present on the backend if the file exists in the local
+ // cache store.
+ let datastore = env.datastore.clone();
+ tracing::info!("Caching of chunk {}", hex::encode(digest));
+ let _ = tokio::task::spawn_blocking(move || {
+ let chunk = DataBlob::from_raw(data.to_vec())?;
+ datastore.cache_insert(&digest, &chunk)
+ })
+ .await?;
+
Ok((digest, size, encoded_size, is_duplicate))
}
}
diff --git a/src/server/pull.rs b/src/server/pull.rs
index fe87359ab..e34766226 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -173,6 +173,10 @@ async fn pull_index_chunks<I: IndexFile>(
target2.insert_chunk(&chunk, &digest)?;
}
DatastoreBackend::S3(s3_client) => {
+ if target2.cache_contains(&digest) {
+ return Ok(());
+ }
+ target2.cache_insert(&digest, &chunk)?;
let data = chunk.raw_data().to_vec();
let upload_data = hyper::body::Bytes::from(data);
let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 32/45] api: reader: use local datastore cache on s3 backend chunk fetching
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (39 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 31/45] api: backup: use local datastore cache on s3 backend chunk upload Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 33/45] datastore: local chunk reader: get cached chunk from local cache store Christian Ebner
` (14 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Take advantage of the local datastore filesystem cache for datastores
backed by an s3 object store in order to reduce number of requests
and latency, and increase throughput.
Also, reducing the number of requests is cost beneficial for S3 object
stores charging for fetching of objects.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/api2/reader/mod.rs | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/src/api2/reader/mod.rs b/src/api2/reader/mod.rs
index 997d9ca77..846493c61 100644
--- a/src/api2/reader/mod.rs
+++ b/src/api2/reader/mod.rs
@@ -327,7 +327,28 @@ fn download_chunk(
let body = match &env.backend {
DatastoreBackend::Filesystem => load_from_filesystem(env, &digest)?,
- DatastoreBackend::S3(s3_client) => fetch_from_object_store(s3_client, &digest).await?,
+ DatastoreBackend::S3(s3_client) => {
+ match env.datastore.cache() {
+ None => fetch_from_object_store(s3_client, &digest).await?,
+ Some(cache) => {
+ let mut cacher = env
+ .datastore
+ .cacher()?
+ .ok_or(format_err!("no cacher for datastore"))?;
+ // Download from object store, insert to local cache store and read from
+ // file. Can this be optimized?
+ let chunk =
+ cache
+ .access(&digest, &mut cacher)
+ .await?
+ .ok_or(format_err!(
+ "unable to access chunk with digest {}",
+ hex::encode(digest)
+ ))?;
+ Body::from(chunk.raw_data().to_owned())
+ }
+ }
+ }
};
// fixme: set other headers ?
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 33/45] datastore: local chunk reader: get cached chunk from local cache store
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (40 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 32/45] api: reader: use local datastore cache on s3 backend chunk fetching Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 11:36 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 34/45] api: backup: add no-cache flag to bypass local datastore cache Christian Ebner
` (13 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Check if a chunk is contained in the local cache and if so prefer
fetching it from the cache instead of pulling it via the S3 api. This
improves performance and reduces number of requests to the backend.
Basic restore performance tests:
Restored a snapshot containing the linux git repository (on-disk size
5.069 GiB, compressed 3.718 GiB) from an AWS S3 backed datastore, with
and without cached contents:
non cached: 691.95 s
all cached: 74.89 s
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
pbs-datastore/src/local_chunk_reader.rs | 31 +++++++++++++++++++++----
1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/pbs-datastore/src/local_chunk_reader.rs b/pbs-datastore/src/local_chunk_reader.rs
index f5aa217ae..7ad44c4fa 100644
--- a/pbs-datastore/src/local_chunk_reader.rs
+++ b/pbs-datastore/src/local_chunk_reader.rs
@@ -2,7 +2,7 @@ use std::future::Future;
use std::pin::Pin;
use std::sync::Arc;
-use anyhow::{bail, Error};
+use anyhow::{bail, format_err, Error};
use http_body_util::BodyExt;
use pbs_api_types::CryptMode;
@@ -68,9 +68,18 @@ impl ReadChunk for LocalChunkReader {
fn read_raw_chunk(&self, digest: &[u8; 32]) -> Result<DataBlob, Error> {
let chunk = match &self.backend {
DatastoreBackend::Filesystem => self.store.load_chunk(digest)?,
- DatastoreBackend::S3(s3_client) => {
- proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?
- }
+ DatastoreBackend::S3(s3_client) => match self.store.cache() {
+ None => proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?,
+ Some(cache) => {
+ let mut cacher = self
+ .store
+ .cacher()?
+ .ok_or(format_err!("no cacher for datastore"))?;
+ proxmox_async::runtime::block_on(cache.access(digest, &mut cacher))?.ok_or(
+ format_err!("unable to access chunk with digest {}", hex::encode(digest)),
+ )?
+ }
+ },
};
self.ensure_crypt_mode(chunk.crypt_mode()?)?;
@@ -98,7 +107,19 @@ impl AsyncReadChunk for LocalChunkReader {
let raw_data = tokio::fs::read(&path).await?;
DataBlob::load_from_reader(&mut &raw_data[..])?
}
- DatastoreBackend::S3(s3_client) => fetch(s3_client.clone(), digest).await?,
+ DatastoreBackend::S3(s3_client) => match self.store.cache() {
+ None => fetch(s3_client.clone(), digest).await?,
+ Some(cache) => {
+ let mut cacher = self
+ .store
+ .cacher()?
+ .ok_or(format_err!("no cacher for datastore"))?;
+ cache.access(digest, &mut cacher).await?.ok_or(format_err!(
+ "unable to access chunk with digest {}",
+ hex::encode(digest)
+ ))?
+ }
+ },
};
self.ensure_crypt_mode(chunk.crypt_mode()?)?;
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 34/45] api: backup: add no-cache flag to bypass local datastore cache
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (41 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 33/45] datastore: local chunk reader: get cached chunk from local cache store Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 11:41 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 35/45] api/datastore: implement refresh endpoint for stores with s3 backend Christian Ebner
` (12 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Adds the `no-cache` flag so the client can request to bypass the
local datastore cache for chunk uploads. This is mainly intended for
debugging and benchmarking, but can be used in cases the caching is
known to be ineffective (no possible deduplication).
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
examples/upload-speed.rs | 1 +
pbs-client/src/backup_writer.rs | 4 +++-
proxmox-backup-client/src/benchmark.rs | 1 +
proxmox-backup-client/src/main.rs | 8 ++++++++
src/api2/backup/environment.rs | 3 +++
src/api2/backup/mod.rs | 3 +++
src/api2/backup/upload_chunk.rs | 9 +++++++++
src/server/push.rs | 1 +
8 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/examples/upload-speed.rs b/examples/upload-speed.rs
index e4b570ec5..8a6594a47 100644
--- a/examples/upload-speed.rs
+++ b/examples/upload-speed.rs
@@ -25,6 +25,7 @@ async fn upload_speed() -> Result<f64, Error> {
&(BackupType::Host, "speedtest".to_string(), backup_time).into(),
false,
true,
+ false,
)
.await?;
diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
index 1253ef561..ce5bd9375 100644
--- a/pbs-client/src/backup_writer.rs
+++ b/pbs-client/src/backup_writer.rs
@@ -82,6 +82,7 @@ impl BackupWriter {
backup: &BackupDir,
debug: bool,
benchmark: bool,
+ no_cache: bool,
) -> Result<Arc<BackupWriter>, Error> {
let mut param = json!({
"backup-type": backup.ty(),
@@ -89,7 +90,8 @@ impl BackupWriter {
"backup-time": backup.time,
"store": datastore,
"debug": debug,
- "benchmark": benchmark
+ "benchmark": benchmark,
+ "no-cache": no_cache,
});
if !ns.is_root() {
diff --git a/proxmox-backup-client/src/benchmark.rs b/proxmox-backup-client/src/benchmark.rs
index a6f24d745..ed21c7a91 100644
--- a/proxmox-backup-client/src/benchmark.rs
+++ b/proxmox-backup-client/src/benchmark.rs
@@ -236,6 +236,7 @@ async fn test_upload_speed(
&(BackupType::Host, "benchmark".to_string(), backup_time).into(),
false,
true,
+ true,
)
.await?;
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index 44f4f5db5..83fc9309a 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -742,6 +742,12 @@ fn spawn_catalog_upload(
optional: true,
default: false,
},
+ "no-cache": {
+ type: Boolean,
+ description: "Bypass local datastore cache for network storages.",
+ optional: true,
+ default: false,
+ },
}
}
)]
@@ -754,6 +760,7 @@ async fn create_backup(
change_detection_mode: Option<BackupDetectionMode>,
dry_run: bool,
skip_e2big_xattr: bool,
+ no_cache: bool,
limit: ClientRateLimitConfig,
_info: &ApiMethod,
_rpcenv: &mut dyn RpcEnvironment,
@@ -960,6 +967,7 @@ async fn create_backup(
&snapshot,
true,
false,
+ no_cache,
)
.await?;
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index 369385368..448659e74 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -113,6 +113,7 @@ pub struct BackupEnvironment {
result_attributes: Value,
auth_id: Authid,
pub debug: bool,
+ pub no_cache: bool,
pub formatter: &'static dyn OutputFormatter,
pub worker: Arc<WorkerTask>,
pub datastore: Arc<DataStore>,
@@ -129,6 +130,7 @@ impl BackupEnvironment {
worker: Arc<WorkerTask>,
datastore: Arc<DataStore>,
backup_dir: BackupDir,
+ no_cache: bool,
) -> Result<Self, Error> {
let state = SharedBackupState {
finished: false,
@@ -149,6 +151,7 @@ impl BackupEnvironment {
worker,
datastore,
debug: tracing::enabled!(tracing::Level::DEBUG),
+ no_cache,
formatter: JSON_FORMATTER,
backup_dir,
last_backup: None,
diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
index 026f1f106..ae61ff697 100644
--- a/src/api2/backup/mod.rs
+++ b/src/api2/backup/mod.rs
@@ -53,6 +53,7 @@ pub const API_METHOD_UPGRADE_BACKUP: ApiMethod = ApiMethod::new(
("backup-time", false, &BACKUP_TIME_SCHEMA),
("debug", true, &BooleanSchema::new("Enable verbose debug logging.").schema()),
("benchmark", true, &BooleanSchema::new("Job is a benchmark (do not keep data).").schema()),
+ ("no-cache", true, &BooleanSchema::new("Disable local datastore cache for network storages").schema()),
]),
)
).access(
@@ -79,6 +80,7 @@ fn upgrade_to_backup_protocol(
async move {
let debug = param["debug"].as_bool().unwrap_or(false);
let benchmark = param["benchmark"].as_bool().unwrap_or(false);
+ let no_cache = param["no-cache"].as_bool().unwrap_or(false);
let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
@@ -214,6 +216,7 @@ fn upgrade_to_backup_protocol(
worker.clone(),
datastore,
backup_dir,
+ no_cache,
)?;
env.debug = debug;
diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
index d97975b34..623b405dd 100644
--- a/src/api2/backup/upload_chunk.rs
+++ b/src/api2/backup/upload_chunk.rs
@@ -262,6 +262,15 @@ async fn upload_to_backend(
);
}
+ if env.no_cache {
+ let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
+ let is_duplicate = s3_client
+ .upload_with_retry(object_key, data, false)
+ .await
+ .context("failed to upload chunk to s3 backend")?;
+ return Ok((digest, size, encoded_size, is_duplicate));
+ }
+
// Avoid re-upload to S3 if the chunk is either present in the LRU cache or the chunk
// file exists on filesystem. The latter means that the chunk has been present in the
// past an was not cleaned up by garbage collection, so contained in the S3 object store.
diff --git a/src/server/push.rs b/src/server/push.rs
index e71012ed8..6a31d2abe 100644
--- a/src/server/push.rs
+++ b/src/server/push.rs
@@ -828,6 +828,7 @@ pub(crate) async fn push_snapshot(
snapshot,
false,
false,
+ false,
)
.await?;
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 35/45] api/datastore: implement refresh endpoint for stores with s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (42 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 34/45] api: backup: add no-cache flag to bypass local datastore cache Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 12:01 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 36/45] cli: add dedicated subcommand for datastore s3 refresh Christian Ebner
` (11 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Allows to easily refresh the contents on the local cache store for
datastores backed by an S3 object store.
In order to guarantee that no read or write operations are ongoing,
the store is first set into the maintenance mode `S3Refresh`. Objects
are then fetched into a temporary directory to avoid loosing contents
and consistency in case of an error. Once all objects have been
fetched, clears out existing contents and moves the newly fetched
contents in place.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- add more error context
- fix clippy warning
pbs-datastore/src/datastore.rs | 172 ++++++++++++++++++++++++++++++++-
src/api2/admin/datastore.rs | 34 +++++++
2 files changed, 205 insertions(+), 1 deletion(-)
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index cab0f5b4d..c63759f9a 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -10,11 +10,13 @@ use anyhow::{bail, format_err, Context, Error};
use http_body_util::BodyExt;
use nix::unistd::{unlinkat, UnlinkatFlags};
use pbs_tools::lru_cache::LruCache;
+use proxmox_lang::try_block;
+use tokio::io::AsyncWriteExt;
use tracing::{info, warn};
use proxmox_human_byte::HumanByte;
use proxmox_s3_client::{
- S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3PathPrefix,
+ S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3ObjectKey, S3PathPrefix,
};
use proxmox_schema::ApiType;
@@ -2132,4 +2134,172 @@ impl DataStore {
pub fn old_locking(&self) -> bool {
*OLD_LOCKING
}
+
+ /// Set the datastore's maintenance mode to `S3Refresh`, fetch from S3 object store, clear and
+ /// replace the local cache store contents. Once finished disable the maintenance mode again.
+ /// Returns with error for other datastore backends without setting the maintenance mode.
+ pub async fn s3_refresh(self: &Arc<Self>) -> Result<(), Error> {
+ match self.backend()? {
+ DatastoreBackend::Filesystem => bail!("store '{}' not backed by S3", self.name()),
+ DatastoreBackend::S3(s3_client) => {
+ try_block!({
+ let _lock = pbs_config::datastore::lock_config()?;
+ let (mut section_config, _digest) = pbs_config::datastore::config()?;
+ let mut datastore: DataStoreConfig =
+ section_config.lookup("datastore", self.name())?;
+ datastore.set_maintenance_mode(Some(MaintenanceMode {
+ ty: MaintenanceType::S3Refresh,
+ message: None,
+ }))?;
+ section_config.set_data(self.name(), "datastore", &datastore)?;
+ pbs_config::datastore::save_config(§ion_config)?;
+ drop(_lock);
+ Ok::<(), Error>(())
+ })
+ .context("failed to set maintenance mode")?;
+
+ let store_base = self.base_path();
+
+ let tmp_base = proxmox_sys::fs::make_tmp_dir(&store_base, None)
+ .context("failed to create temporary content folder in {store_base}")?;
+
+ let backup_user = pbs_config::backup_user().context("failed to get backup user")?;
+ let mode = nix::sys::stat::Mode::from_bits_truncate(0o0644);
+ let file_create_options = CreateOptions::new()
+ .perm(mode)
+ .owner(backup_user.uid)
+ .group(backup_user.gid);
+ let mode = nix::sys::stat::Mode::from_bits_truncate(0o0755);
+ let dir_create_options = CreateOptions::new()
+ .perm(mode)
+ .owner(backup_user.uid)
+ .group(backup_user.gid);
+
+ let list_prefix = S3PathPrefix::Some(S3_CONTENT_PREFIX.to_string());
+ let store_prefix = format!("{}/{S3_CONTENT_PREFIX}/", self.name());
+ let mut next_continuation_token: Option<String> = None;
+ loop {
+ let list_objects_result = s3_client
+ .list_objects_v2(&list_prefix, next_continuation_token.as_deref())
+ .await
+ .context("failed to list object")?;
+
+ let objects_to_fetch: Vec<S3ObjectKey> = list_objects_result
+ .contents
+ .into_iter()
+ .map(|item| item.key)
+ .collect();
+
+ for object_key in objects_to_fetch {
+ let object_path = format!("{object_key}");
+ let object_path = object_path.strip_prefix(&store_prefix).with_context(||
+ format!("failed to strip store context prefix {store_prefix} for {object_key}")
+ )?;
+ if object_path.ends_with(NAMESPACE_MARKER_FILENAME) {
+ continue;
+ }
+
+ info!("Fetching object {object_path}");
+
+ let file_path = tmp_base.join(object_path);
+ if let Some(parent) = file_path.parent() {
+ proxmox_sys::fs::create_path(
+ parent,
+ Some(dir_create_options),
+ Some(dir_create_options),
+ )?;
+ }
+
+ let mut target_file = tokio::fs::OpenOptions::new()
+ .write(true)
+ .create(true)
+ .truncate(true)
+ .read(true)
+ .open(&file_path)
+ .await
+ .with_context(|| {
+ format!("failed to create target file {file_path:?}")
+ })?;
+
+ if let Some(response) = s3_client
+ .get_object(object_key)
+ .await
+ .with_context(|| format!("failed to fetch object {object_path}"))?
+ {
+ let data = response
+ .content
+ .collect()
+ .await
+ .context("failed to collect object contents")?;
+ target_file
+ .write_all(&data.to_bytes())
+ .await
+ .context("failed to write to target file")?;
+ file_create_options
+ .apply_to(&mut target_file, &file_path)
+ .context("failed to set target file create options")?;
+ target_file
+ .flush()
+ .await
+ .context("failed to flush target file")?;
+ } else {
+ bail!("failed to download {object_path}, not found");
+ }
+ }
+
+ if list_objects_result.is_truncated {
+ next_continuation_token = list_objects_result
+ .next_continuation_token
+ .as_ref()
+ .cloned();
+ continue;
+ }
+ break;
+ }
+
+ for ty in ["vm", "ct", "host", "ns"] {
+ let store_base_clone = store_base.clone();
+ let tmp_base_clone = tmp_base.clone();
+ tokio::task::spawn_blocking(move || {
+ let type_dir = store_base_clone.join(ty);
+ if let Err(err) = std::fs::remove_dir_all(&type_dir) {
+ if err.kind() != io::ErrorKind::NotFound {
+ return Err(err).with_context(|| {
+ format!("failed to remove old contents in {type_dir:?}")
+ });
+ }
+ }
+ let tmp_type_dir = tmp_base_clone.join(ty);
+ if let Err(err) = std::fs::rename(&tmp_type_dir, &type_dir) {
+ if err.kind() != io::ErrorKind::NotFound {
+ return Err(err)
+ .with_context(|| format!("failed to rename {tmp_type_dir:?}"));
+ }
+ }
+ Ok::<(), Error>(())
+ })
+ .await?
+ .with_context(|| format!("failed to refresh {store_base:?}"))?;
+ }
+
+ std::fs::remove_dir_all(&tmp_base).with_context(|| {
+ format!("failed to cleanup temporary content in {tmp_base:?}")
+ })?;
+
+ try_block!({
+ let _lock = pbs_config::datastore::lock_config()?;
+ let (mut section_config, _digest) = pbs_config::datastore::config()?;
+ let mut datastore: DataStoreConfig =
+ section_config.lookup("datastore", self.name())?;
+ datastore.set_maintenance_mode(None)?;
+ section_config.set_data(self.name(), "datastore", &datastore)?;
+ pbs_config::datastore::save_config(§ion_config)?;
+ drop(_lock);
+ Ok::<(), Error>(())
+ })
+ .context("failed to clear maintenance mode")?;
+ }
+ }
+ Ok(())
+ }
}
diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 80740e3fb..41cbee4de 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -2707,6 +2707,39 @@ pub async fn unmount(store: String, rpcenv: &mut dyn RpcEnvironment) -> Result<V
Ok(json!(upid))
}
+#[api(
+ protected: true,
+ input: {
+ properties: {
+ store: {
+ schema: DATASTORE_SCHEMA,
+ },
+ }
+ },
+ returns: {
+ schema: UPID_SCHEMA,
+ },
+ access: {
+ permission: &Permission::Privilege(&["datastore", "{store}"], PRIV_DATASTORE_MODIFY, false),
+ },
+)]
+/// Refresh datastore contents from S3 to local cache store.
+pub async fn s3_refresh(store: String, rpcenv: &mut dyn RpcEnvironment) -> Result<Value, Error> {
+ let datastore = DataStore::lookup_datastore(&store, Some(Operation::Lookup))?;
+ let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
+ let to_stdout = rpcenv.env_type() == RpcEnvironmentType::CLI;
+
+ let upid = WorkerTask::spawn(
+ "s3-refresh",
+ Some(store),
+ auth_id.to_string(),
+ to_stdout,
+ move |_worker| async move { datastore.s3_refresh().await },
+ )?;
+
+ Ok(json!(upid))
+}
+
#[sortable]
const DATASTORE_INFO_SUBDIRS: SubdirMap = &[
(
@@ -2773,6 +2806,7 @@ const DATASTORE_INFO_SUBDIRS: SubdirMap = &[
&Router::new().download(&API_METHOD_PXAR_FILE_DOWNLOAD),
),
("rrd", &Router::new().get(&API_METHOD_GET_RRD_STATS)),
+ ("s3-refresh", &Router::new().put(&API_METHOD_S3_REFRESH)),
(
"snapshots",
&Router::new()
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 36/45] cli: add dedicated subcommand for datastore s3 refresh
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (43 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 35/45] api/datastore: implement refresh endpoint for stores with s3 backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 37/45] ui: render s3 refresh as valid maintenance type and task description Christian Ebner
` (10 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Allows to manually trigger an s3 refresh via proxmox-backup-manager
by calling the corresponding api endpoint handler.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
src/bin/proxmox_backup_manager/datastore.rs | 30 +++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/src/bin/proxmox_backup_manager/datastore.rs b/src/bin/proxmox_backup_manager/datastore.rs
index 1922a55a2..4d8b8bf3a 100644
--- a/src/bin/proxmox_backup_manager/datastore.rs
+++ b/src/bin/proxmox_backup_manager/datastore.rs
@@ -290,6 +290,30 @@ async fn uuid_mount(param: Value, _rpcenv: &mut dyn RpcEnvironment) -> Result<Va
Ok(Value::Null)
}
+#[api(
+ protected: true,
+ input: {
+ properties: {
+ store: {
+ schema: DATASTORE_SCHEMA,
+ },
+ },
+ },
+)]
+/// Refresh datastore contents from S3 to local cache store.
+async fn s3_refresh(mut param: Value, rpcenv: &mut dyn RpcEnvironment) -> Result<(), Error> {
+ param["node"] = "localhost".into();
+
+ let info = &api2::admin::datastore::API_METHOD_S3_REFRESH;
+ let result = match info.handler {
+ ApiHandler::Async(handler) => (handler)(param, info, rpcenv).await?,
+ _ => unreachable!(),
+ };
+
+ crate::wait_for_local_worker(result.as_str().unwrap()).await?;
+ Ok(())
+}
+
pub fn datastore_commands() -> CommandLineInterface {
let cmd_def = CliCommandMap::new()
.insert("list", CliCommand::new(&API_METHOD_LIST_DATASTORES))
@@ -302,6 +326,12 @@ pub fn datastore_commands() -> CommandLineInterface {
pbs_config::datastore::complete_removable_datastore_name,
),
)
+ .insert(
+ "s3-refresh",
+ CliCommand::new(&API_METHOD_S3_REFRESH)
+ .arg_param(&["store"])
+ .completion_cb("store", pbs_config::datastore::complete_datastore_name),
+ )
.insert(
"show",
CliCommand::new(&API_METHOD_SHOW_DATASTORE)
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 37/45] ui: render s3 refresh as valid maintenance type and task description
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (44 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 36/45] cli: add dedicated subcommand for datastore s3 refresh Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 38/45] ui: expose s3 refresh button for datastores backed by object store Christian Ebner
` (9 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Analogous to the maintenance type `unmount`, show the `s3-refresh` as
translated string in the maintenance mode options and task
description.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
www/Utils.js | 4 ++++
www/window/MaintenanceOptions.js | 6 +++++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/www/Utils.js b/www/Utils.js
index 30b4a6e79..61f504c3e 100644
--- a/www/Utils.js
+++ b/www/Utils.js
@@ -441,6 +441,7 @@ Ext.define('PBS.Utils', {
prunejob: (type, id) => PBS.Utils.render_prune_job_worker_id(id, gettext('Prune Job')),
reader: (type, id) => PBS.Utils.render_datastore_worker_id(id, gettext('Read Objects')),
'rewind-media': [gettext('Drive'), gettext('Rewind Media')],
+ 's3-refresh': [gettext('Datastore'), gettext('S3 Refresh')],
sync: ['Datastore', gettext('Remote Sync')],
syncjob: [gettext('Sync Job'), gettext('Remote Sync')],
'tape-backup': (type, id) =>
@@ -838,6 +839,9 @@ Ext.define('PBS.Utils', {
case 'unmount':
modeText = gettext('Unmounting');
break;
+ case 's3-refresh':
+ modeText = gettext('S3 refresh');
+ break;
}
return `${modeText} ${extra}`;
},
diff --git a/www/window/MaintenanceOptions.js b/www/window/MaintenanceOptions.js
index 292353556..9a735e5e8 100644
--- a/www/window/MaintenanceOptions.js
+++ b/www/window/MaintenanceOptions.js
@@ -90,13 +90,17 @@ Ext.define('PBS.window.MaintenanceOptions', {
}
let unmounting = options['maintenance-type'] === 'unmount';
+ let s3Refresh = options['maintenance-type'] === 's3-refresh';
let defaultType = options['maintenance-type'] === '__default__';
if (unmounting) {
options['maintenance-type'] = gettext('Unmounting');
}
+ if (s3Refresh) {
+ options['maintenance-type'] = gettext('S3 Refresh');
+ }
me.callParent([options]);
- me.lookupReference('message-field').setDisabled(unmounting || defaultType);
+ me.lookupReference('message-field').setDisabled(unmounting || s3Refresh || defaultType);
},
});
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 38/45] ui: expose s3 refresh button for datastores backed by object store
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (45 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 37/45] ui: render s3 refresh as valid maintenance type and task description Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 12:46 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 39/45] datastore: conditionally upload atime marker chunk to s3 backend Christian Ebner
` (8 subsequent siblings)
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Allows to trigger a refresh of the local datastore contents from
the WebUI.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- no changes
www/datastore/Summary.js | 44 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/www/datastore/Summary.js b/www/datastore/Summary.js
index cdb34aea3..d8f59ebc5 100644
--- a/www/datastore/Summary.js
+++ b/www/datastore/Summary.js
@@ -301,6 +301,31 @@ Ext.define('PBS.DataStoreSummary', {
});
},
},
+ {
+ xtype: 'button',
+ text: gettext('S3 Refresh'),
+ hidden: true,
+ itemId: 's3RefreshButton',
+ reference: 's3RefreshButton',
+ handler: function () {
+ let me = this;
+ let datastore = me.up('panel').datastore;
+ Proxmox.Utils.API2Request({
+ url: `/admin/datastore/${datastore}/s3-refresh`,
+ method: 'PUT',
+ failure: (response) => Ext.Msg.alert(gettext('Error'), response.htmlStatus),
+ success: function (response, options) {
+ Ext.create('Proxmox.window.TaskViewer', {
+ upid: response.result.data,
+ taskDone: () => {
+ me.up('panel').statusStore.load();
+ Ext.ComponentQuery.query('navigationtree')[0]?.reloadStore();
+ },
+ }).show();
+ },
+ });
+ },
+ },
'->',
{
xtype: 'proxmoxRRDTypeSelector',
@@ -398,6 +423,7 @@ Ext.define('PBS.DataStoreSummary', {
me.mon(me.statusStore, 'load', (s, records, success) => {
let mountBtn = me.lookupReferenceHolder().lookupReference('mountButton');
let unmountBtn = me.lookupReferenceHolder().lookupReference('unmountButton');
+ let s3RefreshBtn = me.lookupReferenceHolder().lookupReference('s3RefreshButton');
if (!success) {
lastRequestWasFailue = true;
@@ -413,6 +439,16 @@ Ext.define('PBS.DataStoreSummary', {
success: (response) => {
let mode = response.result.data['maintenance-mode'];
let [type, _message] = PBS.Utils.parseMaintenanceMode(mode);
+
+ if (!type && response.result.data.backend) {
+ let backendConfig = PBS.Utils.parsePropertyString(
+ response.result.data.backend,
+ );
+ if (backendConfig.type === 's3') {
+ s3RefreshBtn.setDisabled(true);
+ }
+ }
+
if (!response.result.data['backing-device']) {
return;
}
@@ -466,6 +502,14 @@ Ext.define('PBS.DataStoreSummary', {
me.lookupReferenceHolder().lookupReference('mountButton').setHidden(!removable);
me.lookupReferenceHolder().lookupReference('unmountButton').setHidden(!removable);
+ if (data.backend) {
+ let backendConfig = PBS.Utils.parsePropertyString(data.backend);
+ let s3Backend = backendConfig.type === 's3';
+ me.lookupReferenceHolder()
+ .lookupReference('s3RefreshButton')
+ .setHidden(!s3Backend);
+ }
+
let path = Ext.htmlEncode(data.path);
me.down('pbsDataStoreInfo').setTitle(`${me.datastore} (${path})`);
me.down('pbsDataStoreNotes').setNotes(data.comment);
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 39/45] datastore: conditionally upload atime marker chunk to s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (46 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 38/45] ui: expose s3 refresh button for datastores backed by object store Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 40/45] bin: implement client subcommands for s3 configuration manipulation Christian Ebner
` (7 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Since commit b18eab64 ("fix #5982: garbage collection: check atime
updates are honored"), the 4 MiB fixed sized, unencypted and
compressed chunk containing all zeros is inserted at datastore
creation if the atime safety check is enabled.
If the datastore is backed by an S3 object store, chunk uploads are
avoided by checking the presence of the chunks in the local cache
store. Therefore, the all zero chunk will however not be uploaded
since already inserted locally.
Fix this by conditionally uploading the chunk before performing the
atime update check for datastores backed by S3.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- new in this version
pbs-datastore/src/chunk_store.rs | 26 +++++++++++++++++++++++---
pbs-datastore/src/datastore.rs | 20 ++++++++++----------
src/api2/config/datastore.rs | 5 ++++-
3 files changed, 37 insertions(+), 14 deletions(-)
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 95f00e8d5..3e0994aed 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -9,6 +9,7 @@ use tracing::{info, warn};
use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
use proxmox_io::ReadExt;
+use proxmox_s3_client::S3Client;
use proxmox_sys::fs::{create_dir, create_path, file_type_from_file_stat, CreateOptions};
use proxmox_sys::process_locker::{
ProcessLockExclusiveGuard, ProcessLockSharedGuard, ProcessLocker,
@@ -454,11 +455,30 @@ impl ChunkStore {
/// Uses a 4 MiB fixed size, compressed but unencrypted chunk to test. The chunk is inserted in
/// the chunk store if not yet present.
/// Returns with error if the check could not be performed.
- pub fn check_fs_atime_updates(&self, retry_on_file_changed: bool) -> Result<(), Error> {
+ pub fn check_fs_atime_updates(
+ &self,
+ retry_on_file_changed: bool,
+ s3_client: Option<Arc<S3Client>>,
+ ) -> Result<(), Error> {
let (zero_chunk, digest) = DataChunkBuilder::build_zero_chunk(None, 4096 * 1024, true)?;
- let (pre_existing, _) = self.insert_chunk(&zero_chunk, &digest)?;
let (path, _digest) = self.chunk_path(&digest);
+ if let Some(ref s3_client) = s3_client {
+ if let Err(err) = std::fs::metadata(&path) {
+ if err.kind() == std::io::ErrorKind::NotFound {
+ let object_key = crate::s3::object_key_from_digest(&digest)?;
+ proxmox_async::runtime::block_on(s3_client.upload_with_retry(
+ object_key,
+ zero_chunk.raw_data().to_vec().into(),
+ false,
+ ))
+ .context("failed to upload chunk to s3 backend")?;
+ }
+ }
+ }
+
+ let (pre_existing, _) = self.insert_chunk(&zero_chunk, &digest)?;
+
// Take into account timestamp update granularity in the kernel
// Blocking the thread is fine here since this runs in a worker.
std::thread::sleep(Duration::from_secs(1));
@@ -478,7 +498,7 @@ impl ChunkStore {
// two metadata calls, try to check once again on changed file
if metadata_before.ino() != metadata_now.ino() {
if retry_on_file_changed {
- return self.check_fs_atime_updates(false);
+ return self.check_fs_atime_updates(false, s3_client);
}
bail!("chunk {path:?} changed twice during access time safety check, cannot proceed.");
}
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index c63759f9a..c72d1d228 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1528,10 +1528,19 @@ impl DataStore {
.parse_property_string(gc_store_config.tuning.as_deref().unwrap_or(""))?,
)?;
+ let s3_client = match self.backend()? {
+ DatastoreBackend::Filesystem => None,
+ DatastoreBackend::S3(s3_client) => {
+ proxmox_async::runtime::block_on(s3_client.head_bucket())
+ .context("failed to reach bucket")?;
+ Some(s3_client)
+ }
+ };
+
if tuning.gc_atime_safety_check.unwrap_or(true) {
self.inner
.chunk_store
- .check_fs_atime_updates(true)
+ .check_fs_atime_updates(true, s3_client.clone())
.context("atime safety check failed")?;
info!("Access time update check successful, proceeding with GC.");
} else {
@@ -1570,15 +1579,6 @@ impl DataStore {
1024 * 1024
};
- let s3_client = match self.backend()? {
- DatastoreBackend::Filesystem => None,
- DatastoreBackend::S3(s3_client) => {
- proxmox_async::runtime::block_on(s3_client.head_bucket())
- .context("failed to reach bucket")?;
- Some(s3_client)
- }
- };
-
info!("Start GC phase1 (mark used chunks)");
self.mark_used_chunks(
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index 0fb822c79..c5fca67bc 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -1,4 +1,5 @@
use std::path::{Path, PathBuf};
+use std::sync::Arc;
use ::serde::{Deserialize, Serialize};
use anyhow::{bail, format_err, Context, Error};
@@ -117,6 +118,7 @@ pub(crate) fn do_create_datastore(
.parse_property_string(datastore.tuning.as_deref().unwrap_or(""))?,
)?;
+ let mut backend_s3_client = None;
if let Some(ref backend_config) = datastore.backend {
let backend_config: DatastoreBackendConfig = backend_config.parse()?;
match backend_config.ty.unwrap_or_default() {
@@ -150,6 +152,7 @@ pub(crate) fn do_create_datastore(
// Fine to block since this runs in worker task
proxmox_async::runtime::block_on(s3_client.head_bucket())
.context("failed to access bucket")?;
+ backend_s3_client = Some(Arc::new(s3_client));
}
}
}
@@ -193,7 +196,7 @@ pub(crate) fn do_create_datastore(
if tuning.gc_atime_safety_check.unwrap_or(true) {
chunk_store
- .check_fs_atime_updates(true)
+ .check_fs_atime_updates(true, backend_s3_client)
.context("access time safety check failed")?;
info!("Access time update check successful.");
} else {
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 40/45] bin: implement client subcommands for s3 configuration manipulation
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (47 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 39/45] datastore: conditionally upload atime marker chunk to s3 backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 41/45] bin: expose reuse-datastore flag for proxmox-backup-manager Christian Ebner
` (6 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Implement and expose the proxmox-backup-manager commands to interact
with the s3 client configuration.
This mostly requires to insert the commands into the cli command map and
bind them to the corresponding api methods. The list method is the only
exception, as it requires rendering of the output given the provided
output format.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- new in this version
src/bin/proxmox_backup_manager/s3.rs | 70 +++++++++++++++++++++++++---
1 file changed, 63 insertions(+), 7 deletions(-)
diff --git a/src/bin/proxmox_backup_manager/s3.rs b/src/bin/proxmox_backup_manager/s3.rs
index 9bb89ff55..82bc9413a 100644
--- a/src/bin/proxmox_backup_manager/s3.rs
+++ b/src/bin/proxmox_backup_manager/s3.rs
@@ -1,4 +1,4 @@
-use proxmox_router::{cli::*, RpcEnvironment};
+use proxmox_router::{cli::*, ApiHandler, RpcEnvironment};
use proxmox_s3_client::{S3_BUCKET_NAME_SCHEMA, S3_CLIENT_ID_SCHEMA};
use proxmox_schema::api;
@@ -34,13 +34,69 @@ async fn check(
Ok(Value::Null)
}
+#[api(
+ input: {
+ properties: {
+ "output-format": {
+ schema: OUTPUT_FORMAT,
+ optional: true,
+ },
+ }
+ }
+)]
+/// List configured s3 clients.
+fn list_s3_clients(param: Value, rpcenv: &mut dyn RpcEnvironment) -> Result<Value, Error> {
+ let output_format = get_output_format(¶m);
+
+ let info = &api2::config::s3::API_METHOD_LIST_S3_CLIENT_CONFIG;
+ let mut data = match info.handler {
+ ApiHandler::Sync(handler) => (handler)(param, info, rpcenv)?,
+ _ => unreachable!(),
+ };
+
+ let options = default_table_format_options()
+ .column(ColumnConfig::new("id"))
+ .column(ColumnConfig::new("endpoint"))
+ .column(ColumnConfig::new("port"))
+ .column(ColumnConfig::new("region"))
+ .column(ColumnConfig::new("access-key"))
+ .column(ColumnConfig::new("fingerprint"))
+ .column(ColumnConfig::new("path-style"));
+
+ format_and_print_result_full(&mut data, &info.returns, &output_format, &options);
+
+ Ok(Value::Null)
+}
+
pub fn s3_commands() -> CommandLineInterface {
- let cmd_def = CliCommandMap::new().insert(
- "check",
- CliCommand::new(&API_METHOD_CHECK)
- .arg_param(&["s3-client-id", "bucket"])
- .completion_cb("s3-client-id", pbs_config::s3::complete_s3_client_id),
- );
+ let client_cmd_def = CliCommandMap::new()
+ .insert("list", CliCommand::new(&API_METHOD_LIST_S3_CLIENTS))
+ .insert(
+ "create",
+ CliCommand::new(&api2::config::s3::API_METHOD_CREATE_S3_CLIENT_CONFIG)
+ .arg_param(&["id"]),
+ )
+ .insert(
+ "update",
+ CliCommand::new(&api2::config::s3::API_METHOD_UPDATE_S3_CLIENT_CONFIG)
+ .arg_param(&["id"])
+ .completion_cb("id", pbs_config::s3::complete_s3_client_id),
+ )
+ .insert(
+ "remove",
+ CliCommand::new(&api2::config::s3::API_METHOD_DELETE_S3_CLIENT_CONFIG)
+ .arg_param(&["id"])
+ .completion_cb("id", pbs_config::s3::complete_s3_client_id),
+ );
+
+ let cmd_def = CliCommandMap::new()
+ .insert(
+ "check",
+ CliCommand::new(&API_METHOD_CHECK)
+ .arg_param(&["s3-client-id", "bucket"])
+ .completion_cb("s3-client-id", pbs_config::s3::complete_s3_client_id),
+ )
+ .insert("client", client_cmd_def);
cmd_def.into()
}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 41/45] bin: expose reuse-datastore flag for proxmox-backup-manager
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (48 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 40/45] bin: implement client subcommands for s3 configuration manipulation Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 42/45] datastore: mark store as in-use by setting marker on s3 backend Christian Ebner
` (5 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
It is currently not possible to create a new datastore config and reuse
an existing datastore. Expose the `reuse-datastore` flag also for the
proxmox-backup-manager command, equivalent to what is already exposed in
the WebUI.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- new in this version
src/bin/proxmox_backup_manager/datastore.rs | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/bin/proxmox_backup_manager/datastore.rs b/src/bin/proxmox_backup_manager/datastore.rs
index 4d8b8bf3a..703974882 100644
--- a/src/bin/proxmox_backup_manager/datastore.rs
+++ b/src/bin/proxmox_backup_manager/datastore.rs
@@ -107,6 +107,12 @@ fn show_datastore(param: Value, rpcenv: &mut dyn RpcEnvironment) -> Result<Value
type: DataStoreConfig,
flatten: true,
},
+ "reuse-datastore": {
+ type: Boolean,
+ optional: true,
+ default: false,
+ description: "Re-use existing datastore directory."
+ },
"output-format": {
schema: OUTPUT_FORMAT,
optional: true,
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 42/45] datastore: mark store as in-use by setting marker on s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (49 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 41/45] bin: expose reuse-datastore flag for proxmox-backup-manager Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 43/45] datastore: run s3-refresh when reusing a datastore with " Christian Ebner
` (4 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Adds an in-use marker on the S3 store to protect from accidental reuse
of the same datastore by multiple Proxmox Backup Server instances. Set
the marker file on store creation.
The local cache folder is however always assumed to be empty and needs
creation on datastore creation to guarantee consistency.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- new in this version
src/api2/config/datastore.rs | 44 +++++++++++++++++++++++++++++++++++-
1 file changed, 43 insertions(+), 1 deletion(-)
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index c5fca67bc..2678a71fd 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -4,6 +4,7 @@ use std::sync::Arc;
use ::serde::{Deserialize, Serialize};
use anyhow::{bail, format_err, Context, Error};
use hex::FromHex;
+use http_body_util::BodyExt;
use serde_json::Value;
use tracing::{info, warn};
@@ -34,10 +35,20 @@ use pbs_config::CachedUserInfo;
use pbs_datastore::get_datastore_mount_status;
use proxmox_rest_server::WorkerTask;
+use proxmox_s3_client::S3ObjectKey;
use crate::server::jobstate;
use crate::tools::disks::unmount_by_mountpoint;
+const S3_DATASTORE_IN_USE_MARKER: &str = ".in-use";
+
+#[derive(Default, serde::Deserialize, serde::Serialize)]
+#[serde(rename_all = "kebab-case")]
+struct InUseContent {
+ #[serde(skip_serializing_if = "Option::is_none")]
+ hostname: Option<String>,
+}
+
#[api(
input: {
properties: {},
@@ -152,6 +163,23 @@ pub(crate) fn do_create_datastore(
// Fine to block since this runs in worker task
proxmox_async::runtime::block_on(s3_client.head_bucket())
.context("failed to access bucket")?;
+
+ let object_key = S3ObjectKey::from(S3_DATASTORE_IN_USE_MARKER);
+ if let Some(response) =
+ proxmox_async::runtime::block_on(s3_client.get_object(object_key.clone()))
+ .context("failed to get in-use marker from bucket")?
+ {
+ let content = proxmox_async::runtime::block_on(response.content.collect())
+ .unwrap_or_default();
+ let content =
+ String::from_utf8(content.to_bytes().to_vec()).unwrap_or_default();
+ let in_use: InUseContent = serde_json::from_str(&content).unwrap_or_default();
+ if let Some(hostname) = in_use.hostname {
+ bail!("Bucket already contains datastore in use by host {hostname}");
+ } else {
+ bail!("Bucket already contains datastore in use");
+ }
+ }
backend_s3_client = Some(Arc::new(s3_client));
}
}
@@ -164,7 +192,7 @@ pub(crate) fn do_create_datastore(
UnmountGuard::new(None)
};
- let chunk_store = if reuse_datastore {
+ let chunk_store = if reuse_datastore && backend_s3_client.is_none() {
ChunkStore::verify_chunkstore(&path).and_then(|_| {
// Must be the only instance accessing and locking the chunk store,
// dropping will close all other locks from this process on the lockfile as well.
@@ -194,6 +222,20 @@ pub(crate) fn do_create_datastore(
)?
};
+ if let Some(ref s3_client) = backend_s3_client {
+ let object_key = S3ObjectKey::from(S3_DATASTORE_IN_USE_MARKER);
+ let content = serde_json::to_string(&InUseContent {
+ hostname: Some(proxmox_sys::nodename().to_string()),
+ })
+ .context("failed to encode hostname")?;
+ proxmox_async::runtime::block_on(s3_client.put_object(
+ object_key,
+ hyper::body::Bytes::from(content).into(),
+ true,
+ ))
+ .context("failed to upload in-use marker for datastore")?;
+ }
+
if tuning.gc_atime_safety_check.unwrap_or(true) {
chunk_store
.check_fs_atime_updates(true, backend_s3_client)
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 43/45] datastore: run s3-refresh when reusing a datastore with s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (50 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 42/45] datastore: mark store as in-use by setting marker on s3 backend Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 44/45] api/ui: add flag to allow overwriting in-use marker for " Christian Ebner
` (3 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Instead of relying on the user to manually trigger the refresh after
datastore creation, do it already automatically in the datastore
creation task, thereby improving ergonomics.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- new in this version
src/api2/config/datastore.rs | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index 2678a71fd..3e0a01a5b 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -16,7 +16,7 @@ use proxmox_uuid::Uuid;
use pbs_api_types::{
Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreBackendConfig, DatastoreBackendType,
- DatastoreNotify, DatastoreTuning, KeepOptions, MaintenanceMode, PruneJobConfig,
+ DatastoreNotify, DatastoreTuning, KeepOptions, MaintenanceMode, Operation, PruneJobConfig,
PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE, PRIV_DATASTORE_AUDIT,
PRIV_DATASTORE_MODIFY, PRIV_SYS_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA, UPID_SCHEMA,
};
@@ -33,7 +33,7 @@ use crate::api2::config::tape_backup_job::{delete_tape_backup_job, list_tape_bac
use crate::api2::config::verify::delete_verification_job;
use pbs_config::CachedUserInfo;
-use pbs_datastore::get_datastore_mount_status;
+use pbs_datastore::{get_datastore_mount_status, DatastoreBackend};
use proxmox_rest_server::WorkerTask;
use proxmox_s3_client::S3ObjectKey;
@@ -342,19 +342,37 @@ pub fn create_datastore(
..config
};
+ let store_name = config.name.to_string();
WorkerTask::new_thread(
"create-datastore",
- Some(config.name.to_string()),
+ Some(store_name.clone()),
auth_id.to_string(),
to_stdout,
move |_worker| {
do_create_datastore(lock, section_config, config, reuse_datastore)?;
if let Some(prune_job_config) = prune_job_config {
- do_create_prune_job(prune_job_config)
- } else {
- Ok(())
+ do_create_prune_job(prune_job_config)?;
}
+
+ if reuse_datastore {
+ let datastore = pbs_datastore::DataStore::lookup_datastore(
+ &store_name,
+ Some(Operation::Lookup),
+ )
+ .context("failed to lookup datastore")?;
+ match datastore
+ .backend()
+ .context("failed to get datastore backend")?
+ {
+ DatastoreBackend::Filesystem => (),
+ DatastoreBackend::S3(_s3_client) => {
+ proxmox_async::runtime::block_on(datastore.s3_refresh())
+ .context("S3 refresh failed")?;
+ }
+ }
+ }
+ Ok(())
},
)
}
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 44/45] api/ui: add flag to allow overwriting in-use marker for s3 backend
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (51 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 43/45] datastore: run s3-refresh when reusing a datastore with " Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 45/45] docs: Add section describing how to setup s3 backed datastore Christian Ebner
` (2 subsequent siblings)
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Datastores backed by an s3 object store mark the corresponding bucket
prefix given by the datastore name as in-use to protect from
accidental reuse of the same datastore from other instances.
If the datastore has to be re-created because the Proxmox Backup
Server instance is no longer available, skipping the check and
overwriting the marker with the current hostname is necessary.
Expose this flag to the datastore create api endpoint and expose
it to the web ui and cli command.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- new in this version
src/api2/config/datastore.rs | 43 +++++++++++++--------
src/api2/node/disks/directory.rs | 2 +-
src/api2/node/disks/zfs.rs | 2 +-
src/bin/proxmox_backup_manager/datastore.rs | 6 +++
www/window/DataStoreEdit.js | 20 ++++++++++
5 files changed, 55 insertions(+), 18 deletions(-)
diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
index 3e0a01a5b..b03d7394c 100644
--- a/src/api2/config/datastore.rs
+++ b/src/api2/config/datastore.rs
@@ -112,6 +112,7 @@ pub(crate) fn do_create_datastore(
mut config: SectionConfigData,
datastore: DataStoreConfig,
reuse_datastore: bool,
+ overwrite_in_use: bool,
) -> Result<(), Error> {
let path: PathBuf = datastore.absolute_path().into();
@@ -164,20 +165,23 @@ pub(crate) fn do_create_datastore(
proxmox_async::runtime::block_on(s3_client.head_bucket())
.context("failed to access bucket")?;
- let object_key = S3ObjectKey::from(S3_DATASTORE_IN_USE_MARKER);
- if let Some(response) =
- proxmox_async::runtime::block_on(s3_client.get_object(object_key.clone()))
- .context("failed to get in-use marker from bucket")?
- {
- let content = proxmox_async::runtime::block_on(response.content.collect())
- .unwrap_or_default();
- let content =
- String::from_utf8(content.to_bytes().to_vec()).unwrap_or_default();
- let in_use: InUseContent = serde_json::from_str(&content).unwrap_or_default();
- if let Some(hostname) = in_use.hostname {
- bail!("Bucket already contains datastore in use by host {hostname}");
- } else {
- bail!("Bucket already contains datastore in use");
+ if !overwrite_in_use {
+ let object_key = S3ObjectKey::from(S3_DATASTORE_IN_USE_MARKER);
+ if let Some(response) =
+ proxmox_async::runtime::block_on(s3_client.get_object(object_key.clone()))
+ .context("failed to get in-use marker from bucket")?
+ {
+ let content = proxmox_async::runtime::block_on(response.content.collect())
+ .unwrap_or_default();
+ let content =
+ String::from_utf8(content.to_bytes().to_vec()).unwrap_or_default();
+ let in_use: InUseContent =
+ serde_json::from_str(&content).unwrap_or_default();
+ if let Some(hostname) = in_use.hostname {
+ bail!("Bucket already contains datastore in use by host {hostname}");
+ } else {
+ bail!("Bucket already contains datastore in use");
+ }
}
}
backend_s3_client = Some(Arc::new(s3_client));
@@ -269,7 +273,13 @@ pub(crate) fn do_create_datastore(
optional: true,
default: false,
description: "Re-use existing datastore directory."
- }
+ },
+ "overwrite-in-use": {
+ type: Boolean,
+ optional: true,
+ default: false,
+ description: "Overwrite in use marker (S3 backed datastores only)."
+ },
},
},
access: {
@@ -281,6 +291,7 @@ pub(crate) fn do_create_datastore(
pub fn create_datastore(
config: DataStoreConfig,
reuse_datastore: bool,
+ overwrite_in_use: bool,
rpcenv: &mut dyn RpcEnvironment,
) -> Result<String, Error> {
let lock = pbs_config::datastore::lock_config()?;
@@ -349,7 +360,7 @@ pub fn create_datastore(
auth_id.to_string(),
to_stdout,
move |_worker| {
- do_create_datastore(lock, section_config, config, reuse_datastore)?;
+ do_create_datastore(lock, section_config, config, reuse_datastore, overwrite_in_use)?;
if let Some(prune_job_config) = prune_job_config {
do_create_prune_job(prune_job_config)?;
diff --git a/src/api2/node/disks/directory.rs b/src/api2/node/disks/directory.rs
index 62f463437..74819079c 100644
--- a/src/api2/node/disks/directory.rs
+++ b/src/api2/node/disks/directory.rs
@@ -254,7 +254,7 @@ pub fn create_datastore_disk(
}
crate::api2::config::datastore::do_create_datastore(
- lock, config, datastore, false,
+ lock, config, datastore, false, false,
)?;
}
diff --git a/src/api2/node/disks/zfs.rs b/src/api2/node/disks/zfs.rs
index b6cf18265..cdb7cc6a1 100644
--- a/src/api2/node/disks/zfs.rs
+++ b/src/api2/node/disks/zfs.rs
@@ -314,7 +314,7 @@ pub fn create_zpool(
}
crate::api2::config::datastore::do_create_datastore(
- lock, config, datastore, false,
+ lock, config, datastore, false, false,
)?;
}
diff --git a/src/bin/proxmox_backup_manager/datastore.rs b/src/bin/proxmox_backup_manager/datastore.rs
index 703974882..45ad27049 100644
--- a/src/bin/proxmox_backup_manager/datastore.rs
+++ b/src/bin/proxmox_backup_manager/datastore.rs
@@ -113,6 +113,12 @@ fn show_datastore(param: Value, rpcenv: &mut dyn RpcEnvironment) -> Result<Value
default: false,
description: "Re-use existing datastore directory."
},
+ "overwrite-in-use": {
+ type: Boolean,
+ optional: true,
+ default: false,
+ description: "Overwrite in use marker (S3 backed datastores only)."
+ },
"output-format": {
schema: OUTPUT_FORMAT,
optional: true,
diff --git a/www/window/DataStoreEdit.js b/www/window/DataStoreEdit.js
index 3379bf773..3cf4990c6 100644
--- a/www/window/DataStoreEdit.js
+++ b/www/window/DataStoreEdit.js
@@ -76,6 +76,7 @@ Ext.define('PBS.DataStoreEdit', {
let uuidEditField = inputPanel.down('[name=backing-device]');
let bucketField = inputPanel.down('[name=bucket]');
let s3ClientSelector = inputPanel.down('[name=s3client]');
+ let overwriteInUseField = inputPanel.down('[name=overwrite-in-use]');
uuidEditField.setDisabled(!isRemovable);
uuidEditField.allowBlank = !isRemovable;
@@ -89,6 +90,10 @@ Ext.define('PBS.DataStoreEdit', {
s3ClientSelector.allowBlank = !isS3;
s3ClientSelector.setValue('');
+ overwriteInUseField.setHidden(!isS3);
+ overwriteInUseField.setDisabled(!isS3);
+ overwriteInUseField.setValue(false);
+
if (isRemovable) {
pathField.setFieldLabel(gettext('Path on Device'));
pathField.setEmptyText(gettext('A relative path'));
@@ -176,6 +181,21 @@ Ext.define('PBS.DataStoreEdit', {
xtype: 'checkbox',
name: 'reuse-datastore',
fieldLabel: gettext('Reuse existing datastore'),
+ listeners: {
+ change: function (checkbox, selected) {
+ let inputPanel = checkbox.up('inputpanel');
+ let overwriteInUseField = inputPanel.down('[name=overwrite-in-use]');
+ overwriteInUseField.setDisabled(!selected);
+ overwriteInUseField.setValue(false);
+ }
+ },
+ },
+ ],
+ advancedColumn2: [
+ {
+ xtype: 'checkbox',
+ name: 'overwrite-in-use',
+ fieldLabel: gettext('Overwrite in-use marker'),
},
],
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] [PATCH proxmox-backup v8 45/45] docs: Add section describing how to setup s3 backed datastore
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (52 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 44/45] api/ui: add flag to allow overwriting in-use marker for " Christian Ebner
@ 2025-07-15 12:53 ` Christian Ebner
2025-07-18 13:14 ` Maximiliano Sandoval
2025-07-18 13:16 ` [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Lukas Wagner
2025-07-19 12:52 ` [pbs-devel] superseded: " Christian Ebner
55 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-15 12:53 UTC (permalink / raw)
To: pbs-devel
Describe required basic S3 client setup and possible configuration
options as well as the actual setup of a datastore using the client and
a bucket as backend.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 7:
- new in this version
docs/storage.rst | 68 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 68 insertions(+)
diff --git a/docs/storage.rst b/docs/storage.rst
index 4a8d8255e..0bac85fc3 100644
--- a/docs/storage.rst
+++ b/docs/storage.rst
@@ -233,6 +233,74 @@ datastore is not mounted when they are scheduled. Sync jobs start, but fail
with an error saying the datastore was not mounted. The reason is that syncs
not happening as scheduled should at least be noticeable.
+Datastores with S3 Backend (experimental)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Proxmox Backup Server supports S3 compatible object stores as storage backend for datastores. For
+this, an S3 client needs to be set-up under "Configuration" > "S3 Clients".
+
+In the client configuration, provide the REST API endpoint for the object store. The endpoint
+is provider dependent and allows for the bucket and region templating. For example, configuring
+the endpoint as e.g. ``{{bucket}}.s3.{{region}}.amazonaws.com`` will be expanded to
+``my-pbs-bucket.s3.eu-central-1.amazonaws.com`` with a configured bucket of name ``my-pbs-bucket``
+located in region ``eu-central-1``.
+
+The bucket name is part of the datastore backend configuration rather than the client configuration,
+as the same client might be reused for multiple bucket. Objects placed in the bucket are prefixed by
+the datastore name, therefore it is possible to create multiple datastores using the same bucket.
+
+.. note:: Proxmox Backup Server does not handle bucket creation and access control. The bucket used
+ to store the datastore's objects as well as the access key have to be setup beforehand in your S3
+ provider interface. The Proxmox Backup Server acts as client and requires permissions to get, put
+ list and delete objects in the bucket.
+
+Most providers allow to access buckets either using a vhost style addressing, the bucket name being
+part of the endpoint address, or via path style addressing, the bucket name being the prefix to
+the path components of requests. Proxmox Backup Server supports both styles, favoring the vhost
+style urls over the path style. To use path style addresses, set the corresponding configuration
+flag.
+
+Proxmox Backup Server does not support plain text communication with the S3 API, all communication
+is excrypted using HTTPS in transit. Therefore, for self-hostsd S3 object stores using a self-signed
+certificate, the matching fingerprint has to be provided to the client configuration. Otherwise the
+client refuses connections to the S3 object store.
+
+The following example shows the setup of a new s3 client configuration:
+
+.. code-block:: console
+
+ # proxmox-backup-manager s3 client create my-s3-client --secrets-id my-s3-client --access-key 'my-access-key' --secret-key 'my-secret-key' --endpoint '{{bucket}}.s3.{{region}}.amazonaws.com' --region eu-central-1
+
+To list your s3 client configuration, run:
+
+.. code-block:: console
+
+ # proxmox-backup-manager s3 client list
+
+A new datastore with S3 backend can be created using one of the configures S3 clients. Although
+storing all contents on the S3 object store, the datastore requires nevertheless a local cache store,
+used to increase performance and reduce the number of requests to the backend. For this, a local
+filesystem path has to be provided during datastore creation, just like for regular datastore setup.
+A minimum size of a few GiB of storage is recommended, given that cache datastore contents include
+also data chunks.
+
+To setup a new datastore called ``my-s3-store`` placed in a bucket called ``pbs-s3-bucket``, run:
+
+.. code-block:: console
+
+ # proxmox-backup-manager datastore create my-s3-store /mnt/datastore/my-s3-store-cache --backend type=s3,client=my-s3-client,bucket=pbs-s3-bucket
+
+A datastore cannot be shared between multiple instances, only one instance can operate on the
+datastore at a time. However, datastore contents used on a Proxmox Backup Server instance which is
+no longer available can be reused on a fresh installation. To recreate the datastore, you must pass
+the ``reuse-datastore`` and ``overwrite-in-use`` flags. Since the datastore name is used as prefix,
+the same datastore name must be used.
+
+.. code-block:: console
+
+ # proxmox-backup-manager datastore create my-s3-store /mnt/datastore/my-new-s3-store-cache --backend type=s3,client=my-s3-client,bucket=pbs-s3-bucket --reuse-datastore true --overwrite-in-use true
+
+
Managing Datastores
^^^^^^^^^^^^^^^^^^^
--
2.47.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] partially-applied-series: [PATCH proxmox v8 1/9] s3 client: add crate for AWS s3 compatible object store client
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 1/9] s3 client: add crate for AWS s3 compatible object store client Christian Ebner
@ 2025-07-15 21:13 ` Thomas Lamprecht
0 siblings, 0 replies; 109+ messages in thread
From: Thomas Lamprecht @ 2025-07-15 21:13 UTC (permalink / raw)
To: pve-devel, pbs-devel, Christian Ebner
On Tue, 15 Jul 2025 14:52:39 +0200, Christian Ebner wrote:
> Adds the client to connect to an AWS S3 compatible object store REST
> API. Force the use of an TLS encrypted connection as the communication
> with the object store will contain sensitive information.
>
> For self-signed certificates, check the fingerprint against the one
> configured. This follows along the lines of the PBS client, used to
> connect to the PBS server API.
>
> [...]
Applied the s3 client patches, thanks!
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pve-devel] partially-applied-series: [pbs-devel] [PATCH proxmox v8 1/9] s3 client: add crate for AWS s3 compatible object store client
@ 2025-07-15 21:13 ` Thomas Lamprecht
0 siblings, 0 replies; 109+ messages in thread
From: Thomas Lamprecht @ 2025-07-15 21:13 UTC (permalink / raw)
To: pve-devel, pbs-devel, Christian Ebner
On Tue, 15 Jul 2025 14:52:39 +0200, Christian Ebner wrote:
> Adds the client to connect to an AWS S3 compatible object store REST
> API. Force the use of an TLS encrypted connection as the communication
> with the object store will contain sensitive information.
>
> For self-signed certificates, check the fingerprint against the one
> configured. This follows along the lines of the PBS client, used to
> connect to the PBS server API.
>
> [...]
Applied the s3 client patches, thanks!
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 02/45] config: introduce s3 object store client configuration
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 02/45] config: introduce s3 object store client configuration Christian Ebner
@ 2025-07-18 7:22 ` Lukas Wagner
2025-07-18 8:37 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 7:22 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
With my minor complaints fixed:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> Adds the client configuration for s3 object store as dedicated
> configuration files, with secrets being stored separately from the
> regular configuration and excluded from api responses for security
> reasons.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> pbs-config/Cargo.toml | 1 +
> pbs-config/src/lib.rs | 1 +
> pbs-config/src/s3.rs | 83 +++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 85 insertions(+)
> create mode 100644 pbs-config/src/s3.rs
>
> diff --git a/pbs-config/Cargo.toml b/pbs-config/Cargo.toml
> index 284149658..74afb3c64 100644
> --- a/pbs-config/Cargo.toml
> +++ b/pbs-config/Cargo.toml
> @@ -19,6 +19,7 @@ serde_json.workspace = true
>
> proxmox-notify.workspace = true
> proxmox-router = { workspace = true, default-features = false }
> +proxmox-s3-client.workspace = true
> proxmox-schema.workspace = true
> proxmox-section-config.workspace = true
> proxmox-shared-memory.workspace = true
> diff --git a/pbs-config/src/lib.rs b/pbs-config/src/lib.rs
> index 9c4d77c24..d03c079ab 100644
> --- a/pbs-config/src/lib.rs
> +++ b/pbs-config/src/lib.rs
> @@ -10,6 +10,7 @@ pub mod network;
> pub mod notifications;
> pub mod prune;
> pub mod remote;
> +pub mod s3;
> pub mod sync;
> pub mod tape_job;
> pub mod token_shadow;
> diff --git a/pbs-config/src/s3.rs b/pbs-config/src/s3.rs
> new file mode 100644
> index 000000000..ec3998834
> --- /dev/null
> +++ b/pbs-config/src/s3.rs
> @@ -0,0 +1,83 @@
> +use std::collections::HashMap;
> +use std::sync::LazyLock;
> +
> +use anyhow::Error;
> +
> +use proxmox_s3_client::{S3ClientConfig, S3ClientSecretsConfig};
> +use proxmox_schema::*;
> +use proxmox_section_config::{SectionConfig, SectionConfigData, SectionConfigPlugin};
> +
> +use pbs_api_types::JOB_ID_SCHEMA;
> +
> +use crate::{open_backup_lockfile, replace_backup_config, BackupLockGuard};
> +
> +pub static CONFIG: LazyLock<SectionConfig> = LazyLock::new(init);
> +
> +fn init() -> SectionConfig {
> + let obj_schema = match S3ClientConfig::API_SCHEMA {
> + Schema::Object(ref obj_schema) => obj_schema,
> + _ => unreachable!(),
> + };
> + let secrets_obj_schema = match S3ClientSecretsConfig::API_SCHEMA {
> + Schema::Object(ref obj_schema) => obj_schema,
> + _ => unreachable!(),
> + };
You can use API_SCHEMA::unwrap_object_schema here, that's a bit nicer to read :)
> +
> + let plugin =
> + SectionConfigPlugin::new("s3client".to_string(), Some(String::from("id")), obj_schema);
> + let secrets_plugin = SectionConfigPlugin::new(
> + "s3secrets".to_string(),
> + Some(String::from("secrets-id")),
> + secrets_obj_schema,
> + );
> + let mut config = SectionConfig::new(&JOB_ID_SCHEMA);
> + config.register_plugin(plugin);
> + config.register_plugin(secrets_plugin);
> +
> + config
> +}
> +
> +pub const S3_CFG_FILENAME: &str = "/etc/proxmox-backup/s3.cfg";
> +pub const S3_SECRETS_CFG_FILENAME: &str = "/etc/proxmox-backup/s3-secrets.cfg";
> +pub const S3_CFG_LOCKFILE: &str = "/etc/proxmox-backup/.s3.lck";
You can use the pbs_buildcfg::configdir macro to build these paths. Also please
add some docstrings to public consts like these.
> +
> +/// Get exclusive lock
> +pub fn lock_config() -> Result<BackupLockGuard, Error> {
> + open_backup_lockfile(S3_CFG_LOCKFILE, None, true)
> +}
> +
> +pub fn config() -> Result<(SectionConfigData, [u8; 32]), Error> {
> + parse_config(S3_CFG_FILENAME)
> +}
> +
> +pub fn secrets_config() -> Result<(SectionConfigData, [u8; 32]), Error> {
> + parse_config(S3_SECRETS_CFG_FILENAME)
> +}
> +
> +pub fn save_config(config: &SectionConfigData, secrets: &SectionConfigData) -> Result<(), Error> {
> + let raw = CONFIG.write(S3_CFG_FILENAME, config)?;
> + replace_backup_config(S3_CFG_FILENAME, raw.as_bytes())?;
> +
> + let secrets_raw = CONFIG.write(S3_SECRETS_CFG_FILENAME, secrets)?;
> + // Secrets are stored with `backup` permissions to allow reading from
> + // not protected api endpoints as well.
> + replace_backup_config(S3_SECRETS_CFG_FILENAME, secrets_raw.as_bytes())?;
> +
> + Ok(())
> +}
^ These public functions lack docstrings
> +
> +// shell completion helper
> +pub fn complete_s3_client_id(_arg: &str, _param: &HashMap<String, String>) -> Vec<String> {
> + match config() {
> + Ok((data, _digest)) => data.sections.keys().map(|id| id.to_string()).collect(),
> + Err(_) => Vec::new(),
> + }
> +}
> +
> +fn parse_config(path: &str) -> Result<(SectionConfigData, [u8; 32]), Error> {
> + let content = proxmox_sys::fs::file_read_optional_string(path)?;
> + let content = content.unwrap_or_default();
> + let digest = openssl::sha::sha256(content.as_bytes());
> + let data = CONFIG.parse(path, &content)?;
> + Ok((data, digest))
> +}
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 01/45] datastore: add helpers for path/digest to s3 object key conversion
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 01/45] datastore: add helpers for path/digest to s3 object key conversion Christian Ebner
@ 2025-07-18 7:24 ` Lukas Wagner
2025-07-18 8:34 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 7:24 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
Ideally I'd like to see some basic unit tests for these helpers.
On 2025-07-15 14:52, Christian Ebner wrote:
> Adds helper methods to generate the s3 object keys given a relative
> path and filename for datastore contents or digest in case of chunk
> files.
>
> Regular datastore contents are stored by grouping them with a content
> prefix in the object key. In order to keep the object key length
> small, given the max limit of 1024 bytes {0], `.cnt` is used as
> content prefix. Chunks on the other hand are prefixed by `.chunks`,
> same as on regular datastores.
>
> The prefix allows for selective listing of either contents or chunks
> by providing the prefix to the respective api calls.
>
> [0] https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> Cargo.toml | 1 +
> pbs-datastore/Cargo.toml | 1 +
> pbs-datastore/src/lib.rs | 1 +
> pbs-datastore/src/s3.rs | 49 ++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 52 insertions(+)
> create mode 100644 pbs-datastore/src/s3.rs
>
> diff --git a/Cargo.toml b/Cargo.toml
> index ae57e7e20..b6b779cbc 100644
> --- a/Cargo.toml
> +++ b/Cargo.toml
> @@ -77,6 +77,7 @@ proxmox-rest-server = { version = "1", features = [ "templates" ] }
> proxmox-router = { version = "3.2.2", default-features = false }
> proxmox-rrd = "1"
> proxmox-rrd-api-types = "1.0.2"
> +proxmox-s3-client = "1.0.0"
> # everything but pbs-config and pbs-client use "api-macro"
> proxmox-schema = "4"
> proxmox-section-config = "3"
> diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
> index 56f6e9094..c42eff165 100644
> --- a/pbs-datastore/Cargo.toml
> +++ b/pbs-datastore/Cargo.toml
> @@ -34,6 +34,7 @@ proxmox-borrow.workspace = true
> proxmox-human-byte.workspace = true
> proxmox-io.workspace = true
> proxmox-lang.workspace=true
> +proxmox-s3-client = { workspace = true, features = [ "impl" ] }
> proxmox-schema = { workspace = true, features = [ "api-macro" ] }
> proxmox-serde = { workspace = true, features = [ "serde_json" ] }
> proxmox-sys.workspace = true
> diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
> index 5014b6c09..ffd0d91b2 100644
> --- a/pbs-datastore/src/lib.rs
> +++ b/pbs-datastore/src/lib.rs
> @@ -182,6 +182,7 @@ pub mod manifest;
> pub mod paperkey;
> pub mod prune;
> pub mod read_chunk;
> +pub mod s3;
> pub mod store_progress;
> pub mod task_tracking;
>
> diff --git a/pbs-datastore/src/s3.rs b/pbs-datastore/src/s3.rs
> new file mode 100644
> index 000000000..82843ee26
> --- /dev/null
> +++ b/pbs-datastore/src/s3.rs
> @@ -0,0 +1,49 @@
> +use std::path::{Path, PathBuf};
> +
> +use anyhow::{bail, format_err, Error};
> +
> +use proxmox_s3_client::S3ObjectKey;
> +
> +/// Object key prefix to group regular datastore contents (not chunks)
> +pub const S3_CONTENT_PREFIX: &str = ".cnt";
> +
> +/// Generate a relative object key with content prefix from given path and filename
> +pub fn object_key_from_path(path: &Path, filename: &str) -> Result<S3ObjectKey, Error> {
> + // Force the use of relative paths, otherwise this would loose the content prefix
> + if path.is_absolute() {
> + bail!("cannot generate object key from absolute path");
> + }
> + if filename.contains('/') {
> + bail!("invalid filename containing slashes");
> + }
> + let mut object_path = PathBuf::from(S3_CONTENT_PREFIX);
> + object_path.push(path);
> + object_path.push(filename);
> +
> + let object_key_str = object_path
> + .to_str()
> + .ok_or_else(|| format_err!("unexpected object key path"))?;
> + Ok(S3ObjectKey::from(object_key_str))
> +}
> +
> +/// Generate a relative object key with chunk prefix from given digest
> +pub fn object_key_from_digest(digest: &[u8; 32]) -> Result<S3ObjectKey, Error> {
> + let object_key = hex::encode(digest);
> + let digest_prefix = &object_key[..4];
> + let object_key_string = format!(".chunks/{digest_prefix}/{object_key}");
> + Ok(S3ObjectKey::from(object_key_string.as_str()))
> +}
> +
> +/// Generate a relative object key with chunk prefix from given digest, extended by suffix
> +pub fn object_key_from_digest_with_suffix(
> + digest: &[u8; 32],
> + suffix: &str,
> +) -> Result<S3ObjectKey, Error> {
> + if suffix.contains('/') {
> + bail!("invalid suffix containing slashes");
> + }
> + let object_key = hex::encode(digest);
> + let digest_prefix = &object_key[..4];
> + let object_key_string = format!(".chunks/{digest_prefix}/{object_key}{suffix}");
> + Ok(S3ObjectKey::from(object_key_string.as_str()))
> +}
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 03/45] api: config: implement endpoints to manipulate and list s3 configs
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 03/45] api: config: implement endpoints to manipulate and list s3 configs Christian Ebner
@ 2025-07-18 7:32 ` Lukas Wagner
2025-07-18 8:40 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 7:32 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> +/// Update an s3 client configuration.
> +#[allow(clippy::too_many_arguments)]
> +pub fn update_s3_client_config(
> + id: String,
> + update: S3ClientConfigUpdater,
> + update_secrets: S3ClientSecretsConfigUpdater,
> + delete: Option<Vec<DeletableProperty>>,
> + digest: Option<String>,
> + _rpcenv: &mut dyn RpcEnvironment,
> +) -> Result<(), Error> {
> + let _lock = s3::lock_config()?;
> + let (mut config, expected_digest) = s3::config()?;
> + let (mut secrets, secrets_digest) = s3::secrets_config()?;
> + let expected_digest = digest_with_secrets(&expected_digest, &secrets_digest);
> +
> + // Secrets are not included in digest concurrent changes therefore not detected.
> + if let Some(ref digest) = digest {
> + let digest = <[u8; 32]>::from_hex(digest)?;
> + crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
> + }
> +
> + let mut data: S3ClientConfig = config.lookup("s3client", &id)?;
> +
> + if let Some(delete) = delete {
> + for delete_prop in delete {
> + match delete_prop {
> + DeletableProperty::Port => {
> + data.port = None;
> + }
> + DeletableProperty::Region => {
> + data.region = None;
> + }
> + DeletableProperty::Fingerprint => {
> + data.fingerprint = None;
> + }
> + DeletableProperty::PathStyle => {
> + data.path_style = None;
> + }
> + }
> + }
> + }
Some time ago I've found that it is quite useful to
destructure the updater like I did in proxmox-notify [1].
This ensures that you don't forget to update the
API handler after adding a new field to the config struct.
Not a must, just a suggestion, since I like this pattern quite a bit :)
[1] https://git.proxmox.com/?p=proxmox.git;a=blob;f=proxmox-notify/src/api/webhook.rs;h=9d904d0bf57f9f789bb6723e1d8ca710fcf0cb96;hb=HEAD#l175
> +
> + if let Some(endpoint) = update.endpoint {
> + data.endpoint = endpoint;
> + }
> + if let Some(port) = update.port {
> + data.port = Some(port);
> + }
> + if let Some(region) = update.region {
> + data.region = Some(region);
> + }
> + if let Some(access_key) = update.access_key {
> + data.access_key = access_key;
> + }
> + if let Some(fingerprint) = update.fingerprint {
> + data.fingerprint = Some(fingerprint);
> + }
> + if let Some(path_style) = update.path_style {
> + data.path_style = Some(path_style);
> + }
> +
> + let mut secrets_data: S3ClientSecretsConfig = secrets.lookup("s3secrets", &id)?;
> + if let Some(secret_key) = update_secrets.secret_key {
> + secrets_data.secret_key = secret_key;
> + }
> +
> + config.set_data(&id, "s3client", &data)?;
> + secrets.set_data(&id, "s3secrets", &secrets_data)?;
> + s3::save_config(&config, &secrets)?;
> +
> + Ok(())
> +}
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 04/45] api: datastore: check s3 backend bucket access on datastore create
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 04/45] api: datastore: check s3 backend bucket access on datastore create Christian Ebner
@ 2025-07-18 7:40 ` Lukas Wagner
2025-07-18 8:55 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 7:40 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
With the two string constants moved:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> Check if the configured S3 object store backend can be reached and
> the provided secrets have the permissions to access the bucket.
>
> Perform the check before creating the chunk store, so it is not left
> behind if the bucket cannot be reached.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> Cargo.toml | 2 +-
> src/api2/config/datastore.rs | 48 ++++++++++++++++++++++++++++++++----
> 2 files changed, 44 insertions(+), 6 deletions(-)
>
> diff --git a/Cargo.toml b/Cargo.toml
> index c7a77060e..a5954635a 100644
> --- a/Cargo.toml
> +++ b/Cargo.toml
> @@ -77,7 +77,7 @@ proxmox-rest-server = { version = "1", features = [ "templates" ] }
> proxmox-router = { version = "3.2.2", default-features = false }
> proxmox-rrd = "1"
> proxmox-rrd-api-types = "1.0.2"
> -proxmox-s3-client = "1.0.0"
> +proxmox-s3-client = { version = "1.0.0", features = [ "impl" ] }
> # everything but pbs-config and pbs-client use "api-macro"
> proxmox-schema = "4"
> proxmox-section-config = "3"
> diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
> index b133be707..0fb822c79 100644
> --- a/src/api2/config/datastore.rs
> +++ b/src/api2/config/datastore.rs
> @@ -1,21 +1,22 @@
> use std::path::{Path, PathBuf};
>
> use ::serde::{Deserialize, Serialize};
> -use anyhow::{bail, Context, Error};
> +use anyhow::{bail, format_err, Context, Error};
> use hex::FromHex;
> use serde_json::Value;
> use tracing::{info, warn};
>
> use proxmox_router::{http_bail, Permission, Router, RpcEnvironment, RpcEnvironmentType};
> +use proxmox_s3_client::{S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig};
> use proxmox_schema::{api, param_bail, ApiType};
> use proxmox_section_config::SectionConfigData;
> use proxmox_uuid::Uuid;
>
> use pbs_api_types::{
> - Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreNotify, DatastoreTuning, KeepOptions,
> - MaintenanceMode, PruneJobConfig, PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE,
> - PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY, PRIV_SYS_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA,
> - UPID_SCHEMA,
> + Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreBackendConfig, DatastoreBackendType,
> + DatastoreNotify, DatastoreTuning, KeepOptions, MaintenanceMode, PruneJobConfig,
> + PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE, PRIV_DATASTORE_AUDIT,
> + PRIV_DATASTORE_MODIFY, PRIV_SYS_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA, UPID_SCHEMA,
> };
> use pbs_config::BackupLockGuard;
> use pbs_datastore::chunk_store::ChunkStore;
> @@ -116,6 +117,43 @@ pub(crate) fn do_create_datastore(
> .parse_property_string(datastore.tuning.as_deref().unwrap_or(""))?,
> )?;
>
> + if let Some(ref backend_config) = datastore.backend {
> + let backend_config: DatastoreBackendConfig = backend_config.parse()?;
> + match backend_config.ty.unwrap_or_default() {
> + DatastoreBackendType::Filesystem => (),
> + DatastoreBackendType::S3 => {
> + let s3_client_id = backend_config
> + .client
> + .as_ref()
> + .ok_or_else(|| format_err!("missing required client"))?;
> + let bucket = backend_config
> + .bucket
> + .clone()
> + .ok_or_else(|| format_err!("missing required bucket"))?;
> + let (config, _config_digest) =
> + pbs_config::s3::config().context("failed to get s3 config")?;
> + let (secrets, _secrets_digest) =
> + pbs_config::s3::secrets_config().context("failed to get s3 secrets")?;
> + let config: S3ClientConfig = config
> + .lookup("s3client", s3_client_id)
> + .with_context(|| format!("no '{s3_client_id}' in config"))?;
> + let secrets: S3ClientSecretsConfig = secrets
> + .lookup("s3secrets", s3_client_id)
> + .with_context(|| format!("no '{s3_client_id}' in secrets"))?;
The "s3client" and "s3secrets" section type strings should be `pub const` where the the config parser is defined.
> + let options = S3ClientOptions::from_config(
> + config,
> + secrets,
> + bucket,
> + datastore.name.to_owned(),
> + );
> + let s3_client = S3Client::new(options).context("failed to create s3 client")?;
> + // Fine to block since this runs in worker task
> + proxmox_async::runtime::block_on(s3_client.head_bucket())
> + .context("failed to access bucket")?;
I wonder whether we should add some kind of retry logic not only here, but also for anywhere else
where we interact with S3. Might of course be easier to implement that right in the s3 client crate.
Also, no need to add this right away, just some idea for future improvements.
> + }
> + }
> + }
> +
> let unmount_guard = if datastore.backing_device.is_some() {
> do_mount_device(datastore.clone())?;
> UnmountGuard::new(Some(path.clone()))
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 05/45] api/cli: add endpoint and command to check s3 client connection
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 05/45] api/cli: add endpoint and command to check s3 client connection Christian Ebner
@ 2025-07-18 7:43 ` Lukas Wagner
2025-07-18 9:04 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 7:43 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
With the magic string replaced by constants:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> Adds a dedicated api endpoint and a proxmox-backup-manager command to
> check if the configured S3 client can reach the bucket.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> src/api2/admin/mod.rs | 2 +
> src/api2/admin/s3.rs | 80 +++++++++++++++++++++++++++
> src/bin/proxmox-backup-manager.rs | 1 +
> src/bin/proxmox_backup_manager/mod.rs | 2 +
> src/bin/proxmox_backup_manager/s3.rs | 46 +++++++++++++++
> 5 files changed, 131 insertions(+)
> create mode 100644 src/api2/admin/s3.rs
> create mode 100644 src/bin/proxmox_backup_manager/s3.rs
>
> diff --git a/src/api2/admin/mod.rs b/src/api2/admin/mod.rs
> index a1c49f8e2..7694de4b9 100644
> --- a/src/api2/admin/mod.rs
> +++ b/src/api2/admin/mod.rs
> @@ -9,6 +9,7 @@ pub mod gc;
> pub mod metrics;
> pub mod namespace;
> pub mod prune;
> +pub mod s3;
> pub mod sync;
> pub mod traffic_control;
> pub mod verify;
> @@ -19,6 +20,7 @@ const SUBDIRS: SubdirMap = &sorted!([
> ("metrics", &metrics::ROUTER),
> ("prune", &prune::ROUTER),
> ("gc", &gc::ROUTER),
> + ("s3", &s3::ROUTER),
> ("sync", &sync::ROUTER),
> ("traffic-control", &traffic_control::ROUTER),
> ("verify", &verify::ROUTER),
> diff --git a/src/api2/admin/s3.rs b/src/api2/admin/s3.rs
> new file mode 100644
> index 000000000..d20031707
> --- /dev/null
> +++ b/src/api2/admin/s3.rs
> @@ -0,0 +1,80 @@
> +//! S3 bucket operations
> +
> +use anyhow::{Context, Error};
> +use serde_json::Value;
> +
> +use proxmox_http::Body;
> +use proxmox_router::{list_subdirs_api_method, Permission, Router, RpcEnvironment, SubdirMap};
> +use proxmox_s3_client::{
> + S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3_BUCKET_NAME_SCHEMA,
> + S3_CLIENT_ID_SCHEMA,
> +};
> +use proxmox_schema::*;
> +use proxmox_sortable_macro::sortable;
> +
> +use pbs_api_types::PRIV_SYS_MODIFY;
> +
> +#[api(
> + input: {
> + properties: {
> + "s3-client-id": {
> + schema: S3_CLIENT_ID_SCHEMA,
> + },
> + bucket: {
> + schema: S3_BUCKET_NAME_SCHEMA,
> + },
> + "store-prefix": {
> + type: String,
> + description: "Store prefix within bucket for S3 object keys (commonly datastore name)",
> + },
> + },
> + },
> + access: {
> + permission: &Permission::Privilege(&[], PRIV_SYS_MODIFY, false),
> + },
> +)]
> +/// Perform basic sanity check for given s3 client configuration
> +pub async fn check(
> + s3_client_id: String,
> + bucket: String,
> + store_prefix: String,
> + _rpcenv: &mut dyn RpcEnvironment,
> +) -> Result<Value, Error> {
> + let (config, _digest) = pbs_config::s3::config()?;
> + let config: S3ClientConfig = config
> + .lookup("s3client", &s3_client_id)
> + .context("config lookup failed")?;
> + let (secrets, _secrets_digest) = pbs_config::s3::secrets_config()?;
> + let secrets: S3ClientSecretsConfig = secrets
> + .lookup("s3secrets", &s3_client_id)
> + .context("secrets lookup failed")?;
Same thing here with regards to the section config type strings.
> +
> + let options = S3ClientOptions::from_config(config, secrets, bucket, store_prefix);
> +
> + let test_object_key = ".s3-client-test";
> + let client = S3Client::new(options).context("client creation failed")?;
> + client.head_bucket().await.context("head object failed")?;
> + client
> + .put_object(test_object_key.into(), Body::empty(), true)
> + .await
> + .context("put object failed")?;
> + client
> + .get_object(test_object_key.into())
> + .await
> + .context("get object failed")?;
> + client
> + .delete_object(test_object_key.into())
> + .await
> + .context("delete object failed")?;
> +
> + Ok(Value::Null)
> +}
> +
> +#[sortable]
> +const S3_OPERATION_SUBDIRS: SubdirMap = &[("check", &Router::new().get(&API_METHOD_CHECK))];
> +
> +const S3_OPERATION_ROUTER: Router = Router::new()
> + .get(&list_subdirs_api_method!(S3_OPERATION_SUBDIRS))
> + .subdirs(S3_OPERATION_SUBDIRS);
> +
> +pub const ROUTER: Router = Router::new().match_all("s3-client-id", &S3_OPERATION_ROUTER);
> diff --git a/src/bin/proxmox-backup-manager.rs b/src/bin/proxmox-backup-manager.rs
> index d4363e717..68d87c676 100644
> --- a/src/bin/proxmox-backup-manager.rs
> +++ b/src/bin/proxmox-backup-manager.rs
> @@ -677,6 +677,7 @@ async fn run() -> Result<(), Error> {
> .insert("garbage-collection", garbage_collection_commands())
> .insert("acme", acme_mgmt_cli())
> .insert("cert", cert_mgmt_cli())
> + .insert("s3", s3_commands())
> .insert("subscription", subscription_commands())
> .insert("sync-job", sync_job_commands())
> .insert("verify-job", verify_job_commands())
> diff --git a/src/bin/proxmox_backup_manager/mod.rs b/src/bin/proxmox_backup_manager/mod.rs
> index 9b5c73e9a..312a6db6b 100644
> --- a/src/bin/proxmox_backup_manager/mod.rs
> +++ b/src/bin/proxmox_backup_manager/mod.rs
> @@ -26,6 +26,8 @@ mod prune;
> pub use prune::*;
> mod remote;
> pub use remote::*;
> +mod s3;
> +pub use s3::*;
> mod subscription;
> pub use subscription::*;
> mod sync;
> diff --git a/src/bin/proxmox_backup_manager/s3.rs b/src/bin/proxmox_backup_manager/s3.rs
> new file mode 100644
> index 000000000..9bb89ff55
> --- /dev/null
> +++ b/src/bin/proxmox_backup_manager/s3.rs
> @@ -0,0 +1,46 @@
> +use proxmox_router::{cli::*, RpcEnvironment};
> +use proxmox_s3_client::{S3_BUCKET_NAME_SCHEMA, S3_CLIENT_ID_SCHEMA};
> +use proxmox_schema::api;
> +
> +use proxmox_backup::api2;
> +
> +use anyhow::Error;
> +use serde_json::Value;
> +
> +#[api(
> + input: {
> + properties: {
> + "s3-client-id": {
> + schema: S3_CLIENT_ID_SCHEMA,
> + },
> + bucket: {
> + schema: S3_BUCKET_NAME_SCHEMA,
> + },
> + "store-prefix": {
> + type: String,
> + description: "Store prefix within bucket for S3 object keys (commonly datastore name)",
> + },
> + },
> + },
> +)]
> +/// Perform basic sanity checks for given S3 client configuration
> +async fn check(
> + s3_client_id: String,
> + bucket: String,
> + store_prefix: String,
> + rpcenv: &mut dyn RpcEnvironment,
> +) -> Result<Value, Error> {
> + api2::admin::s3::check(s3_client_id, bucket, store_prefix, rpcenv).await?;
> + Ok(Value::Null)
> +}
> +
> +pub fn s3_commands() -> CommandLineInterface {
> + let cmd_def = CliCommandMap::new().insert(
> + "check",
> + CliCommand::new(&API_METHOD_CHECK)
> + .arg_param(&["s3-client-id", "bucket"])
> + .completion_cb("s3-client-id", pbs_config::s3::complete_s3_client_id),
> + );
> +
> + cmd_def.into()
> +}
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 06/45] datastore: allow to get the backend for a datastore
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 06/45] datastore: allow to get the backend for a datastore Christian Ebner
@ 2025-07-18 7:52 ` Lukas Wagner
2025-07-18 9:10 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 7:52 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
With my feedback addressed:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> Implements an enum with variants Filesystem and S3 to distinguish
> between available backends. Filesystem will be used as default, if no
> backend is configured in the datastores configuration. If the
> datastore has an s3 backend configured, the backend method will
> instantiate and s3 client and return it with the S3 variant.
>
> This allows to instantiate the client once, keeping and reusing the
> same open connection to the api for the lifetime of task or job, e.g.
> in the backup writer/readers runtime environment.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> pbs-datastore/src/datastore.rs | 52 ++++++++++++++++++++++++++++++++--
> pbs-datastore/src/lib.rs | 1 +
> 2 files changed, 51 insertions(+), 2 deletions(-)
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 924d8cf9c..90ab80005 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -12,6 +12,7 @@ use pbs_tools::lru_cache::LruCache;
> use tracing::{info, warn};
>
> use proxmox_human_byte::HumanByte;
> +use proxmox_s3_client::{S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig};
> use proxmox_schema::ApiType;
>
> use proxmox_sys::error::SysError;
> @@ -23,8 +24,8 @@ use proxmox_worker_task::WorkerTaskContext;
>
> use pbs_api_types::{
> ArchiveType, Authid, BackupGroupDeleteStats, BackupNamespace, BackupType, ChunkOrder,
> - DataStoreConfig, DatastoreFSyncLevel, DatastoreTuning, GarbageCollectionStatus,
> - MaintenanceMode, MaintenanceType, Operation, UPID,
> + DataStoreConfig, DatastoreBackendConfig, DatastoreBackendType, DatastoreFSyncLevel,
> + DatastoreTuning, GarbageCollectionStatus, MaintenanceMode, MaintenanceType, Operation, UPID,
> };
> use pbs_config::BackupLockGuard;
>
> @@ -127,6 +128,7 @@ pub struct DataStoreImpl {
> chunk_order: ChunkOrder,
> last_digest: Option<[u8; 32]>,
> sync_level: DatastoreFSyncLevel,
> + backend_config: DatastoreBackendConfig,
> }
>
> impl DataStoreImpl {
> @@ -141,6 +143,7 @@ impl DataStoreImpl {
> chunk_order: Default::default(),
> last_digest: None,
> sync_level: Default::default(),
> + backend_config: Default::default(),
> })
> }
> }
> @@ -196,6 +199,12 @@ impl Drop for DataStore {
> }
> }
>
> +#[derive(Clone)]
> +pub enum DatastoreBackend {
> + Filesystem,
> + S3(Arc<S3Client>),
> +}
> +
Missing doc comments for this public enum
> impl DataStore {
> // This one just panics on everything
> #[doc(hidden)]
> @@ -206,6 +215,39 @@ impl DataStore {
> })
> }
>
> + /// Get the backend for this datastore based on it's configuration
> + pub fn backend(&self) -> Result<DatastoreBackend, Error> {
> + let backend_type = match self.inner.backend_config.ty.unwrap_or_default() {
> + DatastoreBackendType::Filesystem => DatastoreBackend::Filesystem,
> + DatastoreBackendType::S3 => {
> + let s3_client_id = self
> + .inner
> + .backend_config
> + .client
> + .as_ref()
> + .ok_or_else(|| format_err!("missing client for s3 backend"))?;
> + let bucket = self
> + .inner
> + .backend_config
> + .bucket
> + .clone()
> + .ok_or_else(|| format_err!("missing bucket for s3 backend"))?;
> +
> + let (config, _config_digest) = pbs_config::s3::config()?;
> + let (secrets, _secrets_digest) = pbs_config::s3::secrets_config()?;
> + let config: S3ClientConfig = config.lookup("s3client", s3_client_id)?;
> + let secrets: S3ClientSecretsConfig = secrets.lookup("s3secrets", s3_client_id)?;
Same thing here with regards to the hard-coded section type names.
> +
> + let options =
> + S3ClientOptions::from_config(config, secrets, bucket, self.name().to_owned());
> + let s3_client = S3Client::new(options)?;
> + DatastoreBackend::S3(Arc::new(s3_client))
> + }
> + };
> +
> + Ok(backend_type)
> + }
> +
> pub fn lookup_datastore(
> name: &str,
> operation: Option<Operation>,
> @@ -383,6 +425,11 @@ impl DataStore {
> .parse_property_string(config.tuning.as_deref().unwrap_or(""))?,
> )?;
>
> + let backend_config: DatastoreBackendConfig = serde_json::from_value(
> + DatastoreBackendConfig::API_SCHEMA
> + .parse_property_string(config.backend.as_deref().unwrap_or(""))?,
> + )?;
> +
> Ok(DataStoreImpl {
> chunk_store,
> gc_mutex: Mutex::new(()),
> @@ -391,6 +438,7 @@ impl DataStore {
> chunk_order: tuning.chunk_order.unwrap_or_default(),
> last_digest,
> sync_level: tuning.sync_level.unwrap_or_default(),
> + backend_config,
> })
> }
>
> diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
> index ffd0d91b2..ca6fdb7d8 100644
> --- a/pbs-datastore/src/lib.rs
> +++ b/pbs-datastore/src/lib.rs
> @@ -204,6 +204,7 @@ pub use store_progress::StoreProgress;
> mod datastore;
> pub use datastore::{
> check_backup_owner, ensure_datastore_is_mounted, get_datastore_mount_status, DataStore,
> + DatastoreBackend,
> };
>
> mod hierarchy;
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 07/45] api: backup: store datastore backend in runtime environment
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 07/45] api: backup: store datastore backend in runtime environment Christian Ebner
@ 2025-07-18 7:54 ` Lukas Wagner
0 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 7:54 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> Get and store the datastore's backend during creation of the backup
> runtime environment and upload the chunks to the local filesystem or
> s3 object store based on the backend variant.
>
> By storing the backend variant in the environment the s3 client is
> instantiated only once and reused for all api calls in the same
> backup http/2 connection.
>
> Refactor the upgrade method by moving all logic into the async block,
> such that the now possible error on backup environment creation gets
> propagated to the thread spawn call side.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> src/api2/backup/environment.rs | 11 +--
> src/api2/backup/mod.rs | 128 ++++++++++++++++-----------------
> 2 files changed, 71 insertions(+), 68 deletions(-)
>
> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
> index 1d8f64aa0..7bd86f39c 100644
> --- a/src/api2/backup/environment.rs
> +++ b/src/api2/backup/environment.rs
> @@ -16,7 +16,7 @@ use pbs_api_types::Authid;
> use pbs_datastore::backup_info::{BackupDir, BackupInfo};
> use pbs_datastore::dynamic_index::DynamicIndexWriter;
> use pbs_datastore::fixed_index::FixedIndexWriter;
> -use pbs_datastore::{DataBlob, DataStore};
> +use pbs_datastore::{DataBlob, DataStore, DatastoreBackend};
> use proxmox_rest_server::{formatter::*, WorkerTask};
>
> use crate::backup::VerifyWorker;
> @@ -116,6 +116,7 @@ pub struct BackupEnvironment {
> pub datastore: Arc<DataStore>,
> pub backup_dir: BackupDir,
> pub last_backup: Option<BackupInfo>,
> + pub backend: DatastoreBackend,
> state: Arc<Mutex<SharedBackupState>>,
> }
>
> @@ -126,7 +127,7 @@ impl BackupEnvironment {
> worker: Arc<WorkerTask>,
> datastore: Arc<DataStore>,
> backup_dir: BackupDir,
> - ) -> Self {
> + ) -> Result<Self, Error> {
> let state = SharedBackupState {
> finished: false,
> uid_counter: 0,
> @@ -138,7 +139,8 @@ impl BackupEnvironment {
> backup_stat: UploadStatistic::new(),
> };
>
> - Self {
> + let backend = datastore.backend()?;
> + Ok(Self {
> result_attributes: json!({}),
> env_type,
> auth_id,
> @@ -148,8 +150,9 @@ impl BackupEnvironment {
> formatter: JSON_FORMATTER,
> backup_dir,
> last_backup: None,
> + backend,
> state: Arc::new(Mutex::new(state)),
> - }
> + })
> }
>
> /// Register a Chunk with associated length.
> diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
> index a723e7cb0..026f1f106 100644
> --- a/src/api2/backup/mod.rs
> +++ b/src/api2/backup/mod.rs
> @@ -187,7 +187,8 @@ fn upgrade_to_backup_protocol(
> }
>
> // lock last snapshot to prevent forgetting/pruning it during backup
> - let guard = last.backup_dir
> + let guard = last
> + .backup_dir
> .lock_shared()
> .with_context(|| format!("while locking last snapshot during backup '{last:?}'"))?;
> Some(guard)
> @@ -206,14 +207,14 @@ fn upgrade_to_backup_protocol(
> Some(worker_id),
> auth_id.to_string(),
> true,
> - move |worker| {
> + move |worker| async move {
> let mut env = BackupEnvironment::new(
> env_type,
> auth_id,
> worker.clone(),
> datastore,
> backup_dir,
> - );
> + )?;
>
> env.debug = debug;
> env.last_backup = last_backup;
> @@ -247,74 +248,73 @@ fn upgrade_to_backup_protocol(
> http.max_frame_size(4 * 1024 * 1024);
>
> let env3 = env2.clone();
> - http.serve_connection(conn, TowerToHyperService::new(service)).map(move |result| {
> - match result {
> - Err(err) => {
> - // Avoid Transport endpoint is not connected (os error 107)
> - // fixme: find a better way to test for that error
> - if err.to_string().starts_with("connection error")
> - && env3.finished()
> - {
> - Ok(())
> - } else {
> - Err(Error::from(err))
> + http.serve_connection(conn, TowerToHyperService::new(service))
> + .map(move |result| {
> + match result {
> + Err(err) => {
> + // Avoid Transport endpoint is not connected (os error 107)
> + // fixme: find a better way to test for that error
> + if err.to_string().starts_with("connection error")
> + && env3.finished()
> + {
> + Ok(())
> + } else {
> + Err(Error::from(err))
> + }
> }
> + Ok(()) => Ok(()),
> }
> - Ok(()) => Ok(()),
> - }
> - })
> + })
> });
> let mut abort_future = abort_future.map(|_| Err(format_err!("task aborted")));
>
> - async move {
> - // keep flock until task ends
> - let _group_guard = _group_guard;
> - let snap_guard = snap_guard;
> - let _last_guard = _last_guard;
> -
> - let res = select! {
> - req = req_fut => req,
> - abrt = abort_future => abrt,
> - };
> - if benchmark {
> - env.log("benchmark finished successfully");
> - proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
> - return Ok(());
> + // keep flock until task ends
> + let _group_guard = _group_guard;
> + let snap_guard = snap_guard;
> + let _last_guard = _last_guard;
> +
> + let res = select! {
> + req = req_fut => req,
> + abrt = abort_future => abrt,
> + };
> + if benchmark {
> + env.log("benchmark finished successfully");
> + proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
> + return Ok(());
> + }
> +
> + let verify = |env: BackupEnvironment| {
> + if let Err(err) = env.verify_after_complete(snap_guard) {
> + env.log(format!(
> + "backup finished, but starting the requested verify task failed: {}",
> + err
> + ));
> }
> + };
>
> - let verify = |env: BackupEnvironment| {
> - if let Err(err) = env.verify_after_complete(snap_guard) {
> - env.log(format!(
> - "backup finished, but starting the requested verify task failed: {}",
> - err
> - ));
> - }
> - };
> -
> - match (res, env.ensure_finished()) {
> - (Ok(_), Ok(())) => {
> - env.log("backup finished successfully");
> - verify(env);
> - Ok(())
> - }
> - (Err(err), Ok(())) => {
> - // ignore errors after finish
> - env.log(format!("backup had errors but finished: {}", err));
> - verify(env);
> - Ok(())
> - }
> - (Ok(_), Err(err)) => {
> - env.log(format!("backup ended and finish failed: {}", err));
> - env.log("removing unfinished backup");
> - proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
> - Err(err)
> - }
> - (Err(err), Err(_)) => {
> - env.log(format!("backup failed: {}", err));
> - env.log("removing failed backup");
> - proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
> - Err(err)
> - }
> + match (res, env.ensure_finished()) {
> + (Ok(_), Ok(())) => {
> + env.log("backup finished successfully");
> + verify(env);
> + Ok(())
> + }
> + (Err(err), Ok(())) => {
> + // ignore errors after finish
> + env.log(format!("backup had errors but finished: {}", err));
> + verify(env);
> + Ok(())
> + }
> + (Ok(_), Err(err)) => {
> + env.log(format!("backup ended and finish failed: {}", err));
> + env.log("removing unfinished backup");
> + proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
> + Err(err)
> + }
> + (Err(err), Err(_)) => {
> + env.log(format!("backup failed: {}", err));
> + env.log("removing failed backup");
> + proxmox_async::runtime::block_in_place(|| env.remove_backup())?;
> + Err(err)
> }
> }
> },
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 08/45] api: backup: conditionally upload chunks to s3 object store backend
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 08/45] api: backup: conditionally upload chunks to s3 object store backend Christian Ebner
@ 2025-07-18 8:11 ` Lukas Wagner
0 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:11 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> Upload fixed and dynamic sized chunks to either the filesystem or
> the S3 object store, depending on the configured backend.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> src/api2/backup/upload_chunk.rs | 71 +++++++++++++++++++--------------
> 1 file changed, 42 insertions(+), 29 deletions(-)
>
> diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
> index 2c66c2855..3ad8c3c75 100644
> --- a/src/api2/backup/upload_chunk.rs
> +++ b/src/api2/backup/upload_chunk.rs
> @@ -16,7 +16,7 @@ use proxmox_sortable_macro::sortable;
>
> use pbs_api_types::{BACKUP_ARCHIVE_NAME_SCHEMA, CHUNK_DIGEST_SCHEMA};
> use pbs_datastore::file_formats::{DataBlobHeader, EncryptedDataBlobHeader};
> -use pbs_datastore::{DataBlob, DataStore};
> +use pbs_datastore::{DataBlob, DataStore, DatastoreBackend};
> use pbs_tools::json::{required_integer_param, required_string_param};
>
> use super::environment::*;
> @@ -154,22 +154,10 @@ fn upload_fixed_chunk(
> ) -> ApiResponseFuture {
> async move {
> let wid = required_integer_param(¶m, "wid")? as usize;
> - let size = required_integer_param(¶m, "size")? as u32;
> - let encoded_size = required_integer_param(¶m, "encoded-size")? as u32;
> -
> - let digest_str = required_string_param(¶m, "digest")?;
> - let digest = <[u8; 32]>::from_hex(digest_str)?;
> -
> let env: &BackupEnvironment = rpcenv.as_ref();
>
> - let (digest, size, compressed_size, is_duplicate) = UploadChunk::new(
> - BodyDataStream::new(req_body),
> - env.datastore.clone(),
> - digest,
> - size,
> - encoded_size,
> - )
> - .await?;
> + let (digest, size, compressed_size, is_duplicate) =
> + upload_to_backend(req_body, param, env).await?;
>
> env.register_fixed_chunk(wid, digest, size, compressed_size, is_duplicate)?;
> let digest_str = hex::encode(digest);
> @@ -229,22 +217,10 @@ fn upload_dynamic_chunk(
> ) -> ApiResponseFuture {
> async move {
> let wid = required_integer_param(¶m, "wid")? as usize;
> - let size = required_integer_param(¶m, "size")? as u32;
> - let encoded_size = required_integer_param(¶m, "encoded-size")? as u32;
> -
> - let digest_str = required_string_param(¶m, "digest")?;
> - let digest = <[u8; 32]>::from_hex(digest_str)?;
> -
> let env: &BackupEnvironment = rpcenv.as_ref();
>
> - let (digest, size, compressed_size, is_duplicate) = UploadChunk::new(
> - BodyDataStream::new(req_body),
> - env.datastore.clone(),
> - digest,
> - size,
> - encoded_size,
> - )
> - .await?;
> + let (digest, size, compressed_size, is_duplicate) =
> + upload_to_backend(req_body, param, env).await?;
>
> env.register_dynamic_chunk(wid, digest, size, compressed_size, is_duplicate)?;
> let digest_str = hex::encode(digest);
> @@ -256,6 +232,43 @@ fn upload_dynamic_chunk(
> .boxed()
> }
>
> +async fn upload_to_backend(
> + req_body: Incoming,
> + param: Value,
> + env: &BackupEnvironment,
> +) -> Result<([u8; 32], u32, u32, bool), Error> {
> + let size = required_integer_param(¶m, "size")? as u32;
> + let encoded_size = required_integer_param(¶m, "encoded-size")? as u32;
> + let digest_str = required_string_param(¶m, "digest")?;
> + let digest = <[u8; 32]>::from_hex(digest_str)?;
> +
> + match &env.backend {
> + DatastoreBackend::Filesystem => {
> + UploadChunk::new(
> + BodyDataStream::new(req_body),
> + env.datastore.clone(),
> + digest,
> + size,
> + encoded_size,
> + )
> + .await
> + }
> + DatastoreBackend::S3(s3_client) => {
> + let data = req_body.collect().await?.to_bytes();
> + if encoded_size != data.len() as u32 {
> + bail!(
> + "got blob with unexpected length ({encoded_size} != {})",
> + data.len()
> + );
> + }
> +
> + let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
> + let is_duplicate = s3_client.upload_with_retry(object_key, data, false).await?;
> + Ok((digest, size, encoded_size, is_duplicate))
> + }
> + }
> +}
> +
> pub const API_METHOD_UPLOAD_SPEEDTEST: ApiMethod = ApiMethod::new(
> &ApiHandler::AsyncHttp(&upload_speedtest),
> &ObjectSchema::new("Test upload speed.", &[]),
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 09/45] api: backup: conditionally upload blobs to s3 object store backend
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 09/45] api: backup: conditionally upload blobs " Christian Ebner
@ 2025-07-18 8:13 ` Lukas Wagner
0 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:13 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 10/45] api: backup: conditionally upload indices to s3 object store backend
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 10/45] api: backup: conditionally upload indices " Christian Ebner
@ 2025-07-18 8:20 ` Lukas Wagner
2025-07-18 9:24 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:20 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
Two nits inline.
On 2025-07-15 14:52, Christian Ebner wrote:
> If the datastore is backed by an S3 compatible object store, upload
> the dynamic or fixed index files to the object store after closing
> them. The local index files are kept in the local caching datastore
> to allow for fast and efficient content lookups, avoiding expensive
> (as in monetary cost and IO latency) requests.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - fix clippy warning and formatting
>
> src/api2/backup/environment.rs | 34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
> index 3d4677975..9ad13aeb3 100644
> --- a/src/api2/backup/environment.rs
> +++ b/src/api2/backup/environment.rs
> @@ -2,6 +2,7 @@ use anyhow::{bail, format_err, Context, Error};
> use pbs_config::BackupLockGuard;
>
> use std::collections::HashMap;
> +use std::io::Read;
> use std::sync::{Arc, Mutex};
> use tracing::info;
>
> @@ -18,6 +19,7 @@ use pbs_datastore::dynamic_index::DynamicIndexWriter;
> use pbs_datastore::fixed_index::FixedIndexWriter;
> use pbs_datastore::{DataBlob, DataStore, DatastoreBackend};
> use proxmox_rest_server::{formatter::*, WorkerTask};
> +use proxmox_s3_client::S3Client;
>
> use crate::backup::VerifyWorker;
>
> @@ -479,6 +481,13 @@ impl BackupEnvironment {
> );
> }
>
> + // For S3 backends, upload the index file to the object store after closing
> + if let DatastoreBackend::S3(s3_client) = &self.backend {
> + self.s3_upload_index(s3_client, &data.name)
> + .context("failed to upload dynamic index to s3 backend")?;
> + self.log(format!("Uploaded index file to s3 backend: {}", data.name))
> + }
> +
> self.log_upload_stat(
> &data.name,
> &csum,
> @@ -553,6 +562,16 @@ impl BackupEnvironment {
> );
> }
>
> + // For S3 backends, upload the index file to the object store after closing
> + if let DatastoreBackend::S3(s3_client) = &self.backend {
> + self.s3_upload_index(s3_client, &data.name)
> + .context("failed to upload fixed index to s3 backend")?;
> + self.log(format!(
> + "Uploaded fixed index file to object store: {}",
> + data.name
> + ))
> + }
nit: the log message differs between both cases
> +
> self.log_upload_stat(
> &data.name,
> &expected_csum,
> @@ -753,6 +772,21 @@ impl BackupEnvironment {
>
> Ok(())
> }
> +
> + fn s3_upload_index(&self, s3_client: &S3Client, name: &str) -> Result<(), Error> {
> + let object_key =
> + pbs_datastore::s3::object_key_from_path(&self.backup_dir.relative_path(), name)
> + .context("invalid index file object key")?;
> +
> + let mut full_path = self.backup_dir.full_path();
> + full_path.push(name);
> + let mut file = std::fs::File::open(&full_path)?;
> + let mut buffer = Vec::new();
> + file.read_to_end(&mut buffer)?;
nit: You can use std::fs::read() to get the Vec right away :)
> + let data = hyper::body::Bytes::from(buffer);
> + proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))?;
> + Ok(())
> + }
> }
>
> impl RpcEnvironment for BackupEnvironment {
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 11/45] api: backup: conditionally upload manifest to s3 object store backend
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 11/45] api: backup: conditionally upload manifest " Christian Ebner
@ 2025-07-18 8:26 ` Lukas Wagner
2025-07-18 9:33 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:26 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Two minor suggestions, but nothing that would prohibit my R-b:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> Reupload the manifest to the S3 object store backend on manifest
> updates, if s3 is configured as backend.
> This also triggers the initial manifest upload when finishing backup
> snapshot in the backup api call handler.
> Updates also the locally cached version for fast and efficient
> listing of contents without the need to perform expensive (as in
> monetary cost and IO latency) requests.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> pbs-datastore/Cargo.toml | 3 +++
> pbs-datastore/src/backup_info.rs | 12 +++++++++++-
> src/api2/admin/datastore.rs | 14 ++++++++++++--
> src/api2/backup/environment.rs | 16 ++++++++--------
> src/backup/verify.rs | 2 +-
> 5 files changed, 35 insertions(+), 12 deletions(-)
>
> diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
> index c42eff165..7e56dbd31 100644
> --- a/pbs-datastore/Cargo.toml
> +++ b/pbs-datastore/Cargo.toml
> @@ -13,6 +13,7 @@ crc32fast.workspace = true
> endian_trait.workspace = true
> futures.workspace = true
> hex = { workspace = true, features = [ "serde" ] }
> +hyper.workspace = true
> libc.workspace = true
> log.workspace = true
> nix.workspace = true
> @@ -29,8 +30,10 @@ zstd-safe.workspace = true
> pathpatterns.workspace = true
> pxar.workspace = true
>
> +proxmox-async.workspace = true
> proxmox-base64.workspace = true
> proxmox-borrow.workspace = true
> +proxmox-http.workspace = true
> proxmox-human-byte.workspace = true
> proxmox-io.workspace = true
> proxmox-lang.workspace=true
> diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
> index e3ecd437f..46e5b61f0 100644
> --- a/pbs-datastore/src/backup_info.rs
> +++ b/pbs-datastore/src/backup_info.rs
> @@ -19,7 +19,7 @@ use pbs_api_types::{
> use pbs_config::{open_backup_lockfile, BackupLockGuard};
>
> use crate::manifest::{BackupManifest, MANIFEST_LOCK_NAME};
> -use crate::{DataBlob, DataStore};
> +use crate::{DataBlob, DataStore, DatastoreBackend};
>
> pub const DATASTORE_LOCKS_DIR: &str = "/run/proxmox-backup/locks";
> const PROTECTED_MARKER_FILENAME: &str = ".protected";
> @@ -666,6 +666,7 @@ impl BackupDir {
> /// only use this method - anything else may break locking guarantees.
> pub fn update_manifest(
> &self,
> + backend: &DatastoreBackend,
> update_fn: impl FnOnce(&mut BackupManifest),
> ) -> Result<(), Error> {
> let _guard = self.lock_manifest()?;
> @@ -678,6 +679,15 @@ impl BackupDir {
> let blob = DataBlob::encode(manifest.as_bytes(), None, true)?;
> let raw_data = blob.raw_data();
>
> + if let DatastoreBackend::S3(s3_client) = backend {
> + let object_key =
> + super::s3::object_key_from_path(&self.relative_path(), MANIFEST_BLOB_NAME.as_ref())
> + .context("invalid manifest object key")?;
> + let data = hyper::body::Bytes::copy_from_slice(raw_data);
> + proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))
> + .context("failed to update manifest on s3 backend")?;
> + }
> +
> let mut path = self.full_path();
> path.push(MANIFEST_BLOB_NAME.as_ref());
>
> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
> index e24bc1c1b..02666afda 100644
> --- a/src/api2/admin/datastore.rs
> +++ b/src/api2/admin/datastore.rs
> @@ -65,7 +65,7 @@ use pbs_datastore::manifest::BackupManifest;
> use pbs_datastore::prune::compute_prune_info;
> use pbs_datastore::{
> check_backup_owner, ensure_datastore_is_mounted, task_tracking, BackupDir, BackupGroup,
> - DataStore, LocalChunkReader, StoreProgress,
> + DataStore, DatastoreBackend, LocalChunkReader, StoreProgress,
> };
> use pbs_tools::json::required_string_param;
> use proxmox_rest_server::{formatter, WorkerTask};
> @@ -2086,6 +2086,16 @@ pub fn set_group_notes(
> &backup_group,
> )?;
>
> + if let DatastoreBackend::S3(s3_client) = datastore.backend()? {
> + let mut path = ns.path();
> + path.push(format!("{backup_group}"));
You can just use .to_string() here, reads a bit nicer
> + let object_key = pbs_datastore::s3::object_key_from_path(&path, "notes")
> + .context("invalid owner file object key")?;
> + let data = hyper::body::Bytes::copy_from_slice(notes.as_bytes());
> + let _is_duplicate =
> + proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))
> + .context("failed to set notes on s3 backend")?;
> + }
> let notes_path = datastore.group_notes_path(&ns, &backup_group);
> replace_file(notes_path, notes.as_bytes(), CreateOptions::new(), false)?;
>
> @@ -2188,7 +2198,7 @@ pub fn set_notes(
> let backup_dir = datastore.backup_dir(ns, backup_dir)?;
>
> backup_dir
> - .update_manifest(|manifest| {
> + .update_manifest(&datastore.backend()?, |manifest| {
> manifest.unprotected["notes"] = notes.into();
> })
> .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
> index 9ad13aeb3..0017b347d 100644
> --- a/src/api2/backup/environment.rs
> +++ b/src/api2/backup/environment.rs
> @@ -646,14 +646,6 @@ impl BackupEnvironment {
> bail!("backup does not contain valid files (file count == 0)");
> }
>
> - // check for valid manifest and store stats
> - let stats = serde_json::to_value(state.backup_stat)?;
> - self.backup_dir
> - .update_manifest(|manifest| {
> - manifest.unprotected["chunk_upload_stats"] = stats;
> - })
> - .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
> -
> if let Some(base) = &self.last_backup {
> let path = base.backup_dir.full_path();
> if !path.exists() {
> @@ -664,6 +656,14 @@ impl BackupEnvironment {
> }
> }
>
> + // check for valid manifest and store stats
> + let stats = serde_json::to_value(state.backup_stat)?;
> + self.backup_dir
> + .update_manifest(&self.backend, |manifest| {
> + manifest.unprotected["chunk_upload_stats"] = stats;
> + })
> + .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
nit: you can inline the `err` variable here
> +
> self.datastore.try_ensure_sync_level()?;
>
> // marks the backup as successful
> diff --git a/src/backup/verify.rs b/src/backup/verify.rs
> index 0b954ae23..9344033d8 100644
> --- a/src/backup/verify.rs
> +++ b/src/backup/verify.rs
> @@ -359,7 +359,7 @@ impl VerifyWorker {
>
> if let Err(err) = {
> let verify_state = serde_json::to_value(verify_state)?;
> - backup_dir.update_manifest(|manifest| {
> + backup_dir.update_manifest(&self.datastore.backend()?, |manifest| {
> manifest.unprotected["verify_state"] = verify_state;
> })
> } {
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 12/45] api: datastore: conditionally upload client log to s3 backend
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 12/45] api: datastore: conditionally upload client log to s3 backend Christian Ebner
@ 2025-07-18 8:28 ` Lukas Wagner
0 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:28 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:52, Christian Ebner wrote:
> If the datastore is backed by an s3 compatible object store, upload
> the client log content to the s3 backend before persisting it to the
> local cache store.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> src/api2/admin/datastore.rs | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
> index 02666afda..b28b646e8 100644
> --- a/src/api2/admin/datastore.rs
> +++ b/src/api2/admin/datastore.rs
> @@ -1637,6 +1637,17 @@ pub fn upload_backup_log(
> // always verify blob/CRC at server side
> let blob = DataBlob::load_from_reader(&mut &data[..])?;
>
> + if let DatastoreBackend::S3(s3_client) = datastore.backend()? {
> + let object_key = pbs_datastore::s3::object_key_from_path(
> + &backup_dir.relative_path(),
> + file_name.as_ref(),
> + )
> + .context("invalid client log object key")?;
> + let data = hyper::body::Bytes::copy_from_slice(blob.raw_data());
> + proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))
> + .context("failed to upload client log to s3 backend")?;
> + };
> +
> replace_file(&path, blob.raw_data(), CreateOptions::new(), false)?;
>
> // fixme: use correct formatter
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 01/45] datastore: add helpers for path/digest to s3 object key conversion
2025-07-18 7:24 ` Lukas Wagner
@ 2025-07-18 8:34 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 8:34 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 9:24 AM, Lukas Wagner wrote:
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
> Ideally I'd like to see some basic unit tests for these helpers.
Added the requested tests for the upcoming version 9 of the patch series!
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 13/45] sync: pull: conditionally upload content to s3 backend
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 13/45] sync: pull: conditionally upload content " Christian Ebner
@ 2025-07-18 8:35 ` Lukas Wagner
2025-07-18 9:43 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:35 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
On 2025-07-15 14:53, Christian Ebner wrote:
> If the datastore is backed by an S3 object store, not only insert the
> pulled contents to the local cache store, but also upload it to the
> S3 backend.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> src/server/pull.rs | 66 +++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 63 insertions(+), 3 deletions(-)
>
> diff --git a/src/server/pull.rs b/src/server/pull.rs
> index b1724c142..fe87359ab 100644
> --- a/src/server/pull.rs
> +++ b/src/server/pull.rs
> @@ -6,8 +6,9 @@ use std::sync::atomic::{AtomicUsize, Ordering};
> use std::sync::{Arc, Mutex};
> use std::time::SystemTime;
>
> -use anyhow::{bail, format_err, Error};
> +use anyhow::{bail, format_err, Context, Error};
> use proxmox_human_byte::HumanByte;
> +use tokio::io::AsyncReadExt;
> use tracing::info;
>
> use pbs_api_types::{
> @@ -24,7 +25,7 @@ use pbs_datastore::fixed_index::FixedIndexReader;
> use pbs_datastore::index::IndexFile;
> use pbs_datastore::manifest::{BackupManifest, FileInfo};
> use pbs_datastore::read_chunk::AsyncReadChunk;
> -use pbs_datastore::{check_backup_owner, DataStore, StoreProgress};
> +use pbs_datastore::{check_backup_owner, DataStore, DatastoreBackend, StoreProgress};
> use pbs_tools::sha::sha256;
>
> use super::sync::{
> @@ -167,7 +168,20 @@ async fn pull_index_chunks<I: IndexFile>(
> move |(chunk, digest, size): (DataBlob, [u8; 32], u64)| {
> // println!("verify and write {}", hex::encode(&digest));
> chunk.verify_unencrypted(size as usize, &digest)?;
> - target2.insert_chunk(&chunk, &digest)?;
> + match target2.backend()? {
> + DatastoreBackend::Filesystem => {
> + target2.insert_chunk(&chunk, &digest)?;
> + }
> + DatastoreBackend::S3(s3_client) => {
> + let data = chunk.raw_data().to_vec();
> + let upload_data = hyper::body::Bytes::from(data);
> + let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
> + let _is_duplicate = proxmox_async::runtime::block_on(
> + s3_client.upload_with_retry(object_key, upload_data, false),
> + )
> + .context("failed to upload chunk to s3 backend")?;
> + }
> + }
> Ok(())
> },
> );
> @@ -331,6 +345,18 @@ async fn pull_single_archive<'a>(
> if let Err(err) = std::fs::rename(&tmp_path, &path) {
> bail!("Atomic rename file {:?} failed - {}", path, err);
> }
> + if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
> + let object_key =
> + pbs_datastore::s3::object_key_from_path(&snapshot.relative_path(), archive_name)
> + .context("invalid archive object key")?;
> +
> + let archive = tokio::fs::File::open(&path).await?;
> + let mut reader = tokio::io::BufReader::new(archive);
> + let mut contents = Vec::new();
> + reader.read_to_end(&mut contents).await?;
You can use tokio::fs::read here
> + let data = hyper::body::Bytes::from(contents);
> + let _is_duplicate = s3_client.upload_with_retry(object_key, data, true).await?;
I might do a review of the already merged s3 client code later, but I really don't like the
`replace: bool ` parameter for this function very much. I think I'd prefer having
to separate functions for replace vs. not replace (which might delegate to a common
fn internally, there a bool param is fine IMO), or alternatively, use an enum
instead. I think personally I'm gravitating more towards the separate function.
What do you think?
> + }
> Ok(sync_stats)
> }
>
> @@ -401,6 +427,7 @@ async fn pull_snapshot<'a>(
> }
> }
>
> + let manifest_data = tmp_manifest_blob.raw_data().to_vec();
> let manifest = BackupManifest::try_from(tmp_manifest_blob)?;
>
> if ignore_not_verified_or_encrypted(
> @@ -467,9 +494,42 @@ async fn pull_snapshot<'a>(
> if let Err(err) = std::fs::rename(&tmp_manifest_name, &manifest_name) {
> bail!("Atomic rename file {:?} failed - {}", manifest_name, err);
> }
> + if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
> + let object_key = pbs_datastore::s3::object_key_from_path(
> + &snapshot.relative_path(),
> + MANIFEST_BLOB_NAME.as_ref(),
> + )
> + .context("invalid manifest object key")?;
> +
> + let data = hyper::body::Bytes::from(manifest_data);
> + let _is_duplicate = s3_client
> + .upload_with_retry(object_key, data, true)
> + .await
> + .context("failed to upload manifest to s3 backend")?;
> + }
>
> if !client_log_name.exists() {
> reader.try_download_client_log(&client_log_name).await?;
> + if client_log_name.exists() {
> + if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
> + let object_key = pbs_datastore::s3::object_key_from_path(
> + &snapshot.relative_path(),
> + CLIENT_LOG_BLOB_NAME.as_ref(),
> + )
> + .context("invalid archive object key")?;
> +
> + let log_file = tokio::fs::File::open(&client_log_name).await?;
> + let mut reader = tokio::io::BufReader::new(log_file);
> + let mut contents = Vec::new();
> + reader.read_to_end(&mut contents).await?;
You can use tokio::fs::read(...) here
> +
> + let data = hyper::body::Bytes::from(contents);
> + let _is_duplicate = s3_client
> + .upload_with_retry(object_key, data, true)
> + .await
> + .context("failed to upload client log to s3 backend")?;
> + }
> + }
> };
> snapshot
> .cleanup_unreferenced_files(&manifest)
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 02/45] config: introduce s3 object store client configuration
2025-07-18 7:22 ` Lukas Wagner
@ 2025-07-18 8:37 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 8:37 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 9:22 AM, Lukas Wagner wrote:
> With my minor complaints fixed:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
> On 2025-07-15 14:52, Christian Ebner wrote:
>> Adds the client configuration for s3 object store as dedicated
>> configuration files, with secrets being stored separately from the
>> regular configuration and excluded from api responses for security
>> reasons.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> pbs-config/Cargo.toml | 1 +
>> pbs-config/src/lib.rs | 1 +
>> pbs-config/src/s3.rs | 83 +++++++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 85 insertions(+)
>> create mode 100644 pbs-config/src/s3.rs
>>
>> diff --git a/pbs-config/Cargo.toml b/pbs-config/Cargo.toml
>> index 284149658..74afb3c64 100644
>> --- a/pbs-config/Cargo.toml
>> +++ b/pbs-config/Cargo.toml
>> @@ -19,6 +19,7 @@ serde_json.workspace = true
>>
>> proxmox-notify.workspace = true
>> proxmox-router = { workspace = true, default-features = false }
>> +proxmox-s3-client.workspace = true
>> proxmox-schema.workspace = true
>> proxmox-section-config.workspace = true
>> proxmox-shared-memory.workspace = true
>> diff --git a/pbs-config/src/lib.rs b/pbs-config/src/lib.rs
>> index 9c4d77c24..d03c079ab 100644
>> --- a/pbs-config/src/lib.rs
>> +++ b/pbs-config/src/lib.rs
>> @@ -10,6 +10,7 @@ pub mod network;
>> pub mod notifications;
>> pub mod prune;
>> pub mod remote;
>> +pub mod s3;
>> pub mod sync;
>> pub mod tape_job;
>> pub mod token_shadow;
>> diff --git a/pbs-config/src/s3.rs b/pbs-config/src/s3.rs
>> new file mode 100644
>> index 000000000..ec3998834
>> --- /dev/null
>> +++ b/pbs-config/src/s3.rs
>> @@ -0,0 +1,83 @@
>> +use std::collections::HashMap;
>> +use std::sync::LazyLock;
>> +
>> +use anyhow::Error;
>> +
>> +use proxmox_s3_client::{S3ClientConfig, S3ClientSecretsConfig};
>> +use proxmox_schema::*;
>> +use proxmox_section_config::{SectionConfig, SectionConfigData, SectionConfigPlugin};
>> +
>> +use pbs_api_types::JOB_ID_SCHEMA;
>> +
>> +use crate::{open_backup_lockfile, replace_backup_config, BackupLockGuard};
>> +
>> +pub static CONFIG: LazyLock<SectionConfig> = LazyLock::new(init);
>> +
>> +fn init() -> SectionConfig {
>> + let obj_schema = match S3ClientConfig::API_SCHEMA {
>> + Schema::Object(ref obj_schema) => obj_schema,
>> + _ => unreachable!(),
>> + };
>> + let secrets_obj_schema = match S3ClientSecretsConfig::API_SCHEMA {
>> + Schema::Object(ref obj_schema) => obj_schema,
>> + _ => unreachable!(),
>> + };
>
> You can use API_SCHEMA::unwrap_object_schema here, that's a bit nicer to read :)
agreed, incorporated this for the next iteration of the patches
>> +
>> + let plugin =
>> + SectionConfigPlugin::new("s3client".to_string(), Some(String::from("id")), obj_schema);
>> + let secrets_plugin = SectionConfigPlugin::new(
>> + "s3secrets".to_string(),
>> + Some(String::from("secrets-id")),
>> + secrets_obj_schema,
>> + );
>> + let mut config = SectionConfig::new(&JOB_ID_SCHEMA);
>> + config.register_plugin(plugin);
>> + config.register_plugin(secrets_plugin);
>> +
>> + config
>> +}
>> +
>> +pub const S3_CFG_FILENAME: &str = "/etc/proxmox-backup/s3.cfg";
>> +pub const S3_SECRETS_CFG_FILENAME: &str = "/etc/proxmox-backup/s3-secrets.cfg";
>> +pub const S3_CFG_LOCKFILE: &str = "/etc/proxmox-backup/.s3.lck";
>
> You can use the pbs_buildcfg::configdir macro to build these paths. Also please
> add some docstrings to public consts like these.
same, added the helper so we always use the configured path from
buildcfg as base path for these constans.
>
>> +
>> +/// Get exclusive lock
>> +pub fn lock_config() -> Result<BackupLockGuard, Error> {
>> + open_backup_lockfile(S3_CFG_LOCKFILE, None, true)
>> +}
>> +
>> +pub fn config() -> Result<(SectionConfigData, [u8; 32]), Error> {
>> + parse_config(S3_CFG_FILENAME)
>> +}
>> +
>> +pub fn secrets_config() -> Result<(SectionConfigData, [u8; 32]), Error> {
>> + parse_config(S3_SECRETS_CFG_FILENAME)
>> +}
>> +
>> +pub fn save_config(config: &SectionConfigData, secrets: &SectionConfigData) -> Result<(), Error> {
>> + let raw = CONFIG.write(S3_CFG_FILENAME, config)?;
>> + replace_backup_config(S3_CFG_FILENAME, raw.as_bytes())?;
>> +
>> + let secrets_raw = CONFIG.write(S3_SECRETS_CFG_FILENAME, secrets)?;
>> + // Secrets are stored with `backup` permissions to allow reading from
>> + // not protected api endpoints as well.
>> + replace_backup_config(S3_SECRETS_CFG_FILENAME, secrets_raw.as_bytes())?;
>> +
>> + Ok(())
>> +}
>
> ^ These public functions lack docstrings
added these as well ...
>
>> +
>> +// shell completion helper
... and expanded a bit on this one since already at it.
>> +pub fn complete_s3_client_id(_arg: &str, _param: &HashMap<String, String>) -> Vec<String> {
>> + match config() {
>> + Ok((data, _digest)) => data.sections.keys().map(|id| id.to_string()).collect(),
>> + Err(_) => Vec::new(),
>> + }
>> +}
>> +
>> +fn parse_config(path: &str) -> Result<(SectionConfigData, [u8; 32]), Error> {
>> + let content = proxmox_sys::fs::file_read_optional_string(path)?;
>> + let content = content.unwrap_or_default();
>> + let digest = openssl::sha::sha256(content.as_bytes());
>> + let data = CONFIG.parse(path, &content)?;
>> + Ok((data, digest))
>> +}
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend Christian Ebner
@ 2025-07-18 8:38 ` Lukas Wagner
2025-07-18 9:58 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:38 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
One comment inline, but nothing prohibitive of a R-b:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:53, Christian Ebner wrote:
> Read the chunk based on the datastores backend, reading from local
> filesystem or fetching from S3 object store.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> src/api2/reader/environment.rs | 12 ++++++----
> src/api2/reader/mod.rs | 41 +++++++++++++++++++++++-----------
> 2 files changed, 36 insertions(+), 17 deletions(-)
>
> diff --git a/src/api2/reader/environment.rs b/src/api2/reader/environment.rs
> index 3b2f06f43..8924352b0 100644
> --- a/src/api2/reader/environment.rs
> +++ b/src/api2/reader/environment.rs
> @@ -1,13 +1,14 @@
> use std::collections::HashSet;
> use std::sync::{Arc, RwLock};
>
> +use anyhow::Error;
> use serde_json::{json, Value};
>
> use proxmox_router::{RpcEnvironment, RpcEnvironmentType};
>
> use pbs_api_types::Authid;
> use pbs_datastore::backup_info::BackupDir;
> -use pbs_datastore::DataStore;
> +use pbs_datastore::{DataStore, DatastoreBackend};
> use proxmox_rest_server::formatter::*;
> use proxmox_rest_server::WorkerTask;
> use tracing::info;
> @@ -23,6 +24,7 @@ pub struct ReaderEnvironment {
> pub worker: Arc<WorkerTask>,
> pub datastore: Arc<DataStore>,
> pub backup_dir: BackupDir,
> + pub backend: DatastoreBackend,
> allowed_chunks: Arc<RwLock<HashSet<[u8; 32]>>>,
> }
>
> @@ -33,8 +35,9 @@ impl ReaderEnvironment {
> worker: Arc<WorkerTask>,
> datastore: Arc<DataStore>,
> backup_dir: BackupDir,
> - ) -> Self {
> - Self {
> + ) -> Result<Self, Error> {
> + let backend = datastore.backend()?;
> + Ok(Self {
> result_attributes: json!({}),
> env_type,
> auth_id,
> @@ -43,8 +46,9 @@ impl ReaderEnvironment {
> debug: tracing::enabled!(tracing::Level::DEBUG),
> formatter: JSON_FORMATTER,
> backup_dir,
> + backend,
> allowed_chunks: Arc::new(RwLock::new(HashSet::new())),
> - }
> + })
> }
>
> pub fn log<S: AsRef<str>>(&self, msg: S) {
> diff --git a/src/api2/reader/mod.rs b/src/api2/reader/mod.rs
> index a77216043..997d9ca77 100644
> --- a/src/api2/reader/mod.rs
> +++ b/src/api2/reader/mod.rs
> @@ -3,6 +3,7 @@
> use anyhow::{bail, format_err, Context, Error};
> use futures::*;
> use hex::FromHex;
> +use http_body_util::BodyExt;
> use hyper::body::Incoming;
> use hyper::header::{self, HeaderValue, CONNECTION, UPGRADE};
> use hyper::http::request::Parts;
> @@ -27,8 +28,9 @@ use pbs_api_types::{
> };
> use pbs_config::CachedUserInfo;
> use pbs_datastore::index::IndexFile;
> -use pbs_datastore::{DataStore, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
> +use pbs_datastore::{DataStore, DatastoreBackend, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
> use pbs_tools::json::required_string_param;
> +use proxmox_s3_client::S3Client;
>
> use crate::api2::backup::optional_ns_param;
> use crate::api2::helpers;
> @@ -162,7 +164,7 @@ fn upgrade_to_backup_reader_protocol(
> worker.clone(),
> datastore,
> backup_dir,
> - );
> + )?;
>
> env.debug = debug;
>
> @@ -323,17 +325,10 @@ fn download_chunk(
> ));
> }
>
> - let (path, _) = env.datastore.chunk_path(&digest);
> - let path2 = path.clone();
> -
> - env.debug(format!("download chunk {:?}", path));
> -
> - let data =
> - proxmox_async::runtime::block_in_place(|| std::fs::read(path)).map_err(move |err| {
> - http_err!(BAD_REQUEST, "reading file {:?} failed: {}", path2, err)
> - })?;
> -
> - let body = Body::from(data);
> + let body = match &env.backend {
> + DatastoreBackend::Filesystem => load_from_filesystem(env, &digest)?,
> + DatastoreBackend::S3(s3_client) => fetch_from_object_store(s3_client, &digest).await?,
> + };
>
> // fixme: set other headers ?
> Ok(Response::builder()
> @@ -345,6 +340,26 @@ fn download_chunk(
> .boxed()
> }
>
> +async fn fetch_from_object_store(s3_client: &S3Client, digest: &[u8; 32]) -> Result<Body, Error> {
> + let object_key = pbs_datastore::s3::object_key_from_digest(digest)?;
> + if let Some(response) = s3_client.get_object(object_key).await? {
^ Do we maybe want some kind of retry-logic for retrieving objects as well? Disregard
in case you implement it in a later patch, I'm reviewing this series patch by patch.
> + let data = response.content.collect().await?.to_bytes();
> + return Ok(Body::from(data));
> + }
> + bail!("cannot find chunk with digest {}", hex::encode(digest));
> +}
> +
> +fn load_from_filesystem(env: &ReaderEnvironment, digest: &[u8; 32]) -> Result<Body, Error> {
> + let (path, _) = env.datastore.chunk_path(digest);
> + let path2 = path.clone();
> +
> + env.debug(format!("download chunk {path:?}"));
> +
> + let data = proxmox_async::runtime::block_in_place(|| std::fs::read(path))
> + .map_err(move |err| http_err!(BAD_REQUEST, "reading file {path2:?} failed: {err}"))?;
> + Ok(Body::from(data))
> +}
> +
> /* this is too slow
> fn download_chunk_old(
> _parts: Parts,
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 03/45] api: config: implement endpoints to manipulate and list s3 configs
2025-07-18 7:32 ` Lukas Wagner
@ 2025-07-18 8:40 ` Christian Ebner
2025-07-18 9:07 ` Lukas Wagner
0 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 8:40 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 9:32 AM, Lukas Wagner wrote:
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
>
> On 2025-07-15 14:52, Christian Ebner wrote:
>> +/// Update an s3 client configuration.
>> +#[allow(clippy::too_many_arguments)]
>> +pub fn update_s3_client_config(
>> + id: String,
>> + update: S3ClientConfigUpdater,
>> + update_secrets: S3ClientSecretsConfigUpdater,
>> + delete: Option<Vec<DeletableProperty>>,
>> + digest: Option<String>,
>> + _rpcenv: &mut dyn RpcEnvironment,
>> +) -> Result<(), Error> {
>> + let _lock = s3::lock_config()?;
>> + let (mut config, expected_digest) = s3::config()?;
>> + let (mut secrets, secrets_digest) = s3::secrets_config()?;
>> + let expected_digest = digest_with_secrets(&expected_digest, &secrets_digest);
>> +
>> + // Secrets are not included in digest concurrent changes therefore not detected.
>> + if let Some(ref digest) = digest {
>> + let digest = <[u8; 32]>::from_hex(digest)?;
>> + crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
>> + }
>> +
>> + let mut data: S3ClientConfig = config.lookup("s3client", &id)?;
>> +
>> + if let Some(delete) = delete {
>> + for delete_prop in delete {
>> + match delete_prop {
>> + DeletableProperty::Port => {
>> + data.port = None;
>> + }
>> + DeletableProperty::Region => {
>> + data.region = None;
>> + }
>> + DeletableProperty::Fingerprint => {
>> + data.fingerprint = None;
>> + }
>> + DeletableProperty::PathStyle => {
>> + data.path_style = None;
>> + }
>> + }
>> + }
>> + }
>
> Some time ago I've found that it is quite useful to
> destructure the updater like I did in proxmox-notify [1].
> This ensures that you don't forget to update the
> API handler after adding a new field to the config struct.
> Not a must, just a suggestion, since I like this pattern quite a bit :)
If fine by you, I'll keep this for now and do this as a followup,
including the same changes to the sync jobs and other configs were we
have this pattern. Would like to focus on the other comments first, as
these seem more pressing.
> [1] https://git.proxmox.com/?p=proxmox.git;a=blob;f=proxmox-notify/src/api/webhook.rs;h=9d904d0bf57f9f789bb6723e1d8ca710fcf0cb96;hb=HEAD#l175
>
>> +
>> + if let Some(endpoint) = update.endpoint {
>> + data.endpoint = endpoint;
>> + }
>> + if let Some(port) = update.port {
>> + data.port = Some(port);
>> + }
>> + if let Some(region) = update.region {
>> + data.region = Some(region);
>> + }
>> + if let Some(access_key) = update.access_key {
>> + data.access_key = access_key;
>> + }
>> + if let Some(fingerprint) = update.fingerprint {
>> + data.fingerprint = Some(fingerprint);
>> + }
>> + if let Some(path_style) = update.path_style {
>> + data.path_style = Some(path_style);
>> + }
>> +
>> + let mut secrets_data: S3ClientSecretsConfig = secrets.lookup("s3secrets", &id)?;
>> + if let Some(secret_key) = update_secrets.secret_key {
>> + secrets_data.secret_key = secret_key;
>> + }
>> +
>> + config.set_data(&id, "s3client", &data)?;
>> + secrets.set_data(&id, "s3secrets", &secrets_data)?;
>> + s3::save_config(&config, &secrets)?;
>> +
>> + Ok(())
>> +}
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 15/45] datastore: local chunk reader: read chunks based on backend
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 15/45] datastore: local chunk reader: read chunks based on backend Christian Ebner
@ 2025-07-18 8:45 ` Lukas Wagner
2025-07-18 10:11 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:45 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
With the comments addressed:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:53, Christian Ebner wrote:
> Get and store the datastore's backend on local chunk reader
> instantiantion and fetch chunks based on the variant from either the
> filesystem or the s3 object store.
>
> By storing the backend variant, the s3 client is instantiated only
> once and reused until the local chunk reader instance is dropped.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> pbs-datastore/Cargo.toml | 1 +
> pbs-datastore/src/local_chunk_reader.rs | 38 +++++++++++++++++++++----
> 2 files changed, 33 insertions(+), 6 deletions(-)
>
> diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
> index 7e56dbd31..8ce930a94 100644
> --- a/pbs-datastore/Cargo.toml
> +++ b/pbs-datastore/Cargo.toml
> @@ -13,6 +13,7 @@ crc32fast.workspace = true
> endian_trait.workspace = true
> futures.workspace = true
> hex = { workspace = true, features = [ "serde" ] }
> +http-body-util.workspace = true
> hyper.workspace = true
> libc.workspace = true
> log.workspace = true
> diff --git a/pbs-datastore/src/local_chunk_reader.rs b/pbs-datastore/src/local_chunk_reader.rs
> index 05a70c068..f5aa217ae 100644
> --- a/pbs-datastore/src/local_chunk_reader.rs
> +++ b/pbs-datastore/src/local_chunk_reader.rs
> @@ -3,17 +3,21 @@ use std::pin::Pin;
> use std::sync::Arc;
>
> use anyhow::{bail, Error};
> +use http_body_util::BodyExt;
>
> use pbs_api_types::CryptMode;
> use pbs_tools::crypt_config::CryptConfig;
> +use proxmox_s3_client::S3Client;
>
> use crate::data_blob::DataBlob;
> +use crate::datastore::DatastoreBackend;
> use crate::read_chunk::{AsyncReadChunk, ReadChunk};
> use crate::DataStore;
>
> #[derive(Clone)]
> pub struct LocalChunkReader {
> store: Arc<DataStore>,
> + backend: DatastoreBackend,
> crypt_config: Option<Arc<CryptConfig>>,
> crypt_mode: CryptMode,
> }
> @@ -24,8 +28,11 @@ impl LocalChunkReader {
> crypt_config: Option<Arc<CryptConfig>>,
> crypt_mode: CryptMode,
> ) -> Self {
> + // TODO: Error handling!
> + let backend = store.backend().unwrap();
> Self {
> store,
> + backend,
> crypt_config,
> crypt_mode,
> }
> @@ -47,10 +54,26 @@ impl LocalChunkReader {
> }
> }
>
> +async fn fetch(s3_client: Arc<S3Client>, digest: &[u8; 32]) -> Result<DataBlob, Error> {
> + let object_key = crate::s3::object_key_from_digest(digest)?;
> + if let Some(response) = s3_client.get_object(object_key).await? {
> + let bytes = response.content.collect().await?.to_bytes();
> + DataBlob::from_raw(bytes.to_vec())
> + } else {
> + bail!("no object with digest {}", hex::encode(digest));
> + }
> +}
> +
> impl ReadChunk for LocalChunkReader {
> fn read_raw_chunk(&self, digest: &[u8; 32]) -> Result<DataBlob, Error> {
> - let chunk = self.store.load_chunk(digest)?;
> + let chunk = match &self.backend {
> + DatastoreBackend::Filesystem => self.store.load_chunk(digest)?,
> + DatastoreBackend::S3(s3_client) => {
> + proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?
rather use Arc::clone(&s3_client) to avoid ambiguity
> + }
> + };
> self.ensure_crypt_mode(chunk.crypt_mode()?)?;
> +
> Ok(chunk)
> }
>
> @@ -69,11 +92,14 @@ impl AsyncReadChunk for LocalChunkReader {
> digest: &'a [u8; 32],
> ) -> Pin<Box<dyn Future<Output = Result<DataBlob, Error>> + Send + 'a>> {
> Box::pin(async move {
> - let (path, _) = self.store.chunk_path(digest);
> -
> - let raw_data = tokio::fs::read(&path).await?;
> -
> - let chunk = DataBlob::load_from_reader(&mut &raw_data[..])?;
> + let chunk = match &self.backend {
> + DatastoreBackend::Filesystem => {
> + let (path, _) = self.store.chunk_path(digest);
> + let raw_data = tokio::fs::read(&path).await?;
> + DataBlob::load_from_reader(&mut &raw_data[..])?
> + }
> + DatastoreBackend::S3(s3_client) => fetch(s3_client.clone(), digest).await?,
rather use Arc::clone(&s3_client) to avoid ambiguity
> + };
> self.ensure_crypt_mode(chunk.crypt_mode()?)?;
>
> Ok(chunk)
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 04/45] api: datastore: check s3 backend bucket access on datastore create
2025-07-18 7:40 ` Lukas Wagner
@ 2025-07-18 8:55 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 8:55 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 9:40 AM, Lukas Wagner wrote:
> With the two string constants moved:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
>
> On 2025-07-15 14:52, Christian Ebner wrote:
>> Check if the configured S3 object store backend can be reached and
>> the provided secrets have the permissions to access the bucket.
>>
>> Perform the check before creating the chunk store, so it is not left
>> behind if the bucket cannot be reached.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> Cargo.toml | 2 +-
>> src/api2/config/datastore.rs | 48 ++++++++++++++++++++++++++++++++----
>> 2 files changed, 44 insertions(+), 6 deletions(-)
>>
>> diff --git a/Cargo.toml b/Cargo.toml
>> index c7a77060e..a5954635a 100644
>> --- a/Cargo.toml
>> +++ b/Cargo.toml
>> @@ -77,7 +77,7 @@ proxmox-rest-server = { version = "1", features = [ "templates" ] }
>> proxmox-router = { version = "3.2.2", default-features = false }
>> proxmox-rrd = "1"
>> proxmox-rrd-api-types = "1.0.2"
>> -proxmox-s3-client = "1.0.0"
>> +proxmox-s3-client = { version = "1.0.0", features = [ "impl" ] }
>> # everything but pbs-config and pbs-client use "api-macro"
>> proxmox-schema = "4"
>> proxmox-section-config = "3"
>> diff --git a/src/api2/config/datastore.rs b/src/api2/config/datastore.rs
>> index b133be707..0fb822c79 100644
>> --- a/src/api2/config/datastore.rs
>> +++ b/src/api2/config/datastore.rs
>> @@ -1,21 +1,22 @@
>> use std::path::{Path, PathBuf};
>>
>> use ::serde::{Deserialize, Serialize};
>> -use anyhow::{bail, Context, Error};
>> +use anyhow::{bail, format_err, Context, Error};
>> use hex::FromHex;
>> use serde_json::Value;
>> use tracing::{info, warn};
>>
>> use proxmox_router::{http_bail, Permission, Router, RpcEnvironment, RpcEnvironmentType};
>> +use proxmox_s3_client::{S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig};
>> use proxmox_schema::{api, param_bail, ApiType};
>> use proxmox_section_config::SectionConfigData;
>> use proxmox_uuid::Uuid;
>>
>> use pbs_api_types::{
>> - Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreNotify, DatastoreTuning, KeepOptions,
>> - MaintenanceMode, PruneJobConfig, PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE,
>> - PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_MODIFY, PRIV_SYS_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA,
>> - UPID_SCHEMA,
>> + Authid, DataStoreConfig, DataStoreConfigUpdater, DatastoreBackendConfig, DatastoreBackendType,
>> + DatastoreNotify, DatastoreTuning, KeepOptions, MaintenanceMode, PruneJobConfig,
>> + PruneJobOptions, DATASTORE_SCHEMA, PRIV_DATASTORE_ALLOCATE, PRIV_DATASTORE_AUDIT,
>> + PRIV_DATASTORE_MODIFY, PRIV_SYS_MODIFY, PROXMOX_CONFIG_DIGEST_SCHEMA, UPID_SCHEMA,
>> };
>> use pbs_config::BackupLockGuard;
>> use pbs_datastore::chunk_store::ChunkStore;
>> @@ -116,6 +117,43 @@ pub(crate) fn do_create_datastore(
>> .parse_property_string(datastore.tuning.as_deref().unwrap_or(""))?,
>> )?;
>>
>> + if let Some(ref backend_config) = datastore.backend {
>> + let backend_config: DatastoreBackendConfig = backend_config.parse()?;
>> + match backend_config.ty.unwrap_or_default() {
>> + DatastoreBackendType::Filesystem => (),
>> + DatastoreBackendType::S3 => {
>> + let s3_client_id = backend_config
>> + .client
>> + .as_ref()
>> + .ok_or_else(|| format_err!("missing required client"))?;
>> + let bucket = backend_config
>> + .bucket
>> + .clone()
>> + .ok_or_else(|| format_err!("missing required bucket"))?;
>> + let (config, _config_digest) =
>> + pbs_config::s3::config().context("failed to get s3 config")?;
>> + let (secrets, _secrets_digest) =
>> + pbs_config::s3::secrets_config().context("failed to get s3 secrets")?;
>> + let config: S3ClientConfig = config
>> + .lookup("s3client", s3_client_id)
>> + .with_context(|| format!("no '{s3_client_id}' in config"))?;
>> + let secrets: S3ClientSecretsConfig = secrets
>> + .lookup("s3secrets", s3_client_id)
>> + .with_context(|| format!("no '{s3_client_id}' in secrets"))?;
>
> The "s3client" and "s3secrets" section type strings should be `pub const` where the the config parser is defined.
We do not do that for other configs consistently as well, but I do agree
that this makes sense and is best placed as constants next to the s3
config related code, so defininig it there.
>> + let options = S3ClientOptions::from_config(
>> + config,
>> + secrets,
>> + bucket,
>> + datastore.name.to_owned(),
>> + );
>> + let s3_client = S3Client::new(options).context("failed to create s3 client")?;
>> + // Fine to block since this runs in worker task
>> + proxmox_async::runtime::block_on(s3_client.head_bucket())
>> + .context("failed to access bucket")?;
>
> I wonder whether we should add some kind of retry logic not only here, but also for anywhere else
> where we interact with S3. Might of course be easier to implement that right in the s3 client crate.
> Also, no need to add this right away, just some idea for future improvements.
I would refrain from adding a retry logic everywhere, while this could
help circumvent some inter-mitten failures, it will cause additional
requests which we certainly do not want to have and might not help to
debug possible issues.
Also, the retry is limited to put requests, where the API itself might
tell that a retry is required.
quote:
If a conflicting operation occurs during the upload S3 returns a 409
ConditionalRequestConflict response. On a 409 failure you should fetch
the object's ETag and retry the upload.
see: https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html
But I did take a note to have a look at this again.
>
>> + }
>> + }
>> + }
>> +
>> let unmount_guard = if datastore.backing_device.is_some() {
>> do_mount_device(datastore.clone())?;
>> UnmountGuard::new(Some(path.clone()))
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 17/45] verify: implement chunk verification for stores with s3 backend
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 17/45] verify: implement chunk verification for stores with s3 backend Christian Ebner
@ 2025-07-18 8:56 ` Lukas Wagner
2025-07-18 11:45 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 8:56 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
On 2025-07-15 14:53, Christian Ebner wrote:
> For datastores backed by an S3 compatible object store, rather than
> reading the chunks to be verified from the local filesystem, fetch
> them via the s3 client from the configured bucket.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> src/backup/verify.rs | 89 ++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 77 insertions(+), 12 deletions(-)
>
> diff --git a/src/backup/verify.rs b/src/backup/verify.rs
> index dea10f618..3a4a1d0d5 100644
> --- a/src/backup/verify.rs
> +++ b/src/backup/verify.rs
> @@ -5,6 +5,7 @@ use std::sync::{Arc, Mutex};
> use std::time::Instant;
>
> use anyhow::{bail, Error};
> +use http_body_util::BodyExt;
> use tracing::{error, info, warn};
>
> use proxmox_worker_task::WorkerTaskContext;
> @@ -89,6 +90,38 @@ impl VerifyWorker {
> }
> }
>
> + if let Ok(DatastoreBackend::S3(s3_client)) = datastore.backend() {
> + let suffix = format!(".{}.bad", counter);
> + let target_key =
> + match pbs_datastore::s3::object_key_from_digest_with_suffix(digest, &suffix) {
> + Ok(target_key) => target_key,
> + Err(err) => {
> + info!("could not generate target key for corrupted chunk {path:?} - {err}");
> + return;
> + }
> + };
> + let object_key = match pbs_datastore::s3::object_key_from_digest(digest) {
> + Ok(object_key) => object_key,
> + Err(err) => {
> + info!("could not generate object key for corrupted chunk {path:?} - {err}");
> + return;
> + }
> + };
> + if proxmox_async::runtime::block_on(
> + s3_client.copy_object(object_key.clone(), target_key),
> + )
> + .is_ok()
> + {
> + if proxmox_async::runtime::block_on(s3_client.delete_object(object_key)).is_err() {
> + info!("failed to delete corrupt chunk on s3 backend: {digest_str}");
> + }
> + } else {
> + info!("failed to copy corrupt chunk on s3 backend: {digest_str}");
> + }
> + } else {
> + info!("failed to get s3 backend while trying to rename bad chunk: {digest_str}");
> + }
> +
> match std::fs::rename(&path, &new_path) {
> Ok(_) => {
> info!("corrupted chunk renamed to {:?}", &new_path);
> @@ -189,18 +222,50 @@ impl VerifyWorker {
> continue; // already verified or marked corrupt
> }
>
> - match self.datastore.load_chunk(&info.digest) {
> - Err(err) => {
> - self.corrupt_chunks.lock().unwrap().insert(info.digest);
> - error!("can't verify chunk, load failed - {err}");
> - errors.fetch_add(1, Ordering::SeqCst);
> - Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
> - }
> - Ok(chunk) => {
> - let size = info.size();
> - read_bytes += chunk.raw_size();
> - decoder_pool.send((chunk, info.digest, size))?;
> - decoded_bytes += size;
> + match &self.backend {
The whole method becomes uncomfortably large, maybe move the entire match &self.backend into a new method?
> + DatastoreBackend::Filesystem => match self.datastore.load_chunk(&info.digest) {
> + Err(err) => {
> + self.corrupt_chunks.lock().unwrap().insert(info.digest);
Maybe add a new method self.add_corrupt_chunk
fn add_corrupt_chunk(&mut self, chunk: ...) {
// Panic on poisoned mutex
let mut chunks = self.corrupt_chunks.lock().unwrap();
chunks.insert(chunk);
}
or the like
> + error!("can't verify chunk, load failed - {err}");
> + errors.fetch_add(1, Ordering::SeqCst);
> + Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
> + }
> + Ok(chunk) => {
> + let size = info.size();
> + read_bytes += chunk.raw_size();
> + decoder_pool.send((chunk, info.digest, size))?;
> + decoded_bytes += size;
> + }
> + },
> + DatastoreBackend::S3(s3_client) => {
> + let object_key = pbs_datastore::s3::object_key_from_digest(&info.digest)?;
> + match proxmox_async::runtime::block_on(s3_client.get_object(object_key)) {
> + Ok(Some(response)) => {
> + let bytes =
> + proxmox_async::runtime::block_on(response.content.collect())?
> + .to_bytes();
> + let chunk = DataBlob::from_raw(bytes.to_vec())?;
> + let size = info.size();
> + read_bytes += chunk.raw_size();
> + decoder_pool.send((chunk, info.digest, size))?;
> + decoded_bytes += size;
> + }
> + Ok(None) => {
> + self.corrupt_chunks.lock().unwrap().insert(info.digest);
> + error!(
> + "can't verify missing chunk with digest {}",
> + hex::encode(info.digest)
> + );
> + errors.fetch_add(1, Ordering::SeqCst);
> + Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
> + }
> + Err(err) => {
> + self.corrupt_chunks.lock().unwrap().insert(info.digest);
> + error!("can't verify chunk, load failed - {err}");
> + errors.fetch_add(1, Ordering::SeqCst);
> + Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
> + }
> + }
> }
> }
> }
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 05/45] api/cli: add endpoint and command to check s3 client connection
2025-07-18 7:43 ` Lukas Wagner
@ 2025-07-18 9:04 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 9:04 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 9:42 AM, Lukas Wagner wrote:
> With the magic string replaced by constants:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
>
> On 2025-07-15 14:52, Christian Ebner wrote:
>> Adds a dedicated api endpoint and a proxmox-backup-manager command to
>> check if the configured S3 client can reach the bucket.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> src/api2/admin/mod.rs | 2 +
>> src/api2/admin/s3.rs | 80 +++++++++++++++++++++++++++
>> src/bin/proxmox-backup-manager.rs | 1 +
>> src/bin/proxmox_backup_manager/mod.rs | 2 +
>> src/bin/proxmox_backup_manager/s3.rs | 46 +++++++++++++++
>> 5 files changed, 131 insertions(+)
>> create mode 100644 src/api2/admin/s3.rs
>> create mode 100644 src/bin/proxmox_backup_manager/s3.rs
>>
>> diff --git a/src/api2/admin/mod.rs b/src/api2/admin/mod.rs
>> index a1c49f8e2..7694de4b9 100644
>> --- a/src/api2/admin/mod.rs
>> +++ b/src/api2/admin/mod.rs
>> @@ -9,6 +9,7 @@ pub mod gc;
>> pub mod metrics;
>> pub mod namespace;
>> pub mod prune;
>> +pub mod s3;
>> pub mod sync;
>> pub mod traffic_control;
>> pub mod verify;
>> @@ -19,6 +20,7 @@ const SUBDIRS: SubdirMap = &sorted!([
>> ("metrics", &metrics::ROUTER),
>> ("prune", &prune::ROUTER),
>> ("gc", &gc::ROUTER),
>> + ("s3", &s3::ROUTER),
>> ("sync", &sync::ROUTER),
>> ("traffic-control", &traffic_control::ROUTER),
>> ("verify", &verify::ROUTER),
>> diff --git a/src/api2/admin/s3.rs b/src/api2/admin/s3.rs
>> new file mode 100644
>> index 000000000..d20031707
>> --- /dev/null
>> +++ b/src/api2/admin/s3.rs
>> @@ -0,0 +1,80 @@
>> +//! S3 bucket operations
>> +
>> +use anyhow::{Context, Error};
>> +use serde_json::Value;
>> +
>> +use proxmox_http::Body;
>> +use proxmox_router::{list_subdirs_api_method, Permission, Router, RpcEnvironment, SubdirMap};
>> +use proxmox_s3_client::{
>> + S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3_BUCKET_NAME_SCHEMA,
>> + S3_CLIENT_ID_SCHEMA,
>> +};
>> +use proxmox_schema::*;
>> +use proxmox_sortable_macro::sortable;
>> +
>> +use pbs_api_types::PRIV_SYS_MODIFY;
>> +
>> +#[api(
>> + input: {
>> + properties: {
>> + "s3-client-id": {
>> + schema: S3_CLIENT_ID_SCHEMA,
>> + },
>> + bucket: {
>> + schema: S3_BUCKET_NAME_SCHEMA,
>> + },
>> + "store-prefix": {
>> + type: String,
>> + description: "Store prefix within bucket for S3 object keys (commonly datastore name)",
>> + },
>> + },
>> + },
>> + access: {
>> + permission: &Permission::Privilege(&[], PRIV_SYS_MODIFY, false),
>> + },
>> +)]
>> +/// Perform basic sanity check for given s3 client configuration
>> +pub async fn check(
>> + s3_client_id: String,
>> + bucket: String,
>> + store_prefix: String,
>> + _rpcenv: &mut dyn RpcEnvironment,
>> +) -> Result<Value, Error> {
>> + let (config, _digest) = pbs_config::s3::config()?;
>> + let config: S3ClientConfig = config
>> + .lookup("s3client", &s3_client_id)
>> + .context("config lookup failed")?;
>> + let (secrets, _secrets_digest) = pbs_config::s3::secrets_config()?;
>> + let secrets: S3ClientSecretsConfig = secrets
>> + .lookup("s3secrets", &s3_client_id)
>> + .context("secrets lookup failed")?;
>
> Same thing here with regards to the section config type strings.
Adapted both to the new constants as well
>> +
>> + let options = S3ClientOptions::from_config(config, secrets, bucket, store_prefix);
>> +
>> + let test_object_key = ".s3-client-test";
>> + let client = S3Client::new(options).context("client creation failed")?;
>> + client.head_bucket().await.context("head object failed")?;
>> + client
>> + .put_object(test_object_key.into(), Body::empty(), true)
>> + .await
>> + .context("put object failed")?;
>> + client
>> + .get_object(test_object_key.into())
>> + .await
>> + .context("get object failed")?;
>> + client
>> + .delete_object(test_object_key.into())
>> + .await
>> + .context("delete object failed")?;
>> +
>> + Ok(Value::Null)
>> +}
>> +
>> +#[sortable]
>> +const S3_OPERATION_SUBDIRS: SubdirMap = &[("check", &Router::new().get(&API_METHOD_CHECK))];
>> +
>> +const S3_OPERATION_ROUTER: Router = Router::new()
>> + .get(&list_subdirs_api_method!(S3_OPERATION_SUBDIRS))
>> + .subdirs(S3_OPERATION_SUBDIRS);
>> +
>> +pub const ROUTER: Router = Router::new().match_all("s3-client-id", &S3_OPERATION_ROUTER);
>> diff --git a/src/bin/proxmox-backup-manager.rs b/src/bin/proxmox-backup-manager.rs
>> index d4363e717..68d87c676 100644
>> --- a/src/bin/proxmox-backup-manager.rs
>> +++ b/src/bin/proxmox-backup-manager.rs
>> @@ -677,6 +677,7 @@ async fn run() -> Result<(), Error> {
>> .insert("garbage-collection", garbage_collection_commands())
>> .insert("acme", acme_mgmt_cli())
>> .insert("cert", cert_mgmt_cli())
>> + .insert("s3", s3_commands())
>> .insert("subscription", subscription_commands())
>> .insert("sync-job", sync_job_commands())
>> .insert("verify-job", verify_job_commands())
>> diff --git a/src/bin/proxmox_backup_manager/mod.rs b/src/bin/proxmox_backup_manager/mod.rs
>> index 9b5c73e9a..312a6db6b 100644
>> --- a/src/bin/proxmox_backup_manager/mod.rs
>> +++ b/src/bin/proxmox_backup_manager/mod.rs
>> @@ -26,6 +26,8 @@ mod prune;
>> pub use prune::*;
>> mod remote;
>> pub use remote::*;
>> +mod s3;
>> +pub use s3::*;
>> mod subscription;
>> pub use subscription::*;
>> mod sync;
>> diff --git a/src/bin/proxmox_backup_manager/s3.rs b/src/bin/proxmox_backup_manager/s3.rs
>> new file mode 100644
>> index 000000000..9bb89ff55
>> --- /dev/null
>> +++ b/src/bin/proxmox_backup_manager/s3.rs
>> @@ -0,0 +1,46 @@
>> +use proxmox_router::{cli::*, RpcEnvironment};
>> +use proxmox_s3_client::{S3_BUCKET_NAME_SCHEMA, S3_CLIENT_ID_SCHEMA};
>> +use proxmox_schema::api;
>> +
>> +use proxmox_backup::api2;
>> +
>> +use anyhow::Error;
>> +use serde_json::Value;
>> +
>> +#[api(
>> + input: {
>> + properties: {
>> + "s3-client-id": {
>> + schema: S3_CLIENT_ID_SCHEMA,
>> + },
>> + bucket: {
>> + schema: S3_BUCKET_NAME_SCHEMA,
>> + },
>> + "store-prefix": {
>> + type: String,
>> + description: "Store prefix within bucket for S3 object keys (commonly datastore name)",
>> + },
>> + },
>> + },
>> +)]
>> +/// Perform basic sanity checks for given S3 client configuration
>> +async fn check(
>> + s3_client_id: String,
>> + bucket: String,
>> + store_prefix: String,
>> + rpcenv: &mut dyn RpcEnvironment,
>> +) -> Result<Value, Error> {
>> + api2::admin::s3::check(s3_client_id, bucket, store_prefix, rpcenv).await?;
>> + Ok(Value::Null)
>> +}
>> +
>> +pub fn s3_commands() -> CommandLineInterface {
>> + let cmd_def = CliCommandMap::new().insert(
>> + "check",
>> + CliCommand::new(&API_METHOD_CHECK)
>> + .arg_param(&["s3-client-id", "bucket"])
>> + .completion_cb("s3-client-id", pbs_config::s3::complete_s3_client_id),
>> + );
>> +
>> + cmd_def.into()
>> +}
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 03/45] api: config: implement endpoints to manipulate and list s3 configs
2025-07-18 8:40 ` Christian Ebner
@ 2025-07-18 9:07 ` Lukas Wagner
0 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 9:07 UTC (permalink / raw)
To: Christian Ebner, Proxmox Backup Server development discussion
On 2025-07-18 10:40, Christian Ebner wrote:
> On 7/18/25 9:32 AM, Lukas Wagner wrote:
>> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>>
>>
>> On 2025-07-15 14:52, Christian Ebner wrote:
>>> +/// Update an s3 client configuration.
>>> +#[allow(clippy::too_many_arguments)]
>>> +pub fn update_s3_client_config(
>>> + id: String,
>>> + update: S3ClientConfigUpdater,
>>> + update_secrets: S3ClientSecretsConfigUpdater,
>>> + delete: Option<Vec<DeletableProperty>>,
>>> + digest: Option<String>,
>>> + _rpcenv: &mut dyn RpcEnvironment,
>>> +) -> Result<(), Error> {
>>> + let _lock = s3::lock_config()?;
>>> + let (mut config, expected_digest) = s3::config()?;
>>> + let (mut secrets, secrets_digest) = s3::secrets_config()?;
>>> + let expected_digest = digest_with_secrets(&expected_digest, &secrets_digest);
>>> +
>>> + // Secrets are not included in digest concurrent changes therefore not detected.
>>> + if let Some(ref digest) = digest {
>>> + let digest = <[u8; 32]>::from_hex(digest)?;
>>> + crate::tools::detect_modified_configuration_file(&digest, &expected_digest)?;
>>> + }
>>> +
>>> + let mut data: S3ClientConfig = config.lookup("s3client", &id)?;
>>> +
>>> + if let Some(delete) = delete {
>>> + for delete_prop in delete {
>>> + match delete_prop {
>>> + DeletableProperty::Port => {
>>> + data.port = None;
>>> + }
>>> + DeletableProperty::Region => {
>>> + data.region = None;
>>> + }
>>> + DeletableProperty::Fingerprint => {
>>> + data.fingerprint = None;
>>> + }
>>> + DeletableProperty::PathStyle => {
>>> + data.path_style = None;
>>> + }
>>> + }
>>> + }
>>> + }
>>
>> Some time ago I've found that it is quite useful to
>> destructure the updater like I did in proxmox-notify [1].
>> This ensures that you don't forget to update the
>> API handler after adding a new field to the config struct.
>> Not a must, just a suggestion, since I like this pattern quite a bit :)
>
> If fine by you, I'll keep this for now and do this as a followup, including the same changes to the sync jobs and other configs were we have this pattern. Would like to focus on the other comments first, as these seem more pressing.
Sure, no problem for more.
>
>> [1] https://git.proxmox.com/?p=proxmox.git;a=blob;f=proxmox-notify/src/api/webhook.rs;h=9d904d0bf57f9f789bb6723e1d8ca710fcf0cb96;hb=HEAD#l175
>>
>>> +
>>> + if let Some(endpoint) = update.endpoint {
>>> + data.endpoint = endpoint;
>>> + }
>>> + if let Some(port) = update.port {
>>> + data.port = Some(port);
>>> + }
>>> + if let Some(region) = update.region {
>>> + data.region = Some(region);
>>> + }
>>> + if let Some(access_key) = update.access_key {
>>> + data.access_key = access_key;
>>> + }
>>> + if let Some(fingerprint) = update.fingerprint {
>>> + data.fingerprint = Some(fingerprint);
>>> + }
>>> + if let Some(path_style) = update.path_style {
>>> + data.path_style = Some(path_style);
>>> + }
>>> +
>>> + let mut secrets_data: S3ClientSecretsConfig = secrets.lookup("s3secrets", &id)?;
>>> + if let Some(secret_key) = update_secrets.secret_key {
>>> + secrets_data.secret_key = secret_key;
>>> + }
>>> +
>>> + config.set_data(&id, "s3client", &data)?;
>>> + secrets.set_data(&id, "s3secrets", &secrets_data)?;
>>> + s3::save_config(&config, &secrets)?;
>>> +
>>> + Ok(())
>>> +}
>>
>>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 06/45] datastore: allow to get the backend for a datastore
2025-07-18 7:52 ` Lukas Wagner
@ 2025-07-18 9:10 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 9:10 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 9:52 AM, Lukas Wagner wrote:
> With my feedback addressed:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
>
> On 2025-07-15 14:52, Christian Ebner wrote:
>> Implements an enum with variants Filesystem and S3 to distinguish
>> between available backends. Filesystem will be used as default, if no
>> backend is configured in the datastores configuration. If the
>> datastore has an s3 backend configured, the backend method will
>> instantiate and s3 client and return it with the S3 variant.
>>
>> This allows to instantiate the client once, keeping and reusing the
>> same open connection to the api for the lifetime of task or job, e.g.
>> in the backup writer/readers runtime environment.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> pbs-datastore/src/datastore.rs | 52 ++++++++++++++++++++++++++++++++--
>> pbs-datastore/src/lib.rs | 1 +
>> 2 files changed, 51 insertions(+), 2 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 924d8cf9c..90ab80005 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -12,6 +12,7 @@ use pbs_tools::lru_cache::LruCache;
>> use tracing::{info, warn};
>>
>> use proxmox_human_byte::HumanByte;
>> +use proxmox_s3_client::{S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig};
>> use proxmox_schema::ApiType;
>>
>> use proxmox_sys::error::SysError;
>> @@ -23,8 +24,8 @@ use proxmox_worker_task::WorkerTaskContext;
>>
>> use pbs_api_types::{
>> ArchiveType, Authid, BackupGroupDeleteStats, BackupNamespace, BackupType, ChunkOrder,
>> - DataStoreConfig, DatastoreFSyncLevel, DatastoreTuning, GarbageCollectionStatus,
>> - MaintenanceMode, MaintenanceType, Operation, UPID,
>> + DataStoreConfig, DatastoreBackendConfig, DatastoreBackendType, DatastoreFSyncLevel,
>> + DatastoreTuning, GarbageCollectionStatus, MaintenanceMode, MaintenanceType, Operation, UPID,
>> };
>> use pbs_config::BackupLockGuard;
>>
>> @@ -127,6 +128,7 @@ pub struct DataStoreImpl {
>> chunk_order: ChunkOrder,
>> last_digest: Option<[u8; 32]>,
>> sync_level: DatastoreFSyncLevel,
>> + backend_config: DatastoreBackendConfig,
>> }
>>
>> impl DataStoreImpl {
>> @@ -141,6 +143,7 @@ impl DataStoreImpl {
>> chunk_order: Default::default(),
>> last_digest: None,
>> sync_level: Default::default(),
>> + backend_config: Default::default(),
>> })
>> }
>> }
>> @@ -196,6 +199,12 @@ impl Drop for DataStore {
>> }
>> }
>>
>> +#[derive(Clone)]
>> +pub enum DatastoreBackend {
>> + Filesystem,
>> + S3(Arc<S3Client>),
>> +}
>> +
>
> Missing doc comments for this public enum
Added docs for both, the enum and the individual variants.
>> impl DataStore {
>> // This one just panics on everything
>> #[doc(hidden)]
>> @@ -206,6 +215,39 @@ impl DataStore {
>> })
>> }
>>
>> + /// Get the backend for this datastore based on it's configuration
>> + pub fn backend(&self) -> Result<DatastoreBackend, Error> {
>> + let backend_type = match self.inner.backend_config.ty.unwrap_or_default() {
>> + DatastoreBackendType::Filesystem => DatastoreBackend::Filesystem,
>> + DatastoreBackendType::S3 => {
>> + let s3_client_id = self
>> + .inner
>> + .backend_config
>> + .client
>> + .as_ref()
>> + .ok_or_else(|| format_err!("missing client for s3 backend"))?;
>> + let bucket = self
>> + .inner
>> + .backend_config
>> + .bucket
>> + .clone()
>> + .ok_or_else(|| format_err!("missing bucket for s3 backend"))?;
>> +
>> + let (config, _config_digest) = pbs_config::s3::config()?;
>> + let (secrets, _secrets_digest) = pbs_config::s3::secrets_config()?;
>> + let config: S3ClientConfig = config.lookup("s3client", s3_client_id)?;
>> + let secrets: S3ClientSecretsConfig = secrets.lookup("s3secrets", s3_client_id)?;
>
> Same thing here with regards to the hard-coded section type names.
Adapted as well, thx for pointing them out!
>
>> +
>> + let options =
>> + S3ClientOptions::from_config(config, secrets, bucket, self.name().to_owned());
>> + let s3_client = S3Client::new(options)?;
>> + DatastoreBackend::S3(Arc::new(s3_client))
>> + }
>> + };
>> +
>> + Ok(backend_type)
>> + }
>> +
>> pub fn lookup_datastore(
>> name: &str,
>> operation: Option<Operation>,
>> @@ -383,6 +425,11 @@ impl DataStore {
>> .parse_property_string(config.tuning.as_deref().unwrap_or(""))?,
>> )?;
>>
>> + let backend_config: DatastoreBackendConfig = serde_json::from_value(
>> + DatastoreBackendConfig::API_SCHEMA
>> + .parse_property_string(config.backend.as_deref().unwrap_or(""))?,
>> + )?;
>> +
>> Ok(DataStoreImpl {
>> chunk_store,
>> gc_mutex: Mutex::new(()),
>> @@ -391,6 +438,7 @@ impl DataStore {
>> chunk_order: tuning.chunk_order.unwrap_or_default(),
>> last_digest,
>> sync_level: tuning.sync_level.unwrap_or_default(),
>> + backend_config,
>> })
>> }
>>
>> diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
>> index ffd0d91b2..ca6fdb7d8 100644
>> --- a/pbs-datastore/src/lib.rs
>> +++ b/pbs-datastore/src/lib.rs
>> @@ -204,6 +204,7 @@ pub use store_progress::StoreProgress;
>> mod datastore;
>> pub use datastore::{
>> check_backup_owner, ensure_datastore_is_mounted, get_datastore_mount_status, DataStore,
>> + DatastoreBackend,
>> };
>>
>> mod hierarchy;
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 10/45] api: backup: conditionally upload indices to s3 object store backend
2025-07-18 8:20 ` Lukas Wagner
@ 2025-07-18 9:24 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 9:24 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 10:20 AM, Lukas Wagner wrote:
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
> Two nits inline.
>
> On 2025-07-15 14:52, Christian Ebner wrote:
>> If the datastore is backed by an S3 compatible object store, upload
>> the dynamic or fixed index files to the object store after closing
>> them. The local index files are kept in the local caching datastore
>> to allow for fast and efficient content lookups, avoiding expensive
>> (as in monetary cost and IO latency) requests.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - fix clippy warning and formatting
>>
>> src/api2/backup/environment.rs | 34 ++++++++++++++++++++++++++++++++++
>> 1 file changed, 34 insertions(+)
>>
>> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
>> index 3d4677975..9ad13aeb3 100644
>> --- a/src/api2/backup/environment.rs
>> +++ b/src/api2/backup/environment.rs
>> @@ -2,6 +2,7 @@ use anyhow::{bail, format_err, Context, Error};
>> use pbs_config::BackupLockGuard;
>>
>> use std::collections::HashMap;
>> +use std::io::Read;
>> use std::sync::{Arc, Mutex};
>> use tracing::info;
>>
>> @@ -18,6 +19,7 @@ use pbs_datastore::dynamic_index::DynamicIndexWriter;
>> use pbs_datastore::fixed_index::FixedIndexWriter;
>> use pbs_datastore::{DataBlob, DataStore, DatastoreBackend};
>> use proxmox_rest_server::{formatter::*, WorkerTask};
>> +use proxmox_s3_client::S3Client;
>>
>> use crate::backup::VerifyWorker;
>>
>> @@ -479,6 +481,13 @@ impl BackupEnvironment {
>> );
>> }
>>
>> + // For S3 backends, upload the index file to the object store after closing
>> + if let DatastoreBackend::S3(s3_client) = &self.backend {
>> + self.s3_upload_index(s3_client, &data.name)
>> + .context("failed to upload dynamic index to s3 backend")?;
>> + self.log(format!("Uploaded index file to s3 backend: {}", data.name))
>> + }
>> +
>> self.log_upload_stat(
>> &data.name,
>> &csum,
>> @@ -553,6 +562,16 @@ impl BackupEnvironment {
>> );
>> }
>>
>> + // For S3 backends, upload the index file to the object store after closing
>> + if let DatastoreBackend::S3(s3_client) = &self.backend {
>> + self.s3_upload_index(s3_client, &data.name)
>> + .context("failed to upload fixed index to s3 backend")?;
>> + self.log(format!(
>> + "Uploaded fixed index file to object store: {}",
>> + data.name
>> + ))
>> + }
>
> nit: the log message differs between both cases
Ah yes, good catch! Specified for both cases that is either the dynamic
or fixed index.
>
>> +
>> self.log_upload_stat(
>> &data.name,
>> &expected_csum,
>> @@ -753,6 +772,21 @@ impl BackupEnvironment {
>>
>> Ok(())
>> }
>> +
>> + fn s3_upload_index(&self, s3_client: &S3Client, name: &str) -> Result<(), Error> {
>> + let object_key =
>> + pbs_datastore::s3::object_key_from_path(&self.backup_dir.relative_path(), name)
>> + .context("invalid index file object key")?;
>> +
>> + let mut full_path = self.backup_dir.full_path();
>> + full_path.push(name);
>> + let mut file = std::fs::File::open(&full_path)?;
>> + let mut buffer = Vec::new();
>> + file.read_to_end(&mut buffer)?;
>
> nit: You can use std::fs::read() to get the Vec right away :)
Ah, makes this much nicer indeed, thanks for the hint!
>
>> + let data = hyper::body::Bytes::from(buffer);
>> + proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))?;
>> + Ok(())
>> + }
>> }
>>
>> impl RpcEnvironment for BackupEnvironment {
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 21/45] datastore: get and set owner for s3 store backend
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 21/45] datastore: get and set owner for s3 " Christian Ebner
@ 2025-07-18 9:25 ` Lukas Wagner
2025-07-18 12:12 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 9:25 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
With my feedback addressed:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:53, Christian Ebner wrote:
> Read or write the ownership information from/to the corresponding
> object in the S3 object store. Keep that information available if
> the bucket is reused as datastore.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> pbs-datastore/src/datastore.rs | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 265624229..ca099c1d0 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -7,6 +7,7 @@ use std::sync::{Arc, LazyLock, Mutex};
> use std::time::Duration;
>
> use anyhow::{bail, format_err, Context, Error};
> +use http_body_util::BodyExt;
> use nix::unistd::{unlinkat, UnlinkatFlags};
> use pbs_tools::lru_cache::LruCache;
> use tracing::{info, warn};
> @@ -832,6 +833,21 @@ impl DataStore {
> backup_group: &pbs_api_types::BackupGroup,
> ) -> Result<Authid, Error> {
> let full_path = self.owner_path(ns, backup_group);
> +
> + if let DatastoreBackend::S3(s3_client) = self.backend()? {
> + let mut path = ns.path();
> + path.push(format!("{backup_group}"));
nit: you can use .to_string() here, is a bit easier to read
> + let object_key = crate::s3::object_key_from_path(&path, "owner")
I did not note it for the previously reviewed patches, but I think some (pub) consts for these
'static' key suffixes would be better than to repeat the same string in multiple places in the code
(mostly to avoid errors due to spelling mistakes)
> + .context("invalid owner file object key")?;
> + let response = proxmox_async::runtime::block_on(s3_client.get_object(object_key))?
> + .ok_or_else(|| format_err!("fetching owner failed"))?;
> + let content = proxmox_async::runtime::block_on(response.content.collect())?;
> + let owner = String::from_utf8(content.to_bytes().trim_ascii_end().to_vec())?;
> + return owner
> + .parse()
> + .map_err(|err| format_err!("parsing owner for {backup_group} failed: {err}"));
> + }
> +
> let owner = proxmox_sys::fs::file_read_firstline(full_path)?;
> owner
> .trim_end() // remove trailing newline
> @@ -860,6 +876,18 @@ impl DataStore {
> ) -> Result<(), Error> {
> let path = self.owner_path(ns, backup_group);
>
> + if let DatastoreBackend::S3(s3_client) = self.backend()? {
> + let mut path = ns.path();
> + path.push(format!("{backup_group}"));
> + let object_key = crate::s3::object_key_from_path(&path, "owner")
> + .context("invalid owner file object key")?;
> + let data = hyper::body::Bytes::from(format!("{auth_id}\n"));
> + let _is_duplicate = proxmox_async::runtime::block_on(
> + s3_client.upload_with_retry(object_key, data, true),
> + )
> + .context("failed to set owner on s3 backend")?;
> + }
> +
> let mut open_options = std::fs::OpenOptions::new();
> open_options.write(true);
>
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 11/45] api: backup: conditionally upload manifest to s3 object store backend
2025-07-18 8:26 ` Lukas Wagner
@ 2025-07-18 9:33 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 9:33 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 10:26 AM, Lukas Wagner wrote:
> Two minor suggestions, but nothing that would prohibit my R-b:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
> On 2025-07-15 14:52, Christian Ebner wrote:
>> Reupload the manifest to the S3 object store backend on manifest
>> updates, if s3 is configured as backend.
>> This also triggers the initial manifest upload when finishing backup
>> snapshot in the backup api call handler.
>> Updates also the locally cached version for fast and efficient
>> listing of contents without the need to perform expensive (as in
>> monetary cost and IO latency) requests.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> pbs-datastore/Cargo.toml | 3 +++
>> pbs-datastore/src/backup_info.rs | 12 +++++++++++-
>> src/api2/admin/datastore.rs | 14 ++++++++++++--
>> src/api2/backup/environment.rs | 16 ++++++++--------
>> src/backup/verify.rs | 2 +-
>> 5 files changed, 35 insertions(+), 12 deletions(-)
>>
>> diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
>> index c42eff165..7e56dbd31 100644
>> --- a/pbs-datastore/Cargo.toml
>> +++ b/pbs-datastore/Cargo.toml
>> @@ -13,6 +13,7 @@ crc32fast.workspace = true
>> endian_trait.workspace = true
>> futures.workspace = true
>> hex = { workspace = true, features = [ "serde" ] }
>> +hyper.workspace = true
>> libc.workspace = true
>> log.workspace = true
>> nix.workspace = true
>> @@ -29,8 +30,10 @@ zstd-safe.workspace = true
>> pathpatterns.workspace = true
>> pxar.workspace = true
>>
>> +proxmox-async.workspace = true
>> proxmox-base64.workspace = true
>> proxmox-borrow.workspace = true
>> +proxmox-http.workspace = true
>> proxmox-human-byte.workspace = true
>> proxmox-io.workspace = true
>> proxmox-lang.workspace=true
>> diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
>> index e3ecd437f..46e5b61f0 100644
>> --- a/pbs-datastore/src/backup_info.rs
>> +++ b/pbs-datastore/src/backup_info.rs
>> @@ -19,7 +19,7 @@ use pbs_api_types::{
>> use pbs_config::{open_backup_lockfile, BackupLockGuard};
>>
>> use crate::manifest::{BackupManifest, MANIFEST_LOCK_NAME};
>> -use crate::{DataBlob, DataStore};
>> +use crate::{DataBlob, DataStore, DatastoreBackend};
>>
>> pub const DATASTORE_LOCKS_DIR: &str = "/run/proxmox-backup/locks";
>> const PROTECTED_MARKER_FILENAME: &str = ".protected";
>> @@ -666,6 +666,7 @@ impl BackupDir {
>> /// only use this method - anything else may break locking guarantees.
>> pub fn update_manifest(
>> &self,
>> + backend: &DatastoreBackend,
>> update_fn: impl FnOnce(&mut BackupManifest),
>> ) -> Result<(), Error> {
>> let _guard = self.lock_manifest()?;
>> @@ -678,6 +679,15 @@ impl BackupDir {
>> let blob = DataBlob::encode(manifest.as_bytes(), None, true)?;
>> let raw_data = blob.raw_data();
>>
>> + if let DatastoreBackend::S3(s3_client) = backend {
>> + let object_key =
>> + super::s3::object_key_from_path(&self.relative_path(), MANIFEST_BLOB_NAME.as_ref())
>> + .context("invalid manifest object key")?;
>> + let data = hyper::body::Bytes::copy_from_slice(raw_data);
>> + proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))
>> + .context("failed to update manifest on s3 backend")?;
>> + }
>> +
>> let mut path = self.full_path();
>> path.push(MANIFEST_BLOB_NAME.as_ref());
>>
>> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
>> index e24bc1c1b..02666afda 100644
>> --- a/src/api2/admin/datastore.rs
>> +++ b/src/api2/admin/datastore.rs
>> @@ -65,7 +65,7 @@ use pbs_datastore::manifest::BackupManifest;
>> use pbs_datastore::prune::compute_prune_info;
>> use pbs_datastore::{
>> check_backup_owner, ensure_datastore_is_mounted, task_tracking, BackupDir, BackupGroup,
>> - DataStore, LocalChunkReader, StoreProgress,
>> + DataStore, DatastoreBackend, LocalChunkReader, StoreProgress,
>> };
>> use pbs_tools::json::required_string_param;
>> use proxmox_rest_server::{formatter, WorkerTask};
>> @@ -2086,6 +2086,16 @@ pub fn set_group_notes(
>> &backup_group,
>> )?;
>>
>> + if let DatastoreBackend::S3(s3_client) = datastore.backend()? {
>> + let mut path = ns.path();
>> + path.push(format!("{backup_group}"));
>
> You can just use .to_string() here, reads a bit nicer
Indeed, adapted this ...
>> + let object_key = pbs_datastore::s3::object_key_from_path(&path, "notes")
>> + .context("invalid owner file object key")?;
>> + let data = hyper::body::Bytes::copy_from_slice(notes.as_bytes());
>> + let _is_duplicate =
>> + proxmox_async::runtime::block_on(s3_client.upload_with_retry(object_key, data, true))
>> + .context("failed to set notes on s3 backend")?;
>> + }
>> let notes_path = datastore.group_notes_path(&ns, &backup_group);
>> replace_file(notes_path, notes.as_bytes(), CreateOptions::new(), false)?;
>>
>> @@ -2188,7 +2198,7 @@ pub fn set_notes(
>> let backup_dir = datastore.backup_dir(ns, backup_dir)?;
>>
>> backup_dir
>> - .update_manifest(|manifest| {
>> + .update_manifest(&datastore.backend()?, |manifest| {
>> manifest.unprotected["notes"] = notes.into();
>> })
>> .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
>> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
>> index 9ad13aeb3..0017b347d 100644
>> --- a/src/api2/backup/environment.rs
>> +++ b/src/api2/backup/environment.rs
>> @@ -646,14 +646,6 @@ impl BackupEnvironment {
>> bail!("backup does not contain valid files (file count == 0)");
>> }
>>
>> - // check for valid manifest and store stats
>> - let stats = serde_json::to_value(state.backup_stat)?;
>> - self.backup_dir
>> - .update_manifest(|manifest| {
>> - manifest.unprotected["chunk_upload_stats"] = stats;
>> - })
>> - .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
>> -
>> if let Some(base) = &self.last_backup {
>> let path = base.backup_dir.full_path();
>> if !path.exists() {
>> @@ -664,6 +656,14 @@ impl BackupEnvironment {
>> }
>> }
>>
>> + // check for valid manifest and store stats
>> + let stats = serde_json::to_value(state.backup_stat)?;
>> + self.backup_dir
>> + .update_manifest(&self.backend, |manifest| {
>> + manifest.unprotected["chunk_upload_stats"] = stats;
>> + })
>> + .map_err(|err| format_err!("unable to update manifest blob - {}", err))?;
>
> nit: you can inline the `err` variable here
... and inlined the error as requested
>> +
>> self.datastore.try_ensure_sync_level()?;
>>
>> // marks the backup as successful
>> diff --git a/src/backup/verify.rs b/src/backup/verify.rs
>> index 0b954ae23..9344033d8 100644
>> --- a/src/backup/verify.rs
>> +++ b/src/backup/verify.rs
>> @@ -359,7 +359,7 @@ impl VerifyWorker {
>>
>> if let Err(err) = {
>> let verify_state = serde_json::to_value(verify_state)?;
>> - backup_dir.update_manifest(|manifest| {
>> + backup_dir.update_manifest(&self.datastore.backend()?, |manifest| {
>> manifest.unprotected["verify_state"] = verify_state;
>> })
>> } {
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 13/45] sync: pull: conditionally upload content to s3 backend
2025-07-18 8:35 ` Lukas Wagner
@ 2025-07-18 9:43 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 9:43 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 10:35 AM, Lukas Wagner wrote:
>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> If the datastore is backed by an S3 object store, not only insert the
>> pulled contents to the local cache store, but also upload it to the
>> S3 backend.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> src/server/pull.rs | 66 +++++++++++++++++++++++++++++++++++++++++++---
>> 1 file changed, 63 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/server/pull.rs b/src/server/pull.rs
>> index b1724c142..fe87359ab 100644
>> --- a/src/server/pull.rs
>> +++ b/src/server/pull.rs
>> @@ -6,8 +6,9 @@ use std::sync::atomic::{AtomicUsize, Ordering};
>> use std::sync::{Arc, Mutex};
>> use std::time::SystemTime;
>>
>> -use anyhow::{bail, format_err, Error};
>> +use anyhow::{bail, format_err, Context, Error};
>> use proxmox_human_byte::HumanByte;
>> +use tokio::io::AsyncReadExt;
>> use tracing::info;
>>
>> use pbs_api_types::{
>> @@ -24,7 +25,7 @@ use pbs_datastore::fixed_index::FixedIndexReader;
>> use pbs_datastore::index::IndexFile;
>> use pbs_datastore::manifest::{BackupManifest, FileInfo};
>> use pbs_datastore::read_chunk::AsyncReadChunk;
>> -use pbs_datastore::{check_backup_owner, DataStore, StoreProgress};
>> +use pbs_datastore::{check_backup_owner, DataStore, DatastoreBackend, StoreProgress};
>> use pbs_tools::sha::sha256;
>>
>> use super::sync::{
>> @@ -167,7 +168,20 @@ async fn pull_index_chunks<I: IndexFile>(
>> move |(chunk, digest, size): (DataBlob, [u8; 32], u64)| {
>> // println!("verify and write {}", hex::encode(&digest));
>> chunk.verify_unencrypted(size as usize, &digest)?;
>> - target2.insert_chunk(&chunk, &digest)?;
>> + match target2.backend()? {
>> + DatastoreBackend::Filesystem => {
>> + target2.insert_chunk(&chunk, &digest)?;
>> + }
>> + DatastoreBackend::S3(s3_client) => {
>> + let data = chunk.raw_data().to_vec();
>> + let upload_data = hyper::body::Bytes::from(data);
>> + let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
>> + let _is_duplicate = proxmox_async::runtime::block_on(
>> + s3_client.upload_with_retry(object_key, upload_data, false),
>> + )
>> + .context("failed to upload chunk to s3 backend")?;
>> + }
>> + }
>> Ok(())
>> },
>> );
>> @@ -331,6 +345,18 @@ async fn pull_single_archive<'a>(
>> if let Err(err) = std::fs::rename(&tmp_path, &path) {
>> bail!("Atomic rename file {:?} failed - {}", path, err);
>> }
>> + if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
>> + let object_key =
>> + pbs_datastore::s3::object_key_from_path(&snapshot.relative_path(), archive_name)
>> + .context("invalid archive object key")?;
>> +
>> + let archive = tokio::fs::File::open(&path).await?;
>> + let mut reader = tokio::io::BufReader::new(archive);
>> + let mut contents = Vec::new();
>> + reader.read_to_end(&mut contents).await?;
>
> You can use tokio::fs::read here
Same as for the sync code, makes reading the whole file contents much
more concise, thanks!
>
>> + let data = hyper::body::Bytes::from(contents);
>> + let _is_duplicate = s3_client.upload_with_retry(object_key, data, true).await?;
>
> I might do a review of the already merged s3 client code later, but I really don't like the
> `replace: bool ` parameter for this function very much. I think I'd prefer having
> to separate functions for replace vs. not replace (which might delegate to a common
> fn internally, there a bool param is fine IMO), or alternatively, use an enum
> instead. I think personally I'm gravitating more towards the separate function.
>
> What do you think?
Yeah, splitting this up into pub fn `upload_with_retry` and
`upload_replace_with_retry` and keeping the common code in an internal
helper method on the client could indeed be more ergonomic.
Will opt for that in this case.
>
>> + }
>> Ok(sync_stats)
>> }
>>
>> @@ -401,6 +427,7 @@ async fn pull_snapshot<'a>(
>> }
>> }
>>
>> + let manifest_data = tmp_manifest_blob.raw_data().to_vec();
>> let manifest = BackupManifest::try_from(tmp_manifest_blob)?;
>>
>> if ignore_not_verified_or_encrypted(
>> @@ -467,9 +494,42 @@ async fn pull_snapshot<'a>(
>> if let Err(err) = std::fs::rename(&tmp_manifest_name, &manifest_name) {
>> bail!("Atomic rename file {:?} failed - {}", manifest_name, err);
>> }
>> + if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
>> + let object_key = pbs_datastore::s3::object_key_from_path(
>> + &snapshot.relative_path(),
>> + MANIFEST_BLOB_NAME.as_ref(),
>> + )
>> + .context("invalid manifest object key")?;
>> +
>> + let data = hyper::body::Bytes::from(manifest_data);
>> + let _is_duplicate = s3_client
>> + .upload_with_retry(object_key, data, true)
>> + .await
>> + .context("failed to upload manifest to s3 backend")?;
>> + }
>>
>> if !client_log_name.exists() {
>> reader.try_download_client_log(&client_log_name).await?;
>> + if client_log_name.exists() {
>> + if let DatastoreBackend::S3(s3_client) = snapshot.datastore().backend()? {
>> + let object_key = pbs_datastore::s3::object_key_from_path(
>> + &snapshot.relative_path(),
>> + CLIENT_LOG_BLOB_NAME.as_ref(),
>> + )
>> + .context("invalid archive object key")?;
>> +
>> + let log_file = tokio::fs::File::open(&client_log_name).await?;
>> + let mut reader = tokio::io::BufReader::new(log_file);
>> + let mut contents = Vec::new();
>> + reader.read_to_end(&mut contents).await?;
>
> You can use tokio::fs::read(...) here
same as above :)
>
>> +
>> + let data = hyper::body::Bytes::from(contents);
>> + let _is_duplicate = s3_client
>> + .upload_with_retry(object_key, data, true)
>> + .await
>> + .context("failed to upload client log to s3 backend")?;
>> + }
>> + }
>> };
>> snapshot
>> .cleanup_unreferenced_files(&manifest)
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 22/45] datastore: implement garbage collection for s3 backend
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 22/45] datastore: implement garbage collection for s3 backend Christian Ebner
@ 2025-07-18 9:47 ` Lukas Wagner
2025-07-18 14:31 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 9:47 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
On 2025-07-15 14:53, Christian Ebner wrote:
> Implements the garbage collection for datastores backed by an s3
> object store.
> Take advantage of the local datastore by placing marker files in the
> chunk store during phase 1 of the garbage collection, updating their
> atime if already present.
> This allows us to avoid making expensive API calls to update object
> metadata, which would only be possible via a copy object operation.
>
> The phase 2 is implemented by fetching a list of all the chunks via
> the ListObjectsV2 API call, filtered by the chunk folder prefix.
> This operation has to be performed in batches of 1000 objects, given
> by the APIs response limits.
> For each object key, lookup the marker file and decide based on the
> marker existence and it's atime if the chunk object needs to be
> removed. Deletion happens via the delete objects operation, allowing
> to delete multiple chunks by a single request.
>
> This allows to efficiently lookup chunks which are not in use
> anymore while being performant and cost effective.
>
> Baseline runtime performance tests:
> -----------------------------------
>
> 3 garbage collection runs were performed with hot filesystem caches
> (by additional GC run before the test runs). The PBS instance was
> virtualized, the same virtualized disk using ZFS for all the local
> cache stores:
>
> All datastores contained the same encrypted data, with the following
> content statistics:
> Original data usage: 269.685 GiB
> On-Disk usage: 9.018 GiB (3.34%)
> On-Disk chunks: 6477
> Deduplication factor: 29.90
> Average chunk size: 1.426 MiB
>
> The resutlts demonstrate the overhead caused by the additional
> ListObjectV2 API calls and their processing, but depending on the
> object store backend.
>
> Average garbage collection runtime:
> Local datastore: (2.04 ± 0.01) s
> Local RADOS gateway (Squid): (3.05 ± 0.01) s
> AWS S3: (3.05 ± 0.01) s
> Cloudflare R2: (6.71 ± 0.58) s
>
> After pruning of all datastore contents (therefore including
> DeleteObjects requests):
> Local datastore: 3.04 s
> Local RADOS gateway (Squid): 14.08 s
> AWS S3: 13.06 s
> Cloudflare R2: 78.21 s
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> pbs-datastore/src/chunk_store.rs | 4 +
> pbs-datastore/src/datastore.rs | 211 +++++++++++++++++++++++++++----
> 2 files changed, 190 insertions(+), 25 deletions(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 8c195df54..95f00e8d5 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -353,6 +353,10 @@ impl ChunkStore {
> ProcessLocker::oldest_shared_lock(self.locker.clone().unwrap())
> }
>
> + pub fn mutex(&self) -> &std::sync::Mutex<()> {
> + &self.mutex
> + }
> +
> pub fn sweep_unused_chunks(
> &self,
> oldest_writer: i64,
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index ca099c1d0..6cc7fdbaa 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -4,7 +4,7 @@ use std::os::unix::ffi::OsStrExt;
> use std::os::unix::io::AsRawFd;
> use std::path::{Path, PathBuf};
> use std::sync::{Arc, LazyLock, Mutex};
> -use std::time::Duration;
> +use std::time::{Duration, SystemTime};
>
> use anyhow::{bail, format_err, Context, Error};
> use http_body_util::BodyExt;
> @@ -1209,6 +1209,7 @@ impl DataStore {
> chunk_lru_cache: &mut Option<LruCache<[u8; 32], ()>>,
> status: &mut GarbageCollectionStatus,
> worker: &dyn WorkerTaskContext,
> + s3_client: Option<Arc<S3Client>>,
> ) -> Result<(), Error> {
> status.index_file_count += 1;
> status.index_data_bytes += index.index_bytes();
> @@ -1225,21 +1226,41 @@ impl DataStore {
> }
> }
>
> - if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
> - let hex = hex::encode(digest);
> - warn!(
> - "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
> - );
> -
> - // touch any corresponding .bad files to keep them around, meaning if a chunk is
> - // rewritten correctly they will be removed automatically, as well as if no index
> - // file requires the chunk anymore (won't get to this loop then)
> - for i in 0..=9 {
> - let bad_ext = format!("{}.bad", i);
> - let mut bad_path = PathBuf::new();
> - bad_path.push(self.chunk_path(digest).0);
> - bad_path.set_extension(bad_ext);
> - self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
> + match s3_client {
> + None => {
> + // Filesystem backend
> + if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
> + let hex = hex::encode(digest);
> + warn!(
> + "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
> + );
> +
> + // touch any corresponding .bad files to keep them around, meaning if a chunk is
> + // rewritten correctly they will be removed automatically, as well as if no index
> + // file requires the chunk anymore (won't get to this loop then)
> + for i in 0..=9 {
> + let bad_ext = format!("{}.bad", i);
> + let mut bad_path = PathBuf::new();
> + bad_path.push(self.chunk_path(digest).0);
> + bad_path.set_extension(bad_ext);
> + self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
> + }
> + }
> + }
> + Some(ref _s3_client) => {
> + // Update atime on local cache marker files.
> + if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
> + let (chunk_path, _digest) = self.chunk_path(digest);
> + // Insert empty file as marker to tell GC phase2 that this is
> + // a chunk still in-use, so to keep in the S3 object store.
> + std::fs::File::options()
> + .write(true)
> + .create_new(true)
> + .open(&chunk_path)
> + .with_context(|| {
> + format!("failed to create marker for chunk {}", hex::encode(digest))
> + })?;
> + }
> }
> }
> }
> @@ -1251,6 +1272,7 @@ impl DataStore {
> status: &mut GarbageCollectionStatus,
> worker: &dyn WorkerTaskContext,
> cache_capacity: usize,
> + s3_client: Option<Arc<S3Client>>,
> ) -> Result<(), Error> {
> // Iterate twice over the datastore to fetch index files, even if this comes with an
> // additional runtime cost:
> @@ -1344,6 +1366,7 @@ impl DataStore {
> &mut chunk_lru_cache,
> status,
> worker,
> + s3_client.as_ref().cloned(),
> )?;
>
> if !unprocessed_index_list.remove(&path) {
> @@ -1378,7 +1401,14 @@ impl DataStore {
> continue;
> }
> };
> - self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?;
> + self.index_mark_used_chunks(
> + index,
> + &path,
> + &mut chunk_lru_cache,
> + status,
> + worker,
> + s3_client.as_ref().cloned(),
> + )?;
> warn!("Marked chunks for unexpected index file at '{path:?}'");
> }
> if strange_paths_count > 0 {
> @@ -1476,18 +1506,149 @@ impl DataStore {
> 1024 * 1024
> };
>
> - info!("Start GC phase1 (mark used chunks)");
> + let s3_client = match self.backend()? {
> + DatastoreBackend::Filesystem => None,
> + DatastoreBackend::S3(s3_client) => {
> + proxmox_async::runtime::block_on(s3_client.head_bucket())
> + .context("failed to reach bucket")?;
> + Some(s3_client)
> + }
> + };
>
> - self.mark_used_chunks(&mut gc_status, worker, gc_cache_capacity)
> - .context("marking used chunks failed")?;
> + info!("Start GC phase1 (mark used chunks)");
>
> - info!("Start GC phase2 (sweep unused chunks)");
> - self.inner.chunk_store.sweep_unused_chunks(
> - oldest_writer,
> - min_atime,
> + self.mark_used_chunks(
> &mut gc_status,
> worker,
> - )?;
> + gc_cache_capacity,
> + s3_client.as_ref().cloned(),
> + )
> + .context("marking used chunks failed")?;
> +
> + info!("Start GC phase2 (sweep unused chunks)");
> +
> + if let Some(ref s3_client) = s3_client {
> + let mut chunk_count = 0;
> + let prefix = S3PathPrefix::Some(".chunks/".to_string());
> + // Operates in batches of 1000 objects max per request
> + let mut list_bucket_result =
> + proxmox_async::runtime::block_on(s3_client.list_objects_v2(&prefix, None))
> + .context("failed to list chunk in s3 object store")?;
> +
> + let mut delete_list = Vec::with_capacity(1000);
> + loop {
> + let lock = self.inner.chunk_store.mutex().lock().unwrap();
> +
> + for content in list_bucket_result.contents {
> + // Check object is actually a chunk
> + let digest = match Path::new::<str>(&content.key).file_name() {
> + Some(file_name) => file_name,
> + // should never be the case as objects will have a filename
> + None => continue,
> + };
> + let bytes = digest.as_bytes();
> + if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
> + continue;
> + }
> + if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
> + continue;
> + }
> +
> + let bad = bytes.ends_with(b".bad");
> +
> + // Safe since contains valid ascii hexdigits only as checked above.
> + let digest_str = digest.to_string_lossy();
> + let hexdigit_prefix = unsafe { digest_str.get_unchecked(0..4) };
> + let mut chunk_path = self.base_path();
> + chunk_path.push(".chunks");
> + chunk_path.push(hexdigit_prefix);
> + chunk_path.push(digest);
> +
> + // Check local markers (created or atime updated during phase1) and
> + // keep or delete chunk based on that.
> + let atime = match std::fs::metadata(chunk_path) {
> + Ok(stat) => stat.accessed()?,
> + Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
> + // File not found, delete by setting atime to unix epoch
> + info!("Not found, mark for deletion: {}", content.key);
> + SystemTime::UNIX_EPOCH
> + }
> + Err(err) => return Err(err.into()),
> + };
> + let atime = atime.duration_since(SystemTime::UNIX_EPOCH)?.as_secs() as i64;
> +
> + chunk_count += 1;
> +
> + if atime < min_atime {
> + delete_list.push(content.key);
> + if bad {
> + gc_status.removed_bad += 1;
> + } else {
> + gc_status.removed_chunks += 1;
> + }
> + gc_status.removed_bytes += content.size;
> + } else if atime < oldest_writer {
> + if bad {
> + gc_status.still_bad += 1;
> + } else {
> + gc_status.pending_chunks += 1;
> + }
> + gc_status.pending_bytes += content.size;
> + } else {
> + if !bad {
> + gc_status.disk_chunks += 1;
> + }
> + gc_status.disk_bytes += content.size;
> + }
> + }
> +
> + if !delete_list.is_empty() {
> + let delete_objects_result = proxmox_async::runtime::block_on(
> + s3_client.delete_objects(&delete_list),
> + )?;
> + if let Some(_err) = delete_objects_result.error {
> + bail!("failed to delete some objects");
> + }
> + delete_list.clear();
> + }
> +
> + drop(lock);
> +
> + // Process next batch of chunks if there is more
> + if list_bucket_result.is_truncated {
> + list_bucket_result =
> + proxmox_async::runtime::block_on(s3_client.list_objects_v2(
> + &prefix,
> + list_bucket_result.next_continuation_token.as_deref(),
> + ))?;
> + continue;
> + }
> +
> + break;
> + }
> + info!("processed {chunk_count} total chunks");
> +
> + // Phase 2 GC of Filesystem backed storage is phase 3 for S3 backed GC
> + info!("Start GC phase3 (sweep unused chunk markers)");
> +
> + let mut tmp_gc_status = GarbageCollectionStatus {
> + upid: Some(upid.to_string()),
> + ..Default::default()
> + };
> + self.inner.chunk_store.sweep_unused_chunks(
> + oldest_writer,
> + min_atime,
> + &mut tmp_gc_status,
> + worker,
> + )?;
> + } else {
> + self.inner.chunk_store.sweep_unused_chunks(
> + oldest_writer,
> + min_atime,
> + &mut gc_status,
> + worker,
> + )?;
> + }
I found this big chunk for new code quite hard to follow.
I guess everything between the `loop` start and the `if list_bucket_result.is_truncated` could
maybe separated out to some `process_objects` (todo: find better name) function. IMO
a good indicator is also the scope where you hold the lock.
Within this block, it might also make sense to split it further, e.g.
- check_if_chunk
- get_local_chunk_path
- get_local_chunk_atime
- ...
(there might be better ways to separate or name things, but you get the idea)
>
> info!(
> "Removed garbage: {}",
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 23/45] ui: add datastore type selector and reorganize component layout
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 23/45] ui: add datastore type selector and reorganize component layout Christian Ebner
@ 2025-07-18 9:55 ` Lukas Wagner
0 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 9:55 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
On 2025-07-15 14:53, Christian Ebner wrote:
> In preparation for adding the S3 backed datastore variant to the edit
> window. Introduce a datastore type selector in order to distinguish
> between creation of regular and removable datastores, instead of
> using the checkbox as is currently the case.
>
> This allows to more easily expand for further datastore type variants
> while keeping the datastore edit window compact.
>
> Since selecting the type is one of the first steps during datastore
> creation, position the component right below the datastore name field
> and re-organize the components related to the removable datastore
> creation, while keeping additional required components for the S3
> backed datastore creation in mind.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> www/window/DataStoreEdit.js | 78 +++++++++++++++++++++----------------
> 1 file changed, 45 insertions(+), 33 deletions(-)
>
> diff --git a/www/window/DataStoreEdit.js b/www/window/DataStoreEdit.js
> index 372984e37..cd94f0335 100644
> --- a/www/window/DataStoreEdit.js
> +++ b/www/window/DataStoreEdit.js
> @@ -52,6 +52,41 @@ Ext.define('PBS.DataStoreEdit', {
> allowBlank: false,
> fieldLabel: gettext('Name'),
> },
> + {
> + xtype: 'proxmoxKVComboBox',
> + name: 'datastore-type',
> + fieldLabel: gettext('Datastore Type'),
> + value: '__default__',
> + submitValue: false,
> + comboItems: [
> + ['__default__', 'Local'],
> + ['removable', 'Removable'],
These should use gettext
> + ],
> + cbind: {
> + disabled: '{!isCreate}',
> + },
> + listeners: {
> + change: function (checkbox, selected) {
> + let isRemovable = selected === 'removable';
> +
> + let inputPanel = checkbox.up('inputpanel');
> + let pathField = inputPanel.down('[name=path]');
> + let uuidEditField = inputPanel.down('[name=backing-device]');
> +
> + uuidEditField.setDisabled(!isRemovable);
> + uuidEditField.allowBlank = !isRemovable;
> + uuidEditField.setValue('');
> +
> + if (isRemovable) {
> + pathField.setFieldLabel(gettext('Path on Device'));
> + pathField.setEmptyText(gettext('A relative path'));
> + } else {
> + pathField.setFieldLabel(gettext('Backing Path'));
> + pathField.setEmptyText(gettext('An absolute path'));
> + }
> + },
> + },
> + },
I think this could be transformed into a viewModel, bound properties, formulas, etc. while at it, but since this
was pre-existing code, it's not too bad IMO.
> {
> xtype: 'pmxDisplayEditField',
> cbind: {
> @@ -63,17 +98,6 @@ Ext.define('PBS.DataStoreEdit', {
> emptyText: gettext('An absolute path'),
> validator: (val) => val?.trim() !== '/',
> },
> - {
> - xtype: 'pbsPartitionSelector',
> - fieldLabel: gettext('Device'),
> - name: 'backing-device',
> - disabled: true,
> - allowBlank: true,
> - cbind: {
> - editable: '{isCreate}',
> - },
> - emptyText: gettext('Device path'),
> - },
> ],
> column2: [
> {
> @@ -97,31 +121,19 @@ Ext.define('PBS.DataStoreEdit', {
> value: '{scheduleValue}',
> },
> },
> - ],
> - columnB: [
> {
> - xtype: 'checkbox',
> - boxLabel: gettext('Removable datastore'),
> - submitValue: false,
> - listeners: {
> - change: function (checkbox, isRemovable) {
> - let inputPanel = checkbox.up('inputpanel');
> - let pathField = inputPanel.down('[name=path]');
> - let uuidEditField = inputPanel.down('[name=backing-device]');
> -
> - uuidEditField.setDisabled(!isRemovable);
> - uuidEditField.allowBlank = !isRemovable;
> - uuidEditField.setValue('');
> - if (isRemovable) {
> - pathField.setFieldLabel(gettext('Path on Device'));
> - pathField.setEmptyText(gettext('A relative path'));
> - } else {
> - pathField.setFieldLabel(gettext('Backing Path'));
> - pathField.setEmptyText(gettext('An absolute path'));
> - }
> - },
> + xtype: 'pbsPartitionSelector',
> + fieldLabel: gettext('Device'),
> + name: 'backing-device',
> + disabled: true,
> + allowBlank: true,
> + cbind: {
> + editable: '{isCreate}',
> },
> + emptyText: gettext('Device path'),
> },
> + ],
> + columnB: [
> {
> xtype: 'textfield',
> name: 'comment',
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend
2025-07-18 8:38 ` Lukas Wagner
@ 2025-07-18 9:58 ` Christian Ebner
2025-07-18 10:03 ` Lukas Wagner
0 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 9:58 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 10:38 AM, Lukas Wagner wrote:
> One comment inline, but nothing prohibitive of a R-b:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> Read the chunk based on the datastores backend, reading from local
>> filesystem or fetching from S3 object store.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> src/api2/reader/environment.rs | 12 ++++++----
>> src/api2/reader/mod.rs | 41 +++++++++++++++++++++++-----------
>> 2 files changed, 36 insertions(+), 17 deletions(-)
>>
>> diff --git a/src/api2/reader/environment.rs b/src/api2/reader/environment.rs
>> index 3b2f06f43..8924352b0 100644
>> --- a/src/api2/reader/environment.rs
>> +++ b/src/api2/reader/environment.rs
>> @@ -1,13 +1,14 @@
>> use std::collections::HashSet;
>> use std::sync::{Arc, RwLock};
>>
>> +use anyhow::Error;
>> use serde_json::{json, Value};
>>
>> use proxmox_router::{RpcEnvironment, RpcEnvironmentType};
>>
>> use pbs_api_types::Authid;
>> use pbs_datastore::backup_info::BackupDir;
>> -use pbs_datastore::DataStore;
>> +use pbs_datastore::{DataStore, DatastoreBackend};
>> use proxmox_rest_server::formatter::*;
>> use proxmox_rest_server::WorkerTask;
>> use tracing::info;
>> @@ -23,6 +24,7 @@ pub struct ReaderEnvironment {
>> pub worker: Arc<WorkerTask>,
>> pub datastore: Arc<DataStore>,
>> pub backup_dir: BackupDir,
>> + pub backend: DatastoreBackend,
>> allowed_chunks: Arc<RwLock<HashSet<[u8; 32]>>>,
>> }
>>
>> @@ -33,8 +35,9 @@ impl ReaderEnvironment {
>> worker: Arc<WorkerTask>,
>> datastore: Arc<DataStore>,
>> backup_dir: BackupDir,
>> - ) -> Self {
>> - Self {
>> + ) -> Result<Self, Error> {
>> + let backend = datastore.backend()?;
>> + Ok(Self {
>> result_attributes: json!({}),
>> env_type,
>> auth_id,
>> @@ -43,8 +46,9 @@ impl ReaderEnvironment {
>> debug: tracing::enabled!(tracing::Level::DEBUG),
>> formatter: JSON_FORMATTER,
>> backup_dir,
>> + backend,
>> allowed_chunks: Arc::new(RwLock::new(HashSet::new())),
>> - }
>> + })
>> }
>>
>> pub fn log<S: AsRef<str>>(&self, msg: S) {
>> diff --git a/src/api2/reader/mod.rs b/src/api2/reader/mod.rs
>> index a77216043..997d9ca77 100644
>> --- a/src/api2/reader/mod.rs
>> +++ b/src/api2/reader/mod.rs
>> @@ -3,6 +3,7 @@
>> use anyhow::{bail, format_err, Context, Error};
>> use futures::*;
>> use hex::FromHex;
>> +use http_body_util::BodyExt;
>> use hyper::body::Incoming;
>> use hyper::header::{self, HeaderValue, CONNECTION, UPGRADE};
>> use hyper::http::request::Parts;
>> @@ -27,8 +28,9 @@ use pbs_api_types::{
>> };
>> use pbs_config::CachedUserInfo;
>> use pbs_datastore::index::IndexFile;
>> -use pbs_datastore::{DataStore, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
>> +use pbs_datastore::{DataStore, DatastoreBackend, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
>> use pbs_tools::json::required_string_param;
>> +use proxmox_s3_client::S3Client;
>>
>> use crate::api2::backup::optional_ns_param;
>> use crate::api2::helpers;
>> @@ -162,7 +164,7 @@ fn upgrade_to_backup_reader_protocol(
>> worker.clone(),
>> datastore,
>> backup_dir,
>> - );
>> + )?;
>>
>> env.debug = debug;
>>
>> @@ -323,17 +325,10 @@ fn download_chunk(
>> ));
>> }
>>
>> - let (path, _) = env.datastore.chunk_path(&digest);
>> - let path2 = path.clone();
>> -
>> - env.debug(format!("download chunk {:?}", path));
>> -
>> - let data =
>> - proxmox_async::runtime::block_in_place(|| std::fs::read(path)).map_err(move |err| {
>> - http_err!(BAD_REQUEST, "reading file {:?} failed: {}", path2, err)
>> - })?;
>> -
>> - let body = Body::from(data);
>> + let body = match &env.backend {
>> + DatastoreBackend::Filesystem => load_from_filesystem(env, &digest)?,
>> + DatastoreBackend::S3(s3_client) => fetch_from_object_store(s3_client, &digest).await?,
>> + };
>>
>> // fixme: set other headers ?
>> Ok(Response::builder()
>> @@ -345,6 +340,26 @@ fn download_chunk(
>> .boxed()
>> }
>>
>> +async fn fetch_from_object_store(s3_client: &S3Client, digest: &[u8; 32]) -> Result<Body, Error> {
>> + let object_key = pbs_datastore::s3::object_key_from_digest(digest)?;
>> + if let Some(response) = s3_client.get_object(object_key).await? {
>
> ^ Do we maybe want some kind of retry-logic for retrieving objects as well? Disregard
> in case you implement it in a later patch, I'm reviewing this series patch by patch.
While a retry might be of interest in case of inter-mitten issues, for
the time being I would like to refrain from doing so for the reasons
stated in my reply to proxmox-backup patch 0004. If the need for this
truly arises, adding this later on should be rather simple. If you
already see this as an issue now, I can of course add the retry logic
right away.
>
>> + let data = response.content.collect().await?.to_bytes();
>> + return Ok(Body::from(data));
>> + }
>> + bail!("cannot find chunk with digest {}", hex::encode(digest));
>> +}
>> +
>> +fn load_from_filesystem(env: &ReaderEnvironment, digest: &[u8; 32]) -> Result<Body, Error> {
>> + let (path, _) = env.datastore.chunk_path(digest);
>> + let path2 = path.clone();
>> +
>> + env.debug(format!("download chunk {path:?}"));
>> +
>> + let data = proxmox_async::runtime::block_in_place(|| std::fs::read(path))
>> + .map_err(move |err| http_err!(BAD_REQUEST, "reading file {path2:?} failed: {err}"))?;
>> + Ok(Body::from(data))
>> +}
>> +
>> /* this is too slow
>> fn download_chunk_old(
>> _parts: Parts,
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 27/45] ui: add s3 client selector and bucket field for s3 backend setup
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 27/45] ui: add s3 client selector and bucket field for s3 backend setup Christian Ebner
@ 2025-07-18 10:02 ` Lukas Wagner
2025-07-19 12:28 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 10:02 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
On 2025-07-15 14:53, Christian Ebner wrote:
> In order to be able to create datastore with an s3 object store
> backend. Implements a s3 client selector and exposes it in the
> datastore edit window, together with the additional bucket name field
> to associate with the datastore's s3 backend.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - use field endpoint insteand of host, fixing the selector listing
>
> www/Makefile | 1 +
> www/form/S3ClientSelector.js | 33 +++++++++++++++++++++++++++
> www/window/DataStoreEdit.js | 44 ++++++++++++++++++++++++++++++++++++
> 3 files changed, 78 insertions(+)
> create mode 100644 www/form/S3ClientSelector.js
>
> diff --git a/www/Makefile b/www/Makefile
> index 767713c75..410e9f3e0 100644
> --- a/www/Makefile
> +++ b/www/Makefile
> @@ -42,6 +42,7 @@ JSSRC= \
> Schema.js \
> form/TokenSelector.js \
> form/AuthidSelector.js \
> + form/S3ClientSelector.js \
> form/RemoteSelector.js \
> form/RemoteTargetSelector.js \
> form/DataStoreSelector.js \
> diff --git a/www/form/S3ClientSelector.js b/www/form/S3ClientSelector.js
> new file mode 100644
> index 000000000..243484909
> --- /dev/null
> +++ b/www/form/S3ClientSelector.js
> @@ -0,0 +1,33 @@
> +Ext.define('PBS.form.S3ClientSelector', {
> + extend: 'Proxmox.form.ComboGrid',
> + alias: 'widget.pbsS3ClientSelector',
> +
> + allowBlank: false,
> + autoSelect: false,
> + valueField: 'id',
> + displayField: 'id',
> +
> + store: {
> + model: 'pmx-s3client',
> + autoLoad: true,
> + sorters: 'id',
> + },
> +
> + listConfig: {
> + columns: [
> + {
> + header: gettext('S3 Client ID'),
> + sortable: true,
> + dataIndex: 'id',
> + renderer: Ext.String.htmlEncode,
> + flex: 1,
> + },
> + {
> + header: gettext('Endpoint'),
> + sortable: true,
> + dataIndex: 'endpoint',
> + flex: 1,
> + },
> + ],
> + },
> +});
> diff --git a/www/window/DataStoreEdit.js b/www/window/DataStoreEdit.js
> index cd94f0335..3379bf773 100644
> --- a/www/window/DataStoreEdit.js
> +++ b/www/window/DataStoreEdit.js
> @@ -61,6 +61,7 @@ Ext.define('PBS.DataStoreEdit', {
> comboItems: [
> ['__default__', 'Local'],
> ['removable', 'Removable'],
> + ['s3', 'S3 (experimental)'],
Missing gettext here as well
> ],
> cbind: {
> disabled: '{!isCreate}',
> @@ -68,18 +69,32 @@ Ext.define('PBS.DataStoreEdit', {
> listeners: {
> change: function (checkbox, selected) {
> let isRemovable = selected === 'removable';
> + let isS3 = selected === 's3';
>
> let inputPanel = checkbox.up('inputpanel');
> let pathField = inputPanel.down('[name=path]');
> let uuidEditField = inputPanel.down('[name=backing-device]');
> + let bucketField = inputPanel.down('[name=bucket]');
> + let s3ClientSelector = inputPanel.down('[name=s3client]');
>
> uuidEditField.setDisabled(!isRemovable);
> uuidEditField.allowBlank = !isRemovable;
> uuidEditField.setValue('');
>
> + bucketField.setDisabled(!isS3);
> + bucketField.allowBlank = !isS3;
> + bucketField.setValue('');
> +
> + s3ClientSelector.setDisabled(!isS3);
> + s3ClientSelector.allowBlank = !isS3;
> + s3ClientSelector.setValue('');
> +
> if (isRemovable) {
> pathField.setFieldLabel(gettext('Path on Device'));
> pathField.setEmptyText(gettext('A relative path'));
> + } else if (isS3) {
> + pathField.setFieldLabel(gettext('Store Cache'));
> + pathField.setEmptyText(gettext('An absolute path'));
> } else {
> pathField.setFieldLabel(gettext('Backing Path'));
> pathField.setEmptyText(gettext('An absolute path'));
Yup, with these additional changes I'd definitely prefer the viewModel approach mentioned earlier :)
> @@ -98,6 +113,15 @@ Ext.define('PBS.DataStoreEdit', {
> emptyText: gettext('An absolute path'),
> validator: (val) => val?.trim() !== '/',
> },
> + {
> + xtype: 'pbsS3ClientSelector',
> + name: 's3client',
> + fieldLabel: gettext('S3 Client ID'),
> + disabled: true,
> + cbind: {
> + editable: '{isCreate}',
> + },
> + },
> ],
> column2: [
> {
> @@ -132,6 +156,13 @@ Ext.define('PBS.DataStoreEdit', {
> },
> emptyText: gettext('Device path'),
> },
> + {
> + xtype: 'proxmoxtextfield',
> + name: 'bucket',
> + fieldLabel: gettext('Bucket'),
> + allowBlank: false,
> + disabled: true,
> + },
> ],
> columnB: [
> {
> @@ -154,7 +185,20 @@ Ext.define('PBS.DataStoreEdit', {
> if (me.isCreate) {
> // New datastores default to using the notification system
> values['notification-mode'] = 'notification-system';
> +
> + if (values.s3client) {
> + let s3BackendConf = {
> + type: 's3',
> + client: values.s3client,
> + bucket: values.bucket,
> + };
> + values.backend = PBS.Utils.printPropertyString(s3BackendConf);
> + }
> }
> +
> + delete values.s3client;
> + delete values.bucket;
> +
> return values;
> },
> },
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend
2025-07-18 9:58 ` Christian Ebner
@ 2025-07-18 10:03 ` Lukas Wagner
2025-07-18 10:10 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 10:03 UTC (permalink / raw)
To: Christian Ebner, Proxmox Backup Server development discussion
On 2025-07-18 11:58, Christian Ebner wrote:
> On 7/18/25 10:38 AM, Lukas Wagner wrote:
>> One comment inline, but nothing prohibitive of a R-b:
>>
>> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>>
>>
>> On 2025-07-15 14:53, Christian Ebner wrote:
>>> Read the chunk based on the datastores backend, reading from local
>>> filesystem or fetching from S3 object store.
>>>
>>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>>> ---
>>> changes since version 7:
>>> - no changes
>>>
>>> src/api2/reader/environment.rs | 12 ++++++----
>>> src/api2/reader/mod.rs | 41 +++++++++++++++++++++++-----------
>>> 2 files changed, 36 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/src/api2/reader/environment.rs b/src/api2/reader/environment.rs
>>> index 3b2f06f43..8924352b0 100644
>>> --- a/src/api2/reader/environment.rs
>>> +++ b/src/api2/reader/environment.rs
>>> @@ -1,13 +1,14 @@
>>> use std::collections::HashSet;
>>> use std::sync::{Arc, RwLock};
>>> +use anyhow::Error;
>>> use serde_json::{json, Value};
>>> use proxmox_router::{RpcEnvironment, RpcEnvironmentType};
>>> use pbs_api_types::Authid;
>>> use pbs_datastore::backup_info::BackupDir;
>>> -use pbs_datastore::DataStore;
>>> +use pbs_datastore::{DataStore, DatastoreBackend};
>>> use proxmox_rest_server::formatter::*;
>>> use proxmox_rest_server::WorkerTask;
>>> use tracing::info;
>>> @@ -23,6 +24,7 @@ pub struct ReaderEnvironment {
>>> pub worker: Arc<WorkerTask>,
>>> pub datastore: Arc<DataStore>,
>>> pub backup_dir: BackupDir,
>>> + pub backend: DatastoreBackend,
>>> allowed_chunks: Arc<RwLock<HashSet<[u8; 32]>>>,
>>> }
>>> @@ -33,8 +35,9 @@ impl ReaderEnvironment {
>>> worker: Arc<WorkerTask>,
>>> datastore: Arc<DataStore>,
>>> backup_dir: BackupDir,
>>> - ) -> Self {
>>> - Self {
>>> + ) -> Result<Self, Error> {
>>> + let backend = datastore.backend()?;
>>> + Ok(Self {
>>> result_attributes: json!({}),
>>> env_type,
>>> auth_id,
>>> @@ -43,8 +46,9 @@ impl ReaderEnvironment {
>>> debug: tracing::enabled!(tracing::Level::DEBUG),
>>> formatter: JSON_FORMATTER,
>>> backup_dir,
>>> + backend,
>>> allowed_chunks: Arc::new(RwLock::new(HashSet::new())),
>>> - }
>>> + })
>>> }
>>> pub fn log<S: AsRef<str>>(&self, msg: S) {
>>> diff --git a/src/api2/reader/mod.rs b/src/api2/reader/mod.rs
>>> index a77216043..997d9ca77 100644
>>> --- a/src/api2/reader/mod.rs
>>> +++ b/src/api2/reader/mod.rs
>>> @@ -3,6 +3,7 @@
>>> use anyhow::{bail, format_err, Context, Error};
>>> use futures::*;
>>> use hex::FromHex;
>>> +use http_body_util::BodyExt;
>>> use hyper::body::Incoming;
>>> use hyper::header::{self, HeaderValue, CONNECTION, UPGRADE};
>>> use hyper::http::request::Parts;
>>> @@ -27,8 +28,9 @@ use pbs_api_types::{
>>> };
>>> use pbs_config::CachedUserInfo;
>>> use pbs_datastore::index::IndexFile;
>>> -use pbs_datastore::{DataStore, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
>>> +use pbs_datastore::{DataStore, DatastoreBackend, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
>>> use pbs_tools::json::required_string_param;
>>> +use proxmox_s3_client::S3Client;
>>> use crate::api2::backup::optional_ns_param;
>>> use crate::api2::helpers;
>>> @@ -162,7 +164,7 @@ fn upgrade_to_backup_reader_protocol(
>>> worker.clone(),
>>> datastore,
>>> backup_dir,
>>> - );
>>> + )?;
>>> env.debug = debug;
>>> @@ -323,17 +325,10 @@ fn download_chunk(
>>> ));
>>> }
>>> - let (path, _) = env.datastore.chunk_path(&digest);
>>> - let path2 = path.clone();
>>> -
>>> - env.debug(format!("download chunk {:?}", path));
>>> -
>>> - let data =
>>> - proxmox_async::runtime::block_in_place(|| std::fs::read(path)).map_err(move |err| {
>>> - http_err!(BAD_REQUEST, "reading file {:?} failed: {}", path2, err)
>>> - })?;
>>> -
>>> - let body = Body::from(data);
>>> + let body = match &env.backend {
>>> + DatastoreBackend::Filesystem => load_from_filesystem(env, &digest)?,
>>> + DatastoreBackend::S3(s3_client) => fetch_from_object_store(s3_client, &digest).await?,
>>> + };
>>> // fixme: set other headers ?
>>> Ok(Response::builder()
>>> @@ -345,6 +340,26 @@ fn download_chunk(
>>> .boxed()
>>> }
>>> +async fn fetch_from_object_store(s3_client: &S3Client, digest: &[u8; 32]) -> Result<Body, Error> {
>>> + let object_key = pbs_datastore::s3::object_key_from_digest(digest)?;
>>> + if let Some(response) = s3_client.get_object(object_key).await? {
>>
>> ^ Do we maybe want some kind of retry-logic for retrieving objects as well? Disregard
>> in case you implement it in a later patch, I'm reviewing this series patch by patch.
>
> While a retry might be of interest in case of inter-mitten issues, for the time being I would like to refrain from doing so for the reasons stated in my reply to proxmox-backup patch 0004. If the need for this truly arises, adding this later on should be rather simple. If you already see this as an issue now, I can of course add the retry logic right away.
No, I'm fine with revisiting this later, e.g. after a potential rollout where we have some initial user feedback. It's still
experimental after all :)
>
>>
>>> + let data = response.content.collect().await?.to_bytes();
>>> + return Ok(Body::from(data));
>>> + }
>>> + bail!("cannot find chunk with digest {}", hex::encode(digest));
>>> +}
>>> +
>>> +fn load_from_filesystem(env: &ReaderEnvironment, digest: &[u8; 32]) -> Result<Body, Error> {
>>> + let (path, _) = env.datastore.chunk_path(digest);
>>> + let path2 = path.clone();
>>> +
>>> + env.debug(format!("download chunk {path:?}"));
>>> +
>>> + let data = proxmox_async::runtime::block_in_place(|| std::fs::read(path))
>>> + .map_err(move |err| http_err!(BAD_REQUEST, "reading file {path2:?} failed: {err}"))?;
>>> + Ok(Body::from(data))
>>> +}
>>> +
>>> /* this is too slow
>>> fn download_chunk_old(
>>> _parts: Parts,
>>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend
2025-07-18 10:03 ` Lukas Wagner
@ 2025-07-18 10:10 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 10:10 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 12:03 PM, Lukas Wagner wrote:
> On 2025-07-18 11:58, Christian Ebner wrote:
>> On 7/18/25 10:38 AM, Lukas Wagner wrote:
>>> One comment inline, but nothing prohibitive of a R-b:
>>>
>>> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>>>
>>>
>>> On 2025-07-15 14:53, Christian Ebner wrote:
>>>> Read the chunk based on the datastores backend, reading from local
>>>> filesystem or fetching from S3 object store.
>>>>
>>>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>>>> ---
>>>> changes since version 7:
>>>> - no changes
>>>>
>>>> src/api2/reader/environment.rs | 12 ++++++----
>>>> src/api2/reader/mod.rs | 41 +++++++++++++++++++++++-----------
>>>> 2 files changed, 36 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/src/api2/reader/environment.rs b/src/api2/reader/environment.rs
>>>> index 3b2f06f43..8924352b0 100644
>>>> --- a/src/api2/reader/environment.rs
>>>> +++ b/src/api2/reader/environment.rs
>>>> @@ -1,13 +1,14 @@
>>>> use std::collections::HashSet;
>>>> use std::sync::{Arc, RwLock};
>>>> +use anyhow::Error;
>>>> use serde_json::{json, Value};
>>>> use proxmox_router::{RpcEnvironment, RpcEnvironmentType};
>>>> use pbs_api_types::Authid;
>>>> use pbs_datastore::backup_info::BackupDir;
>>>> -use pbs_datastore::DataStore;
>>>> +use pbs_datastore::{DataStore, DatastoreBackend};
>>>> use proxmox_rest_server::formatter::*;
>>>> use proxmox_rest_server::WorkerTask;
>>>> use tracing::info;
>>>> @@ -23,6 +24,7 @@ pub struct ReaderEnvironment {
>>>> pub worker: Arc<WorkerTask>,
>>>> pub datastore: Arc<DataStore>,
>>>> pub backup_dir: BackupDir,
>>>> + pub backend: DatastoreBackend,
>>>> allowed_chunks: Arc<RwLock<HashSet<[u8; 32]>>>,
>>>> }
>>>> @@ -33,8 +35,9 @@ impl ReaderEnvironment {
>>>> worker: Arc<WorkerTask>,
>>>> datastore: Arc<DataStore>,
>>>> backup_dir: BackupDir,
>>>> - ) -> Self {
>>>> - Self {
>>>> + ) -> Result<Self, Error> {
>>>> + let backend = datastore.backend()?;
>>>> + Ok(Self {
>>>> result_attributes: json!({}),
>>>> env_type,
>>>> auth_id,
>>>> @@ -43,8 +46,9 @@ impl ReaderEnvironment {
>>>> debug: tracing::enabled!(tracing::Level::DEBUG),
>>>> formatter: JSON_FORMATTER,
>>>> backup_dir,
>>>> + backend,
>>>> allowed_chunks: Arc::new(RwLock::new(HashSet::new())),
>>>> - }
>>>> + })
>>>> }
>>>> pub fn log<S: AsRef<str>>(&self, msg: S) {
>>>> diff --git a/src/api2/reader/mod.rs b/src/api2/reader/mod.rs
>>>> index a77216043..997d9ca77 100644
>>>> --- a/src/api2/reader/mod.rs
>>>> +++ b/src/api2/reader/mod.rs
>>>> @@ -3,6 +3,7 @@
>>>> use anyhow::{bail, format_err, Context, Error};
>>>> use futures::*;
>>>> use hex::FromHex;
>>>> +use http_body_util::BodyExt;
>>>> use hyper::body::Incoming;
>>>> use hyper::header::{self, HeaderValue, CONNECTION, UPGRADE};
>>>> use hyper::http::request::Parts;
>>>> @@ -27,8 +28,9 @@ use pbs_api_types::{
>>>> };
>>>> use pbs_config::CachedUserInfo;
>>>> use pbs_datastore::index::IndexFile;
>>>> -use pbs_datastore::{DataStore, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
>>>> +use pbs_datastore::{DataStore, DatastoreBackend, PROXMOX_BACKUP_READER_PROTOCOL_ID_V1};
>>>> use pbs_tools::json::required_string_param;
>>>> +use proxmox_s3_client::S3Client;
>>>> use crate::api2::backup::optional_ns_param;
>>>> use crate::api2::helpers;
>>>> @@ -162,7 +164,7 @@ fn upgrade_to_backup_reader_protocol(
>>>> worker.clone(),
>>>> datastore,
>>>> backup_dir,
>>>> - );
>>>> + )?;
>>>> env.debug = debug;
>>>> @@ -323,17 +325,10 @@ fn download_chunk(
>>>> ));
>>>> }
>>>> - let (path, _) = env.datastore.chunk_path(&digest);
>>>> - let path2 = path.clone();
>>>> -
>>>> - env.debug(format!("download chunk {:?}", path));
>>>> -
>>>> - let data =
>>>> - proxmox_async::runtime::block_in_place(|| std::fs::read(path)).map_err(move |err| {
>>>> - http_err!(BAD_REQUEST, "reading file {:?} failed: {}", path2, err)
>>>> - })?;
>>>> -
>>>> - let body = Body::from(data);
>>>> + let body = match &env.backend {
>>>> + DatastoreBackend::Filesystem => load_from_filesystem(env, &digest)?,
>>>> + DatastoreBackend::S3(s3_client) => fetch_from_object_store(s3_client, &digest).await?,
>>>> + };
>>>> // fixme: set other headers ?
>>>> Ok(Response::builder()
>>>> @@ -345,6 +340,26 @@ fn download_chunk(
>>>> .boxed()
>>>> }
>>>> +async fn fetch_from_object_store(s3_client: &S3Client, digest: &[u8; 32]) -> Result<Body, Error> {
>>>> + let object_key = pbs_datastore::s3::object_key_from_digest(digest)?;
>>>> + if let Some(response) = s3_client.get_object(object_key).await? {
>>>
>>> ^ Do we maybe want some kind of retry-logic for retrieving objects as well? Disregard
>>> in case you implement it in a later patch, I'm reviewing this series patch by patch.
>>
>> While a retry might be of interest in case of inter-mitten issues, for the time being I would like to refrain from doing so for the reasons stated in my reply to proxmox-backup patch 0004. If the need for this truly arises, adding this later on should be rather simple. If you already see this as an issue now, I can of course add the retry logic right away.
>
> No, I'm fine with revisiting this later, e.g. after a potential rollout where we have some initial user feedback. It's still
> experimental after all :)
Yes, could well be that this is needed and makes sense, after all I
added the put retry logic and rate limiting option only after seeing
that this is indeed an issue when I encountered errors when uploading to
fast to Cloudflare R2 object stores, which seems was rather easily
overwhelm.
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 15/45] datastore: local chunk reader: read chunks based on backend
2025-07-18 8:45 ` Lukas Wagner
@ 2025-07-18 10:11 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 10:11 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 10:45 AM, Lukas Wagner wrote:
> With the comments addressed:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> Get and store the datastore's backend on local chunk reader
>> instantiantion and fetch chunks based on the variant from either the
>> filesystem or the s3 object store.
>>
>> By storing the backend variant, the s3 client is instantiated only
>> once and reused until the local chunk reader instance is dropped.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> pbs-datastore/Cargo.toml | 1 +
>> pbs-datastore/src/local_chunk_reader.rs | 38 +++++++++++++++++++++----
>> 2 files changed, 33 insertions(+), 6 deletions(-)
>>
>> diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
>> index 7e56dbd31..8ce930a94 100644
>> --- a/pbs-datastore/Cargo.toml
>> +++ b/pbs-datastore/Cargo.toml
>> @@ -13,6 +13,7 @@ crc32fast.workspace = true
>> endian_trait.workspace = true
>> futures.workspace = true
>> hex = { workspace = true, features = [ "serde" ] }
>> +http-body-util.workspace = true
>> hyper.workspace = true
>> libc.workspace = true
>> log.workspace = true
>> diff --git a/pbs-datastore/src/local_chunk_reader.rs b/pbs-datastore/src/local_chunk_reader.rs
>> index 05a70c068..f5aa217ae 100644
>> --- a/pbs-datastore/src/local_chunk_reader.rs
>> +++ b/pbs-datastore/src/local_chunk_reader.rs
>> @@ -3,17 +3,21 @@ use std::pin::Pin;
>> use std::sync::Arc;
>>
>> use anyhow::{bail, Error};
>> +use http_body_util::BodyExt;
>>
>> use pbs_api_types::CryptMode;
>> use pbs_tools::crypt_config::CryptConfig;
>> +use proxmox_s3_client::S3Client;
>>
>> use crate::data_blob::DataBlob;
>> +use crate::datastore::DatastoreBackend;
>> use crate::read_chunk::{AsyncReadChunk, ReadChunk};
>> use crate::DataStore;
>>
>> #[derive(Clone)]
>> pub struct LocalChunkReader {
>> store: Arc<DataStore>,
>> + backend: DatastoreBackend,
>> crypt_config: Option<Arc<CryptConfig>>,
>> crypt_mode: CryptMode,
>> }
>> @@ -24,8 +28,11 @@ impl LocalChunkReader {
>> crypt_config: Option<Arc<CryptConfig>>,
>> crypt_mode: CryptMode,
>> ) -> Self {
>> + // TODO: Error handling!
>> + let backend = store.backend().unwrap();
>> Self {
>> store,
>> + backend,
>> crypt_config,
>> crypt_mode,
>> }
>> @@ -47,10 +54,26 @@ impl LocalChunkReader {
>> }
>> }
>>
>> +async fn fetch(s3_client: Arc<S3Client>, digest: &[u8; 32]) -> Result<DataBlob, Error> {
>> + let object_key = crate::s3::object_key_from_digest(digest)?;
>> + if let Some(response) = s3_client.get_object(object_key).await? {
>> + let bytes = response.content.collect().await?.to_bytes();
>> + DataBlob::from_raw(bytes.to_vec())
>> + } else {
>> + bail!("no object with digest {}", hex::encode(digest));
>> + }
>> +}
>> +
>> impl ReadChunk for LocalChunkReader {
>> fn read_raw_chunk(&self, digest: &[u8; 32]) -> Result<DataBlob, Error> {
>> - let chunk = self.store.load_chunk(digest)?;
>> + let chunk = match &self.backend {
>> + DatastoreBackend::Filesystem => self.store.load_chunk(digest)?,
>> + DatastoreBackend::S3(s3_client) => {
>> + proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?
>
> rather use Arc::clone(&s3_client) to avoid ambiguity
Addressed this ...
>
>> + }
>> + };
>> self.ensure_crypt_mode(chunk.crypt_mode()?)?;
>> +
>> Ok(chunk)
>> }
>>
>> @@ -69,11 +92,14 @@ impl AsyncReadChunk for LocalChunkReader {
>> digest: &'a [u8; 32],
>> ) -> Pin<Box<dyn Future<Output = Result<DataBlob, Error>> + Send + 'a>> {
>> Box::pin(async move {
>> - let (path, _) = self.store.chunk_path(digest);
>> -
>> - let raw_data = tokio::fs::read(&path).await?;
>> -
>> - let chunk = DataBlob::load_from_reader(&mut &raw_data[..])?;
>> + let chunk = match &self.backend {
>> + DatastoreBackend::Filesystem => {
>> + let (path, _) = self.store.chunk_path(digest);
>> + let raw_data = tokio::fs::read(&path).await?;
>> + DataBlob::load_from_reader(&mut &raw_data[..])?
>> + }
>> + DatastoreBackend::S3(s3_client) => fetch(s3_client.clone(), digest).await?,
>
> rather use Arc::clone(&s3_client) to avoid ambiguity
... and here as well, although both without the reference
>
>> + };
>> self.ensure_crypt_mode(chunk.crypt_mode()?)?;
>>
>> Ok(chunk)
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 30/45] datastore: add local datastore cache for network attached storages
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 30/45] datastore: add local datastore cache for network attached storages Christian Ebner
@ 2025-07-18 11:24 ` Lukas Wagner
2025-07-18 14:59 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 11:24 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
Some rustdoc comments are missing, but otherwise looks fine to me.
As a general remark, applying to this patch, but also in general: I think we should put a much larger
focus onto writing unit- and integration tests for any significant chunks for new code, e.g.
like the LocalDatastoreLruCache, and also slowly refactor existing code in a way so that it can be tested.
Naturally, it is additional effort, but IMO it well pays of later. I'd also say that it makes
reviews much easier, since the tests are living proof in the code that it works, and as a reviewer
I also immediately see how the code is supposed to be used. Furthermore, they are a good way
to detect regressions later on, e.g. due to changing third-party dependencies, and of course
also changes in the product code itself.
That being said, I won't ask you to write test for this patch now, since adding them after
the fact is a big pain and might require a big refactor, e.g. to separate out and abstract away
any dependencies on existing code. I just felt the urge to bring this up, since this
is something we can definitely improve on.
On 2025-07-15 14:53, Christian Ebner wrote:
> Use a local datastore as cache using LRU cache replacement policy for
> operations on a datastore backed by a network, e.g. by an S3 object
> store backend. The goal is to reduce number of requests to the
> backend and thereby save costs (monetary as well as time).
>
> Cached chunks are stored on the local datastore cache, already
> containing the datastore's contents metadata (namespace, group,
> snapshot, owner, index files, ecc..), used to perform fast lookups.
> The cache itself only stores chunk digests, not the raw data itself.
> When payload data is required, contents are looked up and read from
> the local datastore cache filesystem, including fallback to fetch from
> the backend if the presumably cached entry is not found.
>
> The cacher allows to fetch cache items on cache misses via the access
> method.
>
> The capacity of the cache is derived from the local datastore cache
> filesystem, or by the user configured value, whichever is smalller.
> The capacity is only set on instantiation of the store, and the current
> value kept as long as the datastore remains cached in the datastore
> cache. To change the value, the store has to be either be set to offline
> mode and back, or the services restarted.
>
> Basic performance tests:
>
> Backup and upload of contents of linux git repository to AWS S3,
> snapshots removed in-between each backup run to avoid other chunk reuse
> optimization of PBS.
>
> no-cache:
> had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 50.76 s (average 102.258 MiB/s)
> empty-cache:
> had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 50.42 s (average 102.945 MiB/s)
> all-cached:
> had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 43.78 s (average 118.554 MiB/s)
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - use info instead of warn, as these might end up in the task logs as
> well, possibly causing confusion if warning level
>
> pbs-datastore/src/datastore.rs | 70 ++++++-
> pbs-datastore/src/lib.rs | 3 +
> .../src/local_datastore_lru_cache.rs | 172 ++++++++++++++++++
> 3 files changed, 244 insertions(+), 1 deletion(-)
> create mode 100644 pbs-datastore/src/local_datastore_lru_cache.rs
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 89f45e7f8..cab0f5b4d 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -40,9 +40,10 @@ use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
> use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
> use crate::hierarchy::{ListGroups, ListGroupsType, ListNamespaces, ListNamespacesRecursive};
> use crate::index::IndexFile;
> +use crate::local_datastore_lru_cache::S3Cacher;
> use crate::s3::S3_CONTENT_PREFIX;
> use crate::task_tracking::{self, update_active_operations};
> -use crate::DataBlob;
> +use crate::{DataBlob, LocalDatastoreLruCache};
>
> static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
> LazyLock::new(|| Mutex::new(HashMap::new()));
> @@ -136,6 +137,7 @@ pub struct DataStoreImpl {
> last_digest: Option<[u8; 32]>,
> sync_level: DatastoreFSyncLevel,
> backend_config: DatastoreBackendConfig,
> + lru_store_caching: Option<LocalDatastoreLruCache>,
> }
>
> impl DataStoreImpl {
> @@ -151,6 +153,7 @@ impl DataStoreImpl {
> last_digest: None,
> sync_level: Default::default(),
> backend_config: Default::default(),
> + lru_store_caching: None,
> })
> }
> }
> @@ -255,6 +258,37 @@ impl DataStore {
> Ok(backend_type)
> }
>
> + pub fn cache(&self) -> Option<&LocalDatastoreLruCache> {
> + self.inner.lru_store_caching.as_ref()
> + }
> +
> + /// Check if the digest is present in the local datastore cache.
> + /// Always returns false if there is no cache configured for this datastore.
> + pub fn cache_contains(&self, digest: &[u8; 32]) -> bool {
> + if let Some(cache) = self.inner.lru_store_caching.as_ref() {
> + return cache.contains(digest);
> + }
> + false
> + }
> +
> + /// Insert digest as most recently used on in the cache.
> + /// Returns with success if there is no cache configured for this datastore.
> + pub fn cache_insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
> + if let Some(cache) = self.inner.lru_store_caching.as_ref() {
> + return cache.insert(digest, chunk);
> + }
> + Ok(())
> + }
> +
Missing rustdoc comment for this pub fn
> + pub fn cacher(&self) -> Result<Option<S3Cacher>, Error> {
> + self.backend().map(|backend| match backend {
> + DatastoreBackend::S3(s3_client) => {
> + Some(S3Cacher::new(s3_client, self.inner.chunk_store.clone()))
> + }
> + DatastoreBackend::Filesystem => None,
> + })
> + }
> +
> pub fn lookup_datastore(
> name: &str,
> operation: Option<Operation>,
> @@ -437,6 +471,33 @@ impl DataStore {
> .parse_property_string(config.backend.as_deref().unwrap_or(""))?,
> )?;
>
> + let lru_store_caching = if DatastoreBackendType::S3 == backend_config.ty.unwrap_or_default()
> + {
> + let mut cache_capacity = 0;
> + if let Ok(fs_info) = proxmox_sys::fs::fs_info(&chunk_store.base_path()) {
> + cache_capacity = fs_info.available / (16 * 1024 * 1024);
> + }
> + if let Some(max_cache_size) = backend_config.max_cache_size {
> + info!(
> + "Got requested max cache size {max_cache_size} for store {}",
> + config.name
> + );
> + let max_cache_capacity = max_cache_size.as_u64() / (16 * 1024 * 1024);
> + cache_capacity = cache_capacity.min(max_cache_capacity);
> + }
> + let cache_capacity = usize::try_from(cache_capacity).unwrap_or_default();
> +
> + info!(
> + "Using datastore cache with capacity {cache_capacity} for store {}",
> + config.name
> + );
> +
> + let cache = LocalDatastoreLruCache::new(cache_capacity, chunk_store.clone());
> + Some(cache)
> + } else {
> + None
> + };
> +
> Ok(DataStoreImpl {
> chunk_store,
> gc_mutex: Mutex::new(()),
> @@ -446,6 +507,7 @@ impl DataStore {
> last_digest,
> sync_level: tuning.sync_level.unwrap_or_default(),
> backend_config,
> + lru_store_caching,
> })
> }
>
> @@ -1580,6 +1642,12 @@ impl DataStore {
> chunk_count += 1;
>
> if atime < min_atime {
> + if let Some(cache) = self.cache() {
> + let mut digest_bytes = [0u8; 32];
> + hex::decode_to_slice(digest.as_bytes(), &mut digest_bytes)?;
> + // ignore errors, phase 3 will retry cleanup anyways
> + let _ = cache.remove(&digest_bytes);
> + }
> delete_list.push(content.key);
> if bad {
> gc_status.removed_bad += 1;
> diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
> index ca6fdb7d8..b9eb035c2 100644
> --- a/pbs-datastore/src/lib.rs
> +++ b/pbs-datastore/src/lib.rs
> @@ -217,3 +217,6 @@ pub use snapshot_reader::SnapshotReader;
>
> mod local_chunk_reader;
> pub use local_chunk_reader::LocalChunkReader;
> +
> +mod local_datastore_lru_cache;
> +pub use local_datastore_lru_cache::LocalDatastoreLruCache;
> diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
> new file mode 100644
> index 000000000..bb64c52f3
> --- /dev/null
> +++ b/pbs-datastore/src/local_datastore_lru_cache.rs
> @@ -0,0 +1,172 @@
> +//! Use a local datastore as cache for operations on a datastore attached via
> +//! a network layer (e.g. via the S3 backend).
> +
> +use std::future::Future;
> +use std::sync::Arc;
> +
> +use anyhow::{bail, Error};
> +use http_body_util::BodyExt;
> +
> +use pbs_tools::async_lru_cache::{AsyncCacher, AsyncLruCache};
> +use proxmox_s3_client::S3Client;
> +
> +use crate::ChunkStore;
> +use crate::DataBlob;
> +
v missing rustdoc for pub struct
> +#[derive(Clone)]
> +pub struct S3Cacher {
> + client: Arc<S3Client>,
> + store: Arc<ChunkStore>,
> +}
> +
> +impl AsyncCacher<[u8; 32], ()> for S3Cacher {
> + fn fetch(
> + &self,
> + key: [u8; 32],
> + ) -> Box<dyn Future<Output = Result<Option<()>, Error>> + Send + 'static> {
> + let client = self.client.clone();
> + let store = self.store.clone();
rather use Arc::clone(&...) here to avoid ambiguity
> + Box::new(async move {
> + let object_key = crate::s3::object_key_from_digest(&key)?;
> + match client.get_object(object_key).await? {
> + None => bail!("could not fetch object with key {}", hex::encode(key)),
> + Some(response) => {
> + let bytes = response.content.collect().await?.to_bytes();
> + let chunk = DataBlob::from_raw(bytes.to_vec())?;
> + store.insert_chunk(&chunk, &key)?;
> + Ok(Some(()))
> + }
> + }
> + })
> + }
> +}
> +
> +impl S3Cacher {
v missing rustdoc for pub fn
> + pub fn new(client: Arc<S3Client>, store: Arc<ChunkStore>) -> Self {
> + Self { client, store }
> + }
> +}
> +
> +/// LRU cache using local datastore for caching chunks
> +///
> +/// Uses a LRU cache, but without storing the values in-memory but rather
> +/// on the filesystem
> +pub struct LocalDatastoreLruCache {
> + cache: AsyncLruCache<[u8; 32], ()>,
> + store: Arc<ChunkStore>,
> +}
> +
> +impl LocalDatastoreLruCache {
> + pub fn new(capacity: usize, store: Arc<ChunkStore>) -> Self {
> + Self {
> + cache: AsyncLruCache::new(capacity),
> + store,
> + }
> + }
> +
> + /// Insert a new chunk into the local datastore cache.
> + ///
> + /// Fails if the chunk cannot be inserted successfully.
> + pub fn insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
> + self.store.insert_chunk(chunk, digest)?;
> + self.cache.insert(*digest, (), |digest| {
> + let (path, _digest_str) = self.store.chunk_path(&digest);
> + // Truncate to free up space but keep the inode around, since that
> + // is used as marker for chunks in use by garbage collection.
> + if let Err(err) = nix::unistd::truncate(&path, 0) {
> + if err != nix::errno::Errno::ENOENT {
> + return Err(Error::from(err));
> + }
> + }
> + Ok(())
> + })
> + }
> +
> + /// Remove a chunk from the local datastore cache.
> + ///
> + /// Fails if the chunk cannot be deleted successfully.
> + pub fn remove(&self, digest: &[u8; 32]) -> Result<(), Error> {
> + self.cache.remove(*digest);
> + let (path, _digest_str) = self.store.chunk_path(digest);
> + std::fs::remove_file(path).map_err(Error::from)
> + }
> +
v missing rustdoc
> + pub async fn access(
> + &self,
> + digest: &[u8; 32],
> + cacher: &mut S3Cacher,
> + ) -> Result<Option<DataBlob>, Error> {
> + if self
> + .cache
> + .access(*digest, cacher, |digest| {
> + let (path, _digest_str) = self.store.chunk_path(&digest);
> + // Truncate to free up space but keep the inode around, since that
> + // is used as marker for chunks in use by garbage collection.
> + if let Err(err) = nix::unistd::truncate(&path, 0) {
> + if err != nix::errno::Errno::ENOENT {
> + return Err(Error::from(err));
> + }
> + }
> + Ok(())
> + })
> + .await?
> + .is_some()
> + {
> + let (path, _digest_str) = self.store.chunk_path(digest);
> + let mut file = match std::fs::File::open(&path) {
> + Ok(file) => file,
> + Err(err) => {
> + // Expected chunk to be present since LRU cache has it, but it is missing
> + // locally, try to fetch again
> + if err.kind() == std::io::ErrorKind::NotFound {
> + let object_key = crate::s3::object_key_from_digest(digest)?;
> + match cacher.client.get_object(object_key).await? {
> + None => {
> + bail!("could not fetch object with key {}", hex::encode(digest))
> + }
> + Some(response) => {
> + let bytes = response.content.collect().await?.to_bytes();
> + let chunk = DataBlob::from_raw(bytes.to_vec())?;
> + self.store.insert_chunk(&chunk, digest)?;
> + std::fs::File::open(&path)?
> + }
> + }
> + } else {
> + return Err(Error::from(err));
> + }
> + }
> + };
> + let chunk = match DataBlob::load_from_reader(&mut file) {
> + Ok(chunk) => chunk,
> + Err(err) => {
> + use std::io::Seek;
> + // Check if file is empty marker file, try fetching content if so
> + if file.seek(std::io::SeekFrom::End(0))? == 0 {
> + let object_key = crate::s3::object_key_from_digest(digest)?;
> + match cacher.client.get_object(object_key).await? {
> + None => {
> + bail!("could not fetch object with key {}", hex::encode(digest))
> + }
> + Some(response) => {
> + let bytes = response.content.collect().await?.to_bytes();
> + let chunk = DataBlob::from_raw(bytes.to_vec())?;
> + self.store.insert_chunk(&chunk, digest)?;
> + let mut file = std::fs::File::open(&path)?;
> + DataBlob::load_from_reader(&mut file)?
> + }
> + }
> + } else {
> + return Err(err);
> + }
> + }
> + };
> + Ok(Some(chunk))
> + } else {
> + Ok(None)
> + }
> + }
> +
v missing rustdoc
> + pub fn contains(&self, digest: &[u8; 32]) -> bool {
> + self.cache.contains(*digest)
> + }
> +}
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 33/45] datastore: local chunk reader: get cached chunk from local cache store
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 33/45] datastore: local chunk reader: get cached chunk from local cache store Christian Ebner
@ 2025-07-18 11:36 ` Lukas Wagner
2025-07-18 15:04 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 11:36 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
With my nits addresses:
Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
On 2025-07-15 14:53, Christian Ebner wrote:
> Check if a chunk is contained in the local cache and if so prefer
> fetching it from the cache instead of pulling it via the S3 api. This
> improves performance and reduces number of requests to the backend.
>
> Basic restore performance tests:
>
> Restored a snapshot containing the linux git repository (on-disk size
> 5.069 GiB, compressed 3.718 GiB) from an AWS S3 backed datastore, with
> and without cached contents:
> non cached: 691.95 s
> all cached: 74.89 s
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> pbs-datastore/src/local_chunk_reader.rs | 31 +++++++++++++++++++++----
> 1 file changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/pbs-datastore/src/local_chunk_reader.rs b/pbs-datastore/src/local_chunk_reader.rs
> index f5aa217ae..7ad44c4fa 100644
> --- a/pbs-datastore/src/local_chunk_reader.rs
> +++ b/pbs-datastore/src/local_chunk_reader.rs
> @@ -2,7 +2,7 @@ use std::future::Future;
> use std::pin::Pin;
> use std::sync::Arc;
>
> -use anyhow::{bail, Error};
> +use anyhow::{bail, format_err, Error};
> use http_body_util::BodyExt;
>
> use pbs_api_types::CryptMode;
> @@ -68,9 +68,18 @@ impl ReadChunk for LocalChunkReader {
> fn read_raw_chunk(&self, digest: &[u8; 32]) -> Result<DataBlob, Error> {
> let chunk = match &self.backend {
> DatastoreBackend::Filesystem => self.store.load_chunk(digest)?,
> - DatastoreBackend::S3(s3_client) => {
> - proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?
better use Arc::clone here :)
> - }
> + DatastoreBackend::S3(s3_client) => match self.store.cache() {
> + None => proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?,
> + Some(cache) => {
> + let mut cacher = self
> + .store
> + .cacher()?
> + .ok_or(format_err!("no cacher for datastore"))?;
> + proxmox_async::runtime::block_on(cache.access(digest, &mut cacher))?.ok_or(
> + format_err!("unable to access chunk with digest {}", hex::encode(digest)),
> + )?
> + }
> + },
> };
> self.ensure_crypt_mode(chunk.crypt_mode()?)?;
>
> @@ -98,7 +107,19 @@ impl AsyncReadChunk for LocalChunkReader {
> let raw_data = tokio::fs::read(&path).await?;
> DataBlob::load_from_reader(&mut &raw_data[..])?
> }
> - DatastoreBackend::S3(s3_client) => fetch(s3_client.clone(), digest).await?,
> + DatastoreBackend::S3(s3_client) => match self.store.cache() {
> + None => fetch(s3_client.clone(), digest).await?,
same here
> + Some(cache) => {
> + let mut cacher = self
> + .store
> + .cacher()?
> + .ok_or(format_err!("no cacher for datastore"))?;
> + cache.access(digest, &mut cacher).await?.ok_or(format_err!(
> + "unable to access chunk with digest {}",
> + hex::encode(digest)
> + ))?
> + }
> + },
> };
> self.ensure_crypt_mode(chunk.crypt_mode()?)?;
>
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 34/45] api: backup: add no-cache flag to bypass local datastore cache
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 34/45] api: backup: add no-cache flag to bypass local datastore cache Christian Ebner
@ 2025-07-18 11:41 ` Lukas Wagner
2025-07-18 15:37 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 11:41 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
On 2025-07-15 14:53, Christian Ebner wrote:
> Adds the `no-cache` flag so the client can request to bypass the
> local datastore cache for chunk uploads. This is mainly intended for
> debugging and benchmarking, but can be used in cases the caching is
> known to be ineffective (no possible deduplication).
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> examples/upload-speed.rs | 1 +
> pbs-client/src/backup_writer.rs | 4 +++-
> proxmox-backup-client/src/benchmark.rs | 1 +
> proxmox-backup-client/src/main.rs | 8 ++++++++
> src/api2/backup/environment.rs | 3 +++
> src/api2/backup/mod.rs | 3 +++
> src/api2/backup/upload_chunk.rs | 9 +++++++++
> src/server/push.rs | 1 +
> 8 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/examples/upload-speed.rs b/examples/upload-speed.rs
> index e4b570ec5..8a6594a47 100644
> --- a/examples/upload-speed.rs
> +++ b/examples/upload-speed.rs
> @@ -25,6 +25,7 @@ async fn upload_speed() -> Result<f64, Error> {
> &(BackupType::Host, "speedtest".to_string(), backup_time).into(),
> false,
> true,
> + false,
> )
> .await?;
>
> diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
> index 1253ef561..ce5bd9375 100644
> --- a/pbs-client/src/backup_writer.rs
> +++ b/pbs-client/src/backup_writer.rs
> @@ -82,6 +82,7 @@ impl BackupWriter {
> backup: &BackupDir,
> debug: bool,
> benchmark: bool,
> + no_cache: bool,
> ) -> Result<Arc<BackupWriter>, Error> {
> let mut param = json!({
> "backup-type": backup.ty(),
> @@ -89,7 +90,8 @@ impl BackupWriter {
> "backup-time": backup.time,
> "store": datastore,
> "debug": debug,
> - "benchmark": benchmark
> + "benchmark": benchmark,
> + "no-cache": no_cache,
> });
>
> if !ns.is_root() {
> diff --git a/proxmox-backup-client/src/benchmark.rs b/proxmox-backup-client/src/benchmark.rs
> index a6f24d745..ed21c7a91 100644
> --- a/proxmox-backup-client/src/benchmark.rs
> +++ b/proxmox-backup-client/src/benchmark.rs
> @@ -236,6 +236,7 @@ async fn test_upload_speed(
> &(BackupType::Host, "benchmark".to_string(), backup_time).into(),
> false,
> true,
> + true,
> )
> .await?;
>
> diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
> index 44f4f5db5..83fc9309a 100644
> --- a/proxmox-backup-client/src/main.rs
> +++ b/proxmox-backup-client/src/main.rs
> @@ -742,6 +742,12 @@ fn spawn_catalog_upload(
> optional: true,
> default: false,
> },
> + "no-cache": {
> + type: Boolean,
> + description: "Bypass local datastore cache for network storages.",
> + optional: true,
> + default: false,
> + },
> }
> }
> )]
> @@ -754,6 +760,7 @@ async fn create_backup(
> change_detection_mode: Option<BackupDetectionMode>,
> dry_run: bool,
> skip_e2big_xattr: bool,
> + no_cache: bool,
> limit: ClientRateLimitConfig,
> _info: &ApiMethod,
> _rpcenv: &mut dyn RpcEnvironment,
> @@ -960,6 +967,7 @@ async fn create_backup(
> &snapshot,
> true,
> false,
> + no_cache,
> )
> .await?;
>
> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
> index 369385368..448659e74 100644
> --- a/src/api2/backup/environment.rs
> +++ b/src/api2/backup/environment.rs
> @@ -113,6 +113,7 @@ pub struct BackupEnvironment {
> result_attributes: Value,
> auth_id: Authid,
> pub debug: bool,
> + pub no_cache: bool,
> pub formatter: &'static dyn OutputFormatter,
> pub worker: Arc<WorkerTask>,
> pub datastore: Arc<DataStore>,
> @@ -129,6 +130,7 @@ impl BackupEnvironment {
> worker: Arc<WorkerTask>,
> datastore: Arc<DataStore>,
> backup_dir: BackupDir,
> + no_cache: bool,
> ) -> Result<Self, Error> {
> let state = SharedBackupState {
> finished: false,
> @@ -149,6 +151,7 @@ impl BackupEnvironment {
> worker,
> datastore,
> debug: tracing::enabled!(tracing::Level::DEBUG),
> + no_cache,
> formatter: JSON_FORMATTER,
> backup_dir,
> last_backup: None,
> diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
> index 026f1f106..ae61ff697 100644
> --- a/src/api2/backup/mod.rs
> +++ b/src/api2/backup/mod.rs
> @@ -53,6 +53,7 @@ pub const API_METHOD_UPGRADE_BACKUP: ApiMethod = ApiMethod::new(
> ("backup-time", false, &BACKUP_TIME_SCHEMA),
> ("debug", true, &BooleanSchema::new("Enable verbose debug logging.").schema()),
> ("benchmark", true, &BooleanSchema::new("Job is a benchmark (do not keep data).").schema()),
> + ("no-cache", true, &BooleanSchema::new("Disable local datastore cache for network storages").schema()),
> ]),
> )
> ).access(
> @@ -79,6 +80,7 @@ fn upgrade_to_backup_protocol(
> async move {
> let debug = param["debug"].as_bool().unwrap_or(false);
> let benchmark = param["benchmark"].as_bool().unwrap_or(false);
> + let no_cache = param["no-cache"].as_bool().unwrap_or(false);
>
> let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
>
> @@ -214,6 +216,7 @@ fn upgrade_to_backup_protocol(
> worker.clone(),
> datastore,
> backup_dir,
> + no_cache,
> )?;
>
> env.debug = debug;
> diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
> index d97975b34..623b405dd 100644
> --- a/src/api2/backup/upload_chunk.rs
> +++ b/src/api2/backup/upload_chunk.rs
> @@ -262,6 +262,15 @@ async fn upload_to_backend(
> );
> }
>
> + if env.no_cache {
> + let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
> + let is_duplicate = s3_client
> + .upload_with_retry(object_key, data, false)
> + .await
> + .context("failed to upload chunk to s3 backend")?;
> + return Ok((digest, size, encoded_size, is_duplicate));
> + }
> +
> // Avoid re-upload to S3 if the chunk is either present in the LRU cache or the chunk
> // file exists on filesystem. The latter means that the chunk has been present in the
> // past an was not cleaned up by garbage collection, so contained in the S3 object store.
> diff --git a/src/server/push.rs b/src/server/push.rs
> index e71012ed8..6a31d2abe 100644
> --- a/src/server/push.rs
> +++ b/src/server/push.rs
> @@ -828,6 +828,7 @@ pub(crate) async fn push_snapshot(
> snapshot,
> false,
> false,
> + false,
There is already a FIXME for it above the BackupWriter::start function, but with the *third*
boolean parameter I think it is overdue to use a parameter struct instead of plain parameters
for this function.
> )
> .await?;
>
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 17/45] verify: implement chunk verification for stores with s3 backend
2025-07-18 8:56 ` Lukas Wagner
@ 2025-07-18 11:45 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 11:45 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 10:56 AM, Lukas Wagner wrote:
> On 2025-07-15 14:53, Christian Ebner wrote:
>> For datastores backed by an S3 compatible object store, rather than
>> reading the chunks to be verified from the local filesystem, fetch
>> them via the s3 client from the configured bucket.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> src/backup/verify.rs | 89 ++++++++++++++++++++++++++++++++++++++------
>> 1 file changed, 77 insertions(+), 12 deletions(-)
>>
>> diff --git a/src/backup/verify.rs b/src/backup/verify.rs
>> index dea10f618..3a4a1d0d5 100644
>> --- a/src/backup/verify.rs
>> +++ b/src/backup/verify.rs
>> @@ -5,6 +5,7 @@ use std::sync::{Arc, Mutex};
>> use std::time::Instant;
>>
>> use anyhow::{bail, Error};
>> +use http_body_util::BodyExt;
>> use tracing::{error, info, warn};
>>
>> use proxmox_worker_task::WorkerTaskContext;
>> @@ -89,6 +90,38 @@ impl VerifyWorker {
>> }
>> }
>>
>> + if let Ok(DatastoreBackend::S3(s3_client)) = datastore.backend() {
>> + let suffix = format!(".{}.bad", counter);
>> + let target_key =
>> + match pbs_datastore::s3::object_key_from_digest_with_suffix(digest, &suffix) {
>> + Ok(target_key) => target_key,
>> + Err(err) => {
>> + info!("could not generate target key for corrupted chunk {path:?} - {err}");
>> + return;
>> + }
>> + };
>> + let object_key = match pbs_datastore::s3::object_key_from_digest(digest) {
>> + Ok(object_key) => object_key,
>> + Err(err) => {
>> + info!("could not generate object key for corrupted chunk {path:?} - {err}");
>> + return;
>> + }
>> + };
>> + if proxmox_async::runtime::block_on(
>> + s3_client.copy_object(object_key.clone(), target_key),
>> + )
>> + .is_ok()
>> + {
>> + if proxmox_async::runtime::block_on(s3_client.delete_object(object_key)).is_err() {
>> + info!("failed to delete corrupt chunk on s3 backend: {digest_str}");
>> + }
>> + } else {
>> + info!("failed to copy corrupt chunk on s3 backend: {digest_str}");
>> + }
>> + } else {
>> + info!("failed to get s3 backend while trying to rename bad chunk: {digest_str}");
>> + }
>> +
>> match std::fs::rename(&path, &new_path) {
>> Ok(_) => {
>> info!("corrupted chunk renamed to {:?}", &new_path);
>> @@ -189,18 +222,50 @@ impl VerifyWorker {
>> continue; // already verified or marked corrupt
>> }
>>
>> - match self.datastore.load_chunk(&info.digest) {
>> - Err(err) => {
>> - self.corrupt_chunks.lock().unwrap().insert(info.digest);
>> - error!("can't verify chunk, load failed - {err}");
>> - errors.fetch_add(1, Ordering::SeqCst);
>> - Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
>> - }
>> - Ok(chunk) => {
>> - let size = info.size();
>> - read_bytes += chunk.raw_size();
>> - decoder_pool.send((chunk, info.digest, size))?;
>> - decoded_bytes += size;
>> + match &self.backend {
>
> The whole method becomes uncomfortably large, maybe move the entire match &self.backend into a new method?
Okay, took me a bit since all the rquired interdependencies here, but
now this is all placed in a verify_chunk_by_backend()
>
>> + DatastoreBackend::Filesystem => match self.datastore.load_chunk(&info.digest) {
>> + Err(err) => {
>> + self.corrupt_chunks.lock().unwrap().insert(info.digest);
>
> Maybe add a new method self.add_corrupt_chunk
>
> fn add_corrupt_chunk(&mut self, chunk: ...) {
> // Panic on poisoned mutex
> let mut chunks = self.corrupt_chunks.lock().unwrap();
>
> chunks.insert(chunk);
> }
>
> or the like
and this moved into a dedicated helper as suggested, but with the error
counting and message included.
>
>> + error!("can't verify chunk, load failed - {err}");
>> + errors.fetch_add(1, Ordering::SeqCst);
>> + Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
>> + }
>> + Ok(chunk) => {
>> + let size = info.size();
>> + read_bytes += chunk.raw_size();
>> + decoder_pool.send((chunk, info.digest, size))?;
>> + decoded_bytes += size;
>> + }
>> + },
>> + DatastoreBackend::S3(s3_client) => {
>> + let object_key = pbs_datastore::s3::object_key_from_digest(&info.digest)?;
>> + match proxmox_async::runtime::block_on(s3_client.get_object(object_key)) {
>> + Ok(Some(response)) => {
>> + let bytes =
>> + proxmox_async::runtime::block_on(response.content.collect())?
>> + .to_bytes();
>> + let chunk = DataBlob::from_raw(bytes.to_vec())?;
>> + let size = info.size();
>> + read_bytes += chunk.raw_size();
>> + decoder_pool.send((chunk, info.digest, size))?;
>> + decoded_bytes += size;
>> + }
>> + Ok(None) => {
>> + self.corrupt_chunks.lock().unwrap().insert(info.digest);
>> + error!(
>> + "can't verify missing chunk with digest {}",
>> + hex::encode(info.digest)
>> + );
>> + errors.fetch_add(1, Ordering::SeqCst);
>> + Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
>> + }
>> + Err(err) => {
>> + self.corrupt_chunks.lock().unwrap().insert(info.digest);
>> + error!("can't verify chunk, load failed - {err}");
>> + errors.fetch_add(1, Ordering::SeqCst);
>> + Self::rename_corrupted_chunk(self.datastore.clone(), &info.digest);
>> + }
>> + }
>> }
>> }
>> }
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 35/45] api/datastore: implement refresh endpoint for stores with s3 backend
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 35/45] api/datastore: implement refresh endpoint for stores with s3 backend Christian Ebner
@ 2025-07-18 12:01 ` Lukas Wagner
2025-07-18 15:51 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 12:01 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
On 2025-07-15 14:53, Christian Ebner wrote:
> Allows to easily refresh the contents on the local cache store for
> datastores backed by an S3 object store.
>
> In order to guarantee that no read or write operations are ongoing,
> the store is first set into the maintenance mode `S3Refresh`. Objects
> are then fetched into a temporary directory to avoid loosing contents
> and consistency in case of an error. Once all objects have been
> fetched, clears out existing contents and moves the newly fetched
> contents in place.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - add more error context
> - fix clippy warning
>
> pbs-datastore/src/datastore.rs | 172 ++++++++++++++++++++++++++++++++-
> src/api2/admin/datastore.rs | 34 +++++++
> 2 files changed, 205 insertions(+), 1 deletion(-)
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index cab0f5b4d..c63759f9a 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -10,11 +10,13 @@ use anyhow::{bail, format_err, Context, Error};
> use http_body_util::BodyExt;
> use nix::unistd::{unlinkat, UnlinkatFlags};
> use pbs_tools::lru_cache::LruCache;
> +use proxmox_lang::try_block;
> +use tokio::io::AsyncWriteExt;
> use tracing::{info, warn};
>
> use proxmox_human_byte::HumanByte;
> use proxmox_s3_client::{
> - S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3PathPrefix,
> + S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3ObjectKey, S3PathPrefix,
> };
> use proxmox_schema::ApiType;
>
> @@ -2132,4 +2134,172 @@ impl DataStore {
> pub fn old_locking(&self) -> bool {
> *OLD_LOCKING
> }
> +
> + /// Set the datastore's maintenance mode to `S3Refresh`, fetch from S3 object store, clear and
> + /// replace the local cache store contents. Once finished disable the maintenance mode again.
> + /// Returns with error for other datastore backends without setting the maintenance mode.
> + pub async fn s3_refresh(self: &Arc<Self>) -> Result<(), Error> {
> + match self.backend()? {
> + DatastoreBackend::Filesystem => bail!("store '{}' not backed by S3", self.name()),
> + DatastoreBackend::S3(s3_client) => {
> + try_block!({
> + let _lock = pbs_config::datastore::lock_config()?;
> + let (mut section_config, _digest) = pbs_config::datastore::config()?;
> + let mut datastore: DataStoreConfig =
> + section_config.lookup("datastore", self.name())?;
> + datastore.set_maintenance_mode(Some(MaintenanceMode {
> + ty: MaintenanceType::S3Refresh,
> + message: None,
> + }))?;
> + section_config.set_data(self.name(), "datastore", &datastore)?;
> + pbs_config::datastore::save_config(§ion_config)?;
> + drop(_lock);
No need to drop the lock, since the block ends anyway, right?
Also this should be done in a tokio::spawn_blocking, if I'm not mistaken?
(the try_block! is only a convenience wrapper that wraps the block in a function,
it doesn't spawn the block on the blocking thread pool)
> + Ok::<(), Error>(())
> + })
> + .context("failed to set maintenance mode")?;
> +
> + let store_base = self.base_path();
> +
> + let tmp_base = proxmox_sys::fs::make_tmp_dir(&store_base, None)
> + .context("failed to create temporary content folder in {store_base}")?;
> +
> + let backup_user = pbs_config::backup_user().context("failed to get backup user")?;
> + let mode = nix::sys::stat::Mode::from_bits_truncate(0o0644);
> + let file_create_options = CreateOptions::new()
> + .perm(mode)
> + .owner(backup_user.uid)
> + .group(backup_user.gid);
> + let mode = nix::sys::stat::Mode::from_bits_truncate(0o0755);
> + let dir_create_options = CreateOptions::new()
> + .perm(mode)
> + .owner(backup_user.uid)
> + .group(backup_user.gid);
> +
> + let list_prefix = S3PathPrefix::Some(S3_CONTENT_PREFIX.to_string());
> + let store_prefix = format!("{}/{S3_CONTENT_PREFIX}/", self.name());
> + let mut next_continuation_token: Option<String> = None;
> + loop {
> + let list_objects_result = s3_client
> + .list_objects_v2(&list_prefix, next_continuation_token.as_deref())
> + .await
> + .context("failed to list object")?;
> +
> + let objects_to_fetch: Vec<S3ObjectKey> = list_objects_result
> + .contents
> + .into_iter()
> + .map(|item| item.key)
> + .collect();
> +
> + for object_key in objects_to_fetch {
> + let object_path = format!("{object_key}");
> + let object_path = object_path.strip_prefix(&store_prefix).with_context(||
> + format!("failed to strip store context prefix {store_prefix} for {object_key}")
> + )?;
> + if object_path.ends_with(NAMESPACE_MARKER_FILENAME) {
> + continue;
> + }
> +
> + info!("Fetching object {object_path}");
> +
> + let file_path = tmp_base.join(object_path);
> + if let Some(parent) = file_path.parent() {
> + proxmox_sys::fs::create_path(
> + parent,
> + Some(dir_create_options),
> + Some(dir_create_options),
> + )?;
> + }
> +
> + let mut target_file = tokio::fs::OpenOptions::new()
> + .write(true)
> + .create(true)
> + .truncate(true)
> + .read(true)
> + .open(&file_path)
> + .await
> + .with_context(|| {
> + format!("failed to create target file {file_path:?}")
> + })?;
> +
> + if let Some(response) = s3_client
> + .get_object(object_key)
> + .await
> + .with_context(|| format!("failed to fetch object {object_path}"))?
> + {
> + let data = response
> + .content
> + .collect()
> + .await
> + .context("failed to collect object contents")?;
> + target_file
> + .write_all(&data.to_bytes())
> + .await
> + .context("failed to write to target file")?;
> + file_create_options
> + .apply_to(&mut target_file, &file_path)
> + .context("failed to set target file create options")?;
> + target_file
> + .flush()
> + .await
> + .context("failed to flush target file")?;
> + } else {
> + bail!("failed to download {object_path}, not found");
> + }
> + }
> +
> + if list_objects_result.is_truncated {
> + next_continuation_token = list_objects_result
> + .next_continuation_token
> + .as_ref()
> + .cloned();
> + continue;
> + }
> + break;
> + }
> +
> + for ty in ["vm", "ct", "host", "ns"] {
> + let store_base_clone = store_base.clone();
> + let tmp_base_clone = tmp_base.clone();
> + tokio::task::spawn_blocking(move || {
> + let type_dir = store_base_clone.join(ty);
> + if let Err(err) = std::fs::remove_dir_all(&type_dir) {
> + if err.kind() != io::ErrorKind::NotFound {
> + return Err(err).with_context(|| {
> + format!("failed to remove old contents in {type_dir:?}")
> + });
> + }
> + }
> + let tmp_type_dir = tmp_base_clone.join(ty);
> + if let Err(err) = std::fs::rename(&tmp_type_dir, &type_dir) {
> + if err.kind() != io::ErrorKind::NotFound {
> + return Err(err)
> + .with_context(|| format!("failed to rename {tmp_type_dir:?}"));
> + }
> + }
> + Ok::<(), Error>(())
> + })
> + .await?
> + .with_context(|| format!("failed to refresh {store_base:?}"))?;
> + }
> +
> + std::fs::remove_dir_all(&tmp_base).with_context(|| {
> + format!("failed to cleanup temporary content in {tmp_base:?}")
> + })?;
> +
> + try_block!({
> + let _lock = pbs_config::datastore::lock_config()?;
> + let (mut section_config, _digest) = pbs_config::datastore::config()?;
> + let mut datastore: DataStoreConfig =
> + section_config.lookup("datastore", self.name())?;
> + datastore.set_maintenance_mode(None)?;
> + section_config.set_data(self.name(), "datastore", &datastore)?;
> + pbs_config::datastore::save_config(§ion_config)?;
> + drop(_lock);
> + Ok::<(), Error>(())
> + })
> + .context("failed to clear maintenance mode")?;
Same thing here.
> + }
> + }
> + Ok(())
> + }
In general, I think the s3_refresh function is a good candidate to be broken up into multiple smaller functions
- setting/unsetting maintenance mode
- creating the new temporary dir
- retrieving the objects from S3
- replacing the old contents
- etc.
> }
> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
> index 80740e3fb..41cbee4de 100644
> --- a/src/api2/admin/datastore.rs
> +++ b/src/api2/admin/datastore.rs
> @@ -2707,6 +2707,39 @@ pub async fn unmount(store: String, rpcenv: &mut dyn RpcEnvironment) -> Result<V
> Ok(json!(upid))
> }
>
> +#[api(
> + protected: true,
> + input: {
> + properties: {
> + store: {
> + schema: DATASTORE_SCHEMA,
> + },
> + }
> + },
> + returns: {
> + schema: UPID_SCHEMA,
> + },
> + access: {
> + permission: &Permission::Privilege(&["datastore", "{store}"], PRIV_DATASTORE_MODIFY, false),
> + },
> +)]
> +/// Refresh datastore contents from S3 to local cache store.
> +pub async fn s3_refresh(store: String, rpcenv: &mut dyn RpcEnvironment) -> Result<Value, Error> {
> + let datastore = DataStore::lookup_datastore(&store, Some(Operation::Lookup))?;
> + let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
> + let to_stdout = rpcenv.env_type() == RpcEnvironmentType::CLI;
> +
> + let upid = WorkerTask::spawn(
> + "s3-refresh",
> + Some(store),
> + auth_id.to_string(),
> + to_stdout,
> + move |_worker| async move { datastore.s3_refresh().await },
> + )?;
> +
> + Ok(json!(upid))
> +}
> +
> #[sortable]
> const DATASTORE_INFO_SUBDIRS: SubdirMap = &[
> (
> @@ -2773,6 +2806,7 @@ const DATASTORE_INFO_SUBDIRS: SubdirMap = &[
> &Router::new().download(&API_METHOD_PXAR_FILE_DOWNLOAD),
> ),
> ("rrd", &Router::new().get(&API_METHOD_GET_RRD_STATS)),
> + ("s3-refresh", &Router::new().put(&API_METHOD_S3_REFRESH)),
> (
> "snapshots",
> &Router::new()
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 21/45] datastore: get and set owner for s3 store backend
2025-07-18 9:25 ` Lukas Wagner
@ 2025-07-18 12:12 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 12:12 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 11:25 AM, Lukas Wagner wrote:
> With my feedback addressed:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> Read or write the ownership information from/to the corresponding
>> object in the S3 object store. Keep that information available if
>> the bucket is reused as datastore.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> pbs-datastore/src/datastore.rs | 28 ++++++++++++++++++++++++++++
>> 1 file changed, 28 insertions(+)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 265624229..ca099c1d0 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -7,6 +7,7 @@ use std::sync::{Arc, LazyLock, Mutex};
>> use std::time::Duration;
>>
>> use anyhow::{bail, format_err, Context, Error};
>> +use http_body_util::BodyExt;
>> use nix::unistd::{unlinkat, UnlinkatFlags};
>> use pbs_tools::lru_cache::LruCache;
>> use tracing::{info, warn};
>> @@ -832,6 +833,21 @@ impl DataStore {
>> backup_group: &pbs_api_types::BackupGroup,
>> ) -> Result<Authid, Error> {
>> let full_path = self.owner_path(ns, backup_group);
>> +
>> + if let DatastoreBackend::S3(s3_client) = self.backend()? {
>> + let mut path = ns.path();
>> + path.push(format!("{backup_group}"));
>
> nit: you can use .to_string() here, is a bit easier to read
adapted, thanks!
>
>> + let object_key = crate::s3::object_key_from_path(&path, "owner")
>
> I did not note it for the previously reviewed patches, but I think some (pub) consts for these
> 'static' key suffixes would be better than to repeat the same string in multiple places in the code
> (mostly to avoid errors due to spelling mistakes)
Yeah, agreed. I did this already for some other but for the owner
filename this was indeed missing. Adapted it here and the previous patch
as well.
>
>> + .context("invalid owner file object key")?;
>> + let response = proxmox_async::runtime::block_on(s3_client.get_object(object_key))?
>> + .ok_or_else(|| format_err!("fetching owner failed"))?;
>> + let content = proxmox_async::runtime::block_on(response.content.collect())?;
>> + let owner = String::from_utf8(content.to_bytes().trim_ascii_end().to_vec())?;
>> + return owner
>> + .parse()
>> + .map_err(|err| format_err!("parsing owner for {backup_group} failed: {err}"));
>> + }
>> +
>> let owner = proxmox_sys::fs::file_read_firstline(full_path)?;
>> owner
>> .trim_end() // remove trailing newline
>> @@ -860,6 +876,18 @@ impl DataStore {
>> ) -> Result<(), Error> {
>> let path = self.owner_path(ns, backup_group);
>>
>> + if let DatastoreBackend::S3(s3_client) = self.backend()? {
>> + let mut path = ns.path();
>> + path.push(format!("{backup_group}"));
>> + let object_key = crate::s3::object_key_from_path(&path, "owner")
>> + .context("invalid owner file object key")?;
>> + let data = hyper::body::Bytes::from(format!("{auth_id}\n"));
>> + let _is_duplicate = proxmox_async::runtime::block_on(
>> + s3_client.upload_with_retry(object_key, data, true),
>> + )
>> + .context("failed to set owner on s3 backend")?;
>> + }
>> +
>> let mut open_options = std::fs::OpenOptions::new();
>> open_options.write(true);
>>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 38/45] ui: expose s3 refresh button for datastores backed by object store
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 38/45] ui: expose s3 refresh button for datastores backed by object store Christian Ebner
@ 2025-07-18 12:46 ` Lukas Wagner
0 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 12:46 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
On 2025-07-15 14:53, Christian Ebner wrote:
> Allows to trigger a refresh of the local datastore contents from
> the WebUI.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - no changes
>
> www/datastore/Summary.js | 44 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 44 insertions(+)
>
> diff --git a/www/datastore/Summary.js b/www/datastore/Summary.js
> index cdb34aea3..d8f59ebc5 100644
> --- a/www/datastore/Summary.js
> +++ b/www/datastore/Summary.js
> @@ -301,6 +301,31 @@ Ext.define('PBS.DataStoreSummary', {
> });
> },
> },
> + {
> + xtype: 'button',
> + text: gettext('S3 Refresh'),
> + hidden: true,
> + itemId: 's3RefreshButton',
> + reference: 's3RefreshButton',
> + handler: function () {
> + let me = this;
> + let datastore = me.up('panel').datastore;
> + Proxmox.Utils.API2Request({
> + url: `/admin/datastore/${datastore}/s3-refresh`,
> + method: 'PUT',
> + failure: (response) => Ext.Msg.alert(gettext('Error'), response.htmlStatus),
> + success: function (response, options) {
> + Ext.create('Proxmox.window.TaskViewer', {
> + upid: response.result.data,
> + taskDone: () => {
> + me.up('panel').statusStore.load();
> + Ext.ComponentQuery.query('navigationtree')[0]?.reloadStore();
> + },
> + }).show();
> + },
> + });
> + },
> + },
Already discussed off-list:
The "S3 Refresh" button should in my opinion be not placed as prominently as it is now.
In normal operation, a user should not need it. I'd maybe create some drop-down menu
in the "Content" page and use also a bit more text, e.g. "Refresh backup snapshots from S3 bucket"
or the like.
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 45/45] docs: Add section describing how to setup s3 backed datastore
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 45/45] docs: Add section describing how to setup s3 backed datastore Christian Ebner
@ 2025-07-18 13:14 ` Maximiliano Sandoval
2025-07-18 14:38 ` Christian Ebner
0 siblings, 1 reply; 109+ messages in thread
From: Maximiliano Sandoval @ 2025-07-18 13:14 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
Documentation looks good to me.
Some small comments bellow.
Christian Ebner <c.ebner@proxmox.com> writes:
> Describe required basic S3 client setup and possible configuration
> options as well as the actual setup of a datastore using the client and
> a bucket as backend.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 7:
> - new in this version
>
> docs/storage.rst | 68 ++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 68 insertions(+)
>
> diff --git a/docs/storage.rst b/docs/storage.rst
> index 4a8d8255e..0bac85fc3 100644
> --- a/docs/storage.rst
> +++ b/docs/storage.rst
> @@ -233,6 +233,74 @@ datastore is not mounted when they are scheduled. Sync jobs start, but fail
> with an error saying the datastore was not mounted. The reason is that syncs
> not happening as scheduled should at least be noticeable.
>
> +Datastores with S3 Backend (experimental)
I think we generally use the term "technology preview" in these cases.
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Proxmox Backup Server supports S3 compatible object stores as storage backend for datastores. For
> +this, an S3 client needs to be set-up under "Configuration" > "S3 Clients".
> +
> +In the client configuration, provide the REST API endpoint for the object store. The endpoint
> +is provider dependent and allows for the bucket and region templating. For example, configuring
> +the endpoint as e.g. ``{{bucket}}.s3.{{region}}.amazonaws.com`` will be expanded to
> +``my-pbs-bucket.s3.eu-central-1.amazonaws.com`` with a configured bucket of name ``my-pbs-bucket``
> +located in region ``eu-central-1``.
> +
> +The bucket name is part of the datastore backend configuration rather than the client configuration,
> +as the same client might be reused for multiple bucket. Objects placed in the bucket are prefixed by
> +the datastore name, therefore it is possible to create multiple datastores using the same bucket.
> +
> +.. note:: Proxmox Backup Server does not handle bucket creation and access control. The bucket used
> + to store the datastore's objects as well as the access key have to be setup beforehand in your S3
> + provider interface. The Proxmox Backup Server acts as client and requires permissions to get, put
> + list and delete objects in the bucket.
> +
> +Most providers allow to access buckets either using a vhost style addressing, the bucket name being
> +part of the endpoint address, or via path style addressing, the bucket name being the prefix to
> +the path components of requests. Proxmox Backup Server supports both styles, favoring the vhost
> +style urls over the path style. To use path style addresses, set the corresponding configuration
> +flag.
> +
> +Proxmox Backup Server does not support plain text communication with the S3 API, all communication
> +is excrypted using HTTPS in transit. Therefore, for self-hostsd S3 object stores using a self-signed
s/excrypted/encrypted and s/hostsd/hosted.
> +certificate, the matching fingerprint has to be provided to the client configuration. Otherwise the
> +client refuses connections to the S3 object store.
> +
> +The following example shows the setup of a new s3 client configuration:
> +
> +.. code-block:: console
> +
> + # proxmox-backup-manager s3 client create my-s3-client --secrets-id my-s3-client --access-key 'my-access-key' --secret-key 'my-secret-key' --endpoint '{{bucket}}.s3.{{region}}.amazonaws.com' --region eu-central-1
> +
> +To list your s3 client configuration, run:
> +
> +.. code-block:: console
> +
> + # proxmox-backup-manager s3 client list
> +
> +A new datastore with S3 backend can be created using one of the configures S3 clients. Although
> +storing all contents on the S3 object store, the datastore requires nevertheless a local cache store,
> +used to increase performance and reduce the number of requests to the backend. For this, a local
> +filesystem path has to be provided during datastore creation, just like for regular datastore setup.
> +A minimum size of a few GiB of storage is recommended, given that cache datastore contents include
> +also data chunks.
> +
> +To setup a new datastore called ``my-s3-store`` placed in a bucket called ``pbs-s3-bucket``, run:
> +
> +.. code-block:: console
> +
> + # proxmox-backup-manager datastore create my-s3-store /mnt/datastore/my-s3-store-cache --backend type=s3,client=my-s3-client,bucket=pbs-s3-bucket
> +
> +A datastore cannot be shared between multiple instances, only one instance can operate on the
A Backup Server instance? I would personally specify this here instead
of in the next line.
> +datastore at a time. However, datastore contents used on a Proxmox Backup Server instance which is
> +no longer available can be reused on a fresh installation. To recreate the datastore, you must pass
> +the ``reuse-datastore`` and ``overwrite-in-use`` flags. Since the datastore name is used as prefix,
> +the same datastore name must be used.
> +
> +.. code-block:: console
> +
> + # proxmox-backup-manager datastore create my-s3-store /mnt/datastore/my-new-s3-store-cache --backend type=s3,client=my-s3-client,bucket=pbs-s3-bucket --reuse-datastore true --overwrite-in-use true
> +
> +
> Managing Datastores
> ^^^^^^^^^^^^^^^^^^^
Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (53 preceding siblings ...)
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 45/45] docs: Add section describing how to setup s3 backed datastore Christian Ebner
@ 2025-07-18 13:16 ` Lukas Wagner
2025-07-19 12:52 ` [pbs-devel] superseded: " Christian Ebner
55 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-18 13:16 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Christian Ebner
I'm now done reviewing the 'proxmox-backup' patches (all except the docs patch, I need a bit of a break before that)
Left a couple of comments, but most of that was just relatively minor stuff.
I really liked how you split your work into quite small patches, that made reviewing
this huge series quite manageable (still quite exhausting of course, more than 5 hours in).
I'm not super familiar with some of the inner workings of the backup server, so my review
mostly focused on style and coding practices - so there might be things on the
conceptual level that I might have missed.
Overall, great work Chris!
--
- Lukas
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 22/45] datastore: implement garbage collection for s3 backend
2025-07-18 9:47 ` Lukas Wagner
@ 2025-07-18 14:31 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 14:31 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 11:47 AM, Lukas Wagner wrote:
>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> Implements the garbage collection for datastores backed by an s3
>> object store.
>> Take advantage of the local datastore by placing marker files in the
>> chunk store during phase 1 of the garbage collection, updating their
>> atime if already present.
>> This allows us to avoid making expensive API calls to update object
>> metadata, which would only be possible via a copy object operation.
>>
>> The phase 2 is implemented by fetching a list of all the chunks via
>> the ListObjectsV2 API call, filtered by the chunk folder prefix.
>> This operation has to be performed in batches of 1000 objects, given
>> by the APIs response limits.
>> For each object key, lookup the marker file and decide based on the
>> marker existence and it's atime if the chunk object needs to be
>> removed. Deletion happens via the delete objects operation, allowing
>> to delete multiple chunks by a single request.
>>
>> This allows to efficiently lookup chunks which are not in use
>> anymore while being performant and cost effective.
>>
>> Baseline runtime performance tests:
>> -----------------------------------
>>
>> 3 garbage collection runs were performed with hot filesystem caches
>> (by additional GC run before the test runs). The PBS instance was
>> virtualized, the same virtualized disk using ZFS for all the local
>> cache stores:
>>
>> All datastores contained the same encrypted data, with the following
>> content statistics:
>> Original data usage: 269.685 GiB
>> On-Disk usage: 9.018 GiB (3.34%)
>> On-Disk chunks: 6477
>> Deduplication factor: 29.90
>> Average chunk size: 1.426 MiB
>>
>> The resutlts demonstrate the overhead caused by the additional
>> ListObjectV2 API calls and their processing, but depending on the
>> object store backend.
>>
>> Average garbage collection runtime:
>> Local datastore: (2.04 ± 0.01) s
>> Local RADOS gateway (Squid): (3.05 ± 0.01) s
>> AWS S3: (3.05 ± 0.01) s
>> Cloudflare R2: (6.71 ± 0.58) s
>>
>> After pruning of all datastore contents (therefore including
>> DeleteObjects requests):
>> Local datastore: 3.04 s
>> Local RADOS gateway (Squid): 14.08 s
>> AWS S3: 13.06 s
>> Cloudflare R2: 78.21 s
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> pbs-datastore/src/chunk_store.rs | 4 +
>> pbs-datastore/src/datastore.rs | 211 +++++++++++++++++++++++++++----
>> 2 files changed, 190 insertions(+), 25 deletions(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index 8c195df54..95f00e8d5 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -353,6 +353,10 @@ impl ChunkStore {
>> ProcessLocker::oldest_shared_lock(self.locker.clone().unwrap())
>> }
>>
>> + pub fn mutex(&self) -> &std::sync::Mutex<()> {
>> + &self.mutex
>> + }
>> +
>> pub fn sweep_unused_chunks(
>> &self,
>> oldest_writer: i64,
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index ca099c1d0..6cc7fdbaa 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -4,7 +4,7 @@ use std::os::unix::ffi::OsStrExt;
>> use std::os::unix::io::AsRawFd;
>> use std::path::{Path, PathBuf};
>> use std::sync::{Arc, LazyLock, Mutex};
>> -use std::time::Duration;
>> +use std::time::{Duration, SystemTime};
>>
>> use anyhow::{bail, format_err, Context, Error};
>> use http_body_util::BodyExt;
>> @@ -1209,6 +1209,7 @@ impl DataStore {
>> chunk_lru_cache: &mut Option<LruCache<[u8; 32], ()>>,
>> status: &mut GarbageCollectionStatus,
>> worker: &dyn WorkerTaskContext,
>> + s3_client: Option<Arc<S3Client>>,
>> ) -> Result<(), Error> {
>> status.index_file_count += 1;
>> status.index_data_bytes += index.index_bytes();
>> @@ -1225,21 +1226,41 @@ impl DataStore {
>> }
>> }
>>
>> - if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
>> - let hex = hex::encode(digest);
>> - warn!(
>> - "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
>> - );
>> -
>> - // touch any corresponding .bad files to keep them around, meaning if a chunk is
>> - // rewritten correctly they will be removed automatically, as well as if no index
>> - // file requires the chunk anymore (won't get to this loop then)
>> - for i in 0..=9 {
>> - let bad_ext = format!("{}.bad", i);
>> - let mut bad_path = PathBuf::new();
>> - bad_path.push(self.chunk_path(digest).0);
>> - bad_path.set_extension(bad_ext);
>> - self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
>> + match s3_client {
>> + None => {
>> + // Filesystem backend
>> + if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
>> + let hex = hex::encode(digest);
>> + warn!(
>> + "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
>> + );
>> +
>> + // touch any corresponding .bad files to keep them around, meaning if a chunk is
>> + // rewritten correctly they will be removed automatically, as well as if no index
>> + // file requires the chunk anymore (won't get to this loop then)
>> + for i in 0..=9 {
>> + let bad_ext = format!("{}.bad", i);
>> + let mut bad_path = PathBuf::new();
>> + bad_path.push(self.chunk_path(digest).0);
>> + bad_path.set_extension(bad_ext);
>> + self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
>> + }
>> + }
>> + }
>> + Some(ref _s3_client) => {
>> + // Update atime on local cache marker files.
>> + if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
>> + let (chunk_path, _digest) = self.chunk_path(digest);
>> + // Insert empty file as marker to tell GC phase2 that this is
>> + // a chunk still in-use, so to keep in the S3 object store.
>> + std::fs::File::options()
>> + .write(true)
>> + .create_new(true)
>> + .open(&chunk_path)
>> + .with_context(|| {
>> + format!("failed to create marker for chunk {}", hex::encode(digest))
>> + })?;
>> + }
>> }
>> }
>> }
>> @@ -1251,6 +1272,7 @@ impl DataStore {
>> status: &mut GarbageCollectionStatus,
>> worker: &dyn WorkerTaskContext,
>> cache_capacity: usize,
>> + s3_client: Option<Arc<S3Client>>,
>> ) -> Result<(), Error> {
>> // Iterate twice over the datastore to fetch index files, even if this comes with an
>> // additional runtime cost:
>> @@ -1344,6 +1366,7 @@ impl DataStore {
>> &mut chunk_lru_cache,
>> status,
>> worker,
>> + s3_client.as_ref().cloned(),
>> )?;
>>
>> if !unprocessed_index_list.remove(&path) {
>> @@ -1378,7 +1401,14 @@ impl DataStore {
>> continue;
>> }
>> };
>> - self.index_mark_used_chunks(index, &path, &mut chunk_lru_cache, status, worker)?;
>> + self.index_mark_used_chunks(
>> + index,
>> + &path,
>> + &mut chunk_lru_cache,
>> + status,
>> + worker,
>> + s3_client.as_ref().cloned(),
>> + )?;
>> warn!("Marked chunks for unexpected index file at '{path:?}'");
>> }
>> if strange_paths_count > 0 {
>> @@ -1476,18 +1506,149 @@ impl DataStore {
>> 1024 * 1024
>> };
>>
>> - info!("Start GC phase1 (mark used chunks)");
>> + let s3_client = match self.backend()? {
>> + DatastoreBackend::Filesystem => None,
>> + DatastoreBackend::S3(s3_client) => {
>> + proxmox_async::runtime::block_on(s3_client.head_bucket())
>> + .context("failed to reach bucket")?;
>> + Some(s3_client)
>> + }
>> + };
>>
>> - self.mark_used_chunks(&mut gc_status, worker, gc_cache_capacity)
>> - .context("marking used chunks failed")?;
>> + info!("Start GC phase1 (mark used chunks)");
>>
>> - info!("Start GC phase2 (sweep unused chunks)");
>> - self.inner.chunk_store.sweep_unused_chunks(
>> - oldest_writer,
>> - min_atime,
>> + self.mark_used_chunks(
>> &mut gc_status,
>> worker,
>> - )?;
>> + gc_cache_capacity,
>> + s3_client.as_ref().cloned(),
>> + )
>> + .context("marking used chunks failed")?;
>> +
>> + info!("Start GC phase2 (sweep unused chunks)");
>> +
>> + if let Some(ref s3_client) = s3_client {
>> + let mut chunk_count = 0;
>> + let prefix = S3PathPrefix::Some(".chunks/".to_string());
>> + // Operates in batches of 1000 objects max per request
>> + let mut list_bucket_result =
>> + proxmox_async::runtime::block_on(s3_client.list_objects_v2(&prefix, None))
>> + .context("failed to list chunk in s3 object store")?;
>> +
>> + let mut delete_list = Vec::with_capacity(1000);
>> + loop {
>> + let lock = self.inner.chunk_store.mutex().lock().unwrap();
>> +
>> + for content in list_bucket_result.contents {
>> + // Check object is actually a chunk
>> + let digest = match Path::new::<str>(&content.key).file_name() {
>> + Some(file_name) => file_name,
>> + // should never be the case as objects will have a filename
>> + None => continue,
>> + };
>> + let bytes = digest.as_bytes();
>> + if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
>> + continue;
>> + }
>> + if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
>> + continue;
>> + }
>> +
>> + let bad = bytes.ends_with(b".bad");
>> +
>> + // Safe since contains valid ascii hexdigits only as checked above.
>> + let digest_str = digest.to_string_lossy();
>> + let hexdigit_prefix = unsafe { digest_str.get_unchecked(0..4) };
>> + let mut chunk_path = self.base_path();
>> + chunk_path.push(".chunks");
>> + chunk_path.push(hexdigit_prefix);
>> + chunk_path.push(digest);
>> +
>> + // Check local markers (created or atime updated during phase1) and
>> + // keep or delete chunk based on that.
>> + let atime = match std::fs::metadata(chunk_path) {
>> + Ok(stat) => stat.accessed()?,
>> + Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
>> + // File not found, delete by setting atime to unix epoch
>> + info!("Not found, mark for deletion: {}", content.key);
>> + SystemTime::UNIX_EPOCH
>> + }
>> + Err(err) => return Err(err.into()),
>> + };
>> + let atime = atime.duration_since(SystemTime::UNIX_EPOCH)?.as_secs() as i64;
>> +
>> + chunk_count += 1;
>> +
>> + if atime < min_atime {
>> + delete_list.push(content.key);
>> + if bad {
>> + gc_status.removed_bad += 1;
>> + } else {
>> + gc_status.removed_chunks += 1;
>> + }
>> + gc_status.removed_bytes += content.size;
>> + } else if atime < oldest_writer {
>> + if bad {
>> + gc_status.still_bad += 1;
>> + } else {
>> + gc_status.pending_chunks += 1;
>> + }
>> + gc_status.pending_bytes += content.size;
>> + } else {
>> + if !bad {
>> + gc_status.disk_chunks += 1;
>> + }
>> + gc_status.disk_bytes += content.size;
>> + }
>> + }
>> +
>> + if !delete_list.is_empty() {
>> + let delete_objects_result = proxmox_async::runtime::block_on(
>> + s3_client.delete_objects(&delete_list),
>> + )?;
>> + if let Some(_err) = delete_objects_result.error {
>> + bail!("failed to delete some objects");
>> + }
>> + delete_list.clear();
>> + }
>> +
>> + drop(lock);
>> +
>> + // Process next batch of chunks if there is more
>> + if list_bucket_result.is_truncated {
>> + list_bucket_result =
>> + proxmox_async::runtime::block_on(s3_client.list_objects_v2(
>> + &prefix,
>> + list_bucket_result.next_continuation_token.as_deref(),
>> + ))?;
>> + continue;
>> + }
>> +
>> + break;
>> + }
>> + info!("processed {chunk_count} total chunks");
>> +
>> + // Phase 2 GC of Filesystem backed storage is phase 3 for S3 backed GC
>> + info!("Start GC phase3 (sweep unused chunk markers)");
>> +
>> + let mut tmp_gc_status = GarbageCollectionStatus {
>> + upid: Some(upid.to_string()),
>> + ..Default::default()
>> + };
>> + self.inner.chunk_store.sweep_unused_chunks(
>> + oldest_writer,
>> + min_atime,
>> + &mut tmp_gc_status,
>> + worker,
>> + )?;
>> + } else {
>> + self.inner.chunk_store.sweep_unused_chunks(
>> + oldest_writer,
>> + min_atime,
>> + &mut gc_status,
>> + worker,
>> + )?;
>> + }
>
> I found this big chunk for new code quite hard to follow.
>
> I guess everything between the `loop` start and the `if list_bucket_result.is_truncated` could
> maybe separated out to some `process_objects` (todo: find better name) function. IMO
> a good indicator is also the scope where you hold the lock.
>
> Within this block, it might also make sense to split it further, e.g.
> - check_if_chunk
> - get_local_chunk_path
> - get_local_chunk_atime
> - ...
>
> (there might be better ways to separate or name things, but you get the idea)
Okay, took a bit without breaking stuff because of all the
interdependence here and not all types being pub, but I managed to
restructure this quite a bit. Although I did opt for a slightly
different structure as suggested.
>
>
>>
>> info!(
>> "Removed garbage: {}",
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 45/45] docs: Add section describing how to setup s3 backed datastore
2025-07-18 13:14 ` Maximiliano Sandoval
@ 2025-07-18 14:38 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 14:38 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Maximiliano Sandoval
On 7/18/25 4:03 PM, Maximiliano Sandoval wrote:
>
> Documentation looks good to me.
>
> Some small comments bellow.
>
> Christian Ebner <c.ebner@proxmox.com> writes:
>
>> Describe required basic S3 client setup and possible configuration
>> options as well as the actual setup of a datastore using the client and
>> a bucket as backend.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - new in this version
>>
>> docs/storage.rst | 68 ++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 68 insertions(+)
>>
>> diff --git a/docs/storage.rst b/docs/storage.rst
>> index 4a8d8255e..0bac85fc3 100644
>> --- a/docs/storage.rst
>> +++ b/docs/storage.rst
>> @@ -233,6 +233,74 @@ datastore is not mounted when they are scheduled. Sync jobs start, but fail
>> with an error saying the datastore was not mounted. The reason is that syncs
>> not happening as scheduled should at least be noticeable.
>>
>> +Datastores with S3 Backend (experimental)
>
> I think we generally use the term "technology preview" in these cases.
I would nevertheless still consider this as experimental and only adapt
this *after* the public beta phase as technology preview.
Not sure if we gain much by not calling it what it is here, as well as
in the selector of the datastore creation window.
But I'm open for other opinions/naming suggestions.
>
>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +Proxmox Backup Server supports S3 compatible object stores as storage backend for datastores. For
>> +this, an S3 client needs to be set-up under "Configuration" > "S3 Clients".
>> +
>> +In the client configuration, provide the REST API endpoint for the object store. The endpoint
>> +is provider dependent and allows for the bucket and region templating. For example, configuring
>> +the endpoint as e.g. ``{{bucket}}.s3.{{region}}.amazonaws.com`` will be expanded to
>> +``my-pbs-bucket.s3.eu-central-1.amazonaws.com`` with a configured bucket of name ``my-pbs-bucket``
>> +located in region ``eu-central-1``.
>> +
>> +The bucket name is part of the datastore backend configuration rather than the client configuration,
>> +as the same client might be reused for multiple bucket. Objects placed in the bucket are prefixed by
>> +the datastore name, therefore it is possible to create multiple datastores using the same bucket.
>> +
>> +.. note:: Proxmox Backup Server does not handle bucket creation and access control. The bucket used
>> + to store the datastore's objects as well as the access key have to be setup beforehand in your S3
>> + provider interface. The Proxmox Backup Server acts as client and requires permissions to get, put
>> + list and delete objects in the bucket.
>> +
>> +Most providers allow to access buckets either using a vhost style addressing, the bucket name being
>> +part of the endpoint address, or via path style addressing, the bucket name being the prefix to
>> +the path components of requests. Proxmox Backup Server supports both styles, favoring the vhost
>> +style urls over the path style. To use path style addresses, set the corresponding configuration
>> +flag.
>> +
>> +Proxmox Backup Server does not support plain text communication with the S3 API, all communication
>> +is excrypted using HTTPS in transit. Therefore, for self-hostsd S3 object stores using a self-signed
>
> s/excrypted/encrypted and s/hostsd/hosted.
>
>> +certificate, the matching fingerprint has to be provided to the client configuration. Otherwise the
>> +client refuses connections to the S3 object store.
>> +
>> +The following example shows the setup of a new s3 client configuration:
>> +
>> +.. code-block:: console
>> +
>> + # proxmox-backup-manager s3 client create my-s3-client --secrets-id my-s3-client --access-key 'my-access-key' --secret-key 'my-secret-key' --endpoint '{{bucket}}.s3.{{region}}.amazonaws.com' --region eu-central-1
>> +
>> +To list your s3 client configuration, run:
>> +
>> +.. code-block:: console
>> +
>> + # proxmox-backup-manager s3 client list
>> +
>> +A new datastore with S3 backend can be created using one of the configures S3 clients. Although
>> +storing all contents on the S3 object store, the datastore requires nevertheless a local cache store,
>> +used to increase performance and reduce the number of requests to the backend. For this, a local
>> +filesystem path has to be provided during datastore creation, just like for regular datastore setup.
>> +A minimum size of a few GiB of storage is recommended, given that cache datastore contents include
>> +also data chunks.
>> +
>> +To setup a new datastore called ``my-s3-store`` placed in a bucket called ``pbs-s3-bucket``, run:
>> +
>> +.. code-block:: console
>> +
>> + # proxmox-backup-manager datastore create my-s3-store /mnt/datastore/my-s3-store-cache --backend type=s3,client=my-s3-client,bucket=pbs-s3-bucket
>> +
>> +A datastore cannot be shared between multiple instances, only one instance can operate on the
>
> A Backup Server instance? I would personally specify this here instead
> of in the next line.
>
>> +datastore at a time. However, datastore contents used on a Proxmox Backup Server instance which is
>> +no longer available can be reused on a fresh installation. To recreate the datastore, you must pass
>> +the ``reuse-datastore`` and ``overwrite-in-use`` flags. Since the datastore name is used as prefix,
>> +the same datastore name must be used.
>> +
>> +.. code-block:: console
>> +
>> + # proxmox-backup-manager datastore create my-s3-store /mnt/datastore/my-new-s3-store-cache --backend type=s3,client=my-s3-client,bucket=pbs-s3-bucket --reuse-datastore true --overwrite-in-use true
>> +
>> +
>> Managing Datastores
>> ^^^^^^^^^^^^^^^^^^^
>
> Reviewed-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 30/45] datastore: add local datastore cache for network attached storages
2025-07-18 11:24 ` Lukas Wagner
@ 2025-07-18 14:59 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 14:59 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 1:24 PM, Lukas Wagner wrote:
> Some rustdoc comments are missing, but otherwise looks fine to me.
>
> As a general remark, applying to this patch, but also in general: I think we should put a much larger
> focus onto writing unit- and integration tests for any significant chunks for new code, e.g.
> like the LocalDatastoreLruCache, and also slowly refactor existing code in a way so that it can be tested.
>
> Naturally, it is additional effort, but IMO it well pays of later. I'd also say that it makes
> reviews much easier, since the tests are living proof in the code that it works, and as a reviewer
> I also immediately see how the code is supposed to be used. Furthermore, they are a good way
> to detect regressions later on, e.g. due to changing third-party dependencies, and of course
> also changes in the product code itself.
>
> That being said, I won't ask you to write test for this patch now, since adding them after
> the fact is a big pain and might require a big refactor, e.g. to separate out and abstract away
> any dependencies on existing code. I just felt the urge to bring this up, since this
> is something we can definitely improve on.
Agreed, noted this in my todo list. For the time being I added the
missing docstrings as requested.
> On 2025-07-15 14:53, Christian Ebner wrote:
>> Use a local datastore as cache using LRU cache replacement policy for
>> operations on a datastore backed by a network, e.g. by an S3 object
>> store backend. The goal is to reduce number of requests to the
>> backend and thereby save costs (monetary as well as time).
>>
>> Cached chunks are stored on the local datastore cache, already
>> containing the datastore's contents metadata (namespace, group,
>> snapshot, owner, index files, ecc..), used to perform fast lookups.
>> The cache itself only stores chunk digests, not the raw data itself.
>> When payload data is required, contents are looked up and read from
>> the local datastore cache filesystem, including fallback to fetch from
>> the backend if the presumably cached entry is not found.
>>
>> The cacher allows to fetch cache items on cache misses via the access
>> method.
>>
>> The capacity of the cache is derived from the local datastore cache
>> filesystem, or by the user configured value, whichever is smalller.
>> The capacity is only set on instantiation of the store, and the current
>> value kept as long as the datastore remains cached in the datastore
>> cache. To change the value, the store has to be either be set to offline
>> mode and back, or the services restarted.
>>
>> Basic performance tests:
>>
>> Backup and upload of contents of linux git repository to AWS S3,
>> snapshots removed in-between each backup run to avoid other chunk reuse
>> optimization of PBS.
>>
>> no-cache:
>> had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 50.76 s (average 102.258 MiB/s)
>> empty-cache:
>> had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 50.42 s (average 102.945 MiB/s)
>> all-cached:
>> had to backup 5.069 GiB of 5.069 GiB (compressed 3.718 GiB) in 43.78 s (average 118.554 MiB/s)
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - use info instead of warn, as these might end up in the task logs as
>> well, possibly causing confusion if warning level
>>
>> pbs-datastore/src/datastore.rs | 70 ++++++-
>> pbs-datastore/src/lib.rs | 3 +
>> .../src/local_datastore_lru_cache.rs | 172 ++++++++++++++++++
>> 3 files changed, 244 insertions(+), 1 deletion(-)
>> create mode 100644 pbs-datastore/src/local_datastore_lru_cache.rs
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 89f45e7f8..cab0f5b4d 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -40,9 +40,10 @@ use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
>> use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
>> use crate::hierarchy::{ListGroups, ListGroupsType, ListNamespaces, ListNamespacesRecursive};
>> use crate::index::IndexFile;
>> +use crate::local_datastore_lru_cache::S3Cacher;
>> use crate::s3::S3_CONTENT_PREFIX;
>> use crate::task_tracking::{self, update_active_operations};
>> -use crate::DataBlob;
>> +use crate::{DataBlob, LocalDatastoreLruCache};
>>
>> static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
>> LazyLock::new(|| Mutex::new(HashMap::new()));
>> @@ -136,6 +137,7 @@ pub struct DataStoreImpl {
>> last_digest: Option<[u8; 32]>,
>> sync_level: DatastoreFSyncLevel,
>> backend_config: DatastoreBackendConfig,
>> + lru_store_caching: Option<LocalDatastoreLruCache>,
>> }
>>
>> impl DataStoreImpl {
>> @@ -151,6 +153,7 @@ impl DataStoreImpl {
>> last_digest: None,
>> sync_level: Default::default(),
>> backend_config: Default::default(),
>> + lru_store_caching: None,
>> })
>> }
>> }
>> @@ -255,6 +258,37 @@ impl DataStore {
>> Ok(backend_type)
>> }
>>
>> + pub fn cache(&self) -> Option<&LocalDatastoreLruCache> {
>> + self.inner.lru_store_caching.as_ref()
>> + }
>> +
>> + /// Check if the digest is present in the local datastore cache.
>> + /// Always returns false if there is no cache configured for this datastore.
>> + pub fn cache_contains(&self, digest: &[u8; 32]) -> bool {
>> + if let Some(cache) = self.inner.lru_store_caching.as_ref() {
>> + return cache.contains(digest);
>> + }
>> + false
>> + }
>> +
>> + /// Insert digest as most recently used on in the cache.
>> + /// Returns with success if there is no cache configured for this datastore.
>> + pub fn cache_insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
>> + if let Some(cache) = self.inner.lru_store_caching.as_ref() {
>> + return cache.insert(digest, chunk);
>> + }
>> + Ok(())
>> + }
>> +
>
> Missing rustdoc comment for this pub fn
>
>> + pub fn cacher(&self) -> Result<Option<S3Cacher>, Error> {
>> + self.backend().map(|backend| match backend {
>> + DatastoreBackend::S3(s3_client) => {
>> + Some(S3Cacher::new(s3_client, self.inner.chunk_store.clone()))
>> + }
>> + DatastoreBackend::Filesystem => None,
>> + })
>> + }
>> +
>> pub fn lookup_datastore(
>> name: &str,
>> operation: Option<Operation>,
>> @@ -437,6 +471,33 @@ impl DataStore {
>> .parse_property_string(config.backend.as_deref().unwrap_or(""))?,
>> )?;
>>
>> + let lru_store_caching = if DatastoreBackendType::S3 == backend_config.ty.unwrap_or_default()
>> + {
>> + let mut cache_capacity = 0;
>> + if let Ok(fs_info) = proxmox_sys::fs::fs_info(&chunk_store.base_path()) {
>> + cache_capacity = fs_info.available / (16 * 1024 * 1024);
>> + }
>> + if let Some(max_cache_size) = backend_config.max_cache_size {
>> + info!(
>> + "Got requested max cache size {max_cache_size} for store {}",
>> + config.name
>> + );
>> + let max_cache_capacity = max_cache_size.as_u64() / (16 * 1024 * 1024);
>> + cache_capacity = cache_capacity.min(max_cache_capacity);
>> + }
>> + let cache_capacity = usize::try_from(cache_capacity).unwrap_or_default();
>> +
>> + info!(
>> + "Using datastore cache with capacity {cache_capacity} for store {}",
>> + config.name
>> + );
>> +
>> + let cache = LocalDatastoreLruCache::new(cache_capacity, chunk_store.clone());
>> + Some(cache)
>> + } else {
>> + None
>> + };
>> +
>> Ok(DataStoreImpl {
>> chunk_store,
>> gc_mutex: Mutex::new(()),
>> @@ -446,6 +507,7 @@ impl DataStore {
>> last_digest,
>> sync_level: tuning.sync_level.unwrap_or_default(),
>> backend_config,
>> + lru_store_caching,
>> })
>> }
>>
>> @@ -1580,6 +1642,12 @@ impl DataStore {
>> chunk_count += 1;
>>
>> if atime < min_atime {
>> + if let Some(cache) = self.cache() {
>> + let mut digest_bytes = [0u8; 32];
>> + hex::decode_to_slice(digest.as_bytes(), &mut digest_bytes)?;
>> + // ignore errors, phase 3 will retry cleanup anyways
>> + let _ = cache.remove(&digest_bytes);
>> + }
>> delete_list.push(content.key);
>> if bad {
>> gc_status.removed_bad += 1;
>> diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
>> index ca6fdb7d8..b9eb035c2 100644
>> --- a/pbs-datastore/src/lib.rs
>> +++ b/pbs-datastore/src/lib.rs
>> @@ -217,3 +217,6 @@ pub use snapshot_reader::SnapshotReader;
>>
>> mod local_chunk_reader;
>> pub use local_chunk_reader::LocalChunkReader;
>> +
>> +mod local_datastore_lru_cache;
>> +pub use local_datastore_lru_cache::LocalDatastoreLruCache;
>> diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
>> new file mode 100644
>> index 000000000..bb64c52f3
>> --- /dev/null
>> +++ b/pbs-datastore/src/local_datastore_lru_cache.rs
>> @@ -0,0 +1,172 @@
>> +//! Use a local datastore as cache for operations on a datastore attached via
>> +//! a network layer (e.g. via the S3 backend).
>> +
>> +use std::future::Future;
>> +use std::sync::Arc;
>> +
>> +use anyhow::{bail, Error};
>> +use http_body_util::BodyExt;
>> +
>> +use pbs_tools::async_lru_cache::{AsyncCacher, AsyncLruCache};
>> +use proxmox_s3_client::S3Client;
>> +
>> +use crate::ChunkStore;
>> +use crate::DataBlob;
>> +
>
> v missing rustdoc for pub struct
>
>> +#[derive(Clone)]
>> +pub struct S3Cacher {
>> + client: Arc<S3Client>,
>> + store: Arc<ChunkStore>,
>> +}
>> +
>> +impl AsyncCacher<[u8; 32], ()> for S3Cacher {
>> + fn fetch(
>> + &self,
>> + key: [u8; 32],
>> + ) -> Box<dyn Future<Output = Result<Option<()>, Error>> + Send + 'static> {
>> + let client = self.client.clone();
>> + let store = self.store.clone();
>
> rather use Arc::clone(&...) here to avoid ambiguity
>
>> + Box::new(async move {
>> + let object_key = crate::s3::object_key_from_digest(&key)?;
>> + match client.get_object(object_key).await? {
>> + None => bail!("could not fetch object with key {}", hex::encode(key)),
>> + Some(response) => {
>> + let bytes = response.content.collect().await?.to_bytes();
>> + let chunk = DataBlob::from_raw(bytes.to_vec())?;
>> + store.insert_chunk(&chunk, &key)?;
>> + Ok(Some(()))
>> + }
>> + }
>> + })
>> + }
>> +}
>> +
>> +impl S3Cacher {
>
> v missing rustdoc for pub fn
>
>> + pub fn new(client: Arc<S3Client>, store: Arc<ChunkStore>) -> Self {
>> + Self { client, store }
>> + }
>> +}
>> +
>> +/// LRU cache using local datastore for caching chunks
>> +///
>> +/// Uses a LRU cache, but without storing the values in-memory but rather
>> +/// on the filesystem
>> +pub struct LocalDatastoreLruCache {
>> + cache: AsyncLruCache<[u8; 32], ()>,
>> + store: Arc<ChunkStore>,
>> +}
>> +
>> +impl LocalDatastoreLruCache {
>> + pub fn new(capacity: usize, store: Arc<ChunkStore>) -> Self {
>> + Self {
>> + cache: AsyncLruCache::new(capacity),
>> + store,
>> + }
>> + }
>> +
>> + /// Insert a new chunk into the local datastore cache.
>> + ///
>> + /// Fails if the chunk cannot be inserted successfully.
>> + pub fn insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
>> + self.store.insert_chunk(chunk, digest)?;
>> + self.cache.insert(*digest, (), |digest| {
>> + let (path, _digest_str) = self.store.chunk_path(&digest);
>> + // Truncate to free up space but keep the inode around, since that
>> + // is used as marker for chunks in use by garbage collection.
>> + if let Err(err) = nix::unistd::truncate(&path, 0) {
>> + if err != nix::errno::Errno::ENOENT {
>> + return Err(Error::from(err));
>> + }
>> + }
>> + Ok(())
>> + })
>> + }
>> +
>> + /// Remove a chunk from the local datastore cache.
>> + ///
>> + /// Fails if the chunk cannot be deleted successfully.
>> + pub fn remove(&self, digest: &[u8; 32]) -> Result<(), Error> {
>> + self.cache.remove(*digest);
>> + let (path, _digest_str) = self.store.chunk_path(digest);
>> + std::fs::remove_file(path).map_err(Error::from)
>> + }
>> +
>
> v missing rustdoc
>
>> + pub async fn access(
>> + &self,
>> + digest: &[u8; 32],
>> + cacher: &mut S3Cacher,
>> + ) -> Result<Option<DataBlob>, Error> {
>> + if self
>> + .cache
>> + .access(*digest, cacher, |digest| {
>> + let (path, _digest_str) = self.store.chunk_path(&digest);
>> + // Truncate to free up space but keep the inode around, since that
>> + // is used as marker for chunks in use by garbage collection.
>> + if let Err(err) = nix::unistd::truncate(&path, 0) {
>> + if err != nix::errno::Errno::ENOENT {
>> + return Err(Error::from(err));
>> + }
>> + }
>> + Ok(())
>> + })
>> + .await?
>> + .is_some()
>> + {
>> + let (path, _digest_str) = self.store.chunk_path(digest);
>> + let mut file = match std::fs::File::open(&path) {
>> + Ok(file) => file,
>> + Err(err) => {
>> + // Expected chunk to be present since LRU cache has it, but it is missing
>> + // locally, try to fetch again
>> + if err.kind() == std::io::ErrorKind::NotFound {
>> + let object_key = crate::s3::object_key_from_digest(digest)?;
>> + match cacher.client.get_object(object_key).await? {
>> + None => {
>> + bail!("could not fetch object with key {}", hex::encode(digest))
>> + }
>> + Some(response) => {
>> + let bytes = response.content.collect().await?.to_bytes();
>> + let chunk = DataBlob::from_raw(bytes.to_vec())?;
>> + self.store.insert_chunk(&chunk, digest)?;
>> + std::fs::File::open(&path)?
>> + }
>> + }
>> + } else {
>> + return Err(Error::from(err));
>> + }
>> + }
>> + };
>> + let chunk = match DataBlob::load_from_reader(&mut file) {
>> + Ok(chunk) => chunk,
>> + Err(err) => {
>> + use std::io::Seek;
>> + // Check if file is empty marker file, try fetching content if so
>> + if file.seek(std::io::SeekFrom::End(0))? == 0 {
>> + let object_key = crate::s3::object_key_from_digest(digest)?;
>> + match cacher.client.get_object(object_key).await? {
>> + None => {
>> + bail!("could not fetch object with key {}", hex::encode(digest))
>> + }
>> + Some(response) => {
>> + let bytes = response.content.collect().await?.to_bytes();
>> + let chunk = DataBlob::from_raw(bytes.to_vec())?;
>> + self.store.insert_chunk(&chunk, digest)?;
>> + let mut file = std::fs::File::open(&path)?;
>> + DataBlob::load_from_reader(&mut file)?
>> + }
>> + }
>> + } else {
>> + return Err(err);
>> + }
>> + }
>> + };
>> + Ok(Some(chunk))
>> + } else {
>> + Ok(None)
>> + }
>> + }
>> +
>
> v missing rustdoc
>
>> + pub fn contains(&self, digest: &[u8; 32]) -> bool {
>> + self.cache.contains(*digest)
>> + }
>> +}
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 33/45] datastore: local chunk reader: get cached chunk from local cache store
2025-07-18 11:36 ` Lukas Wagner
@ 2025-07-18 15:04 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 15:04 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 1:36 PM, Lukas Wagner wrote:
> With my nits addresses:
>
> Reviewed-by: Lukas Wagner <l.wagner@proxmox.com>
>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> Check if a chunk is contained in the local cache and if so prefer
>> fetching it from the cache instead of pulling it via the S3 api. This
>> improves performance and reduces number of requests to the backend.
>>
>> Basic restore performance tests:
>>
>> Restored a snapshot containing the linux git repository (on-disk size
>> 5.069 GiB, compressed 3.718 GiB) from an AWS S3 backed datastore, with
>> and without cached contents:
>> non cached: 691.95 s
>> all cached: 74.89 s
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> pbs-datastore/src/local_chunk_reader.rs | 31 +++++++++++++++++++++----
>> 1 file changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/pbs-datastore/src/local_chunk_reader.rs b/pbs-datastore/src/local_chunk_reader.rs
>> index f5aa217ae..7ad44c4fa 100644
>> --- a/pbs-datastore/src/local_chunk_reader.rs
>> +++ b/pbs-datastore/src/local_chunk_reader.rs
>> @@ -2,7 +2,7 @@ use std::future::Future;
>> use std::pin::Pin;
>> use std::sync::Arc;
>>
>> -use anyhow::{bail, Error};
>> +use anyhow::{bail, format_err, Error};
>> use http_body_util::BodyExt;
>>
>> use pbs_api_types::CryptMode;
>> @@ -68,9 +68,18 @@ impl ReadChunk for LocalChunkReader {
>> fn read_raw_chunk(&self, digest: &[u8; 32]) -> Result<DataBlob, Error> {
>> let chunk = match &self.backend {
>> DatastoreBackend::Filesystem => self.store.load_chunk(digest)?,
>> - DatastoreBackend::S3(s3_client) => {
>> - proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?
>
> better use Arc::clone here :)
adapted here ...
>
>> - }
>> + DatastoreBackend::S3(s3_client) => match self.store.cache() {
>> + None => proxmox_async::runtime::block_on(fetch(s3_client.clone(), digest))?,
>> + Some(cache) => {
>> + let mut cacher = self
>> + .store
>> + .cacher()?
>> + .ok_or(format_err!("no cacher for datastore"))?;
>> + proxmox_async::runtime::block_on(cache.access(digest, &mut cacher))?.ok_or(
>> + format_err!("unable to access chunk with digest {}", hex::encode(digest)),
>> + )?
>> + }
>> + },
>> };
>> self.ensure_crypt_mode(chunk.crypt_mode()?)?;
>>
>> @@ -98,7 +107,19 @@ impl AsyncReadChunk for LocalChunkReader {
>> let raw_data = tokio::fs::read(&path).await?;
>> DataBlob::load_from_reader(&mut &raw_data[..])?
>> }
>> - DatastoreBackend::S3(s3_client) => fetch(s3_client.clone(), digest).await?,
>> + DatastoreBackend::S3(s3_client) => match self.store.cache() {
>> + None => fetch(s3_client.clone(), digest).await?,
>
> same here
... and here
>
>> + Some(cache) => {
>> + let mut cacher = self
>> + .store
>> + .cacher()?
>> + .ok_or(format_err!("no cacher for datastore"))?;
>> + cache.access(digest, &mut cacher).await?.ok_or(format_err!(
>> + "unable to access chunk with digest {}",
>> + hex::encode(digest)
>> + ))?
>> + }
>> + },
>> };
>> self.ensure_crypt_mode(chunk.crypt_mode()?)?;
>>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 34/45] api: backup: add no-cache flag to bypass local datastore cache
2025-07-18 11:41 ` Lukas Wagner
@ 2025-07-18 15:37 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 15:37 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 1:41 PM, Lukas Wagner wrote:
>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> Adds the `no-cache` flag so the client can request to bypass the
>> local datastore cache for chunk uploads. This is mainly intended for
>> debugging and benchmarking, but can be used in cases the caching is
>> known to be ineffective (no possible deduplication).
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - no changes
>>
>> examples/upload-speed.rs | 1 +
>> pbs-client/src/backup_writer.rs | 4 +++-
>> proxmox-backup-client/src/benchmark.rs | 1 +
>> proxmox-backup-client/src/main.rs | 8 ++++++++
>> src/api2/backup/environment.rs | 3 +++
>> src/api2/backup/mod.rs | 3 +++
>> src/api2/backup/upload_chunk.rs | 9 +++++++++
>> src/server/push.rs | 1 +
>> 8 files changed, 29 insertions(+), 1 deletion(-)
>>
>> diff --git a/examples/upload-speed.rs b/examples/upload-speed.rs
>> index e4b570ec5..8a6594a47 100644
>> --- a/examples/upload-speed.rs
>> +++ b/examples/upload-speed.rs
>> @@ -25,6 +25,7 @@ async fn upload_speed() -> Result<f64, Error> {
>> &(BackupType::Host, "speedtest".to_string(), backup_time).into(),
>> false,
>> true,
>> + false,
>> )
>> .await?;
>>
>> diff --git a/pbs-client/src/backup_writer.rs b/pbs-client/src/backup_writer.rs
>> index 1253ef561..ce5bd9375 100644
>> --- a/pbs-client/src/backup_writer.rs
>> +++ b/pbs-client/src/backup_writer.rs
>> @@ -82,6 +82,7 @@ impl BackupWriter {
>> backup: &BackupDir,
>> debug: bool,
>> benchmark: bool,
>> + no_cache: bool,
>> ) -> Result<Arc<BackupWriter>, Error> {
>> let mut param = json!({
>> "backup-type": backup.ty(),
>> @@ -89,7 +90,8 @@ impl BackupWriter {
>> "backup-time": backup.time,
>> "store": datastore,
>> "debug": debug,
>> - "benchmark": benchmark
>> + "benchmark": benchmark,
>> + "no-cache": no_cache,
>> });
>>
>> if !ns.is_root() {
>> diff --git a/proxmox-backup-client/src/benchmark.rs b/proxmox-backup-client/src/benchmark.rs
>> index a6f24d745..ed21c7a91 100644
>> --- a/proxmox-backup-client/src/benchmark.rs
>> +++ b/proxmox-backup-client/src/benchmark.rs
>> @@ -236,6 +236,7 @@ async fn test_upload_speed(
>> &(BackupType::Host, "benchmark".to_string(), backup_time).into(),
>> false,
>> true,
>> + true,
>> )
>> .await?;
>>
>> diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
>> index 44f4f5db5..83fc9309a 100644
>> --- a/proxmox-backup-client/src/main.rs
>> +++ b/proxmox-backup-client/src/main.rs
>> @@ -742,6 +742,12 @@ fn spawn_catalog_upload(
>> optional: true,
>> default: false,
>> },
>> + "no-cache": {
>> + type: Boolean,
>> + description: "Bypass local datastore cache for network storages.",
>> + optional: true,
>> + default: false,
>> + },
>> }
>> }
>> )]
>> @@ -754,6 +760,7 @@ async fn create_backup(
>> change_detection_mode: Option<BackupDetectionMode>,
>> dry_run: bool,
>> skip_e2big_xattr: bool,
>> + no_cache: bool,
>> limit: ClientRateLimitConfig,
>> _info: &ApiMethod,
>> _rpcenv: &mut dyn RpcEnvironment,
>> @@ -960,6 +967,7 @@ async fn create_backup(
>> &snapshot,
>> true,
>> false,
>> + no_cache,
>> )
>> .await?;
>>
>> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
>> index 369385368..448659e74 100644
>> --- a/src/api2/backup/environment.rs
>> +++ b/src/api2/backup/environment.rs
>> @@ -113,6 +113,7 @@ pub struct BackupEnvironment {
>> result_attributes: Value,
>> auth_id: Authid,
>> pub debug: bool,
>> + pub no_cache: bool,
>> pub formatter: &'static dyn OutputFormatter,
>> pub worker: Arc<WorkerTask>,
>> pub datastore: Arc<DataStore>,
>> @@ -129,6 +130,7 @@ impl BackupEnvironment {
>> worker: Arc<WorkerTask>,
>> datastore: Arc<DataStore>,
>> backup_dir: BackupDir,
>> + no_cache: bool,
>> ) -> Result<Self, Error> {
>> let state = SharedBackupState {
>> finished: false,
>> @@ -149,6 +151,7 @@ impl BackupEnvironment {
>> worker,
>> datastore,
>> debug: tracing::enabled!(tracing::Level::DEBUG),
>> + no_cache,
>> formatter: JSON_FORMATTER,
>> backup_dir,
>> last_backup: None,
>> diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
>> index 026f1f106..ae61ff697 100644
>> --- a/src/api2/backup/mod.rs
>> +++ b/src/api2/backup/mod.rs
>> @@ -53,6 +53,7 @@ pub const API_METHOD_UPGRADE_BACKUP: ApiMethod = ApiMethod::new(
>> ("backup-time", false, &BACKUP_TIME_SCHEMA),
>> ("debug", true, &BooleanSchema::new("Enable verbose debug logging.").schema()),
>> ("benchmark", true, &BooleanSchema::new("Job is a benchmark (do not keep data).").schema()),
>> + ("no-cache", true, &BooleanSchema::new("Disable local datastore cache for network storages").schema()),
>> ]),
>> )
>> ).access(
>> @@ -79,6 +80,7 @@ fn upgrade_to_backup_protocol(
>> async move {
>> let debug = param["debug"].as_bool().unwrap_or(false);
>> let benchmark = param["benchmark"].as_bool().unwrap_or(false);
>> + let no_cache = param["no-cache"].as_bool().unwrap_or(false);
>>
>> let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
>>
>> @@ -214,6 +216,7 @@ fn upgrade_to_backup_protocol(
>> worker.clone(),
>> datastore,
>> backup_dir,
>> + no_cache,
>> )?;
>>
>> env.debug = debug;
>> diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
>> index d97975b34..623b405dd 100644
>> --- a/src/api2/backup/upload_chunk.rs
>> +++ b/src/api2/backup/upload_chunk.rs
>> @@ -262,6 +262,15 @@ async fn upload_to_backend(
>> );
>> }
>>
>> + if env.no_cache {
>> + let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
>> + let is_duplicate = s3_client
>> + .upload_with_retry(object_key, data, false)
>> + .await
>> + .context("failed to upload chunk to s3 backend")?;
>> + return Ok((digest, size, encoded_size, is_duplicate));
>> + }
>> +
>> // Avoid re-upload to S3 if the chunk is either present in the LRU cache or the chunk
>> // file exists on filesystem. The latter means that the chunk has been present in the
>> // past an was not cleaned up by garbage collection, so contained in the S3 object store.
>> diff --git a/src/server/push.rs b/src/server/push.rs
>> index e71012ed8..6a31d2abe 100644
>> --- a/src/server/push.rs
>> +++ b/src/server/push.rs
>> @@ -828,6 +828,7 @@ pub(crate) async fn push_snapshot(
>> snapshot,
>> false,
>> false,
>> + false,
>
> There is already a FIXME for it above the BackupWriter::start function, but with the *third*
> boolean parameter I think it is overdue to use a parameter struct instead of plain parameters
> for this function.
Okay, I factored the backup writer parameters out into a
BackupWriterOptions struct, passing this along wherever needed. Makes
the call sides clearer.
>
>> )
>> .await?;
>>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 35/45] api/datastore: implement refresh endpoint for stores with s3 backend
2025-07-18 12:01 ` Lukas Wagner
@ 2025-07-18 15:51 ` Christian Ebner
0 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-18 15:51 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 2:00 PM, Lukas Wagner wrote:
>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> Allows to easily refresh the contents on the local cache store for
>> datastores backed by an S3 object store.
>>
>> In order to guarantee that no read or write operations are ongoing,
>> the store is first set into the maintenance mode `S3Refresh`. Objects
>> are then fetched into a temporary directory to avoid loosing contents
>> and consistency in case of an error. Once all objects have been
>> fetched, clears out existing contents and moves the newly fetched
>> contents in place.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - add more error context
>> - fix clippy warning
>>
>> pbs-datastore/src/datastore.rs | 172 ++++++++++++++++++++++++++++++++-
>> src/api2/admin/datastore.rs | 34 +++++++
>> 2 files changed, 205 insertions(+), 1 deletion(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index cab0f5b4d..c63759f9a 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -10,11 +10,13 @@ use anyhow::{bail, format_err, Context, Error};
>> use http_body_util::BodyExt;
>> use nix::unistd::{unlinkat, UnlinkatFlags};
>> use pbs_tools::lru_cache::LruCache;
>> +use proxmox_lang::try_block;
>> +use tokio::io::AsyncWriteExt;
>> use tracing::{info, warn};
>>
>> use proxmox_human_byte::HumanByte;
>> use proxmox_s3_client::{
>> - S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3PathPrefix,
>> + S3Client, S3ClientConfig, S3ClientOptions, S3ClientSecretsConfig, S3ObjectKey, S3PathPrefix,
>> };
>> use proxmox_schema::ApiType;
>>
>> @@ -2132,4 +2134,172 @@ impl DataStore {
>> pub fn old_locking(&self) -> bool {
>> *OLD_LOCKING
>> }
>> +
>> + /// Set the datastore's maintenance mode to `S3Refresh`, fetch from S3 object store, clear and
>> + /// replace the local cache store contents. Once finished disable the maintenance mode again.
>> + /// Returns with error for other datastore backends without setting the maintenance mode.
>> + pub async fn s3_refresh(self: &Arc<Self>) -> Result<(), Error> {
>> + match self.backend()? {
>> + DatastoreBackend::Filesystem => bail!("store '{}' not backed by S3", self.name()),
>> + DatastoreBackend::S3(s3_client) => {
>> + try_block!({
>> + let _lock = pbs_config::datastore::lock_config()?;
>> + let (mut section_config, _digest) = pbs_config::datastore::config()?;
>> + let mut datastore: DataStoreConfig =
>> + section_config.lookup("datastore", self.name())?;
>> + datastore.set_maintenance_mode(Some(MaintenanceMode {
>> + ty: MaintenanceType::S3Refresh,
>> + message: None,
>> + }))?;
>> + section_config.set_data(self.name(), "datastore", &datastore)?;
>> + pbs_config::datastore::save_config(§ion_config)?;
>> + drop(_lock);
>
>
> No need to drop the lock, since the block ends anyway, right?'
Agreed, dropping that here.
>
> Also this should be done in a tokio::spawn_blocking, if I'm not mistaken?
> (the try_block! is only a convenience wrapper that wraps the block in a function,
> it doesn't spawn the block on the blocking thread pool)
True, allows me to also get rid of the try_block after adapting that
here ...
>
>> + Ok::<(), Error>(())
>> + })
>> + .context("failed to set maintenance mode")?;
>> +
>> + let store_base = self.base_path();
>> +
>> + let tmp_base = proxmox_sys::fs::make_tmp_dir(&store_base, None)
>> + .context("failed to create temporary content folder in {store_base}")?;
>> +
>> + let backup_user = pbs_config::backup_user().context("failed to get backup user")?;
>> + let mode = nix::sys::stat::Mode::from_bits_truncate(0o0644);
>> + let file_create_options = CreateOptions::new()
>> + .perm(mode)
>> + .owner(backup_user.uid)
>> + .group(backup_user.gid);
>> + let mode = nix::sys::stat::Mode::from_bits_truncate(0o0755);
>> + let dir_create_options = CreateOptions::new()
>> + .perm(mode)
>> + .owner(backup_user.uid)
>> + .group(backup_user.gid);
>> +
>> + let list_prefix = S3PathPrefix::Some(S3_CONTENT_PREFIX.to_string());
>> + let store_prefix = format!("{}/{S3_CONTENT_PREFIX}/", self.name());
>> + let mut next_continuation_token: Option<String> = None;
>> + loop {
>> + let list_objects_result = s3_client
>> + .list_objects_v2(&list_prefix, next_continuation_token.as_deref())
>> + .await
>> + .context("failed to list object")?;
>> +
>> + let objects_to_fetch: Vec<S3ObjectKey> = list_objects_result
>> + .contents
>> + .into_iter()
>> + .map(|item| item.key)
>> + .collect();
>> +
>> + for object_key in objects_to_fetch {
>> + let object_path = format!("{object_key}");
>> + let object_path = object_path.strip_prefix(&store_prefix).with_context(||
>> + format!("failed to strip store context prefix {store_prefix} for {object_key}")
>> + )?;
>> + if object_path.ends_with(NAMESPACE_MARKER_FILENAME) {
>> + continue;
>> + }
>> +
>> + info!("Fetching object {object_path}");
>> +
>> + let file_path = tmp_base.join(object_path);
>> + if let Some(parent) = file_path.parent() {
>> + proxmox_sys::fs::create_path(
>> + parent,
>> + Some(dir_create_options),
>> + Some(dir_create_options),
>> + )?;
>> + }
>> +
>> + let mut target_file = tokio::fs::OpenOptions::new()
>> + .write(true)
>> + .create(true)
>> + .truncate(true)
>> + .read(true)
>> + .open(&file_path)
>> + .await
>> + .with_context(|| {
>> + format!("failed to create target file {file_path:?}")
>> + })?;
>> +
>> + if let Some(response) = s3_client
>> + .get_object(object_key)
>> + .await
>> + .with_context(|| format!("failed to fetch object {object_path}"))?
>> + {
>> + let data = response
>> + .content
>> + .collect()
>> + .await
>> + .context("failed to collect object contents")?;
>> + target_file
>> + .write_all(&data.to_bytes())
>> + .await
>> + .context("failed to write to target file")?;
>> + file_create_options
>> + .apply_to(&mut target_file, &file_path)
>> + .context("failed to set target file create options")?;
>> + target_file
>> + .flush()
>> + .await
>> + .context("failed to flush target file")?;
>> + } else {
>> + bail!("failed to download {object_path}, not found");
>> + }
>> + }
>> +
>> + if list_objects_result.is_truncated {
>> + next_continuation_token = list_objects_result
>> + .next_continuation_token
>> + .as_ref()
>> + .cloned();
>> + continue;
>> + }
>> + break;
>> + }
>> +
>> + for ty in ["vm", "ct", "host", "ns"] {
>> + let store_base_clone = store_base.clone();
>> + let tmp_base_clone = tmp_base.clone();
>> + tokio::task::spawn_blocking(move || {
>> + let type_dir = store_base_clone.join(ty);
>> + if let Err(err) = std::fs::remove_dir_all(&type_dir) {
>> + if err.kind() != io::ErrorKind::NotFound {
>> + return Err(err).with_context(|| {
>> + format!("failed to remove old contents in {type_dir:?}")
>> + });
>> + }
>> + }
>> + let tmp_type_dir = tmp_base_clone.join(ty);
>> + if let Err(err) = std::fs::rename(&tmp_type_dir, &type_dir) {
>> + if err.kind() != io::ErrorKind::NotFound {
>> + return Err(err)
>> + .with_context(|| format!("failed to rename {tmp_type_dir:?}"));
>> + }
>> + }
>> + Ok::<(), Error>(())
>> + })
>> + .await?
>> + .with_context(|| format!("failed to refresh {store_base:?}"))?;
>> + }
>> +
>> + std::fs::remove_dir_all(&tmp_base).with_context(|| {
>> + format!("failed to cleanup temporary content in {tmp_base:?}")
>> + })?;
>> +
>> + try_block!({
>> + let _lock = pbs_config::datastore::lock_config()?;
>> + let (mut section_config, _digest) = pbs_config::datastore::config()?;
>> + let mut datastore: DataStoreConfig =
>> + section_config.lookup("datastore", self.name())?;
>> + datastore.set_maintenance_mode(None)?;
>> + section_config.set_data(self.name(), "datastore", &datastore)?;
>> + pbs_config::datastore::save_config(§ion_config)?;
>> + drop(_lock);
>> + Ok::<(), Error>(())
>> + })
>> + .context("failed to clear maintenance mode")?;
>
> Same thing here.
... and here
>
>> + }
>> + }
>> + Ok(())
>> + }
>
> In general, I think the s3_refresh function is a good candidate to be broken up into multiple smaller functions
> - setting/unsetting maintenance mode
> - creating the new temporary dir
> - retrieving the objects from S3
> - replacing the old contents
> - etc.
Okay, will try to factor out parts of it, although I see not to much
benefit as this is rather self contained at the moment.
>> }
>> diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
>> index 80740e3fb..41cbee4de 100644
>> --- a/src/api2/admin/datastore.rs
>> +++ b/src/api2/admin/datastore.rs
>> @@ -2707,6 +2707,39 @@ pub async fn unmount(store: String, rpcenv: &mut dyn RpcEnvironment) -> Result<V
>> Ok(json!(upid))
>> }
>>
>> +#[api(
>> + protected: true,
>> + input: {
>> + properties: {
>> + store: {
>> + schema: DATASTORE_SCHEMA,
>> + },
>> + }
>> + },
>> + returns: {
>> + schema: UPID_SCHEMA,
>> + },
>> + access: {
>> + permission: &Permission::Privilege(&["datastore", "{store}"], PRIV_DATASTORE_MODIFY, false),
>> + },
>> +)]
>> +/// Refresh datastore contents from S3 to local cache store.
>> +pub async fn s3_refresh(store: String, rpcenv: &mut dyn RpcEnvironment) -> Result<Value, Error> {
>> + let datastore = DataStore::lookup_datastore(&store, Some(Operation::Lookup))?;
>> + let auth_id: Authid = rpcenv.get_auth_id().unwrap().parse()?;
>> + let to_stdout = rpcenv.env_type() == RpcEnvironmentType::CLI;
>> +
>> + let upid = WorkerTask::spawn(
>> + "s3-refresh",
>> + Some(store),
>> + auth_id.to_string(),
>> + to_stdout,
>> + move |_worker| async move { datastore.s3_refresh().await },
>> + )?;
>> +
>> + Ok(json!(upid))
>> +}
>> +
>> #[sortable]
>> const DATASTORE_INFO_SUBDIRS: SubdirMap = &[
>> (
>> @@ -2773,6 +2806,7 @@ const DATASTORE_INFO_SUBDIRS: SubdirMap = &[
>> &Router::new().download(&API_METHOD_PXAR_FILE_DOWNLOAD),
>> ),
>> ("rrd", &Router::new().get(&API_METHOD_GET_RRD_STATS)),
>> + ("s3-refresh", &Router::new().put(&API_METHOD_S3_REFRESH)),
>> (
>> "snapshots",
>> &Router::new()
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 27/45] ui: add s3 client selector and bucket field for s3 backend setup
2025-07-18 10:02 ` Lukas Wagner
@ 2025-07-19 12:28 ` Christian Ebner
2025-07-22 9:25 ` Lukas Wagner
0 siblings, 1 reply; 109+ messages in thread
From: Christian Ebner @ 2025-07-19 12:28 UTC (permalink / raw)
To: Lukas Wagner, Proxmox Backup Server development discussion
On 7/18/25 12:02 PM, Lukas Wagner wrote:
>
>
> On 2025-07-15 14:53, Christian Ebner wrote:
>> In order to be able to create datastore with an s3 object store
>> backend. Implements a s3 client selector and exposes it in the
>> datastore edit window, together with the additional bucket name field
>> to associate with the datastore's s3 backend.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 7:
>> - use field endpoint insteand of host, fixing the selector listing
>>
>> www/Makefile | 1 +
>> www/form/S3ClientSelector.js | 33 +++++++++++++++++++++++++++
>> www/window/DataStoreEdit.js | 44 ++++++++++++++++++++++++++++++++++++
>> 3 files changed, 78 insertions(+)
>> create mode 100644 www/form/S3ClientSelector.js
>>
>> diff --git a/www/Makefile b/www/Makefile
>> index 767713c75..410e9f3e0 100644
>> --- a/www/Makefile
>> +++ b/www/Makefile
>> @@ -42,6 +42,7 @@ JSSRC= \
>> Schema.js \
>> form/TokenSelector.js \
>> form/AuthidSelector.js \
>> + form/S3ClientSelector.js \
>> form/RemoteSelector.js \
>> form/RemoteTargetSelector.js \
>> form/DataStoreSelector.js \
>> diff --git a/www/form/S3ClientSelector.js b/www/form/S3ClientSelector.js
>> new file mode 100644
>> index 000000000..243484909
>> --- /dev/null
>> +++ b/www/form/S3ClientSelector.js
>> @@ -0,0 +1,33 @@
>> +Ext.define('PBS.form.S3ClientSelector', {
>> + extend: 'Proxmox.form.ComboGrid',
>> + alias: 'widget.pbsS3ClientSelector',
>> +
>> + allowBlank: false,
>> + autoSelect: false,
>> + valueField: 'id',
>> + displayField: 'id',
>> +
>> + store: {
>> + model: 'pmx-s3client',
>> + autoLoad: true,
>> + sorters: 'id',
>> + },
>> +
>> + listConfig: {
>> + columns: [
>> + {
>> + header: gettext('S3 Client ID'),
>> + sortable: true,
>> + dataIndex: 'id',
>> + renderer: Ext.String.htmlEncode,
>> + flex: 1,
>> + },
>> + {
>> + header: gettext('Endpoint'),
>> + sortable: true,
>> + dataIndex: 'endpoint',
>> + flex: 1,
>> + },
>> + ],
>> + },
>> +});
>> diff --git a/www/window/DataStoreEdit.js b/www/window/DataStoreEdit.js
>> index cd94f0335..3379bf773 100644
>> --- a/www/window/DataStoreEdit.js
>> +++ b/www/window/DataStoreEdit.js
>> @@ -61,6 +61,7 @@ Ext.define('PBS.DataStoreEdit', {
>> comboItems: [
>> ['__default__', 'Local'],
>> ['removable', 'Removable'],
>> + ['s3', 'S3 (experimental)'],
>
> Missing gettext here as well
added, thanks!
>> ],
>> cbind: {
>> disabled: '{!isCreate}',
>> @@ -68,18 +69,32 @@ Ext.define('PBS.DataStoreEdit', {
>> listeners: {
>> change: function (checkbox, selected) {
>> let isRemovable = selected === 'removable';
>> + let isS3 = selected === 's3';
>>
>> let inputPanel = checkbox.up('inputpanel');
>> let pathField = inputPanel.down('[name=path]');
>> let uuidEditField = inputPanel.down('[name=backing-device]');
>> + let bucketField = inputPanel.down('[name=bucket]');
>> + let s3ClientSelector = inputPanel.down('[name=s3client]');
>>
>> uuidEditField.setDisabled(!isRemovable);
>> uuidEditField.allowBlank = !isRemovable;
>> uuidEditField.setValue('');
>>
>> + bucketField.setDisabled(!isS3);
>> + bucketField.allowBlank = !isS3;
>> + bucketField.setValue('');
>> +
>> + s3ClientSelector.setDisabled(!isS3);
>> + s3ClientSelector.allowBlank = !isS3;
>> + s3ClientSelector.setValue('');
>> +
>> if (isRemovable) {
>> pathField.setFieldLabel(gettext('Path on Device'));
>> pathField.setEmptyText(gettext('A relative path'));
>> + } else if (isS3) {
>> + pathField.setFieldLabel(gettext('Store Cache'));
>> + pathField.setEmptyText(gettext('An absolute path'));
>> } else {
>> pathField.setFieldLabel(gettext('Backing Path'));
>> pathField.setEmptyText(gettext('An absolute path'));
>
> Yup, with these additional changes I'd definitely prefer the viewModel approach mentioned earlier :)
Well... I did check this out, also based on the off-list input you gave
me on this one, but unfortunately the viewModel and binding approach
does not work just yet, since the `pmxDisplayField` xtype used for the
`path` does not implement the logic for bindings of the `fieldLabel` and
`emtpyText`. Therefore these cannot be dynamically adapted by the use of
formulas and/or view model data.
So I would like to rather do this as followup instead, to not further
delay the path series until I've figured out how to correctly add these
as bindable values in the corresponding component.
But I do see the benefits of the viewModel and formulas approach, thanks
a lot for your input on that.
>> @@ -98,6 +113,15 @@ Ext.define('PBS.DataStoreEdit', {
>> emptyText: gettext('An absolute path'),
>> validator: (val) => val?.trim() !== '/',
>> },
>> + {
>> + xtype: 'pbsS3ClientSelector',
>> + name: 's3client',
>> + fieldLabel: gettext('S3 Client ID'),
>> + disabled: true,
>> + cbind: {
>> + editable: '{isCreate}',
>> + },
>> + },
>> ],
>> column2: [
>> {
>> @@ -132,6 +156,13 @@ Ext.define('PBS.DataStoreEdit', {
>> },
>> emptyText: gettext('Device path'),
>> },
>> + {
>> + xtype: 'proxmoxtextfield',
>> + name: 'bucket',
>> + fieldLabel: gettext('Bucket'),
>> + allowBlank: false,
>> + disabled: true,
>> + },
>> ],
>> columnB: [
>> {
>> @@ -154,7 +185,20 @@ Ext.define('PBS.DataStoreEdit', {
>> if (me.isCreate) {
>> // New datastores default to using the notification system
>> values['notification-mode'] = 'notification-system';
>> +
>> + if (values.s3client) {
>> + let s3BackendConf = {
>> + type: 's3',
>> + client: values.s3client,
>> + bucket: values.bucket,
>> + };
>> + values.backend = PBS.Utils.printPropertyString(s3BackendConf);
>> + }
>> }
>> +
>> + delete values.s3client;
>> + delete values.bucket;
>> +
>> return values;
>> },
>> },
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* [pbs-devel] superseded: [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
` (54 preceding siblings ...)
2025-07-18 13:16 ` [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Lukas Wagner
@ 2025-07-19 12:52 ` Christian Ebner
55 siblings, 0 replies; 109+ messages in thread
From: Christian Ebner @ 2025-07-19 12:52 UTC (permalink / raw)
To: pbs-devel
superseded-by version 9:
https://lore.proxmox.com/pbs-devel/20250719125035.9926-1-c.ebner@proxmox.com/T/
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v8 27/45] ui: add s3 client selector and bucket field for s3 backend setup
2025-07-19 12:28 ` Christian Ebner
@ 2025-07-22 9:25 ` Lukas Wagner
0 siblings, 0 replies; 109+ messages in thread
From: Lukas Wagner @ 2025-07-22 9:25 UTC (permalink / raw)
To: Christian Ebner, pbs-devel
On Sat Jul 19, 2025 at 2:28 PM CEST, Christian Ebner wrote:
> On 7/18/25 12:02 PM, Lukas Wagner wrote:
>>> listeners: {
>>> change: function (checkbox, selected) {
>>> let isRemovable = selected === 'removable';
>>> + let isS3 = selected === 's3';
>>>
>>> let inputPanel = checkbox.up('inputpanel');
>>> let pathField = inputPanel.down('[name=path]');
>>> let uuidEditField = inputPanel.down('[name=backing-device]');
>>> + let bucketField = inputPanel.down('[name=bucket]');
>>> + let s3ClientSelector = inputPanel.down('[name=s3client]');
>>>
>>> uuidEditField.setDisabled(!isRemovable);
>>> uuidEditField.allowBlank = !isRemovable;
>>> uuidEditField.setValue('');
>>>
>>> + bucketField.setDisabled(!isS3);
>>> + bucketField.allowBlank = !isS3;
>>> + bucketField.setValue('');
>>> +
>>> + s3ClientSelector.setDisabled(!isS3);
>>> + s3ClientSelector.allowBlank = !isS3;
>>> + s3ClientSelector.setValue('');
>>> +
>>> if (isRemovable) {
>>> pathField.setFieldLabel(gettext('Path on Device'));
>>> pathField.setEmptyText(gettext('A relative path'));
>>> + } else if (isS3) {
>>> + pathField.setFieldLabel(gettext('Store Cache'));
>>> + pathField.setEmptyText(gettext('An absolute path'));
>>> } else {
>>> pathField.setFieldLabel(gettext('Backing Path'));
>>> pathField.setEmptyText(gettext('An absolute path'));
>>
>> Yup, with these additional changes I'd definitely prefer the viewModel approach mentioned earlier :)
>
> Well... I did check this out, also based on the off-list input you gave
> me on this one, but unfortunately the viewModel and binding approach
> does not work just yet, since the `pmxDisplayField` xtype used for the
> `path` does not implement the logic for bindings of the `fieldLabel` and
> `emtpyText`. Therefore these cannot be dynamically adapted by the use of
> formulas and/or view model data.
>
> So I would like to rather do this as followup instead, to not further
> delay the path series until I've figured out how to correctly add these
> as bindable values in the corresponding component.
>
> But I do see the benefits of the viewModel and formulas approach, thanks
> a lot for your input on that.
As a workaround we could just keep the fieldLabel and emptyText static
and use separate fields that are hidden/shown on demand, supported by
the view model. Might need some additional code in onSetValues and/or
onGetValues to then map these fields to the correct parameter in the end.
But as you said, this is definitely material for a follow-up, no
pressure now.
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 109+ messages in thread
end of thread, other threads:[~2025-07-22 9:24 UTC | newest]
Thread overview: 109+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-15 12:52 [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 1/9] s3 client: add crate for AWS s3 compatible object store client Christian Ebner
2025-07-15 21:13 ` [pbs-devel] partially-applied-series: " Thomas Lamprecht
2025-07-15 21:13 ` [pve-devel] partially-applied-series: [pbs-devel] " Thomas Lamprecht
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 2/9] s3 client: implement AWS signature v4 request authentication Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 3/9] s3 client: add dedicated type for s3 object keys Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 4/9] s3 client: add type for last modified timestamp in responses Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 5/9] s3 client: add helper to parse http date headers Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 6/9] s3 client: implement methods to operate on s3 objects in bucket Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 7/9] s3 client: add example usage for basic operations Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 8/9] pbs-api-types: extend datastore config by backend config enum Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox v8 9/9] pbs-api-types: maintenance: add new maintenance mode S3 refresh Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 01/45] datastore: add helpers for path/digest to s3 object key conversion Christian Ebner
2025-07-18 7:24 ` Lukas Wagner
2025-07-18 8:34 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 02/45] config: introduce s3 object store client configuration Christian Ebner
2025-07-18 7:22 ` Lukas Wagner
2025-07-18 8:37 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 03/45] api: config: implement endpoints to manipulate and list s3 configs Christian Ebner
2025-07-18 7:32 ` Lukas Wagner
2025-07-18 8:40 ` Christian Ebner
2025-07-18 9:07 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 04/45] api: datastore: check s3 backend bucket access on datastore create Christian Ebner
2025-07-18 7:40 ` Lukas Wagner
2025-07-18 8:55 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 05/45] api/cli: add endpoint and command to check s3 client connection Christian Ebner
2025-07-18 7:43 ` Lukas Wagner
2025-07-18 9:04 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 06/45] datastore: allow to get the backend for a datastore Christian Ebner
2025-07-18 7:52 ` Lukas Wagner
2025-07-18 9:10 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 07/45] api: backup: store datastore backend in runtime environment Christian Ebner
2025-07-18 7:54 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 08/45] api: backup: conditionally upload chunks to s3 object store backend Christian Ebner
2025-07-18 8:11 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 09/45] api: backup: conditionally upload blobs " Christian Ebner
2025-07-18 8:13 ` Lukas Wagner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 10/45] api: backup: conditionally upload indices " Christian Ebner
2025-07-18 8:20 ` Lukas Wagner
2025-07-18 9:24 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 11/45] api: backup: conditionally upload manifest " Christian Ebner
2025-07-18 8:26 ` Lukas Wagner
2025-07-18 9:33 ` Christian Ebner
2025-07-15 12:52 ` [pbs-devel] [PATCH proxmox-backup v8 12/45] api: datastore: conditionally upload client log to s3 backend Christian Ebner
2025-07-18 8:28 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 13/45] sync: pull: conditionally upload content " Christian Ebner
2025-07-18 8:35 ` Lukas Wagner
2025-07-18 9:43 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 14/45] api: reader: fetch chunks based on datastore backend Christian Ebner
2025-07-18 8:38 ` Lukas Wagner
2025-07-18 9:58 ` Christian Ebner
2025-07-18 10:03 ` Lukas Wagner
2025-07-18 10:10 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 15/45] datastore: local chunk reader: read chunks based on backend Christian Ebner
2025-07-18 8:45 ` Lukas Wagner
2025-07-18 10:11 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 16/45] verify worker: add datastore backed to verify worker Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 17/45] verify: implement chunk verification for stores with s3 backend Christian Ebner
2025-07-18 8:56 ` Lukas Wagner
2025-07-18 11:45 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 18/45] datastore: create namespace marker in " Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 19/45] datastore: create/delete protected marker file on s3 storage backend Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 20/45] datastore: prune groups/snapshots from s3 object store backend Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 21/45] datastore: get and set owner for s3 " Christian Ebner
2025-07-18 9:25 ` Lukas Wagner
2025-07-18 12:12 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 22/45] datastore: implement garbage collection for s3 backend Christian Ebner
2025-07-18 9:47 ` Lukas Wagner
2025-07-18 14:31 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 23/45] ui: add datastore type selector and reorganize component layout Christian Ebner
2025-07-18 9:55 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 24/45] ui: add s3 client edit window for configuration create/edit Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 25/45] ui: add s3 client view for configuration Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 26/45] ui: expose the s3 client view in the navigation tree Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 27/45] ui: add s3 client selector and bucket field for s3 backend setup Christian Ebner
2025-07-18 10:02 ` Lukas Wagner
2025-07-19 12:28 ` Christian Ebner
2025-07-22 9:25 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 28/45] tools: lru cache: add removed callback for evicted cache nodes Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 29/45] tools: async lru cache: implement insert, remove and contains methods Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 30/45] datastore: add local datastore cache for network attached storages Christian Ebner
2025-07-18 11:24 ` Lukas Wagner
2025-07-18 14:59 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 31/45] api: backup: use local datastore cache on s3 backend chunk upload Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 32/45] api: reader: use local datastore cache on s3 backend chunk fetching Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 33/45] datastore: local chunk reader: get cached chunk from local cache store Christian Ebner
2025-07-18 11:36 ` Lukas Wagner
2025-07-18 15:04 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 34/45] api: backup: add no-cache flag to bypass local datastore cache Christian Ebner
2025-07-18 11:41 ` Lukas Wagner
2025-07-18 15:37 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 35/45] api/datastore: implement refresh endpoint for stores with s3 backend Christian Ebner
2025-07-18 12:01 ` Lukas Wagner
2025-07-18 15:51 ` Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 36/45] cli: add dedicated subcommand for datastore s3 refresh Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 37/45] ui: render s3 refresh as valid maintenance type and task description Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 38/45] ui: expose s3 refresh button for datastores backed by object store Christian Ebner
2025-07-18 12:46 ` Lukas Wagner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 39/45] datastore: conditionally upload atime marker chunk to s3 backend Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 40/45] bin: implement client subcommands for s3 configuration manipulation Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 41/45] bin: expose reuse-datastore flag for proxmox-backup-manager Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 42/45] datastore: mark store as in-use by setting marker on s3 backend Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 43/45] datastore: run s3-refresh when reusing a datastore with " Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 44/45] api/ui: add flag to allow overwriting in-use marker for " Christian Ebner
2025-07-15 12:53 ` [pbs-devel] [PATCH proxmox-backup v8 45/45] docs: Add section describing how to setup s3 backed datastore Christian Ebner
2025-07-18 13:14 ` Maximiliano Sandoval
2025-07-18 14:38 ` Christian Ebner
2025-07-18 13:16 ` [pbs-devel] [PATCH proxmox{, -backup} v8 00/54] fix #2943: S3 storage backend for datastores Lukas Wagner
2025-07-19 12:52 ` [pbs-devel] superseded: " Christian Ebner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.