From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [RFC v2 proxmox/bookworm-stable proxmox-backup 00/42] S3 storage backend for datastores
Date: Thu, 29 May 2025 16:31:25 +0200 [thread overview]
Message-ID: <20250529143207.694497-1-c.ebner@proxmox.com> (raw)
Disclaimer: These patches are in a development state and are not
intended for production use.
This patch series aims to add S3 compatible object stores as storage
backend for PBS datastores. A PBS local cache store using the regular
datastore layout is used for faster operation, bypassing requests to
the S3 api when possible. Further, the local cache store allows to
keep frequently used chunks and is used to avoid expensive metadata
updates on the object store, e.g. by using local marker file during
garbage collection.
Backups are created by upload chunks to the corresponding S3 bucket,
while keeping the index files in the local cache store, on backup
finish, the snapshot metadata are persisted to the S3 storage backend.
Snapshot restores read chunks preferably from the local cache store,
downloading and insterting them if not present from the S3 object
store.
Listing and snapsoht metadata operation currently rely soly on the
local cache store, with the intention to provide a mechanism to
re-sync and merge with object stored on the S3 backend if requested.
Sending this patch series as RFC to get some initial feedback, mostly
on the S3 client implementation part and the corresponding
configuration integration with PBS, which is already in an advanced
stage and warants initial review and real world testing.
Datastore operations on the S3 backend are still work in progress,
but feedback on that is appreciated very much as well.
Among the open points still being worked on are:
- Consistency between local cache and S3 store.
- Sync and merge of namespace, group snapshot and index files when
required or requested.
- Advanced packing mechanism for chunks to significantly reduce the
number of api requests and therefore be more cost effective.
- Reduction of in-memory copies for chunks/blobs and recalculation of
checksums.
Testing:
For testing, an S3 compatible object store provided via Ceph RADOS
gateway can be used by the following setup. This was performed on a
pre-existing Ceph Reef 18.2 cluster.
Install radosgw on all the nodes:
```
apt install radosgw
```
On one node, generate client keyring:
```
ceph-authtool --create-keyring /etc/pve/priv/ceph.client.radosgw.keyring
```
For each node, generate key and add it to the keyring (adapt name
accordingly):
```
ceph-authtool /etc/pve/priv/ceph.client.radosgw.keyring -n client.radosgw.pve-c0-n1 --gen-key
```
Setup capabilities for client keys:
```
ceph-authtool -n client.radosgw.pve-c0-n1 --cap osd 'allow rwx' --cap mon 'allow rwx' /etc/pve/priv/ceph.client.radosgw.keyring
```
Add the keys (repeat for each) to the cluster:
```
ceph -k /etc/pve/priv/ceph.client.admin.keyring auth add client.radosgw.pve-c0-n1 -i /etc/pve/priv/ceph.client.radosgw.keyring
```
For each client, add a config based on the one below to /etc/ceph/ceph.conf
```
[client.radosgw.pve-c0-n1]
host = pve-c0-n1
keyring = /etc/pve/priv/ceph.client.radosgw.keyring
log file = /var/log/ceph/client.radosgw.$host.log
rgw_dns_name = s3.pve-c0-n1.local
```
Restart the service for each node, e.g.
```
systemctl daemon-reload
systemctl restart radosgw.service
```
Setup a new user, generating access key and secret key shown in
output:
```
radosgw-admin user create --uid=testuser --display-name="TestUser" --email=your@mail.com
```
Since the configuration and keyring are located on the pmxcfs, add
the following override so the gateway service is only started after
pve-cluster by adding to
`/etc/systemd/system/radosgw.service.d/override.conf`:
```
[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/init.d/radosgw
Description=LSB: radosgw RESTful rados gateway
After=pve-cluster.service
Wants=pve-cluster.service
```
A custom certificate must be added since the client forces tls by
extending the config with a path to a custom generated certificate and
key:
```
[client.radosgw.pve-c0-n1]
host = pve-c0-n1
keyring = /etc/pve/priv/ceph.client.radosgw.keyring
logfile = /var/log/ceph/client.radsgw.$host.log
rgw_dns_name = s3.pve-c0-n1.local
rgw_frontends = "beast ssl_port=7480 ssl_certificate=/etc/pve/ceph/server-cert.pem ssl_private_key=/etc/pve/ceph/server-key.pem"
```
A new bucket can finally be created using the `s3cmd` cli tool after
initial configuration.
Most notable changes since the previous RFC version 1 [0]:
- Improved and fixed various issues with consistency and locking,
especially with respect to backup group/snapshot pruning
- Fix and improve listing and deletion of multiple object, also taking
S3 api object count limits into account.
- Fix namespace handling, especially with respect to prune.
- Fix pull sync jobs not uploading chunks to S3 object store backend
- Fix permissions for s3 config api endpoints
- Fixed issues with hyper::Body not being consumed when skipping cached
chunks, resulting in stream errors on upload
- Use md5 checksum for consistency checks, crc32 is not yet implemented
and ignored by many S3 compatible apis, e.g. rados gateway.
- Use iso6801 parser for last modified timestamp parser over limited own
implementation.
- Rework proxmox-backup-manager s3 command for basic sanity checking
- Smaller bugfixes, code style cleanups and refactoring
[0] https://lore.proxmox.com/pbs-devel/20250519114640.303640-1-c.ebner@proxmox.com/T/
proxmox:
Christian Ebner (2):
pbs-api-types: add types for S3 client configs and secrets
pbs-api-types: extend datastore config by backend config enum
pbs-api-types/src/datastore.rs | 58 +++++++++++++-
pbs-api-types/src/lib.rs | 3 +
pbs-api-types/src/s3.rs | 138 +++++++++++++++++++++++++++++++++
3 files changed, 198 insertions(+), 1 deletion(-)
create mode 100644 pbs-api-types/src/s3.rs
proxmox-backup:
Christian Ebner (40):
api: fix minor formatting issues
bin: sort submodules alphabetically
datastore: ignore missing owner file when removing group directory
verify: refactor verify related functions to be methods of worker
s3 client: add crate for AWS S3 compatible object store client
s3 client: implement AWS signature v4 request authentication
s3 client: add dedicated type for s3 object keys
s3 client: add type for last modified timestamp in responses
s3 client: add helper to parse http date headers
s3 client: implement methods to operate on s3 objects in bucket
config: introduce s3 object store client configuration
api: config: implement endpoints to manipulate and list s3 configs
api: datastore: check S3 backend bucket access on datastore create
api/bin: add endpoint and command to check s3 client connection
datastore: allow to get the backend for a datastore
api: backup: store datastore backend in runtime environment
api: backup: conditionally upload chunks to S3 object store backend
api: backup: conditionally upload blobs to S3 object store backend
api: backup: conditionally upload indices to S3 object store backend
api: backup: conditionally upload manifest to S3 object store backend
sync: pull: conditionally upload content to S3 backend
api: reader: fetch chunks based on datastore backend
datastore: local chunk reader: read chunks based on backend
verify worker: add datastore backed to verify worker
verify: implement chunk verification for stores with s3 backend
datastore: create namespace marker in S3 backend
datastore: create/delete protected marker file on S3 storage backend
datastore: prune groups/snapshots from S3 object store backend
datastore: get and set owner for S3 store backend
datastore: implement garbage collection for s3 backend
ui: add S3 client edit window for configuration create/edit
ui: add S3 client view for configuration
ui: expose the S3 client view in the navigation tree
ui: add s3 bucket selector and allow to set s3 backend
tools: lru cache: add removed callback for evicted cache nodes
tools: async lru cache: implement insert, remove and contains methods
datastore: add local datastore cache for network attached storages
api: backup: use local datastore cache on S3 backend chunk upload
api: reader: use local datastore cache on S3 backend chunk fetching
api: backup: add no-cache flag to bypass local datastore cache
Cargo.toml | 8 +
examples/upload-speed.rs | 1 +
pbs-client/src/backup_writer.rs | 4 +-
pbs-config/src/lib.rs | 1 +
pbs-config/src/s3.rs | 82 ++
pbs-datastore/Cargo.toml | 3 +
pbs-datastore/src/backup_info.rs | 53 +-
pbs-datastore/src/cached_chunk_reader.rs | 6 +-
pbs-datastore/src/datastore.rs | 435 ++++++++-
pbs-datastore/src/dynamic_index.rs | 1 +
pbs-datastore/src/lib.rs | 4 +
pbs-datastore/src/local_chunk_reader.rs | 37 +-
.../src/local_datastore_lru_cache.rs | 116 +++
pbs-s3-client/Cargo.toml | 29 +
pbs-s3-client/src/aws_sign_v4.rs | 140 +++
pbs-s3-client/src/client.rs | 594 ++++++++++++
pbs-s3-client/src/lib.rs | 122 +++
pbs-s3-client/src/object_key.rs | 64 ++
pbs-s3-client/src/response_reader.rs | 343 +++++++
pbs-tools/src/async_lru_cache.rs | 46 +-
pbs-tools/src/lru_cache.rs | 42 +-
proxmox-backup-client/src/benchmark.rs | 1 +
proxmox-backup-client/src/main.rs | 8 +
src/api2/admin/datastore.rs | 52 +-
src/api2/admin/mod.rs | 2 +
src/api2/admin/s3.rs | 72 ++
src/api2/backup/environment.rs | 145 ++-
src/api2/backup/mod.rs | 107 +--
src/api2/backup/upload_chunk.rs | 93 +-
src/api2/config/datastore.rs | 41 +-
src/api2/config/mod.rs | 2 +
src/api2/config/s3.rs | 305 ++++++
src/api2/reader/environment.rs | 12 +-
src/api2/reader/mod.rs | 59 +-
src/backup/verify.rs | 879 +++++++++---------
src/bin/proxmox-backup-manager.rs | 1 +
src/bin/proxmox_backup_manager/mod.rs | 30 +-
src/bin/proxmox_backup_manager/s3.rs | 34 +
src/server/pull.rs | 62 +-
src/server/push.rs | 1 +
src/server/verify_job.rs | 12 +-
www/Makefile | 3 +
www/NavigationTree.js | 6 +
www/config/S3BucketView.js | 144 +++
www/form/S3BucketSelector.js | 40 +
www/window/DataStoreEdit.js | 35 +
www/window/S3BucketEdit.js | 125 +++
47 files changed, 3753 insertions(+), 649 deletions(-)
create mode 100644 pbs-config/src/s3.rs
create mode 100644 pbs-datastore/src/local_datastore_lru_cache.rs
create mode 100644 pbs-s3-client/Cargo.toml
create mode 100644 pbs-s3-client/src/aws_sign_v4.rs
create mode 100644 pbs-s3-client/src/client.rs
create mode 100644 pbs-s3-client/src/lib.rs
create mode 100644 pbs-s3-client/src/object_key.rs
create mode 100644 pbs-s3-client/src/response_reader.rs
create mode 100644 src/api2/admin/s3.rs
create mode 100644 src/api2/config/s3.rs
create mode 100644 src/bin/proxmox_backup_manager/s3.rs
create mode 100644 www/config/S3BucketView.js
create mode 100644 www/form/S3BucketSelector.js
create mode 100644 www/window/S3BucketEdit.js
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next reply other threads:[~2025-05-29 14:32 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-29 14:31 Christian Ebner [this message]
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox/bookworm-stable 1/42] pbs-api-types: add types for S3 client configs and secrets Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox/bookworm-stable 2/42] pbs-api-types: extend datastore config by backend config enum Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 03/42] api: fix minor formatting issues Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 04/42] bin: sort submodules alphabetically Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 05/42] datastore: ignore missing owner file when removing group directory Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 06/42] verify: refactor verify related functions to be methods of worker Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 07/42] s3 client: add crate for AWS S3 compatible object store client Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 08/42] s3 client: implement AWS signature v4 request authentication Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 09/42] s3 client: add dedicated type for s3 object keys Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 10/42] s3 client: add type for last modified timestamp in responses Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 11/42] s3 client: add helper to parse http date headers Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 12/42] s3 client: implement methods to operate on s3 objects in bucket Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 13/42] config: introduce s3 object store client configuration Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 14/42] api: config: implement endpoints to manipulate and list s3 configs Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 15/42] api: datastore: check S3 backend bucket access on datastore create Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 16/42] api/bin: add endpoint and command to check s3 client connection Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 17/42] datastore: allow to get the backend for a datastore Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 18/42] api: backup: store datastore backend in runtime environment Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 19/42] api: backup: conditionally upload chunks to S3 object store backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 20/42] api: backup: conditionally upload blobs " Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 21/42] api: backup: conditionally upload indices " Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 22/42] api: backup: conditionally upload manifest " Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 23/42] sync: pull: conditionally upload content to S3 backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 24/42] api: reader: fetch chunks based on datastore backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 25/42] datastore: local chunk reader: read chunks based on backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 26/42] verify worker: add datastore backed to verify worker Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 27/42] verify: implement chunk verification for stores with s3 backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 28/42] datastore: create namespace marker in S3 backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 29/42] datastore: create/delete protected marker file on S3 storage backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 30/42] datastore: prune groups/snapshots from S3 object store backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 31/42] datastore: get and set owner for S3 " Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 32/42] datastore: implement garbage collection for s3 backend Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 33/42] ui: add S3 client edit window for configuration create/edit Christian Ebner
2025-05-29 14:31 ` [pbs-devel] [RFC v2 proxmox-backup 34/42] ui: add S3 client view for configuration Christian Ebner
2025-05-29 14:32 ` [pbs-devel] [RFC v2 proxmox-backup 35/42] ui: expose the S3 client view in the navigation tree Christian Ebner
2025-05-29 14:32 ` [pbs-devel] [RFC v2 proxmox-backup 36/42] ui: add s3 bucket selector and allow to set s3 backend Christian Ebner
2025-05-29 14:32 ` [pbs-devel] [RFC v2 proxmox-backup 37/42] tools: lru cache: add removed callback for evicted cache nodes Christian Ebner
2025-05-29 14:32 ` [pbs-devel] [RFC v2 proxmox-backup 38/42] tools: async lru cache: implement insert, remove and contains methods Christian Ebner
2025-05-29 14:32 ` [pbs-devel] [RFC v2 proxmox-backup 39/42] datastore: add local datastore cache for network attached storages Christian Ebner
2025-05-29 14:32 ` [pbs-devel] [RFC v2 proxmox-backup 40/42] api: backup: use local datastore cache on S3 backend chunk upload Christian Ebner
2025-05-29 14:32 ` [pbs-devel] [RFC v2 proxmox-backup 41/42] api: reader: use local datastore cache on S3 backend chunk fetching Christian Ebner
2025-05-29 14:32 ` [pbs-devel] [RFC v2 proxmox-backup 42/42] api: backup: add no-cache flag to bypass local datastore cache Christian Ebner
2025-06-04 11:58 ` [pbs-devel] [RFC v2 proxmox/bookworm-stable proxmox-backup 00/42] S3 storage backend for datastores Lukas Wagner
2025-06-06 7:40 ` Christian Ebner
2025-06-06 11:12 ` Lukas Wagner
2025-06-16 14:27 ` [pbs-devel] superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250529143207.694497-1-c.ebner@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal