all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Hannes Laimer" <h.laimer@proxmox.com>
To: "Christian Ebner" <c.ebner@proxmox.com>,
	"Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>,
	"Hannes Laimer" <h.laimer@proxmox.com>
Cc: pbs-devel <pbs-devel-bounces@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v9 01/46] datastore: add helpers for path/digest to s3 object key conversion
Date: Mon, 21 Jul 2025 14:55:14 +0200	[thread overview]
Message-ID: <DBHQZ5061F1T.WQLXASSCFGG6@proxmox.com> (raw)
In-Reply-To: <68faffd5-eb4e-4618-8f83-5d239b5ccea2@proxmox.com>

On Mon Jul 21, 2025 at 2:51 PM CEST, Christian Ebner wrote:
> On 7/21/25 2:29 PM, Hannes Laimer wrote:
>> On Sat Jul 19, 2025 at 2:49 PM CEST, Christian Ebner wrote:
>>> Adds helper methods to generate the s3 object keys given a relative
>>> path and filename for datastore contents or digest in case of chunk
>>> files.
>>>
>>> Regular datastore contents are stored by grouping them with a content
>>> prefix in the object key. In order to keep the object key length
>>> small, given the max limit of 1024 bytes {0], `.cnt` is used as
>>> content prefix. Chunks on the other hand are prefixed by `.chunks`,
>>> same as on regular datastores.
>>>
>>> The prefix allows for selective listing of either contents or chunks
>>> by providing the prefix to the respective api calls.
>>>
>>> [0] https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html
>>>
>>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>>> ---
>>> changes since version 8:
>>> - added unit tests for helper functions
>>>
>>>   Cargo.toml               |   1 +
>>>   pbs-datastore/Cargo.toml |   1 +
>>>   pbs-datastore/src/lib.rs |   1 +
>>>   pbs-datastore/src/s3.rs  | 114 +++++++++++++++++++++++++++++++++++++++
>>>   4 files changed, 117 insertions(+)
>>>   create mode 100644 pbs-datastore/src/s3.rs
>>>
>>> diff --git a/Cargo.toml b/Cargo.toml
>>> index adfa427d1..97783ddd5 100644
>>> --- a/Cargo.toml
>>> +++ b/Cargo.toml
>>> @@ -77,6 +77,7 @@ proxmox-rest-server = { version = "1", features = [ "templates" ] }
>>>   proxmox-router = { version = "3.2.2", default-features = false }
>>>   proxmox-rrd = "1"
>>>   proxmox-rrd-api-types = "1.0.2"
>>> +proxmox-s3-client = "1.0.0"
>>>   # everything but pbs-config and pbs-client use "api-macro"
>>>   proxmox-schema = "4"
>>>   proxmox-section-config = "3"
>>> diff --git a/pbs-datastore/Cargo.toml b/pbs-datastore/Cargo.toml
>>> index 56f6e9094..c42eff165 100644
>>> --- a/pbs-datastore/Cargo.toml
>>> +++ b/pbs-datastore/Cargo.toml
>>> @@ -34,6 +34,7 @@ proxmox-borrow.workspace = true
>>>   proxmox-human-byte.workspace = true
>>>   proxmox-io.workspace = true
>>>   proxmox-lang.workspace=true
>>> +proxmox-s3-client = { workspace = true, features = [ "impl" ] }
>>>   proxmox-schema = { workspace = true, features = [ "api-macro" ] }
>>>   proxmox-serde = { workspace = true, features = [ "serde_json" ] }
>>>   proxmox-sys.workspace = true
>>> diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
>>> index 5014b6c09..ffd0d91b2 100644
>>> --- a/pbs-datastore/src/lib.rs
>>> +++ b/pbs-datastore/src/lib.rs
>>> @@ -182,6 +182,7 @@ pub mod manifest;
>>>   pub mod paperkey;
>>>   pub mod prune;
>>>   pub mod read_chunk;
>>> +pub mod s3;
>>>   pub mod store_progress;
>>>   pub mod task_tracking;
>>>   
>>> diff --git a/pbs-datastore/src/s3.rs b/pbs-datastore/src/s3.rs
>>> new file mode 100644
>>> index 000000000..79e7548fb
>>> --- /dev/null
>>> +++ b/pbs-datastore/src/s3.rs
>>> @@ -0,0 +1,114 @@
>>> +use std::path::{Path, PathBuf};
>>> +
>>> +use anyhow::{bail, format_err, Error};
>>> +
>>> +use proxmox_s3_client::S3ObjectKey;
>>> +
>>> +/// Object key prefix to group regular datastore contents (not chunks)
>>> +pub const S3_CONTENT_PREFIX: &str = ".cnt";
>>> +
>>> +/// Generate a relative object key with content prefix from given path and filename
>>> +pub fn object_key_from_path(path: &Path, filename: &str) -> Result<S3ObjectKey, Error> {
>>> +    // Force the use of relative paths, otherwise this would loose the content prefix
>>> +    if path.is_absolute() {
>>> +        bail!("cannot generate object key from absolute path");
>>> +    }
>>> +    if filename.contains('/') {
>>> +        bail!("invalid filename containing slashes");
>>> +    }
>>> +    let mut object_path = PathBuf::from(S3_CONTENT_PREFIX);
>>> +    object_path.push(path);
>>> +    object_path.push(filename);
>>> +
>>> +    let object_key_str = object_path
>>> +        .to_str()
>>> +        .ok_or_else(|| format_err!("unexpected object key path"))?;
>>> +    Ok(S3ObjectKey::from(object_key_str))
>>> +}
>>> +
>>> +/// Generate a relative object key with chunk prefix from given digest
>>> +pub fn object_key_from_digest(digest: &[u8; 32]) -> Result<S3ObjectKey, Error> {
>>> +    let object_key = hex::encode(digest);
>>> +    let digest_prefix = &object_key[..4];
>>> +    let object_key_string = format!(".chunks/{digest_prefix}/{object_key}");
>> 
>> I just skimmed of the S3 key specs, but I was wondering if having the
>> `digest_prefix` in the key actually adds anything. For FSs sure, but S3?
>> They say this is just chars for them, they don't infer hierarchy on `/`s,
>> so whatever optimisation they do with the prefix present, they should
>> also do without it, no?
>
> Yes, however the intention was to keep this analogous to the filesystem 
> based datastore's in order to be able to fetch the contents by external 
> tooling without the need to have a running PBS instance. So you could 
> recreate a datastore locally if needed.
>

makes sense :) just didn't know if longer/shorter keys have any
performance implications(probably not I assume)

>> 
>>> +    Ok(S3ObjectKey::from(object_key_string.as_str()))
>>> +}
>>> +
>>> +/// Generate a relative object key with chunk prefix from given digest, extended by suffix
>>> +pub fn object_key_from_digest_with_suffix(
>>> +    digest: &[u8; 32],
>>> +    suffix: &str,
>>> +) -> Result<S3ObjectKey, Error> {
>>> +    if suffix.contains('/') {
>>> +        bail!("invalid suffix containing slashes");
>>> +    }
>>> +    let object_key = hex::encode(digest);
>>> +    let digest_prefix = &object_key[..4];
>>> +    let object_key_string = format!(".chunks/{digest_prefix}/{object_key}{suffix}");
>>> +    Ok(S3ObjectKey::from(object_key_string.as_str()))
>>> +}
>>> +
>>> +#[test]
>>> +fn test_object_key_from_path() {
>>> +    let path = Path::new("vm/100/2025-07-14T14:20:02Z");
>>> +    let filename = "drive-scsci0.img.fidx";
>>> +    assert_eq!(
>>> +        object_key_from_path(path, filename).unwrap().to_string(),
>>> +        ".cnt/vm/100/2025-07-14T14:20:02Z/drive-scsci0.img.fidx",
>>> +    );
>>> +}
>>> +
>>> +#[test]
>>> +fn test_object_key_from_empty_path() {
>>> +    let path = Path::new("");
>>> +    let filename = ".marker";
>>> +    assert_eq!(
>>> +        object_key_from_path(path, filename).unwrap().to_string(),
>>> +        ".cnt/.marker",
>>> +    );
>>> +}
>>> +
>>> +#[test]
>>> +fn test_object_key_from_absolute_path() {
>>> +    assert!(object_key_from_path(Path::new("/"), ".marker").is_err());
>>> +}
>>> +
>>> +#[test]
>>> +fn test_object_key_from_path_incorrect_filename() {
>>> +    assert!(object_key_from_path(Path::new(""), "/.marker").is_err());
>>> +}
>>> +
>>> +#[test]
>>> +fn test_object_key_from_digest() {
>>> +    use hex::FromHex;
>>> +    let digest =
>>> +        <[u8; 32]>::from_hex("bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8")
>>> +            .unwrap();
>>> +    assert_eq!(
>>> +        object_key_from_digest(&digest).unwrap().to_string(),
>>> +        ".chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8",
>>> +    );
>>> +}
>>> +
>>> +#[test]
>>> +fn test_object_key_from_digest_with_suffix() {
>>> +    use hex::FromHex;
>>> +    let digest =
>>> +        <[u8; 32]>::from_hex("bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8")
>>> +            .unwrap();
>>> +    assert_eq!(
>>> +        object_key_from_digest_with_suffix(&digest, ".0.bad")
>>> +            .unwrap()
>>> +            .to_string(),
>>> +        ".chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8.0.bad",
>>> +    );
>>> +}
>>> +
>>> +#[test]
>>> +fn test_object_key_from_digest_with_invalid_suffix() {
>>> +    use hex::FromHex;
>>> +    let digest =
>>> +        <[u8; 32]>::from_hex("bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8")
>>> +            .unwrap();
>>> +    assert!(object_key_from_digest_with_suffix(&digest, "/.0.bad").is_err());
>>> +}
>> 
>> 
>> 
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>> 
>> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


  reply	other threads:[~2025-07-21 12:54 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-19 12:49 [pbs-devel] [PATCH proxmox{, -backup} v9 00/49] fix #2943: S3 storage backend for datastores Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox v9 1/3] pbs-api-types: extend datastore config by backend config enum Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox v9 2/3] pbs-api-types: maintenance: add new maintenance mode S3 refresh Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox v9 3/3] s3 client: wrap upload with retry into dedicated methods Christian Ebner
2025-07-21 15:37   ` [pve-devel] applied: " Thomas Lamprecht
2025-07-21 15:37     ` [pbs-devel] applied: " Thomas Lamprecht
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 01/46] datastore: add helpers for path/digest to s3 object key conversion Christian Ebner
2025-07-21 12:29   ` Hannes Laimer
2025-07-21 12:51     ` Christian Ebner
2025-07-21 12:55       ` Hannes Laimer [this message]
2025-07-21 13:58   ` Hannes Laimer
2025-07-21 14:15     ` Christian Ebner
2025-07-21 14:20       ` Hannes Laimer
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 02/46] config: introduce s3 object store client configuration Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 03/46] api: config: implement endpoints to manipulate and list s3 configs Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 04/46] api: datastore: check s3 backend bucket access on datastore create Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 05/46] api/cli: add endpoint and command to check s3 client connection Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 06/46] datastore: allow to get the backend for a datastore Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 07/46] api: backup: store datastore backend in runtime environment Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 08/46] api: backup: conditionally upload chunks to s3 object store backend Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 09/46] api: backup: conditionally upload blobs " Christian Ebner
2025-07-19 12:49 ` [pbs-devel] [PATCH proxmox-backup v9 10/46] api: backup: conditionally upload indices " Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 11/46] api: backup: conditionally upload manifest " Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 12/46] api: datastore: conditionally upload client log to s3 backend Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 13/46] sync: pull: conditionally upload content " Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 14/46] api: reader: fetch chunks based on datastore backend Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 15/46] datastore: local chunk reader: read chunks based on backend Christian Ebner
2025-07-21 13:12   ` Hannes Laimer
2025-07-21 13:24     ` Christian Ebner
2025-07-21 13:36     ` Lukas Wagner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 16/46] verify worker: add datastore backed to verify worker Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 17/46] verify: implement chunk verification for stores with s3 backend Christian Ebner
2025-07-21 13:35   ` Hannes Laimer
2025-07-21 13:38     ` Christian Ebner
2025-07-21 13:55       ` Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 18/46] datastore: create namespace marker in " Christian Ebner
2025-07-21 13:52   ` Hannes Laimer
2025-07-21 14:01     ` Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 19/46] datastore: create/delete protected marker file on s3 storage backend Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 20/46] datastore: prune groups/snapshots from s3 object store backend Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 21/46] datastore: get and set owner for s3 " Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 22/46] datastore: implement garbage collection for s3 backend Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 23/46] ui: add datastore type selector and reorganize component layout Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 24/46] ui: add s3 client edit window for configuration create/edit Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 25/46] ui: add s3 client view for configuration Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 26/46] ui: expose the s3 client view in the navigation tree Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 27/46] ui: add s3 client selector and bucket field for s3 backend setup Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 28/46] tools: lru cache: add removed callback for evicted cache nodes Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 29/46] tools: async lru cache: implement insert, remove and contains methods Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 30/46] datastore: add local datastore cache for network attached storages Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 31/46] api: backup: use local datastore cache on s3 backend chunk upload Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 32/46] api: reader: use local datastore cache on s3 backend chunk fetching Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 33/46] datastore: local chunk reader: get cached chunk from local cache store Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 34/46] backup writer: refactor parameters into backup writer options struct Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 35/46] api: backup: add no-cache flag to bypass local datastore cache Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 36/46] api/datastore: implement refresh endpoint for stores with s3 backend Christian Ebner
2025-07-21 14:16   ` Hannes Laimer
2025-07-21 14:26     ` Christian Ebner
2025-07-21 14:31       ` Hannes Laimer
2025-07-21 14:42         ` Christian Ebner
2025-07-21 14:48           ` Hannes Laimer
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 37/46] cli: add dedicated subcommand for datastore s3 refresh Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 38/46] ui: render s3 refresh as valid maintenance type and task description Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 39/46] ui: expose s3 refresh button for datastores backed by object store Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 40/46] datastore: conditionally upload atime marker chunk to s3 backend Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 41/46] bin: implement client subcommands for s3 configuration manipulation Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 42/46] bin: expose reuse-datastore flag for proxmox-backup-manager Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 43/46] datastore: mark store as in-use by setting marker on s3 backend Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 44/46] datastore: run s3-refresh when reusing a datastore with " Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 45/46] api/ui: add flag to allow overwriting in-use marker for " Christian Ebner
2025-07-19 12:50 ` [pbs-devel] [PATCH proxmox-backup v9 46/46] docs: Add section describing how to setup s3 backed datastore Christian Ebner
2025-07-21 14:24 ` [pbs-devel] [PATCH proxmox{, -backup} v9 00/49] fix #2943: S3 storage backend for datastores Hannes Laimer
2025-07-21 15:05 ` Lukas Wagner
2025-07-21 15:37   ` Christian Ebner
2025-07-21 16:46 ` Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DBHQZ5061F1T.WQLXASSCFGG6@proxmox.com \
    --to=h.laimer@proxmox.com \
    --cc=c.ebner@proxmox.com \
    --cc=pbs-devel-bounces@lists.proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal