From: Christian Ebner <c.ebner@proxmox.com>
To: Lukas Wagner <l.wagner@proxmox.com>,
Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox{, -backup} v7 00/47] fix #2943: S3 storage backend for datastores
Date: Mon, 14 Jul 2025 17:40:30 +0200 [thread overview]
Message-ID: <208b3247-cc11-46b0-8e2e-603cbcffe763@proxmox.com> (raw)
In-Reply-To: <75170e22-a2f0-44b2-b612-5ce10bf85e49@proxmox.com>
On 7/14/25 16:33, Lukas Wagner wrote:
> On 2025-07-10 19:06, Christian Ebner wrote:
>> Disclaimer: These patches are still in an experimental state and not
>> intended for production use.
>>
>> This patch series aims to add S3 compatible object stores as storage
>> backend for PBS datastores. A PBS local cache store using the regular
>> datastore layout is used for faster operation, bypassing requests to
>> the S3 api when possible. Further, the local cache store allows to
>> keep frequently used chunks and is used to avoid expensive metadata
>> updates on the object store, e.g. by using local marker file during
>> garbage collection.
>>
>> Backups are created by upload chunks to the corresponding S3 bucket,
>> while keeping the index files in the local cache store, on backup
>> finish, the snapshot metadata are persisted to the S3 storage backend.
>>
>> Snapshot restores read chunks preferably from the local cache store,
>> downloading and insterting them if not present from the S3 object
>> store. Listing and snapsoht metadata operation currently rely soly on
>> the local cache store.
>>
>> Currently chunks use a 1:1 mapping to S3 objects. An advanced packing
>> mechanism for chunks to significantly reduce the number of api
>> requests and therefore be more cost effective will be implemented as
>> followup patches.
>>
>
> Applied these patches of the latest proxmox and proxmox-backup master branches and
> tried to thoroughly test this new feature.
>
> Here's what I tested:
> - Backup
> - Restore
> - Prune jobs
> - GC
> - Local sync from/to the S3 datastore with some namespace variations
> - Delete datastore
> - Tried to add the same S3 bucket as a new datastore
>
> I ran into an issue when I attempted to run a verify job, which Chris and I already
> debugged off-list:
>
> - An all-zero, 4MB chunk (hash: bb9f...) will not be uploaded to S3 due to it's special usage
> during the atime check during datastore creation.
> This can be easily triggered by backing up a VM with some amounts of unused disk space
> to an *unencrypted* S3 datastore. The error surfaces once attempting to do a
> verification job.
> If the chunk is uploaded manually (e.g. using some kind of S3 client CLI), the verification
> job goes through without any problems.
Thanks a lot for testing and your debugging efforts, was able to fix
this for the upcoming version of the patches!
> Some UI/UX observations:
> - Minor: Would be easier to understand to unify "Unique Identifier" in the S3 client view
> and "S3 Client ID" when adding the datastore (I prefer the latter, it seems more clear to me)
Okay, adapted this as well for the S3 client view and create window.
Also added the still missing cli commands for s3 client manipulation.
> - Minor: The "Host" column in the "Add Datastore" -> S3 Client ID picker does not show
> anything for me.
Ah, the field here got renamed from host to endpoint, as this was better
fitting. Fixed this as well, thanks.
> - It might make sense to make it a bit easier to re-add an existing S3 bucket that was already
> used as a datastore before - right now, it is a bit unintuitive.
> Right now, to "import" an existing bucket, one has to:
> - Use the same datastore name (since it is used in the object key)
> - Enter the same bucket name (makes sense)
> - Make sure that "reuse existing datastore" is *not* ticked (confusing)
> - Press "S3 sync" after adding the datastore (could be automatic)
>
> I think we might be able to reuse the 'reuse datastore' flag and change its behavior
> for S3 datastores to do the right thing automatically, which would be to
> recreate a local cache and then do the S3 sync to get the list of snapshots
> from the bucket.
Okay, will have a go at this tomorrow and see if I manage to adapt this
as well. I agree that reusing the "reuse existing datastore" flag and an
automatic s3-refresh might be more intuitive here.
> In the long-term it could be nice be to actually try to list the contents of
> a bucket and use some heuristics to "find" existing datastores in the bucket
> (could be as easy as trying to find some key that contains ".chunks" in the
> second level, e.g. (somestore/.chunks/...)
> and showing them in some drop-down in the dialog.
Keeping this in mind, but this is out of scope for this series, I would
rather focus on consolidating the current patches for now.
> Keeping the use case of 'reusing' an S3 bucket in mind, maybe it would make
> sense to mark 'ownership' of a datastore in the bucket, e.g. in some special marker
> object (could contain the host name, host key fingerprint, machine-id, etc.),
> as to make it harder to accidentally use the same datastore from multiple PBS servers.
> There could be an "export" mechanism, effectively giving up the ownership by clearing
> the marker, signalling it to be safe to re-add it to another PBS server.
> Just capturing some thoughts here. :)
Hmm, will keep this in mind as well, although I do not see the benefit
of storing the ownership per-se.
Ownership and permissions on the bucket and sub-object are best handled
by the provider and their acls on tokens.
But adding a marker which flags the store as in use seems a good idea
and I will see if it makes sense to add this already. If the user wants
to reuse a datastore for a PBS instance which is no longer available or
failed, removing the marker by some other means (e.g. provider tooling)
first should be acceptable as fail safe I think.
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-07-14 15:40 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-10 17:06 Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 1/9] s3 client: add crate for AWS s3 compatible object store client Christian Ebner
2025-07-11 7:42 ` Thomas Lamprecht
2025-07-11 8:17 ` Christian Ebner
2025-07-11 8:22 ` Thomas Lamprecht
2025-07-11 10:52 ` Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 2/9] s3 client: implement AWS signature v4 request authentication Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 3/9] s3 client: add dedicated type for s3 object keys Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 4/9] s3 client: add type for last modified timestamp in responses Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 5/9] s3 client: add helper to parse http date headers Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 6/9] s3 client: implement methods to operate on s3 objects in bucket Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 7/9] s3 client: add example usage for basic operations Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 8/9] pbs-api-types: extend datastore config by backend config enum Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox v7 9/9] pbs-api-types: maintenance: add new maintenance mode S3 refresh Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 01/38] datastore: add helpers for path/digest to s3 object key conversion Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 02/38] config: introduce s3 object store client configuration Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 03/38] api: config: implement endpoints to manipulate and list s3 configs Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 04/38] api: datastore: check s3 backend bucket access on datastore create Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 05/38] api/cli: add endpoint and command to check s3 client connection Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 06/38] datastore: allow to get the backend for a datastore Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 07/38] api: backup: store datastore backend in runtime environment Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 08/38] api: backup: conditionally upload chunks to s3 object store backend Christian Ebner
2025-07-10 17:06 ` [pbs-devel] [PATCH proxmox-backup v7 09/38] api: backup: conditionally upload blobs " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 10/38] api: backup: conditionally upload indices " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 11/38] api: backup: conditionally upload manifest " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 12/38] api: datastore: conditionally upload client log to s3 backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 13/38] sync: pull: conditionally upload content " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 14/38] api: reader: fetch chunks based on datastore backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 15/38] datastore: local chunk reader: read chunks based on backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 16/38] verify worker: add datastore backed to verify worker Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 17/38] verify: implement chunk verification for stores with s3 backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 18/38] datastore: create namespace marker in " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 19/38] datastore: create/delete protected marker file on s3 storage backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 20/38] datastore: prune groups/snapshots from s3 object store backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 21/38] datastore: get and set owner for s3 " Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 22/38] datastore: implement garbage collection for s3 backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 23/38] ui: add datastore type selector and reorganize component layout Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 24/38] ui: add s3 client edit window for configuration create/edit Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 25/38] ui: add s3 client view for configuration Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 26/38] ui: expose the s3 client view in the navigation tree Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 27/38] ui: add s3 client selector and bucket field for s3 backend setup Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 28/38] tools: lru cache: add removed callback for evicted cache nodes Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 29/38] tools: async lru cache: implement insert, remove and contains methods Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 30/38] datastore: add local datastore cache for network attached storages Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 31/38] api: backup: use local datastore cache on s3 backend chunk upload Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 32/38] api: reader: use local datastore cache on s3 backend chunk fetching Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 33/38] datastore: local chunk reader: get cached chunk from local cache store Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 34/38] api: backup: add no-cache flag to bypass local datastore cache Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 35/38] api/datastore: implement refresh endpoint for stores with s3 backend Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 36/38] cli: add dedicated subcommand for datastore s3 refresh Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 37/38] ui: render s3 refresh as valid maintenance type and task description Christian Ebner
2025-07-10 17:07 ` [pbs-devel] [PATCH proxmox-backup v7 38/38] ui: expose s3 refresh button for datastores backed by object store Christian Ebner
2025-07-14 14:33 ` [pbs-devel] [PATCH proxmox{, -backup} v7 00/47] fix #2943: S3 storage backend for datastores Lukas Wagner
2025-07-14 15:40 ` Christian Ebner [this message]
2025-07-15 7:21 ` Lukas Wagner
2025-07-15 7:32 ` Christian Ebner
2025-07-15 12:55 ` [pbs-devel] superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=208b3247-cc11-46b0-8e2e-603cbcffe763@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=l.wagner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal