From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH v4 proxmox-backup 4/7] fix #4182: server: sync: allow pulling groups concurrently
Date: Mon, 07 Apr 2025 09:21:04 +0200 [thread overview]
Message-ID: <1744009968.ip6vyc3mbq.astroid@yuna.none> (raw)
In-Reply-To: <D8Y1V8BY8F6I.25I1HV13SAN63@proxmox.com>
On April 4, 2025 8:02 pm, Max Carrara wrote:
> On Fri Apr 4, 2025 at 3:49 PM CEST, Christian Ebner wrote:
>> Currently, a sync job sequentially pulls the backup groups and the
>> snapshots contained within them, therefore being limited in download
>> speed by the http2 connection of the source reader instance in case
>> of remote syncs. High latency networks suffer from limited download
>> speed.
>>
>> Improve the throughput by allowing to pull up to a configured number
>> of backup groups concurrently, by creating tasks connecting and
>> pulling from the remote source in parallel.
>>
>> Make the error handling and accounting logic for each group pull
>> reusable by moving it into its own helper function, returning the
>> future.
>>
>> The store progress is placed behind an atomic reference counted mutex
>> to allow for concurrent access of status updates.
>
> Yeah, so... I've got some thoughts about this:
>
> First of all, I think that that's *fine* here, as the
> `Arc<Mutex<StoreProgress>>` probably isn't going to face that much lock
> contention or something anyway. So to get that out of the way, IMO we
> can keep that here as it is right now.
>
> But in the future I do think that we should check the locations where we
> have that kind of concurrent data access / modification, because I feel
> the amount of mutexes is only going to continue growing. That's not a
> bad thing per se, but it does come with a few risks / drawbacks (e.g.
> higher risk for deadlocks).
>
> Without going to deep into the whole "how to avoid deadlocks" discussion
> and other things, here's an alternative I want to propose that could
> perhaps be done in (or as part of a) different series, since it's a bit
> out of scope for this one here. (Though, if you do wanna do it in this
> one, I certainly won't complain!)
>
> First, since `StoreProgress` only contains four `usize`s, it should be
> fairly easy to convert the ones being modified into `AtomicUsize`s and
> perhaps add helper methods to increase their respective values;
> something like this:
>
> #[derive(Debug, Default)]
> /// Tracker for progress of operations iterating over `Datastore` contents.
> pub struct StoreProgress {
> /// Completed groups
> pub done_groups: AtomicUsize,
> /// Total groups
> pub total_groups: u64,
> /// Completed snapshots within current group
> pub done_snapshots: AtomicUsize,
> /// Total snapshots in current group
> pub group_snapshots: u64,
> }
>
> // [...]
>
> impl StoreProgress {
> pub fn add_done_group(&self) {
> let _ = self.done_groups.fetch_add(1, Ordering::Relaxed);
> }
>
> // [...]
> }
>
> (of course, what it all should look like in detail is up to bikeshedding :P)
>
> Something like that would probably be nicer here, because:
>
> - You won't need to wrap `StoreProgress` within an `Arc<Mutex<T>>`
> anymore -- a shared reference is enough, since ...
> - Operations on atomics take &self (that's the whole point of them ofc ;p )
>
> This means that:
> - Cloning an `Arc<T>` is not necessary anymore
> --> should be approx. two atomic ops less times the amount of `Arc`s used
> (on create/clone and drop for each `Arc`)
> - Locking the `Mutex<T>` is also not necessary anymore, which means
> --> should be two atomic ops less for each call to `.lock()`
> (acquire and release)
>
> In turn, this is replaced by a single atomic call with
> `Ordering::Relaxed` (which is fine for counters [0]). So, something like
>
> progress.lock().unwrap().done_groups += 1;
>
> would just become
>
> progress.add_done_group();
>
> which is also quite neat.
>
> Note however that we might have to split that struct into a "local" and
> "shared" version or whatever in order to adapt it all to the current
> code (a locally-used struct ofc doesn't need atomics).
>
> Again, I think what you're doing here is perfectly fine; I just think
> that we should have a look at all of those concurrent data accesses and
> see whether we can slim some stuff down or perhaps have some kind of
> statically enforced mutex ordering for deadlock prevention [1].
>
> [0]: https://doc.rust-lang.org/nomicon/atomics.html#relaxed
> [1]: https://www.youtube.com/watch?v=Ba7fajt4l1M
we do have a similar construct in the client (although still with Arc<>,
maybe we could eliminate it there? ;)):
https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=pbs-client/src/backup_stats.rs;h=f0563a0011b1117e32027a8a88f9d6f65db591f0;hb=HEAD#l45
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-04-07 7:21 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-04 13:49 [pbs-devel] [PATCH v4 proxmox proxmox-backup 0/7] fix #4182: concurrent group pull/push support for sync jobs Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox 1/7] pbs api types: add 'parallel-groups' to sync job config Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 2/7] client: backup writer: fix upload stats size and rate for push sync Christian Ebner
2025-04-04 18:01 ` Max Carrara
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 3/7] api: config/sync: add optional `parallel-groups` property Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 4/7] fix #4182: server: sync: allow pulling groups concurrently Christian Ebner
2025-04-04 18:02 ` Max Carrara
2025-04-07 7:21 ` Fabian Grünbichler [this message]
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 5/7] server: pull: prefix log messages and add error context Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 6/7] server: sync: allow pushing groups concurrently Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 7/7] server: push: prefix log messages and add additional logging Christian Ebner
2025-04-04 18:01 ` [pbs-devel] [PATCH v4 proxmox proxmox-backup 0/7] fix #4182: concurrent group pull/push support for sync jobs Max Carrara
2025-04-05 9:31 ` Christian Ebner
2025-04-09 10:22 ` Max Carrara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1744009968.ip6vyc3mbq.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal