all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Adam Kalisz <adam.kalisz@notnullmakers.com>,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
Date: Tue, 24 Jun 2025 12:43:19 +0200 (CEST)	[thread overview]
Message-ID: <644731967.7191.1750761799812@webmail.proxmox.com> (raw)
In-Reply-To: <c11ce09eb42ad0ef66e1196a1c33462647ead381.camel@notnullmakers.com>


> Adam Kalisz <adam.kalisz@notnullmakers.com> hat am 24.06.2025 12:22 CEST geschrieben:
> Hi Fabian,

CCing the list again, assuming it got dropped by accident.

> the CPU usage is higher, I see about 400% for the restore process. I
> didn't investigate the original much because it's unbearably slow.
> 
> Yes, having configurable CONCURRENT_REQUESTS and max_blocking_threads
> would be great. However we would need to wire it up all the way to
> qmrestore or similar or ensure it is read from some env vars. I didn't
> feel confident to introduce this kind of infrastructure as a first time
> contribution.

we can guide you if you want, but it's also possible to follow-up on our
end with that as part of applying the change.

> The writer to disk is single thread still so a CPU that can ramp up a
> single core to a high frequency/ IPC will usually do better on the
> benchmarks.

I think that limitation is no longer there on the QEMU side nowadays,
but it would likely require some more changes to actually make use of
multiple threads submitting IO.

> What are the chances of this getting accepted more or less as is?

proper review and discussion of potential follow-ups (no matter who ends
up doing them) would require submitting a properly signed-off patch
and a CLA - see https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

Fabian

> On Tue, 2025-06-24 at 09:28 +0200, Fabian Grünbichler wrote:
> > 
> > > Adam Kalisz via pve-devel <pve-devel@lists.proxmox.com> hat am
> > > 23.06.2025 18:10 CEST geschrieben:
> > > Hi list,
> > 
> > Hi!
> > 
> > > before I go through all the hoops to submit a patch I wanted to
> > > discuss
> > > the current form of the patch that can be found here:
> > > 
> > > https://github.com/NOT-NULL-Makers/proxmox-backup-qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030
> > > 
> > > The speedup process was discussed here:
> > > 
> > > https://forum.proxmox.com/threads/abysmally-slow-restore-from-backup.133602/
> > > 
> > > The current numbers are:
> > > 
> > > With the most current snapshot of a VM with 10 GiB system disk and
> > > 2x
> > > 100 GiB disks with random data:
> > > 
> > > Original as of 1.5.1:
> > > 10 GiB system:    duration=11.78s,  speed=869.34MB/s
> > > 100 GiB random 1: duration=412.85s, speed=248.03MB/s
> > > 100 GiB random 2: duration=422.42s, speed=242.41MB/s
> > > 
> > > With the 12-way concurrent fetching:
> > > 
> > > 10 GiB system:    duration=2.05s,   speed=4991.99MB/s
> > > 100 GiB random 1: duration=100.54s, speed=1018.48MB/s
> > > 100 GiB random 2: duration=100.10s, speed=1022.97MB/s
> > 
> > Those numbers do look good - do you also have CPU usage stats before
> > and after?
> > 
> > > The hardware is on the PVE side:
> > > 2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x
> > > Samsung
> > > NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.
> > > 
> > > On the PBS side:
> > > 2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x Samsung
> > > NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4
> > > compression.
> > > 
> > > Similar or slightly better speeds were achieved on Hetzner AX52
> > > with
> > > AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on PVE
> > > with
> > > recordsize 16k connected to another Hetzner AX52 using a 10 Gbps
> > > connection. The PBS has normal NVMe ZFS mirror again with
> > > recordsize
> > > 1M.
> > > 
> > > On bigger servers a 16-way concurrency was even better on smaller
> > > servers with high frequency CPUs 8-way concurrency performed
> > > better.
> > > The 12-way concurrency is a compromise. We seem to hit a bottleneck
> > > somewhere in the realm of TLS connection and shallow buffers. The
> > > network on the 100 Gbps servers can support up to about 3 GBps
> > > (almost
> > > 20 Gbps) of traffic in a single TCP connection using mbuffer. The
> > > storage can keep up with such a speed.
> > 
> > This sounds like it might make sense to make the number of threads
> > configurable (the second lower count can probably be derived from
> > it?)
> > to allow high-end systems to make the most of it, without overloading
> > smaller setups. Or maybe deriving it from the host CPU count would
> > also work?
> > 
> > > Before I submit the patch, I would also like to do the most up to
> > > date
> > > build but I have trouble updating my build environment to reflect
> > > the
> > > latest commits. What do I have to put in my /etc/apt/sources.list
> > > to be
> > > able to install e.g. librust-cbindgen-0.27+default-dev librust-
> > > http-
> > > body-util-0.1+default-dev librust-hyper-1+default-dev and all the
> > > rest?
> > 
> > We are currently in the process of rebasing all our repositories on
> > top
> > of the upcoming Debian Trixie release. The built packages are not yet
> > available for public testing, so you'd either need to wait a bit (in
> > the
> > order of a few weeks at most), or submit the patches for the current
> > stable Bookworm-based version and let us forward port them.
> > 
> > > This work was sponsored by ČMIS s.r.o. and consulted with the
> > > General
> > > Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers
> > > s.r.o.)
> > > and Linux team leader Roman Müller (ČMIS).
> > 
> > Nice! Looking forward to the "official" patch submission!
> > Fabian


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

  parent reply	other threads:[~2025-06-24 10:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-23 16:10 Adam Kalisz via pve-devel
2025-06-24  7:28 ` Fabian Grünbichler
     [not found]   ` <c11ce09eb42ad0ef66e1196a1c33462647ead381.camel@notnullmakers.com>
2025-06-24 10:43     ` Fabian Grünbichler [this message]
2025-07-03  8:29       ` Adam Kalisz via pve-devel
2025-07-03  8:57         ` Dominik Csapak
2025-07-03 14:27           ` Adam Kalisz via pve-devel
     [not found] <9995c68d9c0d6e699578f5a45edb2731b5724ef1.camel@notnullmakers.com>
2025-06-24  9:09 ` DERUMIER, Alexandre via pve-devel
     [not found] ` <32645c96c7a1e247202d9d34e6102f08a7f08c97.camel@groupe-cyllene.com>
2025-06-24 10:11   ` Adam Kalisz via pve-devel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=644731967.7191.1750761799812@webmail.proxmox.com \
    --to=f.gruenbichler@proxmox.com \
    --cc=adam.kalisz@notnullmakers.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal