[pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu

public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed

* [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
@ 2025-06-23 16:10 Adam Kalisz via pve-devel
  2025-06-24  7:28 ` Fabian Grünbichler
  0 siblings, 1 reply; 8+ messages in thread
From: Adam Kalisz via pve-devel @ 2025-06-23 16:10 UTC (permalink / raw)
  To: pve-devel; +Cc: Adam Kalisz

[-- Attachment #1: Type: message/rfc822, Size: 7632 bytes --]

From: Adam Kalisz <adam.kalisz@notnullmakers.com>
To: pve-devel@lists.proxmox.com
Subject: Discussion of major PBS restore speedup in proxmox-backup-qemu
Date: Mon, 23 Jun 2025 18:10:01 +0200
Message-ID: <9995c68d9c0d6e699578f5a45edb2731b5724ef1.camel@notnullmakers.com>

Hi list,

before I go through all the hoops to submit a patch I wanted to discuss
the current form of the patch that can be found here:

https://github.com/NOT-NULL-Makers/proxmox-backup-qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030

The speedup process was discussed here:

https://forum.proxmox.com/threads/abysmally-slow-restore-from-backup.133602/

The current numbers are:

With the most current snapshot of a VM with 10 GiB system disk and 2x
100 GiB disks with random data:

Original as of 1.5.1:
10 GiB system:    duration=11.78s,  speed=869.34MB/s
100 GiB random 1: duration=412.85s, speed=248.03MB/s
100 GiB random 2: duration=422.42s, speed=242.41MB/s

With the 12-way concurrent fetching:

10 GiB system:    duration=2.05s,   speed=4991.99MB/s
100 GiB random 1: duration=100.54s, speed=1018.48MB/s
100 GiB random 2: duration=100.10s, speed=1022.97MB/s

The hardware is on the PVE side:
2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x Samsung
NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.

On the PBS side:
2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x Samsung
NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4 compression.

Similar or slightly better speeds were achieved on Hetzner AX52 with
AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on PVE with
recordsize 16k connected to another Hetzner AX52 using a 10 Gbps
connection. The PBS has normal NVMe ZFS mirror again with recordsize
1M.

On bigger servers a 16-way concurrency was even better on smaller
servers with high frequency CPUs 8-way concurrency performed better.
The 12-way concurrency is a compromise. We seem to hit a bottleneck
somewhere in the realm of TLS connection and shallow buffers. The
network on the 100 Gbps servers can support up to about 3 GBps (almost
20 Gbps) of traffic in a single TCP connection using mbuffer. The
storage can keep up with such a speed.

Before I submit the patch, I would also like to do the most up to date
build but I have trouble updating my build environment to reflect the
latest commits. What do I have to put in my /etc/apt/sources.list to be
able to install e.g. librust-cbindgen-0.27+default-dev librust-http-
body-util-0.1+default-dev librust-hyper-1+default-dev and all the rest?

This work was sponsored by ČMIS s.r.o. and consulted with the General
Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers s.r.o.)
and Linux team leader Roman Müller (ČMIS).

Best regards
Adam Kalisz

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
  2025-06-23 16:10 [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu Adam Kalisz via pve-devel
@ 2025-06-24  7:28 ` Fabian Grünbichler
       [not found]   ` <c11ce09eb42ad0ef66e1196a1c33462647ead381.camel@notnullmakers.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Fabian Grünbichler @ 2025-06-24  7:28 UTC (permalink / raw)
  To: Proxmox VE development discussion


> Adam Kalisz via pve-devel <pve-devel@lists.proxmox.com> hat am 23.06.2025 18:10 CEST geschrieben:
> Hi list,

Hi!

> before I go through all the hoops to submit a patch I wanted to discuss
> the current form of the patch that can be found here:
> 
> https://github.com/NOT-NULL-Makers/proxmox-backup-qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030
> 
> The speedup process was discussed here:
> 
> https://forum.proxmox.com/threads/abysmally-slow-restore-from-backup.133602/
> 
> The current numbers are:
> 
> With the most current snapshot of a VM with 10 GiB system disk and 2x
> 100 GiB disks with random data:
> 
> Original as of 1.5.1:
> 10 GiB system:    duration=11.78s,  speed=869.34MB/s
> 100 GiB random 1: duration=412.85s, speed=248.03MB/s
> 100 GiB random 2: duration=422.42s, speed=242.41MB/s
> 
> With the 12-way concurrent fetching:
> 
> 10 GiB system:    duration=2.05s,   speed=4991.99MB/s
> 100 GiB random 1: duration=100.54s, speed=1018.48MB/s
> 100 GiB random 2: duration=100.10s, speed=1022.97MB/s

Those numbers do look good - do you also have CPU usage stats before
and after?

> The hardware is on the PVE side:
> 2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x Samsung
> NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.
> 
> On the PBS side:
> 2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x Samsung
> NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4 compression.
> 
> Similar or slightly better speeds were achieved on Hetzner AX52 with
> AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on PVE with
> recordsize 16k connected to another Hetzner AX52 using a 10 Gbps
> connection. The PBS has normal NVMe ZFS mirror again with recordsize
> 1M.
> 
> On bigger servers a 16-way concurrency was even better on smaller
> servers with high frequency CPUs 8-way concurrency performed better.
> The 12-way concurrency is a compromise. We seem to hit a bottleneck
> somewhere in the realm of TLS connection and shallow buffers. The
> network on the 100 Gbps servers can support up to about 3 GBps (almost
> 20 Gbps) of traffic in a single TCP connection using mbuffer. The
> storage can keep up with such a speed.

This sounds like it might make sense to make the number of threads
configurable (the second lower count can probably be derived from it?)
to allow high-end systems to make the most of it, without overloading
smaller setups. Or maybe deriving it from the host CPU count would
also work?

> Before I submit the patch, I would also like to do the most up to date
> build but I have trouble updating my build environment to reflect the
> latest commits. What do I have to put in my /etc/apt/sources.list to be
> able to install e.g. librust-cbindgen-0.27+default-dev librust-http-
> body-util-0.1+default-dev librust-hyper-1+default-dev and all the rest?

We are currently in the process of rebasing all our repositories on top
of the upcoming Debian Trixie release. The built packages are not yet
available for public testing, so you'd either need to wait a bit (in the
order of a few weeks at most), or submit the patches for the current
stable Bookworm-based version and let us forward port them.

> This work was sponsored by ČMIS s.r.o. and consulted with the General
> Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers s.r.o.)
> and Linux team leader Roman Müller (ČMIS).

Nice! Looking forward to the "official" patch submission!
Fabian


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
       [not found]   ` <c11ce09eb42ad0ef66e1196a1c33462647ead381.camel@notnullmakers.com>
@ 2025-06-24 10:43     ` Fabian Grünbichler
  2025-07-03  8:29       ` Adam Kalisz via pve-devel
  0 siblings, 1 reply; 8+ messages in thread
From: Fabian Grünbichler @ 2025-06-24 10:43 UTC (permalink / raw)
  To: Adam Kalisz, Proxmox VE development discussion


> Adam Kalisz <adam.kalisz@notnullmakers.com> hat am 24.06.2025 12:22 CEST geschrieben:
> Hi Fabian,

CCing the list again, assuming it got dropped by accident.

> the CPU usage is higher, I see about 400% for the restore process. I
> didn't investigate the original much because it's unbearably slow.
> 
> Yes, having configurable CONCURRENT_REQUESTS and max_blocking_threads
> would be great. However we would need to wire it up all the way to
> qmrestore or similar or ensure it is read from some env vars. I didn't
> feel confident to introduce this kind of infrastructure as a first time
> contribution.

we can guide you if you want, but it's also possible to follow-up on our
end with that as part of applying the change.

> The writer to disk is single thread still so a CPU that can ramp up a
> single core to a high frequency/ IPC will usually do better on the
> benchmarks.

I think that limitation is no longer there on the QEMU side nowadays,
but it would likely require some more changes to actually make use of
multiple threads submitting IO.

> What are the chances of this getting accepted more or less as is?

proper review and discussion of potential follow-ups (no matter who ends
up doing them) would require submitting a properly signed-off patch
and a CLA - see https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

Fabian

> On Tue, 2025-06-24 at 09:28 +0200, Fabian Grünbichler wrote:
> > 
> > > Adam Kalisz via pve-devel <pve-devel@lists.proxmox.com> hat am
> > > 23.06.2025 18:10 CEST geschrieben:
> > > Hi list,
> > 
> > Hi!
> > 
> > > before I go through all the hoops to submit a patch I wanted to
> > > discuss
> > > the current form of the patch that can be found here:
> > > 
> > > https://github.com/NOT-NULL-Makers/proxmox-backup-qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030
> > > 
> > > The speedup process was discussed here:
> > > 
> > > https://forum.proxmox.com/threads/abysmally-slow-restore-from-backup.133602/
> > > 
> > > The current numbers are:
> > > 
> > > With the most current snapshot of a VM with 10 GiB system disk and
> > > 2x
> > > 100 GiB disks with random data:
> > > 
> > > Original as of 1.5.1:
> > > 10 GiB system:    duration=11.78s,  speed=869.34MB/s
> > > 100 GiB random 1: duration=412.85s, speed=248.03MB/s
> > > 100 GiB random 2: duration=422.42s, speed=242.41MB/s
> > > 
> > > With the 12-way concurrent fetching:
> > > 
> > > 10 GiB system:    duration=2.05s,   speed=4991.99MB/s
> > > 100 GiB random 1: duration=100.54s, speed=1018.48MB/s
> > > 100 GiB random 2: duration=100.10s, speed=1022.97MB/s
> > 
> > Those numbers do look good - do you also have CPU usage stats before
> > and after?
> > 
> > > The hardware is on the PVE side:
> > > 2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x
> > > Samsung
> > > NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.
> > > 
> > > On the PBS side:
> > > 2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x Samsung
> > > NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4
> > > compression.
> > > 
> > > Similar or slightly better speeds were achieved on Hetzner AX52
> > > with
> > > AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on PVE
> > > with
> > > recordsize 16k connected to another Hetzner AX52 using a 10 Gbps
> > > connection. The PBS has normal NVMe ZFS mirror again with
> > > recordsize
> > > 1M.
> > > 
> > > On bigger servers a 16-way concurrency was even better on smaller
> > > servers with high frequency CPUs 8-way concurrency performed
> > > better.
> > > The 12-way concurrency is a compromise. We seem to hit a bottleneck
> > > somewhere in the realm of TLS connection and shallow buffers. The
> > > network on the 100 Gbps servers can support up to about 3 GBps
> > > (almost
> > > 20 Gbps) of traffic in a single TCP connection using mbuffer. The
> > > storage can keep up with such a speed.
> > 
> > This sounds like it might make sense to make the number of threads
> > configurable (the second lower count can probably be derived from
> > it?)
> > to allow high-end systems to make the most of it, without overloading
> > smaller setups. Or maybe deriving it from the host CPU count would
> > also work?
> > 
> > > Before I submit the patch, I would also like to do the most up to
> > > date
> > > build but I have trouble updating my build environment to reflect
> > > the
> > > latest commits. What do I have to put in my /etc/apt/sources.list
> > > to be
> > > able to install e.g. librust-cbindgen-0.27+default-dev librust-
> > > http-
> > > body-util-0.1+default-dev librust-hyper-1+default-dev and all the
> > > rest?
> > 
> > We are currently in the process of rebasing all our repositories on
> > top
> > of the upcoming Debian Trixie release. The built packages are not yet
> > available for public testing, so you'd either need to wait a bit (in
> > the
> > order of a few weeks at most), or submit the patches for the current
> > stable Bookworm-based version and let us forward port them.
> > 
> > > This work was sponsored by ČMIS s.r.o. and consulted with the
> > > General
> > > Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers
> > > s.r.o.)
> > > and Linux team leader Roman Müller (ČMIS).
> > 
> > Nice! Looking forward to the "official" patch submission!
> > Fabian


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
  2025-06-24 10:43     ` Fabian Grünbichler
@ 2025-07-03  8:29       ` Adam Kalisz via pve-devel
  2025-07-03  8:57         ` Dominik Csapak
  0 siblings, 1 reply; 8+ messages in thread
From: Adam Kalisz via pve-devel @ 2025-07-03  8:29 UTC (permalink / raw)
  To: Proxmox VE development discussion; +Cc: Adam Kalisz

[-- Attachment #1: Type: message/rfc822, Size: 12530 bytes --]

From: Adam Kalisz <adam.kalisz@notnullmakers.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
Date: Thu, 03 Jul 2025 10:29:43 +0200
Message-ID: <204986487f1d360ba068f5a1301ee809fd2846f0.camel@notnullmakers.com>

Hi,

On Friday I have submitted the patch with a slight edit to allow
setting the number of threads from an environment variable.

On Tue, 2025-06-24 at 12:43 +0200, Fabian Grünbichler wrote:
> 
> > Adam Kalisz <adam.kalisz@notnullmakers.com> hat am 24.06.2025 12:22
> > CEST geschrieben:
> > Hi Fabian,
> 
> CCing the list again, assuming it got dropped by accident.
> 
> > the CPU usage is higher, I see about 400% for the restore process.
> > I
> > didn't investigate the original much because it's unbearably slow.
> > 
> > Yes, having configurable CONCURRENT_REQUESTS and
> > max_blocking_threads
> > would be great. However we would need to wire it up all the way to
> > qmrestore or similar or ensure it is read from some env vars. I
> > didn't
> > feel confident to introduce this kind of infrastructure as a first
> > time
> > contribution.
> 
> we can guide you if you want, but it's also possible to follow-up on
> our end with that as part of applying the change.

That would be great, it shouldn't be too much work for somebody more
familiar with the project structure where everything needs to be.

> > The writer to disk is single thread still so a CPU that can ramp up
> > a
> > single core to a high frequency/ IPC will usually do better on the
> > benchmarks.
> 
> I think that limitation is no longer there on the QEMU side nowadays,
> but it would likely require some more changes to actually make use of
> multiple threads submitting IO.

The storage writing seemed to be less of a bottleneck than the fetching
of chunks. It seems to me there still is a bottleneck in the network
part because I haven't seen an instance with substantially higher speed
than 1.1 GBps.

Perhaps we could have a discussion about the backup, restore and
synchronization speeds and strategies for debugging and improving the
situation after we have taken the intermediate step of improving the
restore speed as proposed to gather more feedback from the field?

> > What are the chances of this getting accepted more or less as is?
> 
> proper review and discussion of potential follow-ups (no matter who
> ends
> up doing them) would require submitting a properly signed-off patch
> and a CLA - see
> https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

I have cleared the CLA with the Proxmox office last week.

> Fabian

Adam

> > On Tue, 2025-06-24 at 09:28 +0200, Fabian Grünbichler wrote:
> > > 
> > > > Adam Kalisz via pve-devel <pve-devel@lists.proxmox.com> hat am
> > > > 23.06.2025 18:10 CEST geschrieben:
> > > > Hi list,
> > > 
> > > Hi!
> > > 
> > > > before I go through all the hoops to submit a patch I wanted to
> > > > discuss
> > > > the current form of the patch that can be found here:
> > > > 
> > > > https://github.com/NOT-NULL-Makers/proxmox-backup-qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030
> > > > 
> > > > The speedup process was discussed here:
> > > > 
> > > > https://forum.proxmox.com/threads/abysmally-slow-restore-from-backup.133602/
> > > > 
> > > > The current numbers are:
> > > > 
> > > > With the most current snapshot of a VM with 10 GiB system disk
> > > > and
> > > > 2x
> > > > 100 GiB disks with random data:
> > > > 
> > > > Original as of 1.5.1:
> > > > 10 GiB system:    duration=11.78s,  speed=869.34MB/s
> > > > 100 GiB random 1: duration=412.85s, speed=248.03MB/s
> > > > 100 GiB random 2: duration=422.42s, speed=242.41MB/s
> > > > 
> > > > With the 12-way concurrent fetching:
> > > > 
> > > > 10 GiB system:    duration=2.05s,   speed=4991.99MB/s
> > > > 100 GiB random 1: duration=100.54s, speed=1018.48MB/s
> > > > 100 GiB random 2: duration=100.10s, speed=1022.97MB/s
> > > 
> > > Those numbers do look good - do you also have CPU usage stats
> > > before
> > > and after?
> > > 
> > > > The hardware is on the PVE side:
> > > > 2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x
> > > > Samsung
> > > > NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.
> > > > 
> > > > On the PBS side:
> > > > 2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x
> > > > Samsung
> > > > NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4
> > > > compression.
> > > > 
> > > > Similar or slightly better speeds were achieved on Hetzner AX52
> > > > with
> > > > AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on
> > > > PVE
> > > > with
> > > > recordsize 16k connected to another Hetzner AX52 using a 10
> > > > Gbps
> > > > connection. The PBS has normal NVMe ZFS mirror again with
> > > > recordsize
> > > > 1M.
> > > > 
> > > > On bigger servers a 16-way concurrency was even better on
> > > > smaller
> > > > servers with high frequency CPUs 8-way concurrency performed
> > > > better.
> > > > The 12-way concurrency is a compromise. We seem to hit a
> > > > bottleneck
> > > > somewhere in the realm of TLS connection and shallow buffers.
> > > > The
> > > > network on the 100 Gbps servers can support up to about 3 GBps
> > > > (almost
> > > > 20 Gbps) of traffic in a single TCP connection using mbuffer.
> > > > The
> > > > storage can keep up with such a speed.
> > > 
> > > This sounds like it might make sense to make the number of
> > > threads
> > > configurable (the second lower count can probably be derived from
> > > it?)
> > > to allow high-end systems to make the most of it, without
> > > overloading
> > > smaller setups. Or maybe deriving it from the host CPU count
> > > would
> > > also work?
> > > 
> > > > Before I submit the patch, I would also like to do the most up
> > > > to
> > > > date
> > > > build but I have trouble updating my build environment to
> > > > reflect
> > > > the
> > > > latest commits. What do I have to put in my
> > > > /etc/apt/sources.list
> > > > to be
> > > > able to install e.g. librust-cbindgen-0.27+default-dev librust-
> > > > http-
> > > > body-util-0.1+default-dev librust-hyper-1+default-dev and all
> > > > the
> > > > rest?
> > > 
> > > We are currently in the process of rebasing all our repositories
> > > on
> > > top
> > > of the upcoming Debian Trixie release. The built packages are not
> > > yet
> > > available for public testing, so you'd either need to wait a bit
> > > (in
> > > the
> > > order of a few weeks at most), or submit the patches for the
> > > current
> > > stable Bookworm-based version and let us forward port them.
> > > 
> > > > This work was sponsored by ČMIS s.r.o. and consulted with the
> > > > General
> > > > Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers
> > > > s.r.o.)
> > > > and Linux team leader Roman Müller (ČMIS).
> > > 
> > > Nice! Looking forward to the "official" patch submission!
> > > Fabian


[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
  2025-07-03  8:29       ` Adam Kalisz via pve-devel
@ 2025-07-03  8:57         ` Dominik Csapak
  2025-07-03 14:27           ` Adam Kalisz via pve-devel
  0 siblings, 1 reply; 8+ messages in thread
From: Dominik Csapak @ 2025-07-03  8:57 UTC (permalink / raw)
  To: pve-devel

Hi,

On 7/3/25 10:29, Adam Kalisz via pve-devel wrote:
> Hi,
> 
> On Friday I have submitted the patch with a slight edit to allow
> setting the number of threads from an environment variable.
> 

Yes, we saw, thanks for tackling this.

> On Tue, 2025-06-24 at 12:43 +0200, Fabian Grünbichler wrote:
>>> Adam Kalisz<adam.kalisz@notnullmakers.com> hat am 24.06.2025 12:22
>>> CEST geschrieben:
>>> Hi Fabian,
>> CCing the list again, assuming it got dropped by accident.
>>
>>> the CPU usage is higher, I see about 400% for the restore process.
>>> I
>>> didn't investigate the original much because it's unbearably slow.
>>>
>>> Yes, having configurable CONCURRENT_REQUESTS and
>>> max_blocking_threads
>>> would be great. However we would need to wire it up all the way to
>>> qmrestore or similar or ensure it is read from some env vars. I
>>> didn't
>>> feel confident to introduce this kind of infrastructure as a first
>>> time
>>> contribution.
>> we can guide you if you want, but it's also possible to follow-up on
>> our end with that as part of applying the change.
> That would be great, it shouldn't be too much work for somebody more
> familiar with the project structure where everything needs to be.

Just to clarify, it's OK (and preferred?) for you if we continue working with
this patch? In that case I'd take a swing at it.

> 
>>> The writer to disk is single thread still so a CPU that can ramp up
>>> a
>>> single core to a high frequency/ IPC will usually do better on the
>>> benchmarks.
>> I think that limitation is no longer there on the QEMU side nowadays,
>> but it would likely require some more changes to actually make use of
>> multiple threads submitting IO.
> The storage writing seemed to be less of a bottleneck than the fetching
> of chunks. It seems to me there still is a bottleneck in the network
> part because I haven't seen an instance with substantially higher speed
> than 1.1 GBps.

I guess this largely depends on the actual storage and network config,
e.g. if the target storage IO depth is the bottle neck, multiple
writers will speed up that too.

> 
> Perhaps we could have a discussion about the backup, restore and
> synchronization speeds and strategies for debugging and improving the
> situation after we have taken the intermediate step of improving the
> restore speed as proposed to gather more feedback from the field?

I'd at least like to take a very short glance at how hard it would
be to add multiple writers to the image before deciding. If
it's not trivial, then IMHO yes, we can increase the fetching threads for now.

Though I have to look in how we'd want to limit/configure that from
outside. E.g. a valid way to view that would maybe to limit the threads
from exceeding what the vm config says + some extra?

(have to think about that)

> 
>>> What are the chances of this getting accepted more or less as is?
>> proper review and discussion of potential follow-ups (no matter who
>> ends
>> up doing them) would require submitting a properly signed-off patch
>> and a CLA - see
>> https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright
> I have cleared the CLA with the Proxmox office last week.
> 

Thanks

>> Fabian
> Adam
> 


Best Regards
Dominik


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
  2025-07-03  8:57         ` Dominik Csapak
@ 2025-07-03 14:27           ` Adam Kalisz via pve-devel
  0 siblings, 0 replies; 8+ messages in thread
From: Adam Kalisz via pve-devel @ 2025-07-03 14:27 UTC (permalink / raw)
  To: Proxmox VE development discussion; +Cc: Adam Kalisz

[-- Attachment #1: Type: message/rfc822, Size: 9892 bytes --]

From: Adam Kalisz <adam.kalisz@notnullmakers.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
Date: Thu, 03 Jul 2025 16:27:35 +0200
Message-ID: <e7dfd27340a63fdb3e92bf0cb8c01a8f44ab7468.camel@notnullmakers.com>

Hi,

On Thu, 2025-07-03 at 10:57 +0200, Dominik Csapak wrote:
> Hi,
> 
> On 7/3/25 10:29, Adam Kalisz via pve-devel wrote:
> > Hi,
> > 
> > On Friday I have submitted the patch with a slight edit to allow
> > setting the number of threads from an environment variable.
> > 
> 
> Yes, we saw, thanks for tackling this.
> 
> > On Tue, 2025-06-24 at 12:43 +0200, Fabian Grünbichler wrote:
> > > > Adam Kalisz<adam.kalisz@notnullmakers.com> hat am 24.06.2025
> > > > 12:22
> > > > CEST geschrieben:
> > > > Hi Fabian,
> > > CCing the list again, assuming it got dropped by accident.
> > > 
> > > > the CPU usage is higher, I see about 400% for the restore
> > > > process.
> > > > I
> > > > didn't investigate the original much because it's unbearably
> > > > slow.
> > > > 
> > > > Yes, having configurable CONCURRENT_REQUESTS and
> > > > max_blocking_threads
> > > > would be great. However we would need to wire it up all the way
> > > > to
> > > > qmrestore or similar or ensure it is read from some env vars. I
> > > > didn't
> > > > feel confident to introduce this kind of infrastructure as a
> > > > first
> > > > time
> > > > contribution.
> > > we can guide you if you want, but it's also possible to follow-up
> > > on
> > > our end with that as part of applying the change.
> > That would be great, it shouldn't be too much work for somebody
> > more familiar with the project structure where everything needs to
> > be.
> 
> Just to clarify, it's OK (and preferred?) for you if we continue
> working with this patch? In that case I'd take a swing at it.

Yes, please do. If it improves performance/ efficiency further why not?

> > 
> > > > The writer to disk is single thread still so a CPU that can
> > > > ramp up a single core to a high frequency/ IPC will usually do
> > > > better on the benchmarks.
> > > I think that limitation is no longer there on the QEMU side
> > > nowadays, but it would likely require some more changes to
> > > actually make use of multiple threads submitting IO.
> > The storage writing seemed to be less of a bottleneck than the
> > fetching of chunks. It seems to me there still is a bottleneck in
> > the network part because I haven't seen an instance with
> > substantially higher speed than 1.1 GBps.
> 
> I guess this largely depends on the actual storage and network
> config, e.g. if the target storage IO depth is the bottle neck,
> multiple writers will speed up that too.

That's possible but better feeding a single thread would most likely
lead to a big speed improvement too. 

> > Perhaps we could have a discussion about the backup, restore and
> > synchronization speeds and strategies for debugging and improving
> > the situation after we have taken the intermediate step of
> > improving the restore speed as proposed to gather more feedback
> > from the field?
> 
> I'd at least like to take a very short glance at how hard it would
> be to add multiple writers to the image before deciding. If
> it's not trivial, then IMHO yes, we can increase the fetching threads
> for now.

Sure, please have at it. I have tried to make both the fetching and
writing concurrent but ended up in a corner trying to explain that I
will not overwrite data to the borrow checker with my limited Rust -> C
interop knowledge. And the fetch concurrency fortunately was a big
bottleneck too.

> Though I have to look in how we'd want to limit/configure that from
> outside. E.g. a valid way to view that would maybe to limit the
> threads from exceeding what the vm config says + some extra?
> 
> (have to think about that)

Yes, the CPU count from the VM config might be a great default. However
most of the time the CPU is blocking on IO and could do other stuff so
having an option to configure something else or for the case of a
critical recovery a different setting might be suitable.

> > > 
> 
> Thanks
> 
> > > Fabian
> > Adam
> > 
> 
> 
> Best Regards
> Dominik

Thanks / Danke
Adam


[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
       [not found] ` <32645c96c7a1e247202d9d34e6102f08a7f08c97.camel@groupe-cyllene.com>
@ 2025-06-24 10:11   ` Adam Kalisz via pve-devel
  0 siblings, 0 replies; 8+ messages in thread
From: Adam Kalisz via pve-devel @ 2025-06-24 10:11 UTC (permalink / raw)
  To: pve-devel; +Cc: Adam Kalisz

[-- Attachment #1: Type: message/rfc822, Size: 9502 bytes --]

From: Adam Kalisz <adam.kalisz@notnullmakers.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: Discussion of major PBS restore speedup in proxmox-backup-qemu
Date: Tue, 24 Jun 2025 12:11:28 +0200
Message-ID: <191bc676b6e091844291b2752a7b13fba0f3d4b6.camel@notnullmakers.com>

Hi Alexandre,

yes, having configurable CONCURRENT_REQUESTS and max_blocking_threads
would be great. However we would need to wire it up all the way to
qmrestore or similar or ensure it is read from some env vars. I didn't
feel confident to introduce this kind of infrastructure as a first time
contribution. Btw. in this case the concurrency applies mostly to
fetching requests the writer is single thread and there should still be
reasonable locality.

If you have a spinning rust setup I would be very glad if you could do
a performance test vs the current implementation.

Best regards
Adam Kalisz

On Tue, 2025-06-24 at 09:09 +0000, DERUMIER, Alexandre wrote:
> Hi, 
> 
> nice work !
> 
> Could it be possible to have an option to configure the
> CONCURRENT_REQUESTS  ?
> 
> (to avoid to put too much load on slow spinning storage)
> 
> 
> 
> 
> -------- Message initial --------
> De: Adam Kalisz <adam.kalisz@notnullmakers.com>
> À: pve-devel@lists.proxmox.com
> Objet: Discussion of major PBS restore speedup in proxmox-backup-qemu
> Date: 23/06/2025 18:10:01
> 
> Hi list,
> 
> before I go through all the hoops to submit a patch I wanted to
> discuss
> the current form of the patch that can be found here:
> 
> https://github.com/NOT-NULL-Makers/proxmox-backup-
> qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030
> 
> The speedup process was discussed here:
> 
> https://forum.proxmox.com/threads/abysmally-slow-restore-from-
> backup.133602/
> 
> The current numbers are:
> 
> With the most current snapshot of a VM with 10 GiB system disk and 2x
> 100 GiB disks with random data:
> 
> Original as of 1.5.1:
> 10 GiB system:    duration=11.78s,  speed=869.34MB/s
> 100 GiB random 1: duration=412.85s, speed=248.03MB/s
> 100 GiB random 2: duration=422.42s, speed=242.41MB/s
> 
> With the 12-way concurrent fetching:
> 
> 10 GiB system:    duration=2.05s,   speed=4991.99MB/s
> 100 GiB random 1: duration=100.54s, speed=1018.48MB/s
> 100 GiB random 2: duration=100.10s, speed=1022.97MB/s
> 
> The hardware is on the PVE side:
> 2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x Samsung
> NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.
> 
> On the PBS side:
> 2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x Samsung
> NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4 compression.
> 
> Similar or slightly better speeds were achieved on Hetzner AX52 with
> AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on PVE
> with
> recordsize 16k connected to another Hetzner AX52 using a 10 Gbps
> connection. The PBS has normal NVMe ZFS mirror again with recordsize
> 1M.
> 
> On bigger servers a 16-way concurrency was even better on smaller
> servers with high frequency CPUs 8-way concurrency performed better.
> The 12-way concurrency is a compromise. We seem to hit a bottleneck
> somewhere in the realm of TLS connection and shallow buffers. The
> network on the 100 Gbps servers can support up to about 3 GBps
> (almost
> 20 Gbps) of traffic in a single TCP connection using mbuffer. The
> storage can keep up with such a speed.
> 
> Before I submit the patch, I would also like to do the most up to
> date
> build but I have trouble updating my build environment to reflect the
> latest commits. What do I have to put in my /etc/apt/sources.list to
> be
> able to install e.g. librust-cbindgen-0.27+default-dev librust-http-
> body-util-0.1+default-dev librust-hyper-1+default-dev and all the
> rest?
> 
> This work was sponsored by ČMIS s.r.o. and consulted with the General
> Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers s.r.o.)
> and Linux team leader Roman Müller (ČMIS).
> 
> Best regards
> Adam Kalisz


[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
       [not found] <9995c68d9c0d6e699578f5a45edb2731b5724ef1.camel@notnullmakers.com>
@ 2025-06-24  9:09 ` DERUMIER, Alexandre via pve-devel
       [not found] ` <32645c96c7a1e247202d9d34e6102f08a7f08c97.camel@groupe-cyllene.com>
  1 sibling, 0 replies; 8+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2025-06-24  9:09 UTC (permalink / raw)
  To: pve-devel, adam.kalisz; +Cc: DERUMIER, Alexandre

[-- Attachment #1: Type: message/rfc822, Size: 15970 bytes --]

From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>, "adam.kalisz@notnullmakers.com" <adam.kalisz@notnullmakers.com>
Subject: Re: Discussion of major PBS restore speedup in proxmox-backup-qemu
Date: Tue, 24 Jun 2025 09:09:49 +0000
Message-ID: <32645c96c7a1e247202d9d34e6102f08a7f08c97.camel@groupe-cyllene.com>

Hi, 

nice work !

Could it be possible to have an option to configure the
CONCURRENT_REQUESTS  ?

(to avoid to put too much load on slow spinning storage)

-------- Message initial --------
De: Adam Kalisz <adam.kalisz@notnullmakers.com>
À: pve-devel@lists.proxmox.com
Objet: Discussion of major PBS restore speedup in proxmox-backup-qemu
Date: 23/06/2025 18:10:01

Hi list,

before I go through all the hoops to submit a patch I wanted to discuss
the current form of the patch that can be found here:

https://github.com/NOT-NULL-Makers/proxmox-backup-
qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030

The speedup process was discussed here:

https://forum.proxmox.com/threads/abysmally-slow-restore-from-
backup.133602/

The current numbers are:

With the most current snapshot of a VM with 10 GiB system disk and 2x
100 GiB disks with random data:

Original as of 1.5.1:
10 GiB system:    duration=11.78s,  speed=869.34MB/s
100 GiB random 1: duration=412.85s, speed=248.03MB/s
100 GiB random 2: duration=422.42s, speed=242.41MB/s

With the 12-way concurrent fetching:

10 GiB system:    duration=2.05s,   speed=4991.99MB/s
100 GiB random 1: duration=100.54s, speed=1018.48MB/s
100 GiB random 2: duration=100.10s, speed=1022.97MB/s

The hardware is on the PVE side:
2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x Samsung
NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.

On the PBS side:
2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x Samsung
NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4 compression.

Similar or slightly better speeds were achieved on Hetzner AX52 with
AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on PVE with
recordsize 16k connected to another Hetzner AX52 using a 10 Gbps
connection. The PBS has normal NVMe ZFS mirror again with recordsize
1M.

On bigger servers a 16-way concurrency was even better on smaller
servers with high frequency CPUs 8-way concurrency performed better.
The 12-way concurrency is a compromise. We seem to hit a bottleneck
somewhere in the realm of TLS connection and shallow buffers. The
network on the 100 Gbps servers can support up to about 3 GBps (almost
20 Gbps) of traffic in a single TCP connection using mbuffer. The
storage can keep up with such a speed.

Before I submit the patch, I would also like to do the most up to date
build but I have trouble updating my build environment to reflect the
latest commits. What do I have to put in my /etc/apt/sources.list to be
able to install e.g. librust-cbindgen-0.27+default-dev librust-http-
body-util-0.1+default-dev librust-hyper-1+default-dev and all the rest?

This work was sponsored by ČMIS s.r.o. and consulted with the General
Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers s.r.o.)
and Linux team leader Roman Müller (ČMIS).

Best regards
Adam Kalisz

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-03 14:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-06-23 16:10 [pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu Adam Kalisz via pve-devel
2025-06-24  7:28 ` Fabian Grünbichler
     [not found]   ` <c11ce09eb42ad0ef66e1196a1c33462647ead381.camel@notnullmakers.com>
2025-06-24 10:43     ` Fabian Grünbichler
2025-07-03  8:29       ` Adam Kalisz via pve-devel
2025-07-03  8:57         ` Dominik Csapak
2025-07-03 14:27           ` Adam Kalisz via pve-devel
     [not found] <9995c68d9c0d6e699578f5a45edb2731b5724ef1.camel@notnullmakers.com>
2025-06-24  9:09 ` DERUMIER, Alexandre via pve-devel
     [not found] ` <32645c96c7a1e247202d9d34e6102f08a7f08c97.camel@groupe-cyllene.com>
2025-06-24 10:11   ` Adam Kalisz via pve-devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal