* Re: [pve-devel] applied: [RFC pve-qemu] disable jemalloc [not found] <1c4d80a05d8328a52b9d15e991fd4d348bce1327.camel@groupe-cyllene.com> @ 2023-03-11 13:14 ` DERUMIER, Alexandre 2023-03-13 7:17 ` DERUMIER, Alexandre 0 siblings, 1 reply; 5+ messages in thread From: DERUMIER, Alexandre @ 2023-03-11 13:14 UTC (permalink / raw) To: pve-devel Le samedi 11 mars 2023 à 10:01 +0100, Thomas Lamprecht a écrit : > Hi, > > Am 10/03/2023 um 19:05 schrieb DERUMIER, Alexandre: > > I'm currently benching again qemu with librbd and memory allocator. > > > > > > It's seem that they are still performance problem with default > > glibc > > allocator, around 20-25% less iops and bigger latency. > > Are those numbers compared to jemalloc or tcmalloc? > oh sorry, tcmalloc. (I'm gotting almost same result with jmalloc, maybe a little bit more less/unstable) > Also, a key problem with allocator tuning is that its heavily > dependent on > the workload of each specific library (i.e., not only QEMU itself but > also > the specific block backend (library). > > > yes, it should help librbd mainly. I don't think help other storage. > > From my bench, i'm around 60k iops vs 80-90k iops with 4k randread. > > > > Redhat have also notice it > > > > > > I known than jemalloc was buggy with rust lib && pbs block driver, > > but did you have evaluated tcmalloc ? > > Yes, for PBS once - was way worse in how it generally worked than > either > jemalloc and default glibc IIRC, but I don't think I checked for > latency, > as then we tracked down freed memory that the allocator did not give > back > to the system to how they internally try to keep a pool of available > memory > around. > I known than jemalloc could have strange effect on memory. (ceph was using jemalloc some year ago with this kind of side effect, and they have migrate to tcmalloc later) > So for latency it might be a win, but IMO not to sure if the other > effects > it has are worth that. > > > yes, latency is my main objective, mainly for ceph synchronous write with low iodepth,they are pretty slow, so 20% improvement is really big. > > Note that it's possible to load it dynamically with LD_PRELOAD, > > so maybe could we add an option in vm config to enable it ? > > > I'm not 100% sure if QEMU copes well with preloading it via the > dynlinker > as is, or if we need to hard-disable malloc_trim support for it then. > As currently with the "system" allocator (glibc) there's malloc_trim > called > (semi-) periodically via call_rcu_thread - and at least qemu's meson > build > system config disables malloc_trim for tcmalloc or jemalloc. > > > Or did you already test this directly on QEMU, not just rbd bench? As > then > I'd be open to add some tuning config with a allocator sub-property > in there > to our CFGs. > I have tried directly in qemu, with " my $run_qemu = sub { PVE::Tools::run_fork sub { $ENV{LD_PRELOAD} = "/usr/lib/x86_64-linux- gnu/libtcmalloc.so.4" ; PVE::Systemd::enter_systemd_scope($vmid, "Proxmox VE VM $vmid", %systemd_properties); " I really don't known about malloc_trim, the initial discussion about is here, https://patchwork.ozlabs.org/project/qemu-devel/patch/1510899814-19372-1-git-send-email-yang.zhong@intel.com/ and indeed, it's disabled when building with tcmalloc/jemalloc , but I don't known about dynamic loading. But I don't have any crash or segfault. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pve-devel] applied: [RFC pve-qemu] disable jemalloc 2023-03-11 13:14 ` [pve-devel] applied: [RFC pve-qemu] disable jemalloc DERUMIER, Alexandre @ 2023-03-13 7:17 ` DERUMIER, Alexandre 0 siblings, 0 replies; 5+ messages in thread From: DERUMIER, Alexandre @ 2023-03-13 7:17 UTC (permalink / raw) To: pve-devel; +Cc: t.lamprecht I have done tests writing a small C program calling malloc_trim(0), and it don't break/segfault with LD_PRELOAD tcmalloc. I don't think that tcmalloc override this specific gblic function, but maybe malloc_trim is triming empty glibc malloc memory. I have done 2 days of continous fio benchmark in a vm with tcmalloc preload, I don't have any problem. But the speed is really night & days, with iodepth=64 4k randread, It's something like average 85-90k iops (with some spike at 120k) vs 50kiops. (with spike to 60kiops). If it's ok for you, I'll send a patch with something like: vmid.conf --------- memory_allocator: glibc|tcmalloc and simply add the LD_PRELOAD in systemd unit when vm is starting ? Le samedi 11 mars 2023 à 13:14 +0000, DERUMIER, Alexandre a écrit : > Le samedi 11 mars 2023 à 10:01 +0100, Thomas Lamprecht a écrit : > > Hi, > > > > Am 10/03/2023 um 19:05 schrieb DERUMIER, Alexandre: > > > I'm currently benching again qemu with librbd and memory > > > allocator. > > > > > > > > > It's seem that they are still performance problem with default > > > glibc > > > allocator, around 20-25% less iops and bigger latency. > > > > Are those numbers compared to jemalloc or tcmalloc? > > > oh sorry, > > tcmalloc. (I'm gotting almost same result with jmalloc, maybe a > little > bit more less/unstable) > > > > Also, a key problem with allocator tuning is that its heavily > > dependent on > > the workload of each specific library (i.e., not only QEMU itself > > but > > also > > the specific block backend (library). > > > > > > yes, it should help librbd mainly. I don't think help other storage. > > > > > > From my bench, i'm around 60k iops vs 80-90k iops with 4k > > > randread. > > > > > > Redhat have also notice it > > > > > > > > > I known than jemalloc was buggy with rust lib && pbs block > > > driver, > > > but did you have evaluated tcmalloc ? > > > > Yes, for PBS once - was way worse in how it generally worked than > > either > > jemalloc and default glibc IIRC, but I don't think I checked for > > latency, > > as then we tracked down freed memory that the allocator did not > > give > > back > > to the system to how they internally try to keep a pool of > > available > > memory > > around. > > > I known than jemalloc could have strange effect on memory. (ceph was > using jemalloc some year ago with this kind of side effect, and they > have migrate to tcmalloc later) > > > > So for latency it might be a win, but IMO not to sure if the other > > effects > > it has are worth that. > > > > > > yes, latency is my main objective, mainly for ceph synchronous write > with low iodepth,they are pretty slow, so 20% improvement is really > big. > > > > Note that it's possible to load it dynamically with LD_PRELOAD, > > > so maybe could we add an option in vm config to enable it ? > > > > > > I'm not 100% sure if QEMU copes well with preloading it via the > > dynlinker > > as is, or if we need to hard-disable malloc_trim support for it > > then. > > As currently with the "system" allocator (glibc) there's > > malloc_trim > > called > > (semi-) periodically via call_rcu_thread - and at least qemu's > > meson > > build > > system config disables malloc_trim for tcmalloc or jemalloc. > > > > > > Or did you already test this directly on QEMU, not just rbd bench? > > As > > then > > I'd be open to add some tuning config with a allocator sub-property > > in there > > to our CFGs. > > > > I have tried directly in qemu, with > > " > my $run_qemu = sub { > PVE::Tools::run_fork sub { > > $ENV{LD_PRELOAD} = "/usr/lib/x86_64-linux- > gnu/libtcmalloc.so.4" ; > > PVE::Systemd::enter_systemd_scope($vmid, "Proxmox VE VM > $vmid", %systemd_properties); > > " > > I really don't known about malloc_trim, > the initial discussion about is here, > https://patchwork.ozlabs.org/project/qemu-devel/patch/1510899814-19372-1-git-send-email-yang.zhong@intel.com/ > and indeed, it's disabled when building with tcmalloc/jemalloc , but > I > don't known about dynamic loading. > > But I don't have any crash or segfault. > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* [pve-devel] [RFC pve-qemu] disable jemalloc @ 2020-12-10 15:23 Stefan Reiter 2020-12-15 13:43 ` [pve-devel] applied: " Thomas Lamprecht 0 siblings, 1 reply; 5+ messages in thread From: Stefan Reiter @ 2020-12-10 15:23 UTC (permalink / raw) To: pve-devel jemalloc does not play nice with our Rust library (proxmox-backup-qemu), specifically it never releases memory allocated from Rust to the OS. This leads to a problem with larger caches (e.g. for the PBS block driver). It appears to be related to this GitHub issue: https://github.com/jemalloc/jemalloc/issues/1398 The background_thread solution seems weirdly hacky, so let's disable jemalloc entirely for now. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> --- @Alexandre: you were the one to introduce jemalloc into our QEMU builds a long time ago - does it still provide a measurable benefit? If the performance loss would be too great in removing it, we could maybe figure out some workarounds as well. Its current behaviour does seem rather broken to me though... debian/rules | 1 - 1 file changed, 1 deletion(-) diff --git a/debian/rules b/debian/rules index c73d6a1..57e1c91 100755 --- a/debian/rules +++ b/debian/rules @@ -60,7 +60,6 @@ config.status: configure --enable-docs \ --enable-glusterfs \ --enable-gnutls \ - --enable-jemalloc \ --enable-libiscsi \ --enable-libusb \ --enable-linux-aio \ -- 2.20.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [pve-devel] applied: [RFC pve-qemu] disable jemalloc 2020-12-10 15:23 [pve-devel] " Stefan Reiter @ 2020-12-15 13:43 ` Thomas Lamprecht 2023-03-10 18:05 ` DERUMIER, Alexandre 0 siblings, 1 reply; 5+ messages in thread From: Thomas Lamprecht @ 2020-12-15 13:43 UTC (permalink / raw) To: Proxmox VE development discussion, Stefan Reiter On 10.12.20 16:23, Stefan Reiter wrote: > jemalloc does not play nice with our Rust library (proxmox-backup-qemu), > specifically it never releases memory allocated from Rust to the OS. > This leads to a problem with larger caches (e.g. for the PBS block driver). > > It appears to be related to this GitHub issue: > https://github.com/jemalloc/jemalloc/issues/1398 > > The background_thread solution seems weirdly hacky, so let's disable > jemalloc entirely for now. > > Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> > --- > > @Alexandre: you were the one to introduce jemalloc into our QEMU builds a long > time ago - does it still provide a measurable benefit? If the performance loss > would be too great in removing it, we could maybe figure out some workarounds as > well. > > Its current behaviour does seem rather broken to me though... > > debian/rules | 1 - > 1 file changed, 1 deletion(-) > > applied, thanks! ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pve-devel] applied: [RFC pve-qemu] disable jemalloc 2020-12-15 13:43 ` [pve-devel] applied: " Thomas Lamprecht @ 2023-03-10 18:05 ` DERUMIER, Alexandre 2023-03-11 9:01 ` Thomas Lamprecht 0 siblings, 1 reply; 5+ messages in thread From: DERUMIER, Alexandre @ 2023-03-10 18:05 UTC (permalink / raw) To: pve-devel; +Cc: t.lamprecht Hi, sorry for bumping this old thread. I'm currently benching again qemu with librbd and memory allocator. It's seem that they are still performance problem with default glibc allocator, around 20-25% less iops and bigger latency. From my bench, i'm around 60k iops vs 80-90k iops with 4k randread. Redhat have also notice it https://bugzilla.redhat.com/show_bug.cgi?id=1717414 https://sourceware.org/bugzilla/show_bug.cgi?id=28050 I known than jemalloc was buggy with rust lib && pbs block driver, but did you have evaluated tcmalloc ? Note that it's possible to load it dynamically with LD_PRELOAD, so maybe could we add an option in vm config to enable it ? Le mardi 15 décembre 2020 à 14:43 +0100, Thomas Lamprecht a écrit : > On 10.12.20 16:23, Stefan Reiter wrote: > > jemalloc does not play nice with our Rust library (proxmox-backup- > > qemu), > > specifically it never releases memory allocated from Rust to the > > OS. > > This leads to a problem with larger caches (e.g. for the PBS block > > driver). > > > > It appears to be related to this GitHub issue: > > https://github.com/jemalloc/jemalloc/issues/1398 > > > > The background_thread solution seems weirdly hacky, so let's > > disable > > jemalloc entirely for now. > > > > Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> > > --- > > > > @Alexandre: you were the one to introduce jemalloc into our QEMU > > builds a long > > time ago - does it still provide a measurable benefit? If the > > performance loss > > would be too great in removing it, we could maybe figure out some > > workarounds as > > well. > > > > Its current behaviour does seem rather broken to me though... > > > > debian/rules | 1 - > > 1 file changed, 1 deletion(-) > > > > > > applied, thanks! > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [pve-devel] applied: [RFC pve-qemu] disable jemalloc 2023-03-10 18:05 ` DERUMIER, Alexandre @ 2023-03-11 9:01 ` Thomas Lamprecht 0 siblings, 0 replies; 5+ messages in thread From: Thomas Lamprecht @ 2023-03-11 9:01 UTC (permalink / raw) To: DERUMIER, Alexandre, pve-devel Hi, Am 10/03/2023 um 19:05 schrieb DERUMIER, Alexandre: > I'm currently benching again qemu with librbd and memory allocator. > > > It's seem that they are still performance problem with default glibc > allocator, around 20-25% less iops and bigger latency. Are those numbers compared to jemalloc or tcmalloc? Also, a key problem with allocator tuning is that its heavily dependent on the workload of each specific library (i.e., not only QEMU itself but also the specific block backend (library). > > From my bench, i'm around 60k iops vs 80-90k iops with 4k randread. > > Redhat have also notice it > https://bugzilla.redhat.com/show_bug.cgi?id=1717414 > https://sourceware.org/bugzilla/show_bug.cgi?id=28050 > > > I known than jemalloc was buggy with rust lib && pbs block driver, > but did you have evaluated tcmalloc ? Yes, for PBS once - was way worse in how it generally worked than either jemalloc and default glibc IIRC, but I don't think I checked for latency, as then we tracked down freed memory that the allocator did not give back to the system to how they internally try to keep a pool of available memory around. So for latency it might be a win, but IMO not to sure if the other effects it has are worth that. > > Note that it's possible to load it dynamically with LD_PRELOAD, > so maybe could we add an option in vm config to enable it ? > I'm not 100% sure if QEMU copes well with preloading it via the dynlinker as is, or if we need to hard-disable malloc_trim support for it then. As currently with the "system" allocator (glibc) there's malloc_trim called (semi-) periodically via call_rcu_thread - and at least qemu's meson build system config disables malloc_trim for tcmalloc or jemalloc. Or did you already test this directly on QEMU, not just rbd bench? As then I'd be open to add some tuning config with a allocator sub-property in there to our CFGs. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-03-13 7:18 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <1c4d80a05d8328a52b9d15e991fd4d348bce1327.camel@groupe-cyllene.com> 2023-03-11 13:14 ` [pve-devel] applied: [RFC pve-qemu] disable jemalloc DERUMIER, Alexandre 2023-03-13 7:17 ` DERUMIER, Alexandre 2020-12-10 15:23 [pve-devel] " Stefan Reiter 2020-12-15 13:43 ` [pve-devel] applied: " Thomas Lamprecht 2023-03-10 18:05 ` DERUMIER, Alexandre 2023-03-11 9:01 ` Thomas Lamprecht
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox