public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH v2 pve-qemu 0/2] Re-enable tcmalloc as the memory allocator
@ 2026-04-14  5:46 Kefu Chai
  2026-04-14  5:46 ` [PATCH v2 pve-qemu 1/2] add patch to support using " Kefu Chai
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Kefu Chai @ 2026-04-14  5:46 UTC (permalink / raw)
  To: pve-devel

Changes since v1:

* Rename patch 1/2 commit title to "add patch to support using
  tcmalloc as the memory allocator" per Fiona's suggestion, since
  this patch only adds the QEMU source patch without enabling
  tcmalloc yet.
* Rename the inner patch (0048) Subject to "PVE-Backup: support
  using tcmalloc as the memory allocator" to reflect that the code
  change is specific to the backup cleanup path.
* Add Acked-by from Fiona.

No code changes -- only titles and trailers.

Kefu Chai (2):
  add patch to support using tcmalloc as the memory allocator
  d/rules: enable tcmalloc as the memory allocator

 debian/control                                |  1 +
 ...use-tcmalloc-as-the-memory-allocator.patch | 77 +++++++++++++++++++
 debian/patches/series                         |  1 +
 debian/rules                                  |  1 +
 4 files changed, 80 insertions(+)
 create mode 100644 debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch

--
2.47.3





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 pve-qemu 1/2] add patch to support using tcmalloc as the memory allocator
  2026-04-14  5:46 [PATCH v2 pve-qemu 0/2] Re-enable tcmalloc as the memory allocator Kefu Chai
@ 2026-04-14  5:46 ` Kefu Chai
  2026-04-14  9:53   ` Fiona Ebner
  2026-04-14  5:46 ` [PATCH v2 pve-qemu 2/2] d/rules: enable " Kefu Chai
  2026-04-14 11:08 ` [PATCH v2 pve-qemu 0/2] Re-enable " Fiona Ebner
  2 siblings, 1 reply; 7+ messages in thread
From: Kefu Chai @ 2026-04-14  5:46 UTC (permalink / raw)
  To: pve-devel

Add allocator-aware memory release in the backup completion path:
since tcmalloc does not provide glibc's malloc_trim(), use the
tcmalloc-specific MallocExtension_ReleaseFreeMemory() API instead.
This function walks tcmalloc's page heap free span lists and calls
madvise(MADV_DONTNEED) -- it does not walk allocated memory or compact
the heap, so latency impact is negligible.

Also adds a CONFIG_TCMALLOC meson define so the conditional compilation
in pve-backup.c can detect the allocator choice.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
Acked-by: Fiona Ebner <f.ebner@proxmox.com>
---
 ...use-tcmalloc-as-the-memory-allocator.patch | 77 +++++++++++++++++++
 debian/patches/series                         |  1 +
 2 files changed, 78 insertions(+)
 create mode 100644 debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch

diff --git a/debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch b/debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch
new file mode 100644
index 0000000..f8171ae
--- /dev/null
+++ b/debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch
@@ -0,0 +1,77 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Kefu Chai <k.chai@proxmox.com>
+Date: Thu, 9 Apr 2026 17:29:10 +0800
+Subject: [PATCH] PVE-Backup: support using tcmalloc as the memory allocator
+
+Use tcmalloc (from gperftools) as the memory allocator for improved
+performance with workloads that create many small, short-lived
+allocations -- particularly Ceph/librbd I/O paths.
+
+tcmalloc uses per-thread caches and size-class freelists that handle
+this allocation pattern more efficiently than glibc's allocator. Ceph
+benchmarks show ~50% IOPS improvement on 16KB random reads.
+
+Since tcmalloc does not provide glibc's malloc_trim(), use the
+tcmalloc-specific MallocExtension_ReleaseFreeMemory() API to release
+cached memory back to the OS after backup completion. This function
+walks tcmalloc's page heap free span lists and calls
+madvise(MADV_DONTNEED) -- it does not walk allocated memory or compact
+the heap, so latency impact is negligible.
+
+Historical context:
+- tcmalloc was originally enabled in 2015 but removed due to
+  performance issues with gperftools 2.2's default settings (low
+  TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES and aggressive decommit
+  disabled). These issues were resolved in gperftools 2.4+.
+- jemalloc replaced tcmalloc but was removed in 2020 because it didn't
+  release memory allocated from Rust (proxmox-backup-qemu) back to the
+  OS. The allocator-specific release API addresses this.
+- PVE 9 ships gperftools 2.16, so the old tuning issues are moot.
+
+Signed-off-by: Kefu Chai <k.chai@proxmox.com>
+---
+ meson.build  | 1 +
+ pve-backup.c | 8 ++++++++
+ 2 files changed, 9 insertions(+)
+
+diff --git a/meson.build b/meson.build
+index 0b28d2ec39..c6de2464d6 100644
+--- a/meson.build
++++ b/meson.build
+@@ -2567,6 +2567,7 @@ config_host_data.set('CONFIG_CRYPTO_SM4', crypto_sm4.found())
+ config_host_data.set('CONFIG_CRYPTO_SM3', crypto_sm3.found())
+ config_host_data.set('CONFIG_HOGWEED', hogweed.found())
+ config_host_data.set('CONFIG_MALLOC_TRIM', has_malloc_trim)
++config_host_data.set('CONFIG_TCMALLOC', get_option('malloc') == 'tcmalloc')
+ config_host_data.set('CONFIG_ZSTD', zstd.found())
+ config_host_data.set('CONFIG_QPL', qpl.found())
+ config_host_data.set('CONFIG_UADK', uadk.found())
+diff --git a/pve-backup.c b/pve-backup.c
+index ad0f8668fd..d5556f152b 100644
+--- a/pve-backup.c
++++ b/pve-backup.c
+@@ -19,6 +19,8 @@
+ 
+ #if defined(CONFIG_MALLOC_TRIM)
+ #include <malloc.h>
++#elif defined(CONFIG_TCMALLOC)
++#include <gperftools/malloc_extension_c.h>
+ #endif
+ 
+ #include <proxmox-backup-qemu.h>
+@@ -303,6 +305,12 @@ static void coroutine_fn pvebackup_co_cleanup(void)
+      * Won't happen by default if there is fragmentation.
+      */
+     malloc_trim(4 * 1024 * 1024);
++#elif defined(CONFIG_TCMALLOC)
++    /*
++     * Release free memory from tcmalloc's page cache back to the OS. This is
++     * allocator-aware and efficiently returns cached spans via madvise().
++     */
++    MallocExtension_ReleaseFreeMemory();
+ #endif
+ }
+ 
+-- 
+2.47.3
+
diff --git a/debian/patches/series b/debian/patches/series
index 8ed0c52..468df6c 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -81,3 +81,4 @@ pve/0044-PVE-backup-get-device-info-allow-caller-to-specify-f.patch
 pve/0045-PVE-backup-implement-backup-access-setup-and-teardow.patch
 pve/0046-PVE-backup-prepare-for-the-switch-to-using-blockdev-.patch
 pve/0047-savevm-async-reuse-migration-blocker-check-for-snaps.patch
+pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch
-- 
2.47.3





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 pve-qemu 2/2] d/rules: enable tcmalloc as the memory allocator
  2026-04-14  5:46 [PATCH v2 pve-qemu 0/2] Re-enable tcmalloc as the memory allocator Kefu Chai
  2026-04-14  5:46 ` [PATCH v2 pve-qemu 1/2] add patch to support using " Kefu Chai
@ 2026-04-14  5:46 ` Kefu Chai
  2026-04-14 11:08 ` [PATCH v2 pve-qemu 0/2] Re-enable " Fiona Ebner
  2 siblings, 0 replies; 7+ messages in thread
From: Kefu Chai @ 2026-04-14  5:46 UTC (permalink / raw)
  To: pve-devel

Use tcmalloc (from gperftools) instead of glibc's allocator for
improved performance with workloads that create many small, short-lived
allocations -- particularly Ceph/librbd I/O paths. Ceph benchmarks
show ~50% IOPS improvement on 16KB random reads.

tcmalloc was originally used in 2015 but removed due to tuning issues
with gperftools 2.2. PVE 9 ships gperftools 2.16, where those issues
are long resolved.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
Acked-by: Fiona Ebner <f.ebner@proxmox.com>
---
 debian/control | 1 +
 debian/rules   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/debian/control b/debian/control
index 81cc026..a3121e1 100644
--- a/debian/control
+++ b/debian/control
@@ -14,6 +14,7 @@ Build-Depends: debhelper-compat (= 13),
                libfuse3-dev,
                libgbm-dev,
                libgnutls28-dev,
+               libgoogle-perftools-dev,
                libiscsi-dev (>= 1.12.0),
                libjpeg-dev,
                libjson-perl,
diff --git a/debian/rules b/debian/rules
index c90db29..a63e3a5 100755
--- a/debian/rules
+++ b/debian/rules
@@ -70,6 +70,7 @@ endif
 	    --enable-libusb \
 	    --enable-linux-aio \
 	    --enable-linux-io-uring \
+	    --enable-malloc=tcmalloc \
 	    --enable-numa \
 	    --enable-opengl \
 	    --enable-rbd \
-- 
2.47.3





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 pve-qemu 1/2] add patch to support using tcmalloc as the memory allocator
  2026-04-14  5:46 ` [PATCH v2 pve-qemu 1/2] add patch to support using " Kefu Chai
@ 2026-04-14  9:53   ` Fiona Ebner
  0 siblings, 0 replies; 7+ messages in thread
From: Fiona Ebner @ 2026-04-14  9:53 UTC (permalink / raw)
  To: Kefu Chai, pve-devel

Am 14.04.26 um 7:45 AM schrieb Kefu Chai:
> Add allocator-aware memory release in the backup completion path:
> since tcmalloc does not provide glibc's malloc_trim(), use the
> tcmalloc-specific MallocExtension_ReleaseFreeMemory() API instead.
> This function walks tcmalloc's page heap free span lists and calls
> madvise(MADV_DONTNEED) -- it does not walk allocated memory or compact
> the heap, so latency impact is negligible.
> 
> Also adds a CONFIG_TCMALLOC meson define so the conditional compilation
> in pve-backup.c can detect the allocator choice.
> 
> Signed-off-by: Kefu Chai <k.chai@proxmox.com>
> Acked-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  ...use-tcmalloc-as-the-memory-allocator.patch | 77 +++++++++++++++++++
>  debian/patches/series                         |  1 +
>  2 files changed, 78 insertions(+)
>  create mode 100644 debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch

The name of the patch file does not reflect the updated commit title.

> 
> diff --git a/debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch b/debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch
> new file mode 100644
> index 0000000..f8171ae
> --- /dev/null
> +++ b/debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch
> @@ -0,0 +1,77 @@
> +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
> +From: Kefu Chai <k.chai@proxmox.com>
> +Date: Thu, 9 Apr 2026 17:29:10 +0800
> +Subject: [PATCH] PVE-Backup: support using tcmalloc as the memory allocator




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 pve-qemu 0/2] Re-enable tcmalloc as the memory allocator
  2026-04-14  5:46 [PATCH v2 pve-qemu 0/2] Re-enable tcmalloc as the memory allocator Kefu Chai
  2026-04-14  5:46 ` [PATCH v2 pve-qemu 1/2] add patch to support using " Kefu Chai
  2026-04-14  5:46 ` [PATCH v2 pve-qemu 2/2] d/rules: enable " Kefu Chai
@ 2026-04-14 11:08 ` Fiona Ebner
  2026-04-14 15:36   ` Fiona Ebner
  2 siblings, 1 reply; 7+ messages in thread
From: Fiona Ebner @ 2026-04-14 11:08 UTC (permalink / raw)
  To: Kefu Chai, pve-devel

Am 14.04.26 um 7:45 AM schrieb Kefu Chai:
> Changes since v1:
> 
> * Rename patch 1/2 commit title to "add patch to support using
>   tcmalloc as the memory allocator" per Fiona's suggestion, since
>   this patch only adds the QEMU source patch without enabling
>   tcmalloc yet.
> * Rename the inner patch (0048) Subject to "PVE-Backup: support
>   using tcmalloc as the memory allocator" to reflect that the code
>   change is specific to the backup cleanup path.
> * Add Acked-by from Fiona.
> 
> No code changes -- only titles and trailers.
> 
> Kefu Chai (2):
>   add patch to support using tcmalloc as the memory allocator
>   d/rules: enable tcmalloc as the memory allocator
> 
>  debian/control                                |  1 +
>  ...use-tcmalloc-as-the-memory-allocator.patch | 77 +++++++++++++++++++
>  debian/patches/series                         |  1 +
>  debian/rules                                  |  1 +
>  4 files changed, 80 insertions(+)
>  create mode 100644 debian/patches/pve/0048-PVE-use-tcmalloc-as-the-memory-allocator.patch
> 
> --
> 2.47.3

I ran into a segmentation fault in tc_memalign() while doing a snapshot
with RAM and RBD volumes now:

[I] root@pve9a1 ~# cat /etc/pve/qemu-server/103.conf
balloon: 4992
bios: ovmf
boot: order=scsi0;net0;ide1
cores: 6
cpu: host
efidisk0:
rbd:vm-103-disk-3,efitype=4m,ms-cert=2023k,pre-enrolled-keys=1,size=1M
hotplug: disk,network,usb,memory
machine: pc-q35-10.1
memory: 7168
meta: creation-qemu=9.1.2,ctime=1736951759
name: win11-machine-ver
net0: e1000=BC:24:11:BC:50:E8,bridge=vnet0,firewall=1
net1: virtio=BC:24:11:75:BE:22,bridge=vnet0,firewall=1
numa: 1
ostype: win11
parent: pre-enroll
scsi0: rbd:vm-103-disk-1,iothread=1,size=52G
scsihw: virtio-scsi-single
smbios1: uuid=28e3302d-4489-466a-8a4c-835c79a8f2a0
sockets: 1
tpmstate0: rbd:vm-103-disk-2,size=4M,version=v2.0
unused0: rbd:vm-103-disk-0
vga: qxl
vmgenid: 6787d0c6-7ef5-498f-b9a1-7519262b661c

[I] root@pve9a1 ~# cat /etc/pve/storage.cfg | grep -A4 "rbd: rbd"
rbd: rbd
	content images,rootdir
	krbd 0
	pool rbd

[I] root@pve9a1 ~# uname -a
Linux pve9a1 7.0.0-1-rc6-pve #1 SMP PREEMPT_DYNAMIC PMX 7.0.0-1~rc6+1
(2026-03-30T09:17Z) x86_64 GNU/Linux

Note that I did play around with memory hotplug and ballooning before as
well, not sure if related.

Unfortunately, I don't have the debug symbols for librbd.so.1 right now:

> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007ea8da6442d0 in tc_memalign () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
> [Current thread is 1 (Thread 0x7ea8ca66a6c0 (LWP 109157))]
> (gdb) bt
> #0  0x00007ea8da6442d0 in tc_memalign () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
> #1  0x00007ea8da644412 in tc_posix_memalign () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
> #2  0x00007ea8da8b81e6 in ceph::buffer::v15_2_0::list::refill_append_space(unsigned int) () from /lib/librados.so.2
> #3  0x00007ea8da8b83b2 in ceph::buffer::v15_2_0::list::append(char const*, unsigned int) () from /lib/librados.so.2
> #4  0x00007ea8da897335 in librados::v14_2_0::ObjectOperation::exec(char const*, char const*, ceph::buffer::v15_2_0::list&) ()
>    from /lib/librados.so.2
> #5  0x00007ea8db03dca9 in ?? () from /lib/librbd.so.1
> #6  0x00007ea8dad6f96a in ?? () from /lib/librbd.so.1
> #7  0x00007ea8daba84c2 in ?? () from /lib/librbd.so.1
> #8  0x00007ea8dae25db4 in ?? () from /lib/librbd.so.1
> #9  0x00007ea8dae2d9e3 in ?? () from /lib/librbd.so.1
> #10 0x00007ea8dae20a54 in ?? () from /lib/librbd.so.1
> #11 0x00007ea8dacdf8c0 in ?? () from /lib/librbd.so.1
> #12 0x00007ea8dacdfc0f in ?? () from /lib/librbd.so.1
> #13 0x00007ea8dae1a788 in ?? () from /lib/librbd.so.1
> #14 0x00007ea8dae1af2f in ?? () from /lib/librbd.so.1
> #15 0x00007ea8dae1c9f6 in ?? () from /lib/librbd.so.1
> #16 0x00007ea8dae12e62 in ?? () from /lib/librbd.so.1
> #17 0x00007ea8dacdc932 in ?? () from /lib/librbd.so.1
> #18 0x00007ea8dacdcdb7 in ?? () from /lib/librbd.so.1
> #19 0x00007ea8dacdcf03 in ?? () from /lib/librbd.so.1
> #20 0x00007ea8dadcb09c in ?? () from /lib/librbd.so.1
> #21 0x00007ea8da8f2598 in ?? () from /lib/librados.so.2
> #22 0x00007ea8da8dfa71 in ?? () from /lib/librados.so.2
> #23 0x00007ea8da8f5f63 in ?? () from /lib/librados.so.2
> #24 0x00007ea8d9ce1224 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
> #25 0x00007ea8da4aab7b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #26 0x00007ea8da5287f8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6

With the main thread doing:

> Thread 12 (Thread 0x7ea8d72f2900 (LWP 109123)):
> #0  0x00007ea8da4b29ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x00007ea8da4a7668 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #2  0x00007ea8da4a7c8c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #3  0x00007ea8da4aa158 in pthread_cond_wait () from /lib/x86_64-linux-gnu/libc.so.6
> #4  0x00007ea8dabbc89b in ?? () from /lib/librbd.so.1
> #5  0x00007ea8dac4a9c5 in ?? () from /lib/librbd.so.1
> #6  0x00007ea8daafafbc in rbd_snap_create () from /lib/librbd.so.1
> #7  0x000055e3a051fb00 in qemu_rbd_snap_create (bs=<optimized out>, sn_info=0x55e3ddfb61c8) at ../block/rbd.c:1693
> #8  0x000055e3a0478268 in internal_snapshot_action (internal=<optimized out>, tran=tran@entry=0x55e3dd24b438, errp=errp@entry=0x7ffc43f674c8) at ../blockdev.c:1301
> #9  0x000055e3a047b67b in transaction_action (act=0x7ffc43f67540, block_job_txn=<optimized out>, tran=<optimized out>, errp=0x7ffc43f674c8) at ../blockdev.c:2177
> #10 qmp_transaction (actions=actions@entry=0x7ffc43f67550, properties=properties@entry=0x0, errp=0x7ffc43f67588, errp@entry=0x7ffc43f67528) at ../blockdev.c:2267
> #11 0x000055e3a047c0de in blockdev_do_action (action=0x7ffc43f67540, errp=0x7ffc43f67528) at ../blockdev.c:1072
> #12 qmp_blockdev_snapshot_internal_sync (device=<optimized out>, name=<optimized out>, errp=errp@entry=0x7ffc43f67588) at ../blockdev.c:1123
> #13 0x000055e3a056234b in qmp_marshal_blockdev_snapshot_internal_sync (args=<optimized out>, ret=<optimized out>, errp=0x7ea8d6a9aed0) at qapi/qapi-commands-block-core.c:2164
> #14 0x000055e3a05cd11c in do_qmp_dispatch_bh (opaque=0x7ea8d6a9aee0) at ../qapi/qmp-dispatch.c:136
> #15 0x000055e3a05ede13 in aio_bh_poll (ctx=ctx@entry=0x55e3db31c000) at ../util/async.c:219
> #16 0x000055e3a05d73cf in aio_dispatch (ctx=0x55e3db31c000) at ../util/aio-posix.c:390
> #17 0x000055e3a05edb76 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../util/async.c:364
> #18 0x00007ea8dcdc0385 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #19 0x00007ea8dcdc2c78 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #20 0x000055e3a05ef278 in glib_pollfds_poll () at ../util/main-loop.c:290
> #21 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:313
> #22 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:592
> #23 0x000055e3a0211680 in qemu_main_loop () at ../system/runstate.c:904
> #24 0x000055e3a0538450 in qemu_default_main (opaque=opaque@entry=0x0) at ../system/main.c:50
> #25 0x000055e39ff7eb09 in main (argc=<optimized out>, argv=<optimized out>) at ../system/main.c:93






^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 pve-qemu 0/2] Re-enable tcmalloc as the memory allocator
  2026-04-14 11:08 ` [PATCH v2 pve-qemu 0/2] Re-enable " Fiona Ebner
@ 2026-04-14 15:36   ` Fiona Ebner
  2026-04-15 10:22     ` Kefu Chai
  0 siblings, 1 reply; 7+ messages in thread
From: Fiona Ebner @ 2026-04-14 15:36 UTC (permalink / raw)
  To: Kefu Chai, pve-devel

Am 14.04.26 um 1:08 PM schrieb Fiona Ebner:
> Note that I did play around with memory hotplug and ballooning before as
> well, not sure if related.
> 
> Unfortunately, I don't have the debug symbols for librbd.so.1 right now:
> 
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x00007ea8da6442d0 in tc_memalign () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
>> [Current thread is 1 (Thread 0x7ea8ca66a6c0 (LWP 109157))]

I had added malloc_stats(); calls around
MallocExtension_ReleaseFreeMemory(); to better see the effects, which
also requires including malloc.h in pve-backup.c when building for
tcmalloc. I also did a few backups before, so I can't rule out that it's
related to that. I did a build of librbd1 and librados2 with the debug
symbols now, but haven't been able to reproduce the issue yet. Will try
more tomorrow.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 pve-qemu 0/2] Re-enable tcmalloc as the memory allocator
  2026-04-14 15:36   ` Fiona Ebner
@ 2026-04-15 10:22     ` Kefu Chai
  0 siblings, 0 replies; 7+ messages in thread
From: Kefu Chai @ 2026-04-15 10:22 UTC (permalink / raw)
  To: Fiona Ebner, pve-devel

On Tue Apr 14, 2026 at 11:36 PM CST, Fiona Ebner wrote:
> Am 14.04.26 um 1:08 PM schrieb Fiona Ebner:
>> Note that I did play around with memory hotplug and ballooning before as
>> well, not sure if related.
>> 
>> Unfortunately, I don't have the debug symbols for librbd.so.1 right now:
>> 
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  0x00007ea8da6442d0 in tc_memalign () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
>>> [Current thread is 1 (Thread 0x7ea8ca66a6c0 (LWP 109157))]

Hi Fiona,

Thank you for the backtrace.

I dug into the segfault, but was not able to reprudce it locally after
performing over 3300 snapshot ops on two RBD drives, including
concurrent and batch ops.

I also searched over internet to see if we are alone. And here is what I
found: 

The crash site (SLL_Next in linked_list.h) is a known pattern in
gperftools. It's what happens when a *prior* operation corrupts a freed
block's embedded freelist pointer, and a later allocation follows the
garbage and segfaults. Essentially, tc_memalign() is the victim, not the
culprit. RHBZ #1430223 [1] and gperftools issues #1036 [2] and #1096 [3]
all describe the same crash pattern with Ceph. RHBZ #1494309 [4] is also
worth noting -- tcmalloc didn't intercept aligned_alloc() until
gperftools 2.6.1-5, causing a mixed-allocator situation where glibc
allocated but tcmalloc freed. That one's long fixed in our 2.16, but it
shows this corner of the allocator has had real bugs before.

If it happens again, probably the way to catch the actual corruption
at its source would be:

  LD_PRELOAD=libtcmalloc_debug.so.4 qemu-system-x86_64 ...

This adds guard words around allocations and checks them on free,
so it'd point straight at whatever is doing the corrupting write.
This comes with 2-5x overhead, but guess it's fine for debugging.

If you manage to reproduce it, I am more than happy to debug it with
your reproducer.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1430223
[2] https://github.com/gperftools/gperftools/issues/1036
[3] https://github.com/gperftools/gperftools/issues/1096
[4] https://bugzilla.redhat.com/show_bug.cgi?id=1494309

>
> I had added malloc_stats(); calls around
> MallocExtension_ReleaseFreeMemory(); to better see the effects, which
> also requires including malloc.h in pve-backup.c when building for
> tcmalloc. I also did a few backups before, so I can't rule out that it's
> related to that. I did a build of librbd1 and librados2 with the debug
> symbols now, but haven't been able to reproduce the issue yet. Will try
> more tomorrow.





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-15 10:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-04-14  5:46 [PATCH v2 pve-qemu 0/2] Re-enable tcmalloc as the memory allocator Kefu Chai
2026-04-14  5:46 ` [PATCH v2 pve-qemu 1/2] add patch to support using " Kefu Chai
2026-04-14  9:53   ` Fiona Ebner
2026-04-14  5:46 ` [PATCH v2 pve-qemu 2/2] d/rules: enable " Kefu Chai
2026-04-14 11:08 ` [PATCH v2 pve-qemu 0/2] Re-enable " Fiona Ebner
2026-04-14 15:36   ` Fiona Ebner
2026-04-15 10:22     ` Kefu Chai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal