From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 24C989203F for ; Thu, 9 Feb 2023 15:19:31 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 055452535E for ; Thu, 9 Feb 2023 15:19:01 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 9 Feb 2023 15:19:00 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id EB3B74652B for ; Thu, 9 Feb 2023 15:18:59 +0100 (CET) From: Fiona Ebner To: pve-devel@lists.proxmox.com Date: Thu, 9 Feb 2023 15:17:49 +0100 Message-Id: <20230209141749.607994-1-f.ebner@proxmox.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.003 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [RFC-NOT-TO-BE-APPLIED qemu] log on writes to sector zero X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Feb 2023 14:19:31 -0000 The idea is to make a qemu build with this available to users affected by bug #2874 in the hope to catch an actual buggy write should it happen again and if it actually comes from QEMU. The logging in block-backend covers writes coming from the virtual disk protocol (e.g. SATA), while the loggin in block/io should cover most writes coming from both guest and block jobs (AFAICT, bdrv_co_pwritev_part() should cover most paths leading to writes on the device, e.g. bdrv_co_pwrite_zeroes() calls it too). Note that there are false positives we can't filter, because they are valid writes to sector 0, for example: * A drive with just an ext4 filesystem (no partitions) seems to have a write to the first sector on every Linux boot and shutdown. * Any other guest write that should go to sector 0. * Live move disk/mirror operations. * Live restore from PBS. In bdrv_co_pwritev_part(): False positives with qemu-img and pbs-restore etc. are avoided by checking the program's path read via the /proc/self/exe link. If there is no filename for the block driver state, nothing is printed to avoid false positives for the backup target (and other such special devices). Drives on LVM(-Thin), ZFS, RBD (with and without krbd), file based storages and even iSCSI all seem to have the filename property set. Sometimes the filename will be a bit lenghty, e.g. for a Ceph storage without krbd, but it's better to still catch these: json:{"pool": "rbdkvm", "image": "vm-168-disk-0", "conf": "/etc/pve/ceph.conf", "driver": "rbd", "namespace": "", "user": "admin"} backtrace_symbols() is used to get the relative offset from the binary rather than full address, making it easy to use addr2line afterwards to get file name and line number. There seems to be a slight mismatch in line numbers unfortunately, but it should be enough to figure out the call path. (Compiling with -rdynamic would allow resolving the symbols themselves, but also not provide line numbers and printing only offsets relative to the resolved symbol, making it harder to use for addr2line). Signed-off-by: Fiona Ebner --- block/block-backend.c | 18 ++++++++++++++++++ block/io.c | 20 ++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index 1b563e628b..3233403d2e 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -28,6 +28,8 @@ #include "trace.h" #include "migration/misc.h" +#include + /* Number of coroutines to reserve per attached device model */ #define COROUTINE_POOL_RESERVATION 64 @@ -1625,6 +1627,22 @@ BlockAIOCB *blk_aio_pwritev(BlockBackend *blk, int64_t offset, { IO_CODE(); assert((uint64_t)qiov->size <= INT64_MAX); + + if (offset < 512) { + void *trace[20]; + const char *name = blk_name(blk) ?: "unnamed"; + int line_count = backtrace(trace, 20); + char **trace_lines = backtrace_symbols(trace, line_count); + + warn_report("write to first sector on device '%s':", name); + if (trace_lines != NULL) { + for (int i = 0; i < line_count; i++) { + warn_report("%s", trace_lines[i]); + } + free(trace_lines); + } + } + return blk_aio_prwv(blk, offset, qiov->size, qiov, blk_aio_write_entry, flags, cb, opaque); } diff --git a/block/io.c b/block/io.c index 531b3b7a2d..2b552c4d12 100644 --- a/block/io.c +++ b/block/io.c @@ -38,6 +38,8 @@ #include "qemu/main-loop.h" #include "sysemu/replay.h" +#include + /* Maximum bounce buffer for copy-on-read and write zeroes, in bytes */ #define MAX_BOUNCE_BUFFER (32768 << BDRV_SECTOR_BITS) @@ -2222,6 +2224,24 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child, bool padded = false; IO_CODE(); + if (offset < 512) { + char path[PATH_MAX]; + readlink("/proc/self/exe", path, PATH_MAX); + if ((g_strrstr(path, "/qemu-system") || g_strrstr(path, "/kvm")) && bs->filename[0]) { + void *trace[20]; + int line_count = backtrace(trace, 20); + char **trace_lines = backtrace_symbols(trace, line_count); + warn_report("write to first sector on device '%s':", bs->filename); + + if (trace_lines != NULL) { + for (int i = 0; i < line_count; i++) { + warn_report("%s", trace_lines[i]); + } + free(trace_lines); + } + } + } + trace_bdrv_co_pwritev_part(child->bs, offset, bytes, flags); if (!bdrv_is_inserted(bs)) { -- 2.30.2