From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 2A1D39A70E for ; Tue, 9 May 2023 10:47:32 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 02D5E1C7D2 for ; Tue, 9 May 2023 10:47:02 +0200 (CEST) Received: from maui.proxmox.com (unknown [94.136.29.99]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 9 May 2023 10:47:01 +0200 (CEST) Received: from maui.proxmox.com (localhost.localdomain [127.0.0.1]) by maui.proxmox.com (Proxmox) with ESMTP id 101334000A8 for ; Tue, 9 May 2023 10:47:01 +0200 (CEST) Received: from maui.proxmox.com (localhost.localdomain [127.0.0.1]) by maui.proxmox.com (Proxmox) with ESMTP id 56F0A40029E for ; Tue, 9 May 2023 10:47:00 +0200 (CEST) From: Christoph Heiss To: pve-devel@lists.proxmox.com Date: Tue, 9 May 2023 10:46:54 +0200 Message-Id: <20230509084654.666207-1-c.heiss@proxmox.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.498 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RDNS_NONE 0.793 Delivered to internal network by a host with no rDNS SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, qemumigrate.pm, qemuserver.pm] Subject: [pve-devel] [PATCH qemu-server] fix #4549: capture and bubble up qemu errors correctly X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 May 2023 08:47:32 -0000 See [0]. The issue is that currently, all messages from qemu get directly sent to stdout by run_command(). In turn, this output was missing from mail notifications. Fix this by unifying it with the migration logging, properly capturing the output and including it in the error message, which then gets further passed on. Also has the benefit that messages from qemu are now prefixed with 'QEMU: ' in all cases, making it more uniform. As for phase2_start_local_cluster(), this now gets caught by the `errfunc`, thus the special handling can be removed. For `vzdump`, testing was done by manually setting a non-existent bridge on a network interface. For (live-)migration, a bridge was created which exists on the source node but not on the target. Both cases trigger an qemu error when it tries to start up, very similarly to the bug report. [0] https://bugzilla.proxmox.com/show_bug.cgi?id=4549 Signed-off-by: Christoph Heiss --- PVE/QemuMigrate.pm | 2 -- PVE/QemuServer.pm | 16 ++++++++-------- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm index 09cc1d8..b1c28b3 100644 --- a/PVE/QemuMigrate.pm +++ b/PVE/QemuMigrate.pm @@ -998,8 +998,6 @@ sub phase2_start_local_cluster { my $drive = $1; my $volid = $2; $target_replicated_volumes->{$volid} = $drive; - } elsif ($line =~ m/^QEMU: (.*)$/) { - $self->log('info', "[$self->{node}] $1\n"); } }, errfunc => sub { my $line = shift; diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm index c1d0fd2..8375b2a 100644 --- a/PVE/QemuServer.pm +++ b/PVE/QemuServer.pm @@ -5885,15 +5885,9 @@ sub vm_start_nolock { timeout => $statefile ? undef : $start_timeout, umask => 0077, noerr => 1, + quiet => 1, ); - # when migrating, prefix QEMU output so other side can pick up any - # errors that might occur and show the user - if ($migratedfrom) { - $run_params{quiet} = 1; - $run_params{logfunc} = sub { print "QEMU: $_[0]\n" }; - } - my %systemd_properties = ( Slice => 'qemu.slice', KillMode => 'process', @@ -5922,13 +5916,19 @@ sub vm_start_nolock { $tpmpid = start_swtpm($storecfg, $vmid, $tpm, $migratedfrom); } + # '\r\n' needs to be used as a line separator here, as that is what run_command() splits + # lines on. If just a newline is used, all error message lines end up on a single line + # in the migration tasklog. + my $err = ''; + $run_params{logfunc} = sub { $err .= "QEMU: $1\r\n"; }; + my $exitcode = run_command($cmd, %run_params); if ($exitcode) { if ($tpmpid) { warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n"; kill 'TERM', $tpmpid; } - die "QEMU exited with code $exitcode\n"; + die "QEMU exited with code $exitcode\r\n$err"; } }; }; -- 2.39.2