public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu-server] fix #4549: capture and bubble up qemu errors correctly
@ 2023-05-09  8:46 Christoph Heiss
  2023-06-12 11:22 ` Fiona Ebner
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Heiss @ 2023-05-09  8:46 UTC (permalink / raw)
  To: pve-devel

See [0]. The issue is that currently, all messages from qemu get
directly sent to stdout by run_command(). In turn, this output was
missing from mail notifications.

Fix this by unifying it with the migration logging, properly capturing
the output and including it in the error message, which then gets
further passed on. Also has the benefit that messages from qemu are now
prefixed with 'QEMU: ' in all cases, making it more uniform.

As for phase2_start_local_cluster(), this now gets caught by the
`errfunc`, thus the special handling can be removed.

For `vzdump`, testing was done by manually setting a non-existent bridge
on a network interface. For (live-)migration, a bridge was created which
exists on the source node but not on the target. Both cases trigger an
qemu error when it tries to start up, very similarly to the bug report.

[0] https://bugzilla.proxmox.com/show_bug.cgi?id=4549

Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
---
 PVE/QemuMigrate.pm |  2 --
 PVE/QemuServer.pm  | 16 ++++++++--------
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 09cc1d8..b1c28b3 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -998,8 +998,6 @@ sub phase2_start_local_cluster {
 	    my $drive = $1;
 	    my $volid = $2;
 	    $target_replicated_volumes->{$volid} = $drive;
-	} elsif ($line =~ m/^QEMU: (.*)$/) {
-	    $self->log('info', "[$self->{node}] $1\n");
 	}
     }, errfunc => sub {
 	my $line = shift;
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index c1d0fd2..8375b2a 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5885,15 +5885,9 @@ sub vm_start_nolock {
 	timeout => $statefile ? undef : $start_timeout,
 	umask => 0077,
 	noerr => 1,
+	quiet => 1,
     );
 
-    # when migrating, prefix QEMU output so other side can pick up any
-    # errors that might occur and show the user
-    if ($migratedfrom) {
-	$run_params{quiet} = 1;
-	$run_params{logfunc} = sub { print "QEMU: $_[0]\n" };
-    }
-
     my %systemd_properties = (
 	Slice => 'qemu.slice',
 	KillMode => 'process',
@@ -5922,13 +5916,19 @@ sub vm_start_nolock {
 		$tpmpid = start_swtpm($storecfg, $vmid, $tpm, $migratedfrom);
 	    }
 
+	    # '\r\n' needs to be used as a line separator here, as that is what run_command() splits
+	    # lines on. If just a newline is used, all error message lines end up on a single line
+	    # in the migration tasklog.
+	    my $err = '';
+	    $run_params{logfunc} = sub { $err .= "QEMU: $1\r\n"; };
+
 	    my $exitcode = run_command($cmd, %run_params);
 	    if ($exitcode) {
 		if ($tpmpid) {
 		    warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
 		    kill 'TERM', $tpmpid;
 		}
-		die "QEMU exited with code $exitcode\n";
+		die "QEMU exited with code $exitcode\r\n$err";
 	    }
 	};
     };
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH qemu-server] fix #4549: capture and bubble up qemu errors correctly
  2023-05-09  8:46 [pve-devel] [PATCH qemu-server] fix #4549: capture and bubble up qemu errors correctly Christoph Heiss
@ 2023-06-12 11:22 ` Fiona Ebner
  2023-06-12 11:26   ` Fiona Ebner
  2023-06-12 14:54   ` Thomas Lamprecht
  0 siblings, 2 replies; 5+ messages in thread
From: Fiona Ebner @ 2023-06-12 11:22 UTC (permalink / raw)
  To: Proxmox VE development discussion, Christoph Heiss

Am 09.05.23 um 10:46 schrieb Christoph Heiss:
> @@ -5922,13 +5916,19 @@ sub vm_start_nolock {
>  		$tpmpid = start_swtpm($storecfg, $vmid, $tpm, $migratedfrom);
>  	    }
>  
> +	    # '\r\n' needs to be used as a line separator here, as that is what run_command() splits
> +	    # lines on. If just a newline is used, all error message lines end up on a single line
> +	    # in the migration tasklog.

So this is essentially a hack to work around the fact that we don't call
print once for each line anymore ;)

> +	    my $err = '';
> +	    $run_params{logfunc} = sub { $err .= "QEMU: $1\r\n"; };

Style nit: please use $run_params->{logfunc}

So you're eating up all the log lines here...

> +
>  	    my $exitcode = run_command($cmd, %run_params);
>  	    if ($exitcode) {
>  		if ($tpmpid) {
>  		    warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
>  		    kill 'TERM', $tpmpid;
>  		}
> -		die "QEMU exited with code $exitcode\n";
> +		die "QEMU exited with code $exitcode\r\n$err";

...and only print them in the failure case. This means all non-critical
errors/warnings/info will never see the light of day anymore? To get a
warning for testing you can use e.g.
qm set 100 --args '-no-hpet'
with QEMU >= 8.0.

The current approach also produces a leading space in the task logs for
the follow-up lines:

> TASK ERROR: start failed: QEMU exited with code 1
>  QEMU: kvm: -no-hpet: warning: -no-hpet is deprecated, use '-machine hpet=off' instead
>  QEMU: bridge 'vmbr1' does not exist
>  QEMU: kvm: -netdev type=tap,id=net1,ifname=tap100i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on: network script /var/lib/qemu-server/pve-bridge failed with status 512

Instead of getting rid of the "print once per line" handling, can't we
rather adapt the invocation for backup to bubble up the other log lines too?

Always prefixing the messages with "QEMU: " seems fine to me (even
things like the "bridge 'vmbr1' does not exist" message technically come
from the pve-bridge script, it's still part of the QEMU invocation ;)).




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH qemu-server] fix #4549: capture and bubble up qemu errors correctly
  2023-06-12 11:22 ` Fiona Ebner
@ 2023-06-12 11:26   ` Fiona Ebner
  2023-06-12 14:54   ` Thomas Lamprecht
  1 sibling, 0 replies; 5+ messages in thread
From: Fiona Ebner @ 2023-06-12 11:26 UTC (permalink / raw)
  To: Proxmox VE development discussion, Christoph Heiss

Am 12.06.23 um 13:22 schrieb Fiona Ebner:
> Am 09.05.23 um 10:46 schrieb Christoph Heiss:
>> +	    my $err = '';
>> +	    $run_params{logfunc} = sub { $err .= "QEMU: $1\r\n"; };
> 
> Style nit: please use $run_params->{logfunc}

Please ignore me here, didn't realize it wasn't a hash reference, but a
hash at first. I guess it wouldn't just be style otherwise :)




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH qemu-server] fix #4549: capture and bubble up qemu errors correctly
  2023-06-12 11:22 ` Fiona Ebner
  2023-06-12 11:26   ` Fiona Ebner
@ 2023-06-12 14:54   ` Thomas Lamprecht
  2023-06-13  7:05     ` Fiona Ebner
  1 sibling, 1 reply; 5+ messages in thread
From: Thomas Lamprecht @ 2023-06-12 14:54 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner, Christoph Heiss

Am 12/06/2023 um 13:22 schrieb Fiona Ebner:
> Am 09.05.23 um 10:46 schrieb Christoph Heiss:
>> @@ -5922,13 +5916,19 @@ sub vm_start_nolock {
>>  		$tpmpid = start_swtpm($storecfg, $vmid, $tpm, $migratedfrom);
>>  	    }
>>  
>> +	    # '\r\n' needs to be used as a line separator here, as that is what run_command() splits
>> +	    # lines on. If just a newline is used, all error message lines end up on a single line
>> +	    # in the migration tasklog.
> 
> So this is essentially a hack to work around the fact that we don't call
> print once for each line anymore ;)
> 
>> +	    my $err = '';
>> +	    $run_params{logfunc} = sub { $err .= "QEMU: $1\r\n"; };
> 
> Style nit: please use $run_params->{logfunc}

does not work here as this is an actual hash value, not the (for our perl code more common)
hash ref.

my %foo = (a => 1);
$foo{b} = 2;

vs.

my $foo = {a => 1};
$foo->{b} = 2;

> 
> So you're eating up all the log lines here...
> 
>> +
>>  	    my $exitcode = run_command($cmd, %run_params);
>>  	    if ($exitcode) {
>>  		if ($tpmpid) {
>>  		    warn "stopping swtpm instance (pid $tpmpid) due to QEMU startup error\n";
>>  		    kill 'TERM', $tpmpid;
>>  		}
>> -		die "QEMU exited with code $exitcode\n";
>> +		die "QEMU exited with code $exitcode\r\n$err";
> 
> ...and only print them in the failure case. This means all non-critical
> errors/warnings/info will never see the light of day anymore? To get a
> warning for testing you can use e.g.
> qm set 100 --args '-no-hpet'
> with QEMU >= 8.0.
> 
> The current approach also produces a leading space in the task logs for
> the follow-up lines:
> 
>> TASK ERROR: start failed: QEMU exited with code 1
>>  QEMU: kvm: -no-hpet: warning: -no-hpet is deprecated, use '-machine hpet=off' instead
>>  QEMU: bridge 'vmbr1' does not exist
>>  QEMU: kvm: -netdev type=tap,id=net1,ifname=tap100i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on: network script /var/lib/qemu-server/pve-bridge failed with status 512
> 
> Instead of getting rid of the "print once per line" handling, can't we
> rather adapt the invocation for backup to bubble up the other log lines too?

would it make sense to differ between stderr/stdour and do log_warn and print,
respectively?

> 
> Always prefixing the messages with "QEMU: " seems fine to me (even
> things like the "bridge 'vmbr1' does not exist" message technically come
> from the pve-bridge script, it's still part of the QEMU invocation ;)).

not sure if we really need a prefix at all here..




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH qemu-server] fix #4549: capture and bubble up qemu errors correctly
  2023-06-12 14:54   ` Thomas Lamprecht
@ 2023-06-13  7:05     ` Fiona Ebner
  0 siblings, 0 replies; 5+ messages in thread
From: Fiona Ebner @ 2023-06-13  7:05 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox VE development discussion, Christoph Heiss

Am 12.06.23 um 16:54 schrieb Thomas Lamprecht:
> Am 12/06/2023 um 13:22 schrieb Fiona Ebner:
>> Instead of getting rid of the "print once per line" handling, can't we
>> rather adapt the invocation for backup to bubble up the other log lines too?
> 
> would it make sense to differ between stderr/stdour and do log_warn and print,
> respectively?
> 

Could also be done, but from a quick look in the code, QEMU also prints
its info_report() to stderr. Could lead to a bit of over-reporting as
warnings in our task logs, but might be fine for most cases; there's not
too many info_report instances.

>>
>> Always prefixing the messages with "QEMU: " seems fine to me (even
>> things like the "bridge 'vmbr1' does not exist" message technically come
>> from the pve-bridge script, it's still part of the QEMU invocation ;)).
> 
> not sure if we really need a prefix at all here..

Well, the migration code currently expects it and it helps to recognize
where the message comes from in complex logs like migration (and after
this bug is fixed, backup).




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-13  7:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-09  8:46 [pve-devel] [PATCH qemu-server] fix #4549: capture and bubble up qemu errors correctly Christoph Heiss
2023-06-12 11:22 ` Fiona Ebner
2023-06-12 11:26   ` Fiona Ebner
2023-06-12 14:54   ` Thomas Lamprecht
2023-06-13  7:05     ` Fiona Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal