* [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU
@ 2025-10-07 12:24 Fiona Ebner
2025-10-07 12:25 ` [pve-devel] [PATCH common v3 1/2] systemd: add sd_notify() helper Fiona Ebner
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Fiona Ebner @ 2025-10-07 12:24 UTC (permalink / raw)
To: pve-devel
Changes in v3 (thanks to Thomas!):
* Expand commit message for sd_notify() helper.
* Use $socket->{send,shutdown) methods.
* Print $IO::Socket::errstr in case of error.
* Unset NOTIFY_SOCKET environment variable only after sending the
message.
Changes in v2:
* Dropped already applied patches.
* Introduce sd_notify() helper.
* Different approach, make the service type=notify instead of waiting
in a sleep+check-loop until the object shows up via QMP 'qom-list'.
As reported in the community forum [0], it might happen that the
dbus-vmstate object is not added (quickly enough) to the target QEMU
instance, before the migration state is loaded. This would result in
a crash of the target instance.
[0]: https://forum.proxmox.com/threads/172588/
Dependency bump qemu-server -> pve-common needed.
pve-common:
Fiona Ebner (1):
systemd: add sd_notify() helper
src/PVE/Systemd.pm | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
qemu-server:
Fiona Ebner (1):
migration: conntrack: avoid crash when dbus-vmstate object cannot be
added (quickly enough)
src/usr/dbus-vmstate | 3 +++
src/usr/pve-dbus-vmstate@.service | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
Summary over all repositories:
3 files changed, 35 insertions(+), 1 deletions(-)
--
Generated by git-murpp 0.5.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread* [pve-devel] [PATCH common v3 1/2] systemd: add sd_notify() helper 2025-10-07 12:24 [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU Fiona Ebner @ 2025-10-07 12:25 ` Fiona Ebner 2025-10-28 14:58 ` Wolfgang Bumiller 2025-10-07 12:25 ` [pve-devel] [PATCH qemu-server v3 2/2] migration: conntrack: avoid crash when dbus-vmstate object cannot be added (quickly enough) Fiona Ebner 2025-10-15 8:15 ` [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU Fiona Ebner 2 siblings, 1 reply; 7+ messages in thread From: Fiona Ebner @ 2025-10-07 12:25 UTC (permalink / raw) To: pve-devel Implement a pure Perl reimplementation of systemd's sd_notify() as defined in systemd/sd-daemon.h, see also 'man 3 sd_notify'. The initial user of this helper is intended to be the pve-dbus-vmstate service, so it can notify startup completion only once the dbus-vmstate QEMU object is ready to be used. EAGAIN is not checked for, because it does not occur for blocking Unix domain sockets, see 'man 2 send'. Co-developed-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- Changes in v3: * Expand commit message. * Use $socket->{send,shutdown) methods. * Print $IO::Socket::errstr in case of error. * Unset NOTIFY_SOCKET environment variable only after sending the message. src/PVE/Systemd.pm | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/src/PVE/Systemd.pm b/src/PVE/Systemd.pm index e6d6f88..d0a291d 100644 --- a/src/PVE/Systemd.pm +++ b/src/PVE/Systemd.pm @@ -3,9 +3,12 @@ package PVE::Systemd; use strict; use warnings; +use IO::Socket::UNIX; use Net::DBus qw(dbus_uint32 dbus_uint64 dbus_boolean); use Net::DBus::Callback; use Net::DBus::Reactor; +use POSIX qw(EINTR); +use Socket qw(SOCK_DGRAM); use PVE::Tools qw(file_set_contents file_get_contents trim); @@ -282,4 +285,32 @@ sub write_ini { file_set_contents($filename, $content); } +# This is a pure Perl reimplementation of systemd's sd_notify() as defined in systemd/sd-daemon.h +sub sd_notify { + my ($unset_environment, $state) = @_; + + my $socket_path = $ENV{NOTIFY_SOCKET}; + + my $socket = IO::Socket::UNIX->new( + Type => SOCK_DGRAM(), + Peer => $socket_path, + ) or die "unable to connect to socket $socket_path to notify systemd - $IO::Socket::errstr\n"; + + # we won't be reading from the socket + $socket->shutdown(SHUT_RD); + + my $sent = 0; + my $total = length($state); + while ($sent < $total) { + my $res = $socket->send($state); + die "sending to $socket_path failed - $!" if !$res && $! != EINTR; + $sent += $res if $res; + } + $socket->flush(); + + close($socket); + + delete($ENV{NOTIFY_SOCKET}) if $unset_environment; +} + 1; -- 2.47.3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH common v3 1/2] systemd: add sd_notify() helper 2025-10-07 12:25 ` [pve-devel] [PATCH common v3 1/2] systemd: add sd_notify() helper Fiona Ebner @ 2025-10-28 14:58 ` Wolfgang Bumiller 2025-10-29 9:16 ` Fiona Ebner 0 siblings, 1 reply; 7+ messages in thread From: Wolfgang Bumiller @ 2025-10-28 14:58 UTC (permalink / raw) To: Fiona Ebner; +Cc: pve-devel On Tue, Oct 07, 2025 at 02:25:00PM +0200, Fiona Ebner wrote: > Implement a pure Perl reimplementation of systemd's sd_notify() as > defined in systemd/sd-daemon.h, see also 'man 3 sd_notify'. > > The initial user of this helper is intended to be the pve-dbus-vmstate > service, so it can notify startup completion only once the > dbus-vmstate QEMU object is ready to be used. > > EAGAIN is not checked for, because it does not occur for blocking > Unix domain sockets, see 'man 2 send'. > > Co-developed-by: Thomas Lamprecht <t.lamprecht@proxmox.com> > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> > --- > > Changes in v3: > * Expand commit message. > * Use $socket->{send,shutdown) methods. > * Print $IO::Socket::errstr in case of error. > * Unset NOTIFY_SOCKET environment variable only after sending the > message. > > src/PVE/Systemd.pm | 31 +++++++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/src/PVE/Systemd.pm b/src/PVE/Systemd.pm > index e6d6f88..d0a291d 100644 > --- a/src/PVE/Systemd.pm > +++ b/src/PVE/Systemd.pm > @@ -3,9 +3,12 @@ package PVE::Systemd; > use strict; > use warnings; > > +use IO::Socket::UNIX; > use Net::DBus qw(dbus_uint32 dbus_uint64 dbus_boolean); > use Net::DBus::Callback; > use Net::DBus::Reactor; > +use POSIX qw(EINTR); > +use Socket qw(SOCK_DGRAM); > > use PVE::Tools qw(file_set_contents file_get_contents trim); > > @@ -282,4 +285,32 @@ sub write_ini { > file_set_contents($filename, $content); > } > > +# This is a pure Perl reimplementation of systemd's sd_notify() as defined in systemd/sd-daemon.h > +sub sd_notify { > + my ($unset_environment, $state) = @_; > + > + my $socket_path = $ENV{NOTIFY_SOCKET}; Technically this could be an abstract socket. Should be enough to just $socket_path =~ s/^@/\0/; > + > + my $socket = IO::Socket::UNIX->new( > + Type => SOCK_DGRAM(), > + Peer => $socket_path, > + ) or die "unable to connect to socket $socket_path to notify systemd - $IO::Socket::errstr\n"; > + > + # we won't be reading from the socket > + $socket->shutdown(SHUT_RD); > + > + my $sent = 0; > + my $total = length($state); > + while ($sent < $total) { > + my $res = $socket->send($state); > + die "sending to $socket_path failed - $!" if !$res && $! != EINTR; > + $sent += $res if $res; ^ This is a datagram socket. Systemd expects a single datagram. The code sort of makes it look like you're trying doing a `write_all()` style send (without actually changing what it sent in between calls which wouldn't return zero). Trying to continue sending in a fragmented way won't work anyway. (Otherwise it would be rather cumbersome, since the protocol also allows adding things like file descriptors to store in the fd registry; data and metadata need to come in one nice bundle) The example code in the referenced man page errors out with `-EPROTO` if the length does not match, so we could do that as well. So basically, only an EINTR loop makes sense here. > + } > + $socket->flush(); This should not be necessary, this is not buffered I/O. > + > + close($socket); > + > + delete($ENV{NOTIFY_SOCKET}) if $unset_environment; Why is this part of this function, though? > +} > + > 1; > -- > 2.47.3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH common v3 1/2] systemd: add sd_notify() helper 2025-10-28 14:58 ` Wolfgang Bumiller @ 2025-10-29 9:16 ` Fiona Ebner 0 siblings, 0 replies; 7+ messages in thread From: Fiona Ebner @ 2025-10-29 9:16 UTC (permalink / raw) To: Wolfgang Bumiller; +Cc: pve-devel Am 28.10.25 um 3:58 PM schrieb Wolfgang Bumiller: > On Tue, Oct 07, 2025 at 02:25:00PM +0200, Fiona Ebner wrote: >> @@ -282,4 +285,32 @@ sub write_ini { >> file_set_contents($filename, $content); >> } >> >> +# This is a pure Perl reimplementation of systemd's sd_notify() as defined in systemd/sd-daemon.h >> +sub sd_notify { >> + my ($unset_environment, $state) = @_; >> + >> + my $socket_path = $ENV{NOTIFY_SOCKET}; > > Technically this could be an abstract socket. Should be enough to just > > $socket_path =~ s/^@/\0/; Will do in v4! > >> + >> + my $socket = IO::Socket::UNIX->new( >> + Type => SOCK_DGRAM(), >> + Peer => $socket_path, >> + ) or die "unable to connect to socket $socket_path to notify systemd - $IO::Socket::errstr\n"; >> + >> + # we won't be reading from the socket >> + $socket->shutdown(SHUT_RD); >> + >> + my $sent = 0; >> + my $total = length($state); >> + while ($sent < $total) { >> + my $res = $socket->send($state); >> + die "sending to $socket_path failed - $!" if !$res && $! != EINTR; >> + $sent += $res if $res; > > ^ This is a datagram socket. Systemd expects a single datagram. > The code sort of makes it look like you're trying doing a `write_all()` > style send (without actually changing what it sent in between calls > which wouldn't return zero). > Trying to continue sending in a fragmented way won't work anyway. > (Otherwise it would be rather cumbersome, since the protocol also allows > adding things like file descriptors to store in the fd registry; data > and metadata need to come in one nice bundle) > > The example code in the referenced man page errors out with `-EPROTO` if > the length does not match, so we could do that as well. > > So basically, only an EINTR loop makes sense here. Yes, I messed that up. Thank you for the explanation and suggestions! > >> + } >> + $socket->flush(); > > This should not be necessary, this is not buffered I/O. Ack! >> + >> + close($socket); >> + >> + delete($ENV{NOTIFY_SOCKET}) if $unset_environment; > > Why is this part of this function, though? Because I wanted to match the signature of the original sd_notify(), but actually it already doesn't match because of the return value. I guess I'll drop it for now since the examples in 'man 3 sd_notify' also don't have it. If a future caller needs it, it could also unset the variable itself. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* [pve-devel] [PATCH qemu-server v3 2/2] migration: conntrack: avoid crash when dbus-vmstate object cannot be added (quickly enough) 2025-10-07 12:24 [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU Fiona Ebner 2025-10-07 12:25 ` [pve-devel] [PATCH common v3 1/2] systemd: add sd_notify() helper Fiona Ebner @ 2025-10-07 12:25 ` Fiona Ebner 2025-10-15 8:15 ` [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU Fiona Ebner 2 siblings, 0 replies; 7+ messages in thread From: Fiona Ebner @ 2025-10-07 12:25 UTC (permalink / raw) To: pve-devel As reported in the community forum [0], it might happen that the dbus-vmstate object is not added (quickly enough) to the target QEMU instance, before the migration state is loaded. This would result in a crash of the target instance: > kvm: Unknown savevm section or instance 'dbus-vmstate/dbus-vmstate' > 0. Make sure that your current VM setup matches your saved VM setup, > including any hotplugged devices > kvm: load of migration failed: Invalid argument This is after the configuration is already moved and thus there also is no source instance running anymore. Change the type of the 'pve-dbus-vmstate@' service to 'notify', so that starting the service returns success only after the 'dbus-vmstate' object has been added to the QEMU instance. [0]: https://forum.proxmox.com/threads/172588/ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- Dependency bump qemu-server -> pve-common needed. No changes in v3. src/usr/dbus-vmstate | 3 +++ src/usr/pve-dbus-vmstate@.service | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/usr/dbus-vmstate b/src/usr/dbus-vmstate index ac6f8cfb..b2baa840 100755 --- a/src/usr/dbus-vmstate +++ b/src/usr/dbus-vmstate @@ -15,6 +15,7 @@ use Net::DBus::Reactor; use PVE::QemuServer::Helpers; use PVE::QemuServer::QMPHelpers qw(qemu_objectadd qemu_objectdel); use PVE::SafeSyslog; +use PVE::Systemd; use PVE::Tools; use base qw(Net::DBus::Object); @@ -165,4 +166,6 @@ qemu_objectadd($vmid, 'pve-vmstate', 'dbus-vmstate', 'id-list' => "pve-vmstate-$vmid", ); +PVE::Systemd::sd_notify(0, "READY=1\n"); + Net::DBus::Reactor->main()->run(); diff --git a/src/usr/pve-dbus-vmstate@.service b/src/usr/pve-dbus-vmstate@.service index 56b4e285..616f6979 100644 --- a/src/usr/pve-dbus-vmstate@.service +++ b/src/usr/pve-dbus-vmstate@.service @@ -6,5 +6,5 @@ PartOf=%i.scope [Service] Slice=qemu.slice -Type=simple +Type=notify ExecStart=/usr/libexec/qemu-server/dbus-vmstate %i -- 2.47.3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU 2025-10-07 12:24 [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU Fiona Ebner 2025-10-07 12:25 ` [pve-devel] [PATCH common v3 1/2] systemd: add sd_notify() helper Fiona Ebner 2025-10-07 12:25 ` [pve-devel] [PATCH qemu-server v3 2/2] migration: conntrack: avoid crash when dbus-vmstate object cannot be added (quickly enough) Fiona Ebner @ 2025-10-15 8:15 ` Fiona Ebner 2025-10-28 13:41 ` Fiona Ebner 2 siblings, 1 reply; 7+ messages in thread From: Fiona Ebner @ 2025-10-15 8:15 UTC (permalink / raw) To: pve-devel Ping _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU 2025-10-15 8:15 ` [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU Fiona Ebner @ 2025-10-28 13:41 ` Fiona Ebner 0 siblings, 0 replies; 7+ messages in thread From: Fiona Ebner @ 2025-10-28 13:41 UTC (permalink / raw) To: pve-devel Ping Am 15.10.25 um 10:15 AM schrieb Fiona Ebner: > Ping _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-29 9:16 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-10-07 12:24 [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU Fiona Ebner 2025-10-07 12:25 ` [pve-devel] [PATCH common v3 1/2] systemd: add sd_notify() helper Fiona Ebner 2025-10-28 14:58 ` Wolfgang Bumiller 2025-10-29 9:16 ` Fiona Ebner 2025-10-07 12:25 ` [pve-devel] [PATCH qemu-server v3 2/2] migration: conntrack: avoid crash when dbus-vmstate object cannot be added (quickly enough) Fiona Ebner 2025-10-15 8:15 ` [pve-devel] [PATCH-SERIES common/qemu-server v3 0/2] migration: conntrack: fix race adding dbus-vmstate object to QEMU Fiona Ebner 2025-10-28 13:41 ` Fiona Ebner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox