From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate001.proxmox.com (gate001.proxmox.com [IPv6:2a0f:8001:1:32::40]) by lore.proxmox.com (Postfix) with ESMTPS id 3C4DE1FF142 for ; Fri, 03 Jul 2026 09:05:32 +0200 (CEST) Received: from gate001.proxmox.com (localhost.localdomain [127.0.0.1]) by gate001.proxmox.com (Proxmox) with ESMTP id 61075212A3; Fri, 03 Jul 2026 09:05:31 +0200 (CEST) Message-ID: <423528c4-007d-435b-b42a-445b76ecaea3@proxmox.com> Date: Fri, 3 Jul 2026 09:04:54 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Subject: Re: [PATCH qemu-server v3 3/3] fix #5032: agent: sync guest time via QGA when the clock falls behind To: Maximiliano Sandoval References: <20260702103201.164567-1-j.klocker@proxmox.com> <20260702103201.164567-4-j.klocker@proxmox.com> Content-Language: en-US From: Jakob Klocker In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1783062287796 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.759 Adjusted score from AWL reputation of From: address DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment (newer systems) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: ZRPDXMN5GHXICILGHK6DSTHDWQTHVST7 X-Message-ID-Hash: ZRPDXMN5GHXICILGHK6DSTHDWQTHVST7 X-MailFrom: j.klocker@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: "pve-devel@lists.proxmox.com" X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: The initial idea for the override was that you can't opt out of the time sync when resuming from hibernate, since the config is locked. To let users stay flexible, Fiona and I decided an override is a good fit here. If there's an override for hibernate, it feels natural that there would be one for rollback and resume as well. Some comments inline. On 7/2/26 12:56 PM, Maximiliano Sandoval wrote: > Jakob Klocker writes: [snip] >> + 'sync-time-on-resume' => { >> + type => 'boolean', >> + description => > > Please have both a `description` and a `verbose_description`. The > verbose description should mention all cases where this can happen and > if possible mention the exact QGA command used (guest-set-time). > > There is a more complete verbose_description at Agent.pm below, but imho > all doc strings should be complete enough to be read on their own, > unless they reference another doc. > I noticed no API property in `Qemu.pm` has a `verbose_description`, therefore I didn't add one either. Is it actually desired to have verbose descriptions for properties here? >> + "Override the VM's configured 'sync-time-on-resume' agent setting for this" > > This is an agent config so: "The VM QGA's configured..." or similar. > >> + . " start only. Controls whether the guest clock is synchronized after the" >> + . " RAM state is restored. The guest agent must be enabled.", >> + optional => 1, >> + }, >> }, >> }, >> returns => { >> @@ -3525,6 +3533,7 @@ __PACKAGE__->register_method({ >> my $vmid = extract_param($param, 'vmid'); >> my $timeout = extract_param($param, 'timeout'); >> my $machine = extract_param($param, 'machine'); >> + my $sync_time = extract_param($param, 'sync-time-on-resume'); >> >> my $get_root_param = sub { >> my $value = extract_param($param, $_[0]); >> @@ -3631,6 +3640,7 @@ __PACKAGE__->register_method({ >> timeout => $timeout, >> forcecpu => $force_cpu, >> 'nets-host-mtu' => $nets_host_mtu, >> + 'sync-time-on-resume' => $sync_time, >> }; >> >> PVE::QemuServer::vm_start($storecfg, $vmid, $params, $migrate_opts); >> @@ -4115,6 +4125,14 @@ __PACKAGE__->register_method({ >> description => "Do not check whether the VM is running, used" >> . " internally during migration. Only root may use this option.", >> }, >> + 'sync-time-on-resume' => { >> + type => 'boolean', >> + description => >> + "Override the VM's configured 'sync-time-on-resume' agent setting for this" >> + . " resume only. Controls whether the guest clock is synchronized after the" >> + . " RAM state is restored. The guest agent must be enabled.", > > Same considerations here. > >> + optional => 1, >> + }, >> }, >> }, >> returns => { >> @@ -4131,6 +4149,8 @@ __PACKAGE__->register_method({ >> >> my $vmid = extract_param($param, 'vmid'); >> >> + my $sync_time = extract_param($param, 'sync-time-on-resume'); >> + >> my $skiplock = extract_param($param, 'skiplock'); >> raise_param_exc({ skiplock => "Only root may use this option." }) >> if $skiplock && $authuser ne 'root@pam'; >> @@ -4161,10 +4181,14 @@ __PACKAGE__->register_method({ >> syslog('info', "resume VM $vmid: $upid\n"); >> >> if (!$to_disk_suspended) { >> - PVE::QemuServer::RunState::vm_resume($vmid, $skiplock, $nocheck); >> + PVE::QemuServer::RunState::vm_resume($vmid, $skiplock, $nocheck, $sync_time); >> } else { >> my $storecfg = PVE::Storage::config(); >> - PVE::QemuServer::vm_start($storecfg, $vmid, { skiplock => $skiplock }); >> + PVE::QemuServer::vm_start( >> + $storecfg, >> + $vmid, >> + { skiplock => $skiplock, 'sync-time-on-resume' => $sync_time }, >> + ); >> } >> >> return; >> @@ -6323,6 +6347,14 @@ __PACKAGE__->register_method({ >> optional => 1, >> default => 0, >> }, >> + 'sync-time-on-resume' => { >> + type => 'boolean', >> + description => >> + "Override the VM's configured sync-time-on-resume agent setting for this" >> + . " rollback only. Controls whether the guest clock is synchronized" >> + . " after the RAM state is restored. The guest agent must be enabled.", >> + optional => 1, > > Ditto. > >> + }, >> }, >> }, >> returns => { >> @@ -6342,6 +6374,8 @@ __PACKAGE__->register_method({ >> >> my $snapname = extract_param($param, 'snapname'); >> >> + my $sync_time = extract_param($param, 'sync-time-on-resume'); >> + >> # vm_start is invoked directly from the worker, so its own permissions >> # predicate doesn't fire here - check VM.PowerMgmt up front so a user >> # with only VM.Snapshot.Rollback can't power the VM on as a side effect. >> @@ -6350,7 +6384,11 @@ __PACKAGE__->register_method({ >> >> my $realcmd = sub { >> PVE::Cluster::log_msg('info', $authuser, "rollback snapshot VM $vmid: $snapname"); >> - PVE::QemuConfig->snapshot_rollback($vmid, $snapname); >> + PVE::QemuConfig->snapshot_rollback( >> + $vmid, >> + $snapname, >> + { 'sync-time-on-resume' => $sync_time }, >> + ); >> >> if ($param->{start} && !PVE::QemuServer::Helpers::vm_running_locally($vmid)) { >> PVE::API2::Qemu->vm_start({ vmid => $vmid, node => $node }); >> @@ -6915,7 +6953,7 @@ __PACKAGE__->register_method({ >> }, >> 'resume' => sub { >> if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) { >> - PVE::QemuServer::RunState::vm_resume($state->{vmid}, 1, 1); >> + PVE::QemuServer::RunState::vm_resume($state->{vmid}, 1, 1, undef, 1); >> } else { >> die "VM $state->{vmid} not running\n"; >> } >> diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm >> index b903c1f1..25351f64 100755 >> --- a/src/PVE/CLI/qm.pm >> +++ b/src/PVE/CLI/qm.pm >> @@ -473,7 +473,7 @@ __PACKAGE__->register_method({ >> if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) { >> # vm_resume with nocheck, since local node might not have processed config >> # move/rename yet >> - eval { PVE::QemuServer::RunState::vm_resume($vmid, 1, 1); }; >> + eval { PVE::QemuServer::RunState::vm_resume($vmid, 1, 1, undef, 1); }; >> if ($@) { >> $tunnel_write->("ERR: resume failed - $@"); >> } else { >> diff --git a/src/PVE/QemuConfig.pm b/src/PVE/QemuConfig.pm >> index 26f0fda2..e95a8c61 100644 >> --- a/src/PVE/QemuConfig.pm >> +++ b/src/PVE/QemuConfig.pm >> @@ -383,6 +383,11 @@ sub __snapshot_create_vol_snapshots_hook { >> next; >> } >> } >> + my $conf = $class->load_config($vmid); >> + if (PVE::QemuServer::Agent::should_sync_time_on_resume($conf->{agent})) { >> + eval { PVE::QemuServer::Agent::guest_set_time($vmid); }; >> + warn "could not sync guest time after snapshot - $@" if $@; >> + } >> } >> } >> } >> @@ -444,8 +449,9 @@ sub __snapshot_rollback_hook { >> my ($class, $vmid, $conf, $snap, $prepare, $data) = @_; >> >> if ($prepare) { >> - # we save the machine of the current config >> + # we save the machine and agent of the current config >> $data->{oldmachine} = $conf->{machine}; >> + $data->{agent} = $conf->{agent}; >> } else { >> # if we have a 'runningmachine' entry in the snapshot we use that >> # for the forcemachine parameter, else we use the old logic >> @@ -509,7 +515,7 @@ sub __snapshot_rollback_vm_stop { >> } >> >> sub __snapshot_rollback_vm_start { >> - my ($class, $vmid, $vmstate, $data) = @_; >> + my ($class, $vmid, $vmstate, $data, $opts) = @_; >> >> my $storecfg = PVE::Storage::config(); >> my $params = { >> @@ -517,6 +523,8 @@ sub __snapshot_rollback_vm_start { >> forcemachine => $data->{forcemachine}, >> forcecpu => $data->{forcecpu}, >> 'nets-host-mtu' => $data->{'nets-host-mtu'}, >> + agent => $data->{agent}, >> + 'sync-time-on-resume' => $opts->{'sync-time-on-resume'}, >> }; >> PVE::QemuServer::vm_start($storecfg, $vmid, $params); >> } >> diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm >> index 8da6f15d..5846578d 100644 >> --- a/src/PVE/QemuMigrate.pm >> +++ b/src/PVE/QemuMigrate.pm >> @@ -1770,6 +1770,8 @@ sub phase3_cleanup { >> $self->{errors} = 1; >> } >> } >> + $self->log('info', "synced guest clock via guest agent") >> + if PVE::QemuServer::Agent::should_sync_time_on_resume($conf->{agent}); >> } >> >> if ( >> diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm >> index cdf66e89..e30e89e7 100644 >> --- a/src/PVE/QemuServer.pm >> +++ b/src/PVE/QemuServer.pm >> @@ -5179,7 +5179,10 @@ sub vmconfig_update_agent { >> >> die "skip\n" if !$conf->{$opt}; >> >> - my $hotplug_options = { fstrim_cloned_disks => 1 }; >> + my $hotplug_options = { >> + fstrim_cloned_disks => 1, >> + 'sync-time-on-resume' => 1, >> + }; >> >> my $old_agent = parse_guest_agent($conf->{agent}); >> my $agent = parse_guest_agent($value); >> @@ -6006,6 +6009,19 @@ sub vm_start_nolock { >> ); >> } >> >> + # $resume = suspend-to-disk, $statefile = rollback with RAM. Rollback carries the snapshot's >> + # agent setting in params; hibernate resume falls back to the live config >> + if ($resume || ($statefile && !$migratedfrom)) { >> + my $agent = $params->{agent} // $conf->{agent}; >> + if (PVE::QemuServer::Agent::should_sync_time_on_resume( >> + $agent, >> + $params->{'sync-time-on-resume'}, >> + )) { >> + eval { PVE::QemuServer::Agent::guest_set_time($vmid); 1 }; >> + warn "could not sync guest time after resume from saved state - $@" if $@; >> + } >> + } >> + >> return $res; >> } >> >> diff --git a/src/PVE/QemuServer/Agent.pm b/src/PVE/QemuServer/Agent.pm >> index be6df443..77cd2bd6 100644 >> --- a/src/PVE/QemuServer/Agent.pm >> +++ b/src/PVE/QemuServer/Agent.pm >> @@ -4,6 +4,7 @@ use v5.36; >> >> use JSON; >> use MIME::Base64 qw(decode_base64 encode_base64); >> +use Time::HiRes qw(time); >> >> use PVE::JSONSchema; >> >> @@ -52,6 +53,24 @@ our $agent_fmt = { >> optional => 1, >> default => 1, >> }, >> + # FIXME: MAJOR VERSION: 10.0.0 - reconsider enabling this option as default > > Unless it was already discussed that this should be done, I would say > this is a "TODO". > >>From what I've seen TODO and FIXME is used in the codebase, since Fiona suggested FIXME in v2 I've sticked to it. >> + 'sync-time-on-resume' => { >> + description => "Synchronize the guest clock through QGA after operations that can leave" > > I think this should be "the guest's clock". Same in other places. This reads nicer, I'll make sure to change it. > >> + . " the guest clock behind.", >> + verbose_description => >> + "Whether to issue the guest-set-time QEMU guest agent command to synchronize the" > s/guest-set-time/'guest-set-time'/. >> + . " guest clock to the host's current time after operations that can leave the guest" >> + . " clock behind. This happens when resuming from hibernation, resuming from a paused" >> + . " state, taking a snapshot, after a migration, and after rolling back to a snapshot" >> + . " that includes RAM. In these cases the guest's clock may still reflect an earlier" >> + . " time. The time is only synchronized when the QEMU Guest Agent option is enabled" >> + . " in the guest's configuration and the agent is running inside of the guest. For" >> + . " resume, hibernation resume and snapshot rollback, the synchronization can be" >> + . " overridden per operation.", >> + type => 'boolean', >> + optional => 1, >> + default => 0, >> + }, >> type => { >> description => "Select the agent type", >> type => 'string', >> @@ -332,4 +351,39 @@ sub guest_fs_freeze_applicable($agent_str, $vmid, $logfunc = undef) { >> return 1; >> } >> >> +=head3 should_sync_time_on_resume >> + >> +Returns whether the guest's clock should be synchronized to the host's via the QEMU Guest Agent >> +when the VM is resumed from saved state. Requires the guest agent to be enabled. An explicit >> +C<$override> (from an API parameter) takes precedence over the C agent >> +config property when defined; otherwise the config value is used (defaulting to off). Does B >> +check whether the agent is actually running. >> + >> +=cut >> + >> +sub should_sync_time_on_resume($agent_str, $override = undef) { >> + my $agent = parse_guest_agent($agent_str); >> + return 0 if !$agent->{enabled}; >> + return $override ? 1 : 0 if defined($override); >> + return $agent->{'sync-time-on-resume'} // 0; >> +} >> + >> +=head3 guest_set_time >> + >> +Sets the guest's clock to the current host time via the QEMU Guest Agent's >> +C command. The host time is passed explicitly (as nanoseconds >> +since the UNIX epoch) because the argument-less form reads the time from the >> +guest's RTC, which isn't supported on all platforms (e.g. Windows); passing >> +the value directly works regardless of guest OS. >> + >> +=cut >> + >> +sub guest_set_time($vmid, $quiet = undef) { >> + my $time_ns = Time::HiRes::time() * 1_000_000_000; >> + my $res = PVE::QemuServer::Monitor::mon_cmd($vmid, 'guest-set-time', time => int($time_ns)); >> + check_agent_error($res, "unable to set guest time"); >> + print "synced guest clock via guest agent\n" if !$quiet; >> + return; > > Just an observation, this `return` seems unnecessary to me, however the > thaw fn also has one. I've added it to stay consistent with the function above, as you mentioned thaw also has one. > [snip]