public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Maximiliano Sandoval <m.sandoval@proxmox.com>
To: Jakob Klocker <j.klocker@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Subject: Re: [PATCH qemu-server v3 3/3] fix #5032: agent: sync guest time via QGA when the clock falls behind
Date: Thu, 02 Jul 2026 12:56:50 +0200	[thread overview]
Message-ID: <s8o7bnd7j7x.fsf@toolbox> (raw)
In-Reply-To: <20260702103201.164567-4-j.klocker@proxmox.com> (Jakob Klocker's message of "Thu, 2 Jul 2026 12:32:01 +0200")

Jakob Klocker <j.klocker@proxmox.com> writes:

Some comments below.

One design comment: I am not sure if the functions should have an
$override parameter, it could be argued than it is the job of the
call-site to determine whether how to proceeded based on API+config
parameters.

disclaimer: I did not check v1 and v2's discussions.

> Add a 'sync-time-on-resume' agent config property (default off) that,
> when enabled, issues the guest-set-time QGA command to set the guest
> clock to the host's current time after operations that leave the clock
> behind:
> - resuming from hibernation (suspend-to-disk)
> - resuming from pause/suspend
> - rolling back to a snapshot that includes RAM
> - live migration
> - taking a snapshot, during which the guest is briefly frozen
>
> The start, resume and rollback API endpoints gain a
> 'sync-time-on-resume' parameter (e.g. 'qm start <vmid>
> --sync-time-on-resume 0') that overrides the configured agent value for
> a single operation, forcing the sync on or off regardless of config.
>
> guest_set_time prints a confirmation line to stdout on success. During
> live migration the resume step runs over the migration tunnel, which
> also uses stdout, so an unconditional print corrupts the tunnel
> protocol. vm_resume therefore takes a 'quiet' parameter to suppress the
> print on the migration path; the confirmation is logged by the migration
> code instead.
>
> Signed-off-by: Jakob Klocker <j.klocker@proxmox.com>
> ---
> changes since v1:
> - adapt warning message in the resume-from-saved-state path
> changes since v2:
> - add sync on pause/resume
> - add override options
> - add print task 
> - adapt formating and naming
>
>  src/PVE/API2/Qemu.pm           | 46 ++++++++++++++++++++++++++---
>  src/PVE/CLI/qm.pm              |  2 +-
>  src/PVE/QemuConfig.pm          | 12 ++++++--
>  src/PVE/QemuMigrate.pm         |  2 ++
>  src/PVE/QemuServer.pm          | 18 +++++++++++-
>  src/PVE/QemuServer/Agent.pm    | 54 ++++++++++++++++++++++++++++++++++
>  src/PVE/QemuServer/RunState.pm | 11 ++++++-
>  7 files changed, 136 insertions(+), 9 deletions(-)
>
> diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm
> index 9023c344..f58151e8 100644
> --- a/src/PVE/API2/Qemu.pm
> +++ b/src/PVE/API2/Qemu.pm
> @@ -3510,6 +3510,14 @@ __PACKAGE__->register_method({
>                      . ' the migration. A value of 0 means that the host_mtu parameter is to be'
>                      . ' avoided for the corresponding device.',
>              },
> +            'sync-time-on-resume' => {
> +                type => 'boolean',
> +                description =>

Please have both a `description` and a `verbose_description`. The
verbose description should mention all cases where this can happen and
if possible mention the exact QGA command used (guest-set-time).

There is a more complete verbose_description at Agent.pm below, but imho
all doc strings should be complete enough to be read on their own,
unless they reference another doc.

> +                    "Override the VM's configured 'sync-time-on-resume' agent setting for this"

This is an agent config so: "The VM QGA's configured..." or similar.

> +                    . " start only. Controls whether the guest clock is synchronized after the"
> +                    . " RAM state is restored. The guest agent must be enabled.",
> +                optional => 1,
> +            },
>          },
>      },
>      returns => {
> @@ -3525,6 +3533,7 @@ __PACKAGE__->register_method({
>          my $vmid = extract_param($param, 'vmid');
>          my $timeout = extract_param($param, 'timeout');
>          my $machine = extract_param($param, 'machine');
> +        my $sync_time = extract_param($param, 'sync-time-on-resume');
>  
>          my $get_root_param = sub {
>              my $value = extract_param($param, $_[0]);
> @@ -3631,6 +3640,7 @@ __PACKAGE__->register_method({
>                      timeout => $timeout,
>                      forcecpu => $force_cpu,
>                      'nets-host-mtu' => $nets_host_mtu,
> +                    'sync-time-on-resume' => $sync_time,
>                  };
>  
>                  PVE::QemuServer::vm_start($storecfg, $vmid, $params, $migrate_opts);
> @@ -4115,6 +4125,14 @@ __PACKAGE__->register_method({
>                  description => "Do not check whether the VM is running, used"
>                      . " internally during migration. Only root may use this option.",
>              },
> +            'sync-time-on-resume' => {
> +                type => 'boolean',
> +                description =>
> +                    "Override the VM's configured 'sync-time-on-resume' agent setting for this"
> +                    . " resume only. Controls whether the guest clock is synchronized after the"
> +                    . " RAM state is restored. The guest agent must be enabled.",

Same considerations here.

> +                optional => 1,
> +            },
>          },
>      },
>      returns => {
> @@ -4131,6 +4149,8 @@ __PACKAGE__->register_method({
>  
>          my $vmid = extract_param($param, 'vmid');
>  
> +        my $sync_time = extract_param($param, 'sync-time-on-resume');
> +
>          my $skiplock = extract_param($param, 'skiplock');
>          raise_param_exc({ skiplock => "Only root may use this option." })
>              if $skiplock && $authuser ne 'root@pam';
> @@ -4161,10 +4181,14 @@ __PACKAGE__->register_method({
>              syslog('info', "resume VM $vmid: $upid\n");
>  
>              if (!$to_disk_suspended) {
> -                PVE::QemuServer::RunState::vm_resume($vmid, $skiplock, $nocheck);
> +                PVE::QemuServer::RunState::vm_resume($vmid, $skiplock, $nocheck, $sync_time);
>              } else {
>                  my $storecfg = PVE::Storage::config();
> -                PVE::QemuServer::vm_start($storecfg, $vmid, { skiplock => $skiplock });
> +                PVE::QemuServer::vm_start(
> +                    $storecfg,
> +                    $vmid,
> +                    { skiplock => $skiplock, 'sync-time-on-resume' => $sync_time },
> +                );
>              }
>  
>              return;
> @@ -6323,6 +6347,14 @@ __PACKAGE__->register_method({
>                  optional => 1,
>                  default => 0,
>              },
> +            'sync-time-on-resume' => {
> +                type => 'boolean',
> +                description =>
> +                    "Override the VM's configured sync-time-on-resume agent setting for this"
> +                    . " rollback only. Controls whether the guest clock is synchronized"
> +                    . " after the RAM state is restored. The guest agent must be enabled.",
> +                optional => 1,

Ditto.

> +            },
>          },
>      },
>      returns => {
> @@ -6342,6 +6374,8 @@ __PACKAGE__->register_method({
>  
>          my $snapname = extract_param($param, 'snapname');
>  
> +        my $sync_time = extract_param($param, 'sync-time-on-resume');
> +
>          # vm_start is invoked directly from the worker, so its own permissions
>          # predicate doesn't fire here - check VM.PowerMgmt up front so a user
>          # with only VM.Snapshot.Rollback can't power the VM on as a side effect.
> @@ -6350,7 +6384,11 @@ __PACKAGE__->register_method({
>  
>          my $realcmd = sub {
>              PVE::Cluster::log_msg('info', $authuser, "rollback snapshot VM $vmid: $snapname");
> -            PVE::QemuConfig->snapshot_rollback($vmid, $snapname);
> +            PVE::QemuConfig->snapshot_rollback(
> +                $vmid,
> +                $snapname,
> +                { 'sync-time-on-resume' => $sync_time },
> +            );
>  
>              if ($param->{start} && !PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
>                  PVE::API2::Qemu->vm_start({ vmid => $vmid, node => $node });
> @@ -6915,7 +6953,7 @@ __PACKAGE__->register_method({
>                  },
>                  'resume' => sub {
>                      if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
> -                        PVE::QemuServer::RunState::vm_resume($state->{vmid}, 1, 1);
> +                        PVE::QemuServer::RunState::vm_resume($state->{vmid}, 1, 1, undef, 1);
>                      } else {
>                          die "VM $state->{vmid} not running\n";
>                      }
> diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm
> index b903c1f1..25351f64 100755
> --- a/src/PVE/CLI/qm.pm
> +++ b/src/PVE/CLI/qm.pm
> @@ -473,7 +473,7 @@ __PACKAGE__->register_method({
>                  if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
>                      # vm_resume with nocheck, since local node might not have processed config
>                      # move/rename yet
> -                    eval { PVE::QemuServer::RunState::vm_resume($vmid, 1, 1); };
> +                    eval { PVE::QemuServer::RunState::vm_resume($vmid, 1, 1, undef, 1); };
>                      if ($@) {
>                          $tunnel_write->("ERR: resume failed - $@");
>                      } else {
> diff --git a/src/PVE/QemuConfig.pm b/src/PVE/QemuConfig.pm
> index 26f0fda2..e95a8c61 100644
> --- a/src/PVE/QemuConfig.pm
> +++ b/src/PVE/QemuConfig.pm
> @@ -383,6 +383,11 @@ sub __snapshot_create_vol_snapshots_hook {
>                      next;
>                  }
>              }
> +            my $conf = $class->load_config($vmid);
> +            if (PVE::QemuServer::Agent::should_sync_time_on_resume($conf->{agent})) {
> +                eval { PVE::QemuServer::Agent::guest_set_time($vmid); };
> +                warn "could not sync guest time after snapshot - $@" if $@;
> +            }
>          }
>      }
>  }
> @@ -444,8 +449,9 @@ sub __snapshot_rollback_hook {
>      my ($class, $vmid, $conf, $snap, $prepare, $data) = @_;
>  
>      if ($prepare) {
> -        # we save the machine of the current config
> +        # we save the machine and agent of the current config
>          $data->{oldmachine} = $conf->{machine};
> +        $data->{agent} = $conf->{agent};
>      } else {
>          # if we have a 'runningmachine' entry in the snapshot we use that
>          # for the forcemachine parameter, else we use the old logic
> @@ -509,7 +515,7 @@ sub __snapshot_rollback_vm_stop {
>  }
>  
>  sub __snapshot_rollback_vm_start {
> -    my ($class, $vmid, $vmstate, $data) = @_;
> +    my ($class, $vmid, $vmstate, $data, $opts) = @_;
>  
>      my $storecfg = PVE::Storage::config();
>      my $params = {
> @@ -517,6 +523,8 @@ sub __snapshot_rollback_vm_start {
>          forcemachine => $data->{forcemachine},
>          forcecpu => $data->{forcecpu},
>          'nets-host-mtu' => $data->{'nets-host-mtu'},
> +        agent => $data->{agent},
> +        'sync-time-on-resume' => $opts->{'sync-time-on-resume'},
>      };
>      PVE::QemuServer::vm_start($storecfg, $vmid, $params);
>  }
> diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm
> index 8da6f15d..5846578d 100644
> --- a/src/PVE/QemuMigrate.pm
> +++ b/src/PVE/QemuMigrate.pm
> @@ -1770,6 +1770,8 @@ sub phase3_cleanup {
>                      $self->{errors} = 1;
>                  }
>              }
> +            $self->log('info', "synced guest clock via guest agent")
> +                if PVE::QemuServer::Agent::should_sync_time_on_resume($conf->{agent});
>          }
>  
>          if (
> diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm
> index cdf66e89..e30e89e7 100644
> --- a/src/PVE/QemuServer.pm
> +++ b/src/PVE/QemuServer.pm
> @@ -5179,7 +5179,10 @@ sub vmconfig_update_agent {
>  
>      die "skip\n" if !$conf->{$opt};
>  
> -    my $hotplug_options = { fstrim_cloned_disks => 1 };
> +    my $hotplug_options = {
> +        fstrim_cloned_disks => 1,
> +        'sync-time-on-resume' => 1,
> +    };
>  
>      my $old_agent = parse_guest_agent($conf->{agent});
>      my $agent = parse_guest_agent($value);
> @@ -6006,6 +6009,19 @@ sub vm_start_nolock {
>          );
>      }
>  
> +    # $resume = suspend-to-disk, $statefile = rollback with RAM. Rollback carries the snapshot's
> +    # agent setting in params; hibernate resume falls back to the live config
> +    if ($resume || ($statefile && !$migratedfrom)) {
> +        my $agent = $params->{agent} // $conf->{agent};
> +        if (PVE::QemuServer::Agent::should_sync_time_on_resume(
> +            $agent,
> +            $params->{'sync-time-on-resume'},
> +        )) {
> +            eval { PVE::QemuServer::Agent::guest_set_time($vmid); 1 };
> +            warn "could not sync guest time after resume from saved state - $@" if $@;
> +        }
> +    }
> +
>      return $res;
>  }
>  
> diff --git a/src/PVE/QemuServer/Agent.pm b/src/PVE/QemuServer/Agent.pm
> index be6df443..77cd2bd6 100644
> --- a/src/PVE/QemuServer/Agent.pm
> +++ b/src/PVE/QemuServer/Agent.pm
> @@ -4,6 +4,7 @@ use v5.36;
>  
>  use JSON;
>  use MIME::Base64 qw(decode_base64 encode_base64);
> +use Time::HiRes qw(time);
>  
>  use PVE::JSONSchema;
>  
> @@ -52,6 +53,24 @@ our $agent_fmt = {
>          optional => 1,
>          default => 1,
>      },
> +    # FIXME: MAJOR VERSION: 10.0.0 - reconsider enabling this option as default

Unless it was already discussed that this should be done, I would say
this is a "TODO".

> +    'sync-time-on-resume' => {
> +        description => "Synchronize the guest clock through QGA after operations that can leave"

I think this should be "the guest's clock". Same in other places.

> +            . " the guest clock behind.",
> +        verbose_description =>
> +            "Whether to issue the guest-set-time QEMU guest agent command to synchronize the"
s/guest-set-time/'guest-set-time'/.
> +            . " guest clock to the host's current time after operations that can leave the guest"
> +            . " clock behind. This happens when resuming from hibernation, resuming from a paused"
> +            . " state, taking a snapshot, after a migration, and after rolling back to a snapshot"
> +            . " that includes RAM. In these cases the guest's clock may still reflect an earlier"
> +            . " time. The time is only synchronized when the QEMU Guest Agent option is enabled"
> +            . " in the guest's configuration and the agent is running inside of the guest. For"
> +            . " resume, hibernation resume and snapshot rollback, the synchronization can be"
> +            . " overridden per operation.",
> +        type => 'boolean',
> +        optional => 1,
> +        default => 0,
> +    },
>      type => {
>          description => "Select the agent type",
>          type => 'string',
> @@ -332,4 +351,39 @@ sub guest_fs_freeze_applicable($agent_str, $vmid, $logfunc = undef) {
>      return 1;
>  }
>  
> +=head3 should_sync_time_on_resume
> +
> +Returns whether the guest's clock should be synchronized to the host's via the QEMU Guest Agent
> +when the VM is resumed from saved state. Requires the guest agent to be enabled. An explicit
> +C<$override> (from an API parameter) takes precedence over the C<sync-time-on-resume> agent
> +config property when defined; otherwise the config value is used (defaulting to off). Does B<not>
> +check whether the agent is actually running.
> +
> +=cut
> +
> +sub should_sync_time_on_resume($agent_str, $override = undef) {
> +    my $agent = parse_guest_agent($agent_str);
> +    return 0 if !$agent->{enabled};
> +    return $override ? 1 : 0 if defined($override);
> +    return $agent->{'sync-time-on-resume'} // 0;
> +}
> +
> +=head3 guest_set_time
> +
> +Sets the guest's clock to the current host time via the QEMU Guest Agent's
> +C<guest-set-time> command. The host time is passed explicitly (as nanoseconds
> +since the UNIX epoch) because the argument-less form reads the time from the
> +guest's RTC, which isn't supported on all platforms (e.g. Windows); passing
> +the value directly works regardless of guest OS.
> +
> +=cut
> +
> +sub guest_set_time($vmid, $quiet = undef) {
> +    my $time_ns = Time::HiRes::time() * 1_000_000_000;
> +    my $res = PVE::QemuServer::Monitor::mon_cmd($vmid, 'guest-set-time', time => int($time_ns));
> +    check_agent_error($res, "unable to set guest time");
> +    print "synced guest clock via guest agent\n" if !$quiet;
> +    return;

Just an observation, this `return` seems unnecessary to me, however the
thaw fn also has one.

> +}
> +
>  1;
> diff --git a/src/PVE/QemuServer/RunState.pm b/src/PVE/QemuServer/RunState.pm
> index bbbcc88e..a72aed4d 100644
> --- a/src/PVE/QemuServer/RunState.pm
> +++ b/src/PVE/QemuServer/RunState.pm
> @@ -128,7 +128,7 @@ sub vm_suspend {
>  # location of the config file (source or target node) is not deterministic,
>  # since migration cannot wait for pmxcfs to process the rename
>  sub vm_resume {
> -    my ($vmid, $skiplock, $nocheck) = @_;
> +    my ($vmid, $skiplock, $nocheck, $sync_time, $quiet) = @_;
>  
>      PVE::QemuConfig->lock_config(
>          $vmid,
> @@ -180,6 +180,15 @@ sub vm_resume {
>                  if $resume_cmd eq 'cont';
>  
>              mon_cmd($vmid, $resume_cmd);
> +
> +            if (
> +                $resume_cmd eq 'cont'
> +                && PVE::QemuServer::Agent::should_sync_time_on_resume($conf->{agent},
> +                    $sync_time)
> +            ) {
> +                eval { PVE::QemuServer::Agent::guest_set_time($vmid, $quiet); };
> +                warn "could not sync guest time after resume - $@" if $@;
> +            }
>          },
>      );
>  }

-- 
Maximiliano




  reply	other threads:[~2026-07-02 10:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-02 10:31 [PATCH guest-common/manager/qemu-server v3 0/3] fix #5032: add guest time sync via QGA Jakob Klocker
2026-07-02 10:31 ` [PATCH pve-guest-common v3 1/3] AbstractConfig: allow passing options to snapshot_rollback Jakob Klocker
2026-07-02 10:32 ` [PATCH pve-manager v3 2/3] fix #5032: ui: qemu agent: add sync-time-on-resume option Jakob Klocker
2026-07-02 11:00   ` Maximiliano Sandoval
2026-07-02 10:32 ` [PATCH qemu-server v3 3/3] fix #5032: agent: sync guest time via QGA when the clock falls behind Jakob Klocker
2026-07-02 10:56   ` Maximiliano Sandoval [this message]
2026-07-02 15:14 ` [PATCH guest-common/manager/qemu-server v3 0/3] fix #5032: add guest time sync via QGA Maximiliano Sandoval
2026-07-03  6:33   ` Jakob Klocker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=s8o7bnd7j7x.fsf@toolbox \
    --to=m.sandoval@proxmox.com \
    --cc=j.klocker@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal