From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id A999173454 for ; Thu, 7 Oct 2021 10:27:37 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 9D00016633 for ; Thu, 7 Oct 2021 10:27:37 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id CDBFF16612 for ; Thu, 7 Oct 2021 10:27:33 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id DF15D458D2; Thu, 7 Oct 2021 10:27:28 +0200 (CEST) From: Dominik Csapak To: pve-devel@lists.proxmox.com Cc: Thomas Lamprecht Date: Thu, 7 Oct 2021 10:27:21 +0200 Message-Id: <20211007082727.1385888-3-d.csapak@proxmox.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211007082727.1385888-1-d.csapak@proxmox.com> References: <20211007082727.1385888-1-d.csapak@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.709 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment LONGWORDS 2.035 Long string of long words SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [pvescheduler.pid, n.sh, timers.target, spiceproxy.pm, pvedaemon.pm, multi-user.target, pve-storage.target, pvescheduler.pm, pveproxy.pm, defines.mk, pve-doc-generator.mk, pvestatd.pm] Subject: [pve-devel] [PATCH manager 1/7] replace systemd timer with pvescheduler daemon X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Oct 2021 08:27:37 -0000 From: Thomas Lamprecht The whole thing is already prepared for this, the systemd timer was just a fixed periodic timer with a frequency of one minute. And we just introduced it as the assumption was made that less memory usage would be generated with this approach, AFAIK. But logging 4+ lines just about that the timer was started, even if it does nothing, and that 24/7 is not to cheap and a bit annoying. So in a first step add a simple daemon, which forks of a child for running jobs once a minute. This could be made still a bit more intelligent, i.e., look if we have jobs tor run before forking - as forking is not the cheapest syscall. Further, we could adapt the sleep interval to the next time we actually need to run a job (and sending a SIGUSR to the daemon if a job interval changes such, that this interval got narrower) We try to sync running on minute-change boundaries at start, this emulates systemd.timer behaviour, we had until now. Also user can configure jobs on minute precision, so they probably expect that those also start really close to a minute change event. Could be adapted to resync during running, to factor in time drift. But, as long as enough cpu cycles are available we run in correct monotonic intervalls, so this isn't a must, IMO. Another improvement could be locking a bit more fine grained, i.e. not on a per-all-local-job-runs basis, but per-job (per-guest?) basis, which would improve temporary starvement of small high-periodic jobs through big, less peridoci jobs. We argued that it's the user fault if such situations arise, but they can evolve over time without noticing, especially in compolexer setups. Signed-off-by: Thomas Lamprecht Signed-off-by: Dominik Csapak --- PVE/Service/Makefile | 2 +- PVE/Service/pvescheduler.pm | 102 ++++++++++++++++++++++++++++++++++ bin/Makefile | 6 +- bin/pvescheduler | 28 ++++++++++ debian/postinst | 3 +- services/Makefile | 3 +- services/pvescheduler.service | 16 ++++++ services/pvesr.service | 8 --- services/pvesr.timer | 12 ---- 9 files changed, 155 insertions(+), 25 deletions(-) create mode 100644 PVE/Service/pvescheduler.pm create mode 100755 bin/pvescheduler create mode 100644 services/pvescheduler.service delete mode 100644 services/pvesr.service delete mode 100644 services/pvesr.timer diff --git a/PVE/Service/Makefile b/PVE/Service/Makefile index fc1cdb14..f66ceb81 100644 --- a/PVE/Service/Makefile +++ b/PVE/Service/Makefile @@ -1,6 +1,6 @@ include ../../defines.mk -SOURCES=pvestatd.pm pveproxy.pm pvedaemon.pm spiceproxy.pm +SOURCES=pvestatd.pm pveproxy.pm pvedaemon.pm spiceproxy.pm pvescheduler.pm all: diff --git a/PVE/Service/pvescheduler.pm b/PVE/Service/pvescheduler.pm new file mode 100644 index 00000000..ce55c45a --- /dev/null +++ b/PVE/Service/pvescheduler.pm @@ -0,0 +1,102 @@ +package PVE::Service::pvescheduler; + +use strict; +use warnings; + +use POSIX qw(WNOHANG); +use PVE::SafeSyslog; +use PVE::API2::Replication; + +use PVE::Daemon; +use base qw(PVE::Daemon); + +my $cmdline = [$0, @ARGV]; +my %daemon_options = (stop_wait_time => 180, max_workers => 0); +my $daemon = __PACKAGE__->new('pvescheduler', $cmdline, %daemon_options); + +my $finish_jobs = sub { + my ($self) = @_; + foreach my $cpid (keys %{$self->{jobs}}) { + my $waitpid = waitpid($cpid, WNOHANG); + if (defined($waitpid) && ($waitpid == $cpid)) { + delete ($self->{jobs}->{$cpid}); + } + } +}; + +sub run { + my ($self) = @_; + + my $jobs= {}; + $self->{jobs} = $jobs; + + my $old_sig_chld = $SIG{CHLD}; + local $SIG{CHLD} = sub { + local ($@, $!, $?); # do not overwrite error vars + $finish_jobs->($self); + $old_sig_chld->(@_) if $old_sig_chld; + }; + + my $logfunc = sub { syslog('info', $_[0]) }; + + my $run_jobs = sub { + my $child = fork(); + if (!defined($child)) { + die "fork failed: $!\n"; + } elsif ($child == 0) { + $self->after_fork_cleanup(); + PVE::API2::Replication::run_jobs(undef, $logfunc, 0, 1); + POSIX::_exit(0); + } + + $jobs->{$child} = 1; + }; + + # try to run near minute boundaries, makes more sense to the user as he + # configures jobs with minute precision + my ($current_seconds) = localtime; + sleep(60 - $current_seconds) if (60 - $current_seconds >= 5); + + for (;;) { + last if $self->{shutdown_request}; + + $run_jobs->(); + + my $sleep_time = 60; + my $slept = 0; # SIGCHLD interrupts sleep, so we need to keep track + while ($slept < $sleep_time) { + last if $self->{shutdown_request}; + $slept += sleep($sleep_time - $slept); + } + } + + # jobs have a lock timeout of 60s, wait a bit more for graceful termination + my $timeout = 0; + while (keys %$jobs > 0 && $timeout < 75) { + kill 'TERM', keys %$jobs; + $timeout += sleep(5); + } + # ensure the rest gets stopped + kill 'KILL', keys %$jobs if (keys %$jobs > 0); +} + +sub shutdown { + my ($self) = @_; + + syslog('info', 'got shutdown request, signal running jobs to stop'); + + kill 'TERM', keys %{$self->{jobs}}; + $self->{shutdown_request} = 1; +} + +$daemon->register_start_command(); +$daemon->register_stop_command(); +$daemon->register_status_command(); + +our $cmddef = { + start => [ __PACKAGE__, 'start', []], + stop => [ __PACKAGE__, 'stop', []], + status => [ __PACKAGE__, 'status', [], undef, sub { print shift . "\n";} ], +}; + +1; diff --git a/bin/Makefile b/bin/Makefile index df30b27c..12cb4671 100644 --- a/bin/Makefile +++ b/bin/Makefile @@ -6,7 +6,7 @@ export NOVIEW=1 PERL_DOC_INC_DIRS=.. include /usr/share/pve-doc-generator/pve-doc-generator.mk -SERVICES = pvestatd pveproxy pvedaemon spiceproxy +SERVICES = pvestatd pveproxy pvedaemon spiceproxy pvescheduler CLITOOLS = vzdump pvesubscription pveceph pveam pvesr pvenode pvesh pve6to7 SCRIPTS = \ @@ -52,6 +52,10 @@ pve6to7.1: printf ".TH PVE6TO7 1\n.SH NAME\npve6to7 \- Proxmox VE upgrade checker script for 6.4 to 7.x\n" > $@ printf ".SH SYNOPSIS\npve6to7 [--full]\n" >> $@ +pvescheduler.8: + # FIXME: add to doc-generator + echo ".TH pvescheduler 8" > $@ + pveversion.1.pod: pveversion pveupgrade.1.pod: pveupgrade pvereport.1.pod: pvereport diff --git a/bin/pvescheduler b/bin/pvescheduler new file mode 100755 index 00000000..5f2bde52 --- /dev/null +++ b/bin/pvescheduler @@ -0,0 +1,28 @@ +#!/usr/bin/perl + +use strict; +use warnings; + +use PVE::Service::pvescheduler; + +use PVE::RPCEnvironment; +use PVE::SafeSyslog; + +$SIG{'__WARN__'} = sub { + my $err = $@; + my $t = $_[0]; + chomp $t; + print STDERR "$t\n"; + syslog('warning', "%s", $t); + $@ = $err; +}; + +my $prepare = sub { + my $rpcenv = PVE::RPCEnvironment->init('priv'); + + $rpcenv->init_request(); + $rpcenv->set_language($ENV{LANG}); + $rpcenv->set_user('root@pam'); +}; + +PVE::Service::pvescheduler->run_cli_handler(prepare => $prepare); diff --git a/debian/postinst b/debian/postinst index aed4da3f..f6407178 100755 --- a/debian/postinst +++ b/debian/postinst @@ -79,6 +79,7 @@ case "$1" in deb-systemd-invoke reload-or-try-restart pvestatd.service deb-systemd-invoke reload-or-try-restart pveproxy.service deb-systemd-invoke reload-or-try-restart spiceproxy.service + deb-systemd-invoke reload-or-try-restart pvescheduler.service exit 0;; @@ -102,7 +103,7 @@ case "$1" in # same as dh_systemd_enable (code copied) - UNITS="pvedaemon.service pveproxy.service spiceproxy.service pvestatd.service pvebanner.service pvesr.timer pve-daily-update.timer" + UNITS="pvedaemon.service pveproxy.service spiceproxy.service pvestatd.service pvebanner.service pvescheduler.service pve-daily-update.timer" NO_RESTART_UNITS="pvenetcommit.service pve-guests.service" for unit in ${UNITS} ${NO_RESTART_UNITS}; do diff --git a/services/Makefile b/services/Makefile index a3d04a2f..b46c7119 100644 --- a/services/Makefile +++ b/services/Makefile @@ -13,8 +13,7 @@ SERVICES= \ pve-storage.target \ pve-daily-update.service\ pve-daily-update.timer \ - pvesr.service \ - pvesr.timer + pvescheduler.service .PHONY: install install: ${SERVICES} diff --git a/services/pvescheduler.service b/services/pvescheduler.service new file mode 100644 index 00000000..11769e80 --- /dev/null +++ b/services/pvescheduler.service @@ -0,0 +1,16 @@ +[Unit] +Description=Proxmox VE scheduler +ConditionPathExists=/usr/bin/pvescheduler +Wants=pve-cluster.service +After=pve-cluster.service +After=pve-storage.target + +[Service] +ExecStart=/usr/bin/pvescheduler start +ExecStop=/usr/bin/pvescheduler stop +PIDFile=/var/run/pvescheduler.pid +KillMode=process +Type=forking + +[Install] +WantedBy=multi-user.target diff --git a/services/pvesr.service b/services/pvesr.service deleted file mode 100644 index cbaed1ca..00000000 --- a/services/pvesr.service +++ /dev/null @@ -1,8 +0,0 @@ -[Unit] -Description=Proxmox VE replication runner -ConditionPathExists=/usr/bin/pvesr -After=pve-cluster.service - -[Service] -Type=oneshot -ExecStart=/usr/bin/pvesr run --mail 1 diff --git a/services/pvesr.timer b/services/pvesr.timer deleted file mode 100644 index 01d7b9c7..00000000 --- a/services/pvesr.timer +++ /dev/null @@ -1,12 +0,0 @@ -[Unit] -Description=Proxmox VE replication runner - -[Timer] -AccuracySec=1 -RemainAfterElapse=no - -[Timer] -OnCalendar=minutely - -[Install] -WantedBy=timers.target \ No newline at end of file -- 2.30.2