public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more)
@ 2021-11-08 13:07 Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH cluster v2 1/1] add 'jobs.cfg' to observed files Dominik Csapak
                   ` (10 more replies)
  0 siblings, 11 replies; 16+ messages in thread
From: Dominik Csapak @ 2021-11-08 13:07 UTC (permalink / raw)
  To: pve-devel

with this series, we implement a new daemon (pvescheduler) that takes
over from pvesrs' systemd timer (original patch from thomas[0]) and
extends it with a generic job handling mechanism

then i convert the vzdump cron jobs to these jobs, the immediate
gain is that users can use calendarevent schedules instead of
dow + starttime

for now, this 'jobs.cfg' only handles vzdump jobs, but should be easily
extendable for other type of recurring jobs (like auth realm sync, etc.)

also, i did not yet convert the replication jobs to this job system,
but that could probably be done without too much effort (though
i did not look too deeply into it)

if some version of this gets applied, the further plan would be
to remove the vzdump.cron part completely with 8.0, but until then
we must at least list/parse that

whats currently missing but not too hard to add is a calculated
'next-run' column in the gui

changes from v1:
* do not log replication into the syslog
* readjust the loop to start at the full minute every 1000 loops
* rework locking state locking/handling:
  - i introduces a new 'starting' state that is set before we start
    and we set it to started after the start.
    we sadly cannot start the job while we hold the lock, since the open
    file descriptor will be still open in the worker, and then we cannot
    get the flock again. now it's more modeled after how we do qm/ct
    long running locks (by writing 'starting' locked into the state)
  - the stop check is now its own call at the beginning of the job handling
  - handle created/removed jobs properly:
    i did not think of state handling on other nodes in my previous
    iteration. now on every loop, i sync the statefiles with the config
    (create/remvoe) so that the file gets created/removed on all nodes
* incorporated fabians feedback for the api (thanks!)

0: https://lists.proxmox.com/pipermail/pve-devel/2018-April/031357.html

pve-cluster:

Dominik Csapak (1):
  add 'jobs.cfg' to observed files

 data/PVE/Cluster.pm | 1 +
 data/src/status.c   | 1 +
 2 files changed, 2 insertions(+)

pve-manager:

Dominik Csapak (5):
  add PVE/Jobs to handle VZDump jobs
  pvescheduler: run jobs from jobs.cfg
  api/backup: refactor string for all days
  api/backup: handle new vzdump jobs
  ui: dc/backup: show id+schedule instead of dow+starttime

Thomas Lamprecht (1):
  replace systemd timer with pvescheduler daemon

 PVE/API2/Backup.pm                 | 235 +++++++++++++++++++-----
 PVE/API2/Cluster/BackupInfo.pm     |   9 +
 PVE/Jobs.pm                        | 286 +++++++++++++++++++++++++++++
 PVE/Jobs/Makefile                  |  16 ++
 PVE/Jobs/Plugin.pm                 |  61 ++++++
 PVE/Jobs/VZDump.pm                 |  54 ++++++
 PVE/Makefile                       |   3 +-
 PVE/Service/Makefile               |   2 +-
 PVE/Service/pvescheduler.pm        | 131 +++++++++++++
 bin/Makefile                       |   6 +-
 bin/pvescheduler                   |  28 +++
 debian/postinst                    |   3 +-
 services/Makefile                  |   3 +-
 services/pvescheduler.service      |  16 ++
 services/pvesr.service             |   8 -
 services/pvesr.timer               |  12 --
 www/manager6/dc/Backup.js          |  46 +++--
 www/manager6/dc/BackupJobDetail.js |  10 +-
 18 files changed, 823 insertions(+), 106 deletions(-)
 create mode 100644 PVE/Jobs.pm
 create mode 100644 PVE/Jobs/Makefile
 create mode 100644 PVE/Jobs/Plugin.pm
 create mode 100644 PVE/Jobs/VZDump.pm
 create mode 100755 PVE/Service/pvescheduler.pm
 create mode 100755 bin/pvescheduler
 create mode 100644 services/pvescheduler.service
 delete mode 100644 services/pvesr.service
 delete mode 100644 services/pvesr.timer

-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] [PATCH cluster v2 1/1] add 'jobs.cfg' to observed files
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
@ 2021-11-08 13:07 ` Dominik Csapak
  2021-11-09 17:18   ` [pve-devel] applied: " Thomas Lamprecht
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 1/6] replace systemd timer with pvescheduler daemon Dominik Csapak
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 16+ messages in thread
From: Dominik Csapak @ 2021-11-08 13:07 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 data/PVE/Cluster.pm | 1 +
 data/src/status.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/data/PVE/Cluster.pm b/data/PVE/Cluster.pm
index 4d09c60..ab8f713 100644
--- a/data/PVE/Cluster.pm
+++ b/data/PVE/Cluster.pm
@@ -45,6 +45,7 @@ my $dbbackupdir = "/var/lib/pve-cluster/backup";
 # using a computed version and only those can be used by the cfs_*_file methods
 my $observed = {
     'vzdump.cron' => 1,
+    'jobs.cfg' => 1,
     'storage.cfg' => 1,
     'datacenter.cfg' => 1,
     'replication.cfg' => 1,
diff --git a/data/src/status.c b/data/src/status.c
index ff0b1e9..9bceaeb 100644
--- a/data/src/status.c
+++ b/data/src/status.c
@@ -89,6 +89,7 @@ static memdb_change_t memdb_change_array[] = {
 	{ .path = "priv/ipam.db" },
 	{ .path = "datacenter.cfg" },
 	{ .path = "vzdump.cron" },
+	{ .path = "jobs.cfg" },
 	{ .path = "ha/crm_commands" },
 	{ .path = "ha/manager_status" },
 	{ .path = "ha/resources.cfg" },
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] [PATCH manager v2 1/6] replace systemd timer with pvescheduler daemon
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH cluster v2 1/1] add 'jobs.cfg' to observed files Dominik Csapak
@ 2021-11-08 13:07 ` Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 2/6] add PVE/Jobs to handle VZDump jobs Dominik Csapak
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2021-11-08 13:07 UTC (permalink / raw)
  To: pve-devel; +Cc: Thomas Lamprecht

From: Thomas Lamprecht <t.lamprecht@proxmox.com>

The whole thing is already prepared for this, the systemd timer was
just a fixed periodic timer with a frequency of one minute. And we
just introduced it as the assumption was made that less memory usage
would be generated with this approach, AFAIK.

But logging 4+ lines just about that the timer was started, even if
it does nothing, and that 24/7 is not to cheap and a bit annoying.

So in a first step add a simple daemon, which forks of a child for
running jobs once a minute.
This could be made still a bit more intelligent, i.e., look if we
have jobs tor run before forking - as forking is not the cheapest
syscall. Further, we could adapt the sleep interval to the next time
we actually need to run a job (and sending a SIGUSR to the daemon if
a job interval changes such, that this interval got narrower)

We try to sync running on minute-change boundaries at start, this
emulates systemd.timer behaviour, we had until now. Also user can
configure jobs on minute precision, so they probably expect that
those also start really close to a minute change event.
Could be adapted to resync during running, to factor in time drift.
But, as long as enough cpu cycles are available we run in correct
monotonic intervalls, so this isn't a must, IMO.

Another improvement could be locking a bit more fine grained, i.e.
not on a per-all-local-job-runs basis, but per-job (per-guest?)
basis, which would improve temporary starvement  of small
high-periodic jobs through big, less peridoci jobs.
We argued that it's the user fault if such situations arise, but they
can evolve over time without noticing, especially in compolexer
setups.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/Service/Makefile          |   2 +-
 PVE/Service/pvescheduler.pm   | 118 ++++++++++++++++++++++++++++++++++
 bin/Makefile                  |   6 +-
 bin/pvescheduler              |  28 ++++++++
 debian/postinst               |   3 +-
 services/Makefile             |   3 +-
 services/pvescheduler.service |  16 +++++
 services/pvesr.service        |   8 ---
 services/pvesr.timer          |  12 ----
 9 files changed, 171 insertions(+), 25 deletions(-)
 create mode 100755 PVE/Service/pvescheduler.pm
 create mode 100755 bin/pvescheduler
 create mode 100644 services/pvescheduler.service
 delete mode 100644 services/pvesr.service
 delete mode 100644 services/pvesr.timer

diff --git a/PVE/Service/Makefile b/PVE/Service/Makefile
index fc1cdb14..f66ceb81 100644
--- a/PVE/Service/Makefile
+++ b/PVE/Service/Makefile
@@ -1,6 +1,6 @@
 include ../../defines.mk
 
-SOURCES=pvestatd.pm pveproxy.pm pvedaemon.pm spiceproxy.pm
+SOURCES=pvestatd.pm pveproxy.pm pvedaemon.pm spiceproxy.pm pvescheduler.pm
 
 all:
 
diff --git a/PVE/Service/pvescheduler.pm b/PVE/Service/pvescheduler.pm
new file mode 100755
index 00000000..96d84e20
--- /dev/null
+++ b/PVE/Service/pvescheduler.pm
@@ -0,0 +1,118 @@
+package PVE::Service::pvescheduler;
+
+use strict;
+use warnings;
+
+use POSIX qw(WNOHANG);
+use PVE::SafeSyslog;
+use PVE::API2::Replication;
+
+use PVE::Daemon;
+use base qw(PVE::Daemon);
+
+my $cmdline = [$0, @ARGV];
+my %daemon_options = (stop_wait_time => 180, max_workers => 0);
+my $daemon = __PACKAGE__->new('pvescheduler', $cmdline, %daemon_options);
+
+my $finish_jobs = sub {
+    my ($self) = @_;
+    foreach my $cpid (keys %{$self->{jobs}}) {
+	my $waitpid = waitpid($cpid, WNOHANG);
+	if (defined($waitpid) && ($waitpid == $cpid)) {
+	    delete ($self->{jobs}->{$cpid});
+	}
+    }
+};
+
+my $get_sleep_time = sub {
+    my ($calculate_offset) = @_;
+    my $time = 60;
+
+    if ($calculate_offset) {
+	# try to run near minute boundaries, makes more sense to the user as he
+	# configures jobs with minute precision
+	my ($current_seconds) = localtime;
+	$time = (60 - $current_seconds) if (60 - $current_seconds >= 5);
+    }
+
+    return $time;
+};
+
+sub run {
+    my ($self) = @_;
+
+    my $jobs= {};
+    $self->{jobs} = $jobs;
+
+    my $old_sig_chld = $SIG{CHLD};
+    local $SIG{CHLD} = sub {
+	local ($@, $!, $?); # do not overwrite error vars
+	$finish_jobs->($self);
+	$old_sig_chld->(@_) if $old_sig_chld;
+    };
+
+    my $run_jobs = sub {
+	my $child = fork();
+	if (!defined($child)) {
+	    die "fork failed: $!\n";
+	} elsif ($child == 0) {
+	    $self->after_fork_cleanup();
+	    PVE::API2::Replication::run_jobs(undef, sub {}, 0, 1);
+	    POSIX::_exit(0);
+	}
+
+	$jobs->{$child} = 1;
+    };
+
+    PVE::Jobs::setup_dirs();
+
+    for (my $count = 1000;;$count++) {
+	last if $self->{shutdown_request};
+
+	$run_jobs->();
+
+	my $sleep_time;
+	if ($count >= 1000) {
+	    $sleep_time = $get_sleep_time->(1);
+	    $count = 0;
+	} else {
+	    $sleep_time = $get_sleep_time->(0);
+	}
+
+	my $slept = 0; # SIGCHLD interrupts sleep, so we need to keep track
+	while ($slept < $sleep_time) {
+	    last if $self->{shutdown_request};
+	    $slept += sleep($sleep_time - $slept);
+	}
+    }
+
+    # jobs have a lock timeout of 60s, wait a bit more for graceful termination
+    my $timeout = 0;
+    while (keys %$jobs > 0 && $timeout < 75) {
+	kill 'TERM', keys %$jobs;
+	$timeout += sleep(5);
+    }
+    # ensure the rest gets stopped
+    kill 'KILL', keys %$jobs if (keys %$jobs > 0);
+}
+
+sub shutdown {
+    my ($self) = @_;
+
+    syslog('info', 'got shutdown request, signal running jobs to stop');
+
+    kill 'TERM', keys %{$self->{jobs}};
+    $self->{shutdown_request} = 1;
+}
+
+$daemon->register_start_command();
+$daemon->register_stop_command();
+$daemon->register_status_command();
+
+our $cmddef = {
+    start => [ __PACKAGE__, 'start', []],
+    stop => [ __PACKAGE__, 'stop', []],
+    status => [ __PACKAGE__, 'status', [], undef, sub { print shift . "\n";} ],
+};
+
+1;
diff --git a/bin/Makefile b/bin/Makefile
index df30b27c..12cb4671 100644
--- a/bin/Makefile
+++ b/bin/Makefile
@@ -6,7 +6,7 @@ export NOVIEW=1
 PERL_DOC_INC_DIRS=..
 include /usr/share/pve-doc-generator/pve-doc-generator.mk
 
-SERVICES = pvestatd pveproxy pvedaemon spiceproxy
+SERVICES = pvestatd pveproxy pvedaemon spiceproxy pvescheduler
 CLITOOLS = vzdump pvesubscription pveceph pveam pvesr pvenode pvesh pve6to7
 
 SCRIPTS =  			\
@@ -52,6 +52,10 @@ pve6to7.1:
 	printf ".TH PVE6TO7 1\n.SH NAME\npve6to7 \- Proxmox VE upgrade checker script for 6.4 to 7.x\n" > $@
 	printf ".SH SYNOPSIS\npve6to7 [--full]\n" >> $@
 
+pvescheduler.8:
+	# FIXME: add to doc-generator
+	echo ".TH pvescheduler 8" > $@
+
 pveversion.1.pod: pveversion
 pveupgrade.1.pod: pveupgrade
 pvereport.1.pod: pvereport
diff --git a/bin/pvescheduler b/bin/pvescheduler
new file mode 100755
index 00000000..5f2bde52
--- /dev/null
+++ b/bin/pvescheduler
@@ -0,0 +1,28 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+
+use PVE::Service::pvescheduler;
+
+use PVE::RPCEnvironment;
+use PVE::SafeSyslog;
+
+$SIG{'__WARN__'} = sub {
+    my $err = $@;
+    my $t = $_[0];
+    chomp $t;
+    print STDERR "$t\n";
+    syslog('warning', "%s", $t);
+    $@ = $err;
+};
+
+my $prepare = sub {
+    my $rpcenv = PVE::RPCEnvironment->init('priv');
+
+    $rpcenv->init_request();
+    $rpcenv->set_language($ENV{LANG});
+    $rpcenv->set_user('root@pam');
+};
+
+PVE::Service::pvescheduler->run_cli_handler(prepare => $prepare);
diff --git a/debian/postinst b/debian/postinst
index c4df64b5..7be1aa3d 100755
--- a/debian/postinst
+++ b/debian/postinst
@@ -79,6 +79,7 @@ case "$1" in
     deb-systemd-invoke reload-or-try-restart pvestatd.service
     deb-systemd-invoke reload-or-try-restart pveproxy.service
     deb-systemd-invoke reload-or-try-restart spiceproxy.service
+    deb-systemd-invoke reload-or-try-restart pvescheduler.service
 
     exit 0;;
 
@@ -102,7 +103,7 @@ case "$1" in
 
     # same as dh_systemd_enable (code copied)
 
-    UNITS="pvedaemon.service pveproxy.service spiceproxy.service pvestatd.service pvebanner.service pvesr.timer pve-daily-update.timer"
+    UNITS="pvedaemon.service pveproxy.service spiceproxy.service pvestatd.service pvebanner.service pvescheduler.service pve-daily-update.timer"
     NO_RESTART_UNITS="pvenetcommit.service pve-guests.service"
 
     for unit in ${UNITS} ${NO_RESTART_UNITS}; do
diff --git a/services/Makefile b/services/Makefile
index a3d04a2f..b46c7119 100644
--- a/services/Makefile
+++ b/services/Makefile
@@ -13,8 +13,7 @@ SERVICES=			\
 	pve-storage.target	\
 	pve-daily-update.service\
 	pve-daily-update.timer	\
-	pvesr.service		\
-	pvesr.timer
+	pvescheduler.service
 
 .PHONY: install
 install: ${SERVICES}
diff --git a/services/pvescheduler.service b/services/pvescheduler.service
new file mode 100644
index 00000000..11769e80
--- /dev/null
+++ b/services/pvescheduler.service
@@ -0,0 +1,16 @@
+[Unit]
+Description=Proxmox VE scheduler
+ConditionPathExists=/usr/bin/pvescheduler
+Wants=pve-cluster.service
+After=pve-cluster.service
+After=pve-storage.target
+
+[Service]
+ExecStart=/usr/bin/pvescheduler start
+ExecStop=/usr/bin/pvescheduler stop
+PIDFile=/var/run/pvescheduler.pid
+KillMode=process
+Type=forking
+
+[Install]
+WantedBy=multi-user.target
diff --git a/services/pvesr.service b/services/pvesr.service
deleted file mode 100644
index cbaed1ca..00000000
--- a/services/pvesr.service
+++ /dev/null
@@ -1,8 +0,0 @@
-[Unit]
-Description=Proxmox VE replication runner
-ConditionPathExists=/usr/bin/pvesr
-After=pve-cluster.service
-
-[Service]
-Type=oneshot
-ExecStart=/usr/bin/pvesr run --mail 1
diff --git a/services/pvesr.timer b/services/pvesr.timer
deleted file mode 100644
index 01d7b9c7..00000000
--- a/services/pvesr.timer
+++ /dev/null
@@ -1,12 +0,0 @@
-[Unit]
-Description=Proxmox VE replication runner
-
-[Timer]
-AccuracySec=1
-RemainAfterElapse=no
-
-[Timer]
-OnCalendar=minutely
-
-[Install]
-WantedBy=timers.target
\ No newline at end of file
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] [PATCH manager v2 2/6] add PVE/Jobs to handle VZDump jobs
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH cluster v2 1/1] add 'jobs.cfg' to observed files Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 1/6] replace systemd timer with pvescheduler daemon Dominik Csapak
@ 2021-11-08 13:07 ` Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 3/6] pvescheduler: run jobs from jobs.cfg Dominik Csapak
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2021-11-08 13:07 UTC (permalink / raw)
  To: pve-devel

this adds a SectionConfig handling for jobs (only 'vzdump' for now) that
represents a job that will be handled by pvescheduler and a basic
'job-state' handling for reading/writing state json files

this has some intersections with pvesrs state handling, but does not
use a single state file for all jobs, but seperate ones, like we
do it in the backup-server.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/Jobs.pm        | 286 +++++++++++++++++++++++++++++++++++++++++++++
 PVE/Jobs/Makefile  |  16 +++
 PVE/Jobs/Plugin.pm |  61 ++++++++++
 PVE/Jobs/VZDump.pm |  54 +++++++++
 PVE/Makefile       |   3 +-
 5 files changed, 419 insertions(+), 1 deletion(-)
 create mode 100644 PVE/Jobs.pm
 create mode 100644 PVE/Jobs/Makefile
 create mode 100644 PVE/Jobs/Plugin.pm
 create mode 100644 PVE/Jobs/VZDump.pm

diff --git a/PVE/Jobs.pm b/PVE/Jobs.pm
new file mode 100644
index 00000000..2fe197d2
--- /dev/null
+++ b/PVE/Jobs.pm
@@ -0,0 +1,286 @@
+package PVE::Jobs;
+
+use strict;
+use warnings;
+use JSON;
+
+use PVE::Cluster qw(cfs_read_file cfs_lock_file);
+use PVE::Jobs::Plugin;
+use PVE::Jobs::VZDump;
+use PVE::Tools;
+
+PVE::Jobs::VZDump->register();
+PVE::Jobs::Plugin->init();
+
+my $state_dir = "/var/lib/pve-manager/jobs";
+my $lock_dir = "/var/lock/pve-manager";
+
+my $get_state_file = sub {
+    my ($jobid, $type) = @_;
+    return "$state_dir/$type-$jobid.json";
+};
+
+my $default_state = {
+    state => 'created',
+    time => 0,
+};
+
+# lockless, since we use file_get_contents, which is atomic
+sub read_job_state {
+    my ($jobid, $type) = @_;
+    my $path = $get_state_file->($jobid, $type);
+    return if ! -e $path;
+
+    my $raw = PVE::Tools::file_get_contents($path);
+
+    return $default_state if $raw eq '';
+
+    # untaint $raw
+    if ($raw =~ m/^(\{.*\})$/) {
+	return decode_json($1);
+    }
+
+    die "invalid json data in '$path'\n";
+}
+
+sub lock_job_state {
+    my ($jobid, $type, $sub) = @_;
+
+    my $filename = "$lock_dir/$type-$jobid.lck";
+
+    my $res = PVE::Tools::lock_file($filename, 10, $sub);
+    die $@ if $@;
+
+    return $res;
+}
+
+my $get_job_status = sub {
+    my ($state) = @_;
+
+    if (!defined($state->{upid})) {
+	return; # not started
+    }
+
+    my ($task, $filename) = PVE::Tools::upid_decode($state->{upid}, 1);
+    die "unable to parse worker upid - $state->{upid}\n" if !$task;
+    die "no such task\n" if ! -f $filename;
+
+    my $pstart = PVE::ProcFSTools::read_proc_starttime($task->{pid});
+    if ($pstart && $pstart == $task->{pstart}) {
+	return; # still running
+    }
+
+    return PVE::Tools::upid_read_status($state->{upid});
+};
+
+# checks if the job is already finished if it was started before and
+# updates the statefile accordingly
+sub update_job_stopped {
+    my ($jobid, $type) = @_;
+
+    # first check unlocked to save time,
+    my $state = read_job_state($jobid, $type);
+    return if !defined($state) || $state->{state} ne 'started'; # removed or not started
+
+    if (defined($get_job_status->($state))) {
+	lock_job_state($jobid, $type, sub {
+	    my $state = read_job_state($jobid, $type);
+	    return if !defined($state) || $state->{state} ne 'started'; # removed or not started
+
+	    my $status = $get_job_status->($state);
+
+	    my $new_state = {
+		state => 'stopped',
+		msg => $status,
+		upid => $state->{upid}
+	    };
+
+	    if ($state->{updated}) { # save updated time stamp
+		$new_state->{updated} = $state->{updated};
+	    }
+
+	    my $path = $get_state_file->($jobid, $type);
+	    PVE::Tools::file_set_contents($path, encode_json($new_state));
+	});
+    }
+}
+
+# must be called when the job is first created
+sub create_job {
+    my ($jobid, $type) = @_;
+
+    lock_job_state($jobid, $type, sub {
+	my $state = read_job_state($jobid, $type) // $default_state;
+
+	if ($state->{state} ne 'created') {
+	    die "job state already exists\n";
+	}
+
+	$state->{time} = time();
+
+	my $path = $get_state_file->($jobid, $type);
+	PVE::Tools::file_set_contents($path, encode_json($state));
+    });
+}
+
+# to be called when the job is removed
+sub remove_job {
+    my ($jobid, $type) = @_;
+    my $path = $get_state_file->($jobid, $type);
+    unlink $path;
+}
+
+# checks if the job can be started and sets the state to 'starting'
+# returns 1 if the job can be started, 0 otherwise
+sub starting_job {
+    my ($jobid, $type) = @_;
+
+    # first check unlocked to save time
+    my $state = read_job_state($jobid, $type);
+    return 0 if !defined($state) || $state->{state} eq 'started'; # removed or already started
+
+    lock_job_state($jobid, $type, sub {
+	my $state = read_job_state($jobid, $type);
+	return 0 if !defined($state) || $state->{state} eq 'started'; # removed or already started
+
+	my $new_state = {
+	    state => 'starting',
+	    time => time(),
+	};
+
+	my $path = $get_state_file->($jobid, $type);
+	PVE::Tools::file_set_contents($path, encode_json($new_state));
+    });
+    return 1;
+}
+
+sub started_job {
+    my ($jobid, $type, $upid, $err) = @_;
+    lock_job_state($jobid, $type, sub {
+	my $state = read_job_state($jobid, $type);
+	return if !defined($state); # job was removed, do not update
+	die "unexpected state '$state->{state}'\n" if $state->{state} ne 'starting';
+
+	my $new_state;
+	if (defined($err)) {
+	    $new_state = {
+		state => 'stopped',
+		msg => $err,
+		time => time(),
+	    };
+	} else {
+	    $new_state = {
+		state => 'started',
+		upid => $upid,
+	    };
+	}
+
+	my $path = $get_state_file->($jobid, $type);
+	PVE::Tools::file_set_contents($path, encode_json($new_state));
+    });
+}
+
+# will be called when the job schedule is updated
+sub updated_job_schedule {
+    my ($jobid, $type) = @_;
+    lock_job_state($jobid, $type, sub {
+	my $old_state = read_job_state($jobid, $type) // $default_state;
+
+	$old_state->{updated} = time();
+
+	my $path = $get_state_file->($jobid, $type);
+	PVE::Tools::file_set_contents($path, encode_json($old_state));
+    });
+}
+
+sub get_last_runtime {
+    my ($jobid, $type) = @_;
+
+    my $state = read_job_state($jobid, $type) // $default_state;
+
+    return $state->{updated} if defined($state->{updated});
+
+    if (my $upid = $state->{upid}) {
+	my ($task) = PVE::Tools::upid_decode($upid, 1);
+	die "unable to parse worker upid\n" if !$task;
+	return $task->{starttime};
+    }
+
+    return $state->{time} // 0;
+}
+
+sub run_jobs {
+    synchronize_job_states_with_config();
+
+    my $jobs_cfg = cfs_read_file('jobs.cfg');
+    my $nodename = PVE::INotify::nodename();
+
+    foreach my $id (sort keys %{$jobs_cfg->{ids}}) {
+	my $cfg = $jobs_cfg->{ids}->{$id};
+	my $type = $cfg->{type};
+	my $schedule = delete $cfg->{schedule};
+
+	# only schedule local jobs
+	next if defined($cfg->{node}) && $cfg->{node} ne $nodename;
+
+	eval {
+	    update_job_stopped($id, $type);
+	};
+	if (my $err = $@) {
+	    warn "could not update job state, skipping - $err\n";
+	    next;
+	}
+
+	# only schedule enabled jobs
+	next if defined($cfg->{enabled}) && !$cfg->{enabled};
+
+	my $last_run = get_last_runtime($id, $type);
+	my $calspec = PVE::CalendarEvent::parse_calendar_event($schedule);
+	my $next_sync = PVE::CalendarEvent::compute_next_event($calspec, $last_run) // 0;
+
+	if (time() >= $next_sync) {
+	    my $plugin = PVE::Jobs::Plugin->lookup($type);
+	    if (starting_job($id, $type)) {
+		my $upid = eval { $plugin->run($cfg) };
+		if (my $err = $@) {
+		    warn $@ if $@;
+		    started_job($id, $type, undef, $err);
+		} elsif ($upid eq 'OK') { # some jobs return OK immediatly
+		    started_job($id, $type, undef, 'OK');
+		} else {
+		    started_job($id, $type, $upid);
+		}
+	    }
+	}
+    }
+}
+
+# creates and removes statefiles for job configs
+sub synchronize_job_states_with_config {
+    cfs_lock_file('jobs.cfg', undef, sub {
+	my $data = cfs_read_file('jobs.cfg');
+
+	for my $id (keys $data->{ids}->%*) {
+	    my $job = $data->{ids}->{$id};
+	    my $type = $job->{type};
+	    my $jobstate = read_job_state($id, $type);
+	    create_job($id, $type) if !defined($jobstate);
+	}
+
+	PVE::Tools::dir_glob_foreach($state_dir, '(.*?)-(.*).json', sub {
+	    my ($path, $type, $id) = @_;
+
+	    if (!defined($data->{ids}->{$id})) {
+		remove_job($id, $type);
+	    }
+	});
+    });
+    die $@ if $@;
+}
+
+sub setup_dirs {
+    mkdir $state_dir;
+    mkdir $lock_dir;
+}
+
+1;
diff --git a/PVE/Jobs/Makefile b/PVE/Jobs/Makefile
new file mode 100644
index 00000000..6023c3ba
--- /dev/null
+++ b/PVE/Jobs/Makefile
@@ -0,0 +1,16 @@
+include ../../defines.mk
+
+PERLSOURCE =   \
+	Plugin.pm\
+	VZDump.pm
+
+all:
+
+.PHONY: clean
+clean:
+	rm -rf *~
+
+.PHONY: install
+install: ${PERLSOURCE}
+	install -d ${PERLLIBDIR}/PVE/Jobs
+	install -m 0644 ${PERLSOURCE} ${PERLLIBDIR}/PVE/Jobs
diff --git a/PVE/Jobs/Plugin.pm b/PVE/Jobs/Plugin.pm
new file mode 100644
index 00000000..69c31cf2
--- /dev/null
+++ b/PVE/Jobs/Plugin.pm
@@ -0,0 +1,61 @@
+package PVE::Jobs::Plugin;
+
+use strict;
+use warnings;
+
+use PVE::Cluster qw(cfs_register_file);
+
+use base qw(PVE::SectionConfig);
+
+cfs_register_file('jobs.cfg',
+		  sub { __PACKAGE__->parse_config(@_); },
+		  sub { __PACKAGE__->write_config(@_); });
+
+my $defaultData = {
+    propertyList => {
+	type => { description => "Section type." },
+	id => {
+	    description => "The ID of the VZDump job.",
+	    type => 'string',
+	    format => 'pve-configid',
+	},
+	enabled => {
+	    description => "Determines if the job is enabled.",
+	    type => 'boolean',
+	    default => 1,
+	    optional => 1,
+	},
+	schedule => {
+	    description => "Backup schedule. The format is a subset of `systemd` calendar events.",
+	    type => 'string', format => 'pve-calendar-event',
+	    maxLength => 128,
+	},
+    },
+};
+
+sub private {
+    return $defaultData;
+}
+
+sub parse_config {
+    my ($class, $filename, $raw) = @_;
+
+    my $cfg = $class->SUPER::parse_config($filename, $raw);
+
+    foreach my $id (sort keys %{$cfg->{ids}}) {
+	my $data = $cfg->{ids}->{$id};
+
+	$data->{id} = $id;
+	$data->{enabled}  //= 1;
+   }
+
+   return $cfg;
+}
+
+sub run {
+    my ($class, $cfg) = @_;
+    # implement in subclass
+    die "not implemented";
+}
+
+1;
diff --git a/PVE/Jobs/VZDump.pm b/PVE/Jobs/VZDump.pm
new file mode 100644
index 00000000..043b7ace
--- /dev/null
+++ b/PVE/Jobs/VZDump.pm
@@ -0,0 +1,54 @@
+package PVE::Jobs::VZDump;
+
+use strict;
+use warnings;
+
+use PVE::INotify;
+use PVE::VZDump::Common;
+use PVE::API2::VZDump;
+use PVE::Cluster;
+
+use base qw(PVE::Jobs::Plugin);
+
+sub type {
+    return 'vzdump';
+}
+
+my $props = PVE::VZDump::Common::json_config_properties();
+
+sub properties {
+    return $props;
+}
+
+sub options {
+    my $options = {
+	enabled => { optional => 1 },
+	schedule => {},
+    };
+    foreach my $opt (keys %$props) {
+	if ($props->{$opt}->{optional}) {
+	    $options->{$opt} = { optional => 1 };
+	} else {
+	    $options->{$opt} = {};
+	}
+    }
+
+    return $options;
+}
+
+sub run {
+    my ($class, $conf) = @_;
+
+    # remove all non vzdump related options
+    foreach my $opt (keys %$conf) {
+	delete $conf->{$opt} if !defined($props->{$opt});
+    }
+
+    $conf->{quiet} = 1; # do not write to stdout/stderr
+
+    PVE::Cluster::cfs_update(); # refresh vmlist
+
+    return PVE::API2::VZDump->vzdump($conf);
+}
+
+1;
diff --git a/PVE/Makefile b/PVE/Makefile
index 0071fab1..48b85d33 100644
--- a/PVE/Makefile
+++ b/PVE/Makefile
@@ -1,6 +1,6 @@
 include ../defines.mk
 
-SUBDIRS=API2 Status CLI Service Ceph
+SUBDIRS=API2 Status CLI Service Ceph Jobs
 
 PERLSOURCE = 			\
 	API2.pm			\
@@ -11,6 +11,7 @@ PERLSOURCE = 			\
 	CertHelpers.pm		\
 	ExtMetric.pm		\
 	HTTPServer.pm		\
+	Jobs.pm			\
 	NodeConfig.pm		\
 	Report.pm		\
 	VZDump.pm
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] [PATCH manager v2 3/6] pvescheduler: run jobs from jobs.cfg
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
                   ` (2 preceding siblings ...)
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 2/6] add PVE/Jobs to handle VZDump jobs Dominik Csapak
@ 2021-11-08 13:07 ` Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 4/6] api/backup: refactor string for all days Dominik Csapak
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2021-11-08 13:07 UTC (permalink / raw)
  To: pve-devel

PVE/Jobs is responsible to decide if the job must run (e.g. with a
schedule)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/Service/pvescheduler.pm | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/PVE/Service/pvescheduler.pm b/PVE/Service/pvescheduler.pm
index 96d84e20..a8d548fb 100755
--- a/PVE/Service/pvescheduler.pm
+++ b/PVE/Service/pvescheduler.pm
@@ -6,6 +6,7 @@ use warnings;
 use POSIX qw(WNOHANG);
 use PVE::SafeSyslog;
 use PVE::API2::Replication;
+use PVE::Jobs;
 
 use PVE::Daemon;
 use base qw(PVE::Daemon);
@@ -51,19 +52,31 @@ sub run {
 	$old_sig_chld->(@_) if $old_sig_chld;
     };
 
-    my $run_jobs = sub {
+    my $fork = sub {
+	my ($sub) = @_;
 	my $child = fork();
 	if (!defined($child)) {
 	    die "fork failed: $!\n";
 	} elsif ($child == 0) {
 	    $self->after_fork_cleanup();
-	    PVE::API2::Replication::run_jobs(undef, sub {}, 0, 1);
+	    $sub->();
 	    POSIX::_exit(0);
 	}
 
 	$jobs->{$child} = 1;
     };
 
+    my $run_jobs = sub {
+
+	$fork->(sub {
+	    PVE::API2::Replication::run_jobs(undef, sub {}, 0, 1);
+	});
+
+	$fork->(sub {
+	    PVE::Jobs::run_jobs();
+	});
+    };
+
     PVE::Jobs::setup_dirs();
 
     for (my $count = 1000;;$count++) {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] [PATCH manager v2 4/6] api/backup: refactor string for all days
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
                   ` (3 preceding siblings ...)
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 3/6] pvescheduler: run jobs from jobs.cfg Dominik Csapak
@ 2021-11-08 13:07 ` Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 5/6] api/backup: handle new vzdump jobs Dominik Csapak
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2021-11-08 13:07 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Backup.pm | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Backup.pm b/PVE/API2/Backup.pm
index 3b343636..9dc3b48e 100644
--- a/PVE/API2/Backup.pm
+++ b/PVE/API2/Backup.pm
@@ -17,6 +17,8 @@ use PVE::VZDump::Common;
 
 use base qw(PVE::RESTHandler);
 
+use constant ALL_DAYS => 'mon,tue,wed,thu,fri,sat,sun';
+
 PVE::JSONSchema::register_format('pve-day-of-week', \&verify_day_of_week);
 sub verify_day_of_week {
     my ($value, $noerr) = @_;
@@ -101,7 +103,7 @@ __PACKAGE__->register_method({
 		type => 'string', format => 'pve-day-of-week-list',
 		optional => 1,
 		description => "Day of week selection.",
-		default => 'mon,tue,wed,thu,fri,sat,sun',
+		default => ALL_DAYS,
 	    },
 	    enabled => {
 		type => 'boolean',
@@ -129,7 +131,7 @@ __PACKAGE__->register_method({
 	my $create_job = sub {
 	    my $data = cfs_read_file('vzdump.cron');
 
-	    $param->{dow} = 'mon,tue,wed,thu,fri,sat,sun' if !defined($param->{dow});
+	    $param->{dow} = ALL_DAYS if !defined($param->{dow});
 	    $param->{enabled} = 1 if !defined($param->{enabled});
 	    PVE::VZDump::verify_vzdump_parameters($param, 1);
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] [PATCH manager v2 5/6] api/backup: handle new vzdump jobs
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
                   ` (4 preceding siblings ...)
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 4/6] api/backup: refactor string for all days Dominik Csapak
@ 2021-11-08 13:07 ` Dominik Csapak
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 6/6] ui: dc/backup: show id+schedule instead of dow+starttime Dominik Csapak
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2021-11-08 13:07 UTC (permalink / raw)
  To: pve-devel

in addition to listing the vzdump.cron jobs, also list from the
jobs.cfg file.

updates/creations go into the new jobs.cfg only now
and on update, starttime+dow get converted to a schedule
this transformation is straight forward, since 'dow'
is already in a compatible format (e.g. 'mon,tue') and we simply
append the starttime (if any)

id on creation is optional for now (for api compat), but will
be autogenerated (uuid). on update, we simply take the id from before
(the ids of the other entries in vzdump.cron will change but they would
anyway)

as long as we have the vzdump.cron file, we must lock both
vzdump.cron and jobs.cfg, since we often update both

we also change the backupinfo api call to read the jobs.cfg too

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Backup.pm             | 231 ++++++++++++++++++++++++++-------
 PVE/API2/Cluster/BackupInfo.pm |   9 ++
 2 files changed, 194 insertions(+), 46 deletions(-)

diff --git a/PVE/API2/Backup.pm b/PVE/API2/Backup.pm
index 9dc3b48e..3fa2eba3 100644
--- a/PVE/API2/Backup.pm
+++ b/PVE/API2/Backup.pm
@@ -3,6 +3,7 @@ package PVE::API2::Backup;
 use strict;
 use warnings;
 use Digest::SHA;
+use UUID;
 
 use PVE::SafeSyslog;
 use PVE::Tools qw(extract_param);
@@ -14,6 +15,7 @@ use PVE::Storage;
 use PVE::Exception qw(raise_param_exc);
 use PVE::VZDump;
 use PVE::VZDump::Common;
+use PVE::Jobs; # for VZDump Jobs
 
 use base qw(PVE::RESTHandler);
 
@@ -45,6 +47,35 @@ my $assert_param_permission = sub {
     }
 };
 
+my $convert_to_schedule = sub {
+    my ($job) = @_;
+
+    my $starttime = $job->{starttime};
+    my $dow = $job->{dow};
+
+    if (!$dow || $dow eq ALL_DAYS) {
+	return "$starttime";
+    }
+
+    return "$dow $starttime";
+};
+
+my $schedule_param_check = sub {
+    my ($param) = @_;
+    if (defined($param->{schedule})) {
+	if (defined($param->{starttime})) {
+	    raise_param_exc({ starttime => "'starttime' and 'schedule' cannot both be set" });
+	}
+    } elsif (!defined($param->{starttime})) {
+	raise_param_exc({ schedule => "neither 'starttime' nor 'schedule' were set" });
+    } else {
+	$param->{schedule} = $convert_to_schedule->($param);
+    }
+
+    delete $param->{starttime};
+    delete $param->{dow};
+};
+
 __PACKAGE__->register_method({
     name => 'index',
     path => '',
@@ -74,8 +105,20 @@ __PACKAGE__->register_method({
 	my $user = $rpcenv->get_user();
 
 	my $data = cfs_read_file('vzdump.cron');
+	my $jobs_data = cfs_read_file('jobs.cfg');
+	my $order = $jobs_data->{order};
+	my $jobs = $jobs_data->{ids};
 
 	my $res = $data->{jobs} || [];
+	foreach my $job (@$res) {
+	    $job->{schedule} = $convert_to_schedule->($job);
+	}
+
+	foreach my $jobid (sort { $order->{$a} <=> $order->{$b} } keys %$jobs) {
+	    my $job = $jobs->{$jobid};
+	    next if $job->{type} ne 'vzdump';
+	    push @$res, $job;
+	}
 
 	return $res;
     }});
@@ -93,16 +136,30 @@ __PACKAGE__->register_method({
     parameters => {
     	additionalProperties => 0,
 	properties => PVE::VZDump::Common::json_config_properties({
+	    id => {
+		type => 'string',
+		description => "Job ID (will be autogenerated).",
+		format => 'pve-configid',
+		optional => 1, # FIXME: make required on 8.0
+	    },
+	    schedule => {
+		description => "Backup schedule. The format is a subset of `systemd` calendar events.",
+		type => 'string', format => 'pve-calendar-event',
+		maxLength => 128,
+		optional => 1,
+	    },
 	    starttime => {
 		type => 'string',
 		description => "Job Start time.",
 		pattern => '\d{1,2}:\d{1,2}',
 		typetext => 'HH:MM',
+		optional => 1,
 	    },
 	    dow => {
 		type => 'string', format => 'pve-day-of-week-list',
 		optional => 1,
 		description => "Day of week selection.",
+		requires => 'starttime',
 		default => ALL_DAYS,
 	    },
 	    enabled => {
@@ -127,19 +184,29 @@ __PACKAGE__->register_method({
 	    $rpcenv->check($user, "/pool/$pool", ['VM.Backup']);
 	}
 
+	$schedule_param_check->($param);
 
-	my $create_job = sub {
-	    my $data = cfs_read_file('vzdump.cron');
+	$param->{enabled} = 1 if !defined($param->{enabled});
+
+	# autogenerate id for api compatibility FIXME remove with 8.0
+	my $id = extract_param($param, 'id') // uuid();
+
+	cfs_lock_file('jobs.cfg', undef, sub {
+	    my $data = cfs_read_file('jobs.cfg');
+
+	    die "Job '$id' already exists\n"
+		if $data->{ids}->{$id};
 
-	    $param->{dow} = ALL_DAYS if !defined($param->{dow});
-	    $param->{enabled} = 1 if !defined($param->{enabled});
 	    PVE::VZDump::verify_vzdump_parameters($param, 1);
+	    my $plugin = PVE::Jobs::Plugin->lookup('vzdump');
+	    my $opts = $plugin->check_config($id, $param, 1, 1);
 
-	    push @{$data->{jobs}}, $param;
+	    $data->{ids}->{$id} = $opts;
 
-	    cfs_write_file('vzdump.cron', $data);
-	};
-	cfs_lock_file('vzdump.cron', undef, $create_job);
+	    PVE::Jobs::create_job($id, 'vzdump');
+
+	    cfs_write_file('jobs.cfg', $data);
+	});
 	die "$@" if ($@);
 
 	return undef;
@@ -173,9 +240,16 @@ __PACKAGE__->register_method({
 	my $jobs = $data->{jobs} || [];
 
 	foreach my $job (@$jobs) {
-	    return $job if $job->{id} eq $param->{id};
+	    if ($job->{id} eq $param->{id}) {
+		$job->{schedule} = $convert_to_schedule->($job);
+		return $job;
+	    }
 	}
 
+	my $jobs_data = cfs_read_file('jobs.cfg');
+	my $job = $jobs_data->{ids}->{$param->{id}};
+	return $job if $job && $job->{type} eq 'vzdump';
+
 	raise_param_exc({ id => "No such job '$param->{id}'" });
 
     }});
@@ -202,6 +276,8 @@ __PACKAGE__->register_method({
 	my $rpcenv = PVE::RPCEnvironment::get();
 	my $user = $rpcenv->get_user();
 
+	my $id = $param->{id};
+
 	my $delete_job = sub {
 	    my $data = cfs_read_file('vzdump.cron');
 
@@ -210,18 +286,32 @@ __PACKAGE__->register_method({
 
 	    my $found;
 	    foreach my $job (@$jobs) {
-		if ($job->{id} eq $param->{id}) {
+		if ($job->{id} eq $id) {
 		    $found = 1;
 		} else {
 		    push @$newjobs, $job;
 		}
 	    }
 
-	    raise_param_exc({ id => "No such job '$param->{id}'" }) if !$found;
+	    if (!$found) {
+		cfs_lock_file('jobs.cfg', undef, sub {
+		    my $jobs_data = cfs_read_file('jobs.cfg');
 
-	    $data->{jobs} = $newjobs;
+		    if (!defined($jobs_data->{ids}->{$id})) {
+			raise_param_exc({ id => "No such job '$id'" });
+		    }
+		    delete $jobs_data->{ids}->{$id};
+
+		    PVE::Jobs::remove_job($id, 'vzdump');
 
-	    cfs_write_file('vzdump.cron', $data);
+		    cfs_write_file('jobs.cfg', $jobs_data);
+		});
+		die "$@" if $@;
+	    } else {
+		$data->{jobs} = $newjobs;
+
+		cfs_write_file('vzdump.cron', $data);
+	    }
 	};
 	cfs_lock_file('vzdump.cron', undef, $delete_job);
 	die "$@" if ($@);
@@ -243,15 +333,23 @@ __PACKAGE__->register_method({
     	additionalProperties => 0,
 	properties => PVE::VZDump::Common::json_config_properties({
 	    id => $vzdump_job_id_prop,
+	    schedule => {
+		description => "Backup schedule. The format is a subset of `systemd` calendar events.",
+		type => 'string', format => 'pve-calendar-event',
+		maxLength => 128,
+		optional => 1,
+	    },
 	    starttime => {
 		type => 'string',
 		description => "Job Start time.",
 		pattern => '\d{1,2}:\d{1,2}',
 		typetext => 'HH:MM',
+		optional => 1,
 	    },
 	    dow => {
 		type => 'string', format => 'pve-day-of-week-list',
 		optional => 1,
+		requires => 'starttime',
 		description => "Day of week selection.",
 	    },
 	    delete => {
@@ -281,57 +379,91 @@ __PACKAGE__->register_method({
 	    $rpcenv->check($user, "/pool/$pool", ['VM.Backup']);
 	}
 
+	$schedule_param_check->($param);
+
+	my $id = extract_param($param, 'id');
+	my $delete = extract_param($param, 'delete');
+	if ($delete) {
+	    $delete = [PVE::Tools::split_list($delete)];
+	}
+
 	my $update_job = sub {
 	    my $data = cfs_read_file('vzdump.cron');
+	    my $jobs_data = cfs_read_file('jobs.cfg');
 
 	    my $jobs = $data->{jobs} || [];
 
 	    die "no options specified\n" if !scalar(keys %$param);
 
 	    PVE::VZDump::verify_vzdump_parameters($param);
+	    my $plugin = PVE::Jobs::Plugin->lookup('vzdump');
+	    my $opts = $plugin->check_config($id, $param, 0, 1);
+
+	    # try to find it in old vzdump.cron and convert it to a job
+	    my ($idx) = grep { $jobs->[$_]->{id} eq $id } (0 .. scalar(@$jobs) - 1);
+
+	    my $job;
+	    if (defined($idx)) {
+		$job = splice @$jobs, $idx, 1;
+		$job->{schedule} = $convert_to_schedule->($job);
+		delete $job->{starttime};
+		delete $job->{dow};
+		delete $job->{id};
+		$job->{type} = 'vzdump';
+		$jobs_data->{ids}->{$id} = $job;
+	    } else {
+		$job = $jobs_data->{ids}->{$id};
+		die "no such vzdump job\n" if !$job || $job->{type} ne 'vzdump';
+	    }
 
-	    my @delete = PVE::Tools::split_list(extract_param($param, 'delete'));
-
-	    foreach my $job (@$jobs) {
-		if ($job->{id} eq $param->{id}) {
+	    foreach my $k (@$delete) {
+		if (!PVE::VZDump::option_exists($k)) {
+		    raise_param_exc({ delete => "unknown option '$k'" });
+		}
 
-		    foreach my $k (@delete) {
-			if (!PVE::VZDump::option_exists($k)) {
-			    raise_param_exc({ delete => "unknown option '$k'" });
-			}
+		delete $job->{$k};
+	    }
 
-			delete $job->{$k};
-		    }
+	    my $schedule_updated = 0;
+	    if ($param->{schedule} ne $job->{schedule}) {
+		$schedule_updated = 1;
+	    }
 
-		    foreach my $k (keys %$param) {
-			$job->{$k} = $param->{$k};
-		    }
+	    foreach my $k (keys %$param) {
+		$job->{$k} = $param->{$k};
+	    }
 
-		    $job->{all} = 1 if (defined($job->{exclude}) && !defined($job->{pool}));
-
-		    if (defined($param->{vmid})) {
-			delete $job->{all};
-			delete $job->{exclude};
-			delete $job->{pool};
-		    } elsif ($param->{all}) {
-			delete $job->{vmid};
-			delete $job->{pool};
-		    } elsif ($job->{pool}) {
-			delete $job->{vmid};
-			delete $job->{all};
-			delete $job->{exclude};
-		    }
+	    $job->{all} = 1 if (defined($job->{exclude}) && !defined($job->{pool}));
+
+	    if (defined($param->{vmid})) {
+		delete $job->{all};
+		delete $job->{exclude};
+		delete $job->{pool};
+	    } elsif ($param->{all}) {
+		delete $job->{vmid};
+		delete $job->{pool};
+	    } elsif ($job->{pool}) {
+		delete $job->{vmid};
+		delete $job->{all};
+		delete $job->{exclude};
+	    }
 
-		    PVE::VZDump::verify_vzdump_parameters($job, 1);
+	    PVE::VZDump::verify_vzdump_parameters($job, 1);
 
-		    cfs_write_file('vzdump.cron', $data);
+	    if ($schedule_updated) {
+		PVE::Jobs::updated_job_schedule($id, 'vzdump');
+	    }
 
-		    return undef;
-		}
+	    if (defined($idx)) {
+		cfs_write_file('vzdump.cron', $data);
 	    }
-	    raise_param_exc({ id => "No such job '$param->{id}'" });
+	    cfs_write_file('jobs.cfg', $jobs_data);
+	    return;
 	};
-	cfs_lock_file('vzdump.cron', undef, $update_job);
+	cfs_lock_file('vzdump.cron', undef, sub {
+	    cfs_lock_file('jobs.cfg', undef, $update_job);
+	    die "$@" if ($@);
+	});
 	die "$@" if ($@);
     }});
 
@@ -422,6 +554,13 @@ __PACKAGE__->register_method({
 	       last;
 	    }
 	}
+	if (!$job) {
+	    my $jobs_data = cfs_read_file('jobs.cfg');
+	    my $j = $jobs_data->{ids}->{$param->{id}};
+	    if ($j && $j->{type} eq 'vzdump') {
+		$job = $j;
+	    }
+	}
 	raise_param_exc({ id => "No such job '$param->{id}'" }) if !$job;
 
 	my $vmlist = PVE::Cluster::get_vmlist();
diff --git a/PVE/API2/Cluster/BackupInfo.pm b/PVE/API2/Cluster/BackupInfo.pm
index 236e171d..bea0e460 100644
--- a/PVE/API2/Cluster/BackupInfo.pm
+++ b/PVE/API2/Cluster/BackupInfo.pm
@@ -28,6 +28,15 @@ sub get_included_vmids {
 	push @all_vmids, ( map { @{$_} } values %{$job_included_guests} );
     }
 
+    my $vzjobs = cfs_read_file('jobs.cfg');
+
+    for my $job (values %{$vzjobs->{ids}}) {
+	next if $job->{type} ne 'vzdump';
+
+	my $job_included_guests = PVE::VZDump::get_included_guests($job);
+	push @all_vmids, ( map { @{$_} } values %{$job_included_guests} );
+    }
+
     return { map { $_ => 1 } @all_vmids };
 }
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] [PATCH manager v2 6/6] ui: dc/backup: show id+schedule instead of dow+starttime
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
                   ` (5 preceding siblings ...)
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 5/6] api/backup: handle new vzdump jobs Dominik Csapak
@ 2021-11-08 13:07 ` Dominik Csapak
  2021-11-09 14:17 ` [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dylan Whyte
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2021-11-08 13:07 UTC (permalink / raw)
  To: pve-devel

we can now show the id (since its not autogenerated anymore),
and we can always show/edit the schedule instead of the dow+starttime

also add an 'ID' field to the edit/create window and update the
backupinfo window as well

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 www/manager6/dc/Backup.js          | 46 ++++++++++++++----------------
 www/manager6/dc/BackupJobDetail.js | 10 ++-----
 2 files changed, 23 insertions(+), 33 deletions(-)

diff --git a/www/manager6/dc/Backup.js b/www/manager6/dc/Backup.js
index adefc5f4..dede8e6e 100644
--- a/www/manager6/dc/Backup.js
+++ b/www/manager6/dc/Backup.js
@@ -176,24 +176,22 @@ Ext.define('PVE.dc.BackupEdit', {
 	});
 
 	let column1 = [
-	    nodesel,
-	    storagesel,
 	    {
-		xtype: 'pveDayOfWeekSelector',
-		name: 'dow',
-		fieldLabel: gettext('Day of week'),
-		multiSelect: true,
-		value: ['sat'],
+		xtype: 'pmxDisplayEditField',
+		name: 'id',
+		fieldLabel: gettext('ID'),
+		renderer: Ext.htmlEncode,
 		allowBlank: false,
+		minLength: 4,
+		editable: me.isCreate,
 	    },
+	    nodesel,
+	    storagesel,
 	    {
-		xtype: 'timefield',
-		fieldLabel: gettext('Start Time'),
-		name: 'starttime',
-		format: 'H:i',
-		formatText: 'HH:MM',
-		value: '00:00',
+		xtype: 'pveCalendarEvent',
+		fieldLabel: gettext('Schedule'),
 		allowBlank: false,
+		name: 'schedule',
 	    },
 	    selModeField,
 	    selPool,
@@ -390,7 +388,7 @@ Ext.define('PVE.dc.BackupEdit', {
 		success: function(response, options) {
 		    let data = response.result.data;
 
-		    data.dow = data.dow.split(',');
+		    data.dow = (data.dow || '').split(',');
 
 		    if (data.all || data.exclude) {
 			if (data.exclude) {
@@ -532,6 +530,8 @@ Ext.define('PVE.dc.BackupView', {
 	    delete job.starttime;
 	    delete job.dow;
 	    delete job.id;
+	    delete job.schedule;
+	    delete job.type;
 	    delete job.node;
 	    job.all = job.all === true ? 1 : 0;
 
@@ -714,6 +714,10 @@ Ext.define('PVE.dc.BackupView', {
 		    disabledCls: 'x-item-enabled',
 		    stopSelection: false,
 		},
+		{
+		    header: gettext('ID'),
+		    dataIndex: 'id',
+		},
 		{
 		    header: gettext('Node'),
 		    width: 100,
@@ -727,17 +731,9 @@ Ext.define('PVE.dc.BackupView', {
 		    },
 		},
 		{
-		    header: gettext('Day of week'),
-		    width: 200,
-		    sortable: false,
-		    dataIndex: 'dow',
-		    renderer: PVE.Utils.render_backup_days_of_week,
-		},
-		{
-		    header: gettext('Start Time'),
-		    width: 60,
-		    sortable: true,
-		    dataIndex: 'starttime',
+		    header: gettext('Schedule'),
+		    width: 150,
+		    dataIndex: 'schedule',
 		},
 		{
 		    header: gettext('Storage'),
diff --git a/www/manager6/dc/BackupJobDetail.js b/www/manager6/dc/BackupJobDetail.js
index b91cb1b7..19b3b1a3 100644
--- a/www/manager6/dc/BackupJobDetail.js
+++ b/www/manager6/dc/BackupJobDetail.js
@@ -184,14 +184,8 @@ Ext.define('PVE.dc.BackupInfo', {
 	},
 	{
 	    xtype: 'displayfield',
-	    name: 'dow',
-	    fieldLabel: gettext('Day of week'),
-	    renderer: PVE.Utils.render_backup_days_of_week,
-	},
-	{
-	    xtype: 'displayfield',
-	    name: 'starttime',
-	    fieldLabel: gettext('Start Time'),
+	    name: 'schedule',
+	    fieldLabel: gettext('Schedule'),
 	},
 	{
 	    xtype: 'displayfield',
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more)
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
                   ` (6 preceding siblings ...)
  2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 6/6] ui: dc/backup: show id+schedule instead of dow+starttime Dominik Csapak
@ 2021-11-09 14:17 ` Dylan Whyte
  2021-11-10  9:38   ` Dominik Csapak
  2021-11-09 16:55 ` Aaron Lauterer
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 16+ messages in thread
From: Dylan Whyte @ 2021-11-09 14:17 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

Hi,

The patch set works as advertised and is very clean in terms of the 
switch from vzdump.cron to jobs.cfg.

One very minor inconvenience that i seen was the fact that for 
vzdump.cron 'Pool based' backups, you can't simply select 'Edit->Ok' to 
convert it to the new format, but rather you need to change some detail 
to allow the edit. However, this could barely be considered an issue, as 
it just turns conversion into a two-step process of 'make minor edit. 
revert minor edit'.

The IDs that are automatically generated for the old cron jobs are a bit 
ugly and all get shuffled each time a job is moved from vzdump.cron to 
jobs.cfg (as mentioned in the commit message), although they are at 
least easy to change from the job config file, so I don't imagine anyone 
using these default IDs for very long. On this note however, is having 
an API call to change the ID something that we want to avoid, or can it 
be added? I can see users being annoyed by not being able to change it 
from the GUI, even if only for the initial conversion.

When entering the ID of new jobs, it might also be helpful to show the 
allowed characters in a tool-tip box and to prevent the user trying to 
validate bad IDs by disabling the "Create" button while invalid, as is 
done with the 'add storage' window. The "invalid ID" message is a bit 
vague in any case, so I would at least mention what is allowed there.

Other than this, I didn't find any problems.

Tested-By: Dylan Whyte <d.whyte@proxmox.com>


On 11/8/21 2:07 PM, Dominik Csapak wrote:
> with this series, we implement a new daemon (pvescheduler) that takes
> over from pvesrs' systemd timer (original patch from thomas[0]) and
> extends it with a generic job handling mechanism
>
> then i convert the vzdump cron jobs to these jobs, the immediate
> gain is that users can use calendarevent schedules instead of
> dow + starttime
>
> for now, this 'jobs.cfg' only handles vzdump jobs, but should be easily
> extendable for other type of recurring jobs (like auth realm sync, etc.)
>
> also, i did not yet convert the replication jobs to this job system,
> but that could probably be done without too much effort (though
> i did not look too deeply into it)
>
> if some version of this gets applied, the further plan would be
> to remove the vzdump.cron part completely with 8.0, but until then
> we must at least list/parse that
>
> whats currently missing but not too hard to add is a calculated
> 'next-run' column in the gui
>
> changes from v1:
> * do not log replication into the syslog
> * readjust the loop to start at the full minute every 1000 loops
> * rework locking state locking/handling:
>    - i introduces a new 'starting' state that is set before we start
>      and we set it to started after the start.
>      we sadly cannot start the job while we hold the lock, since the open
>      file descriptor will be still open in the worker, and then we cannot
>      get the flock again. now it's more modeled after how we do qm/ct
>      long running locks (by writing 'starting' locked into the state)
>    - the stop check is now its own call at the beginning of the job handling
>    - handle created/removed jobs properly:
>      i did not think of state handling on other nodes in my previous
>      iteration. now on every loop, i sync the statefiles with the config
>      (create/remvoe) so that the file gets created/removed on all nodes
> * incorporated fabians feedback for the api (thanks!)
>
> 0: https://lists.proxmox.com/pipermail/pve-devel/2018-April/031357.html
>
> pve-cluster:
>
> Dominik Csapak (1):
>    add 'jobs.cfg' to observed files
>
>   data/PVE/Cluster.pm | 1 +
>   data/src/status.c   | 1 +
>   2 files changed, 2 insertions(+)
>
> pve-manager:
>
> Dominik Csapak (5):
>    add PVE/Jobs to handle VZDump jobs
>    pvescheduler: run jobs from jobs.cfg
>    api/backup: refactor string for all days
>    api/backup: handle new vzdump jobs
>    ui: dc/backup: show id+schedule instead of dow+starttime
>
> Thomas Lamprecht (1):
>    replace systemd timer with pvescheduler daemon
>
>   PVE/API2/Backup.pm                 | 235 +++++++++++++++++++-----
>   PVE/API2/Cluster/BackupInfo.pm     |   9 +
>   PVE/Jobs.pm                        | 286 +++++++++++++++++++++++++++++
>   PVE/Jobs/Makefile                  |  16 ++
>   PVE/Jobs/Plugin.pm                 |  61 ++++++
>   PVE/Jobs/VZDump.pm                 |  54 ++++++
>   PVE/Makefile                       |   3 +-
>   PVE/Service/Makefile               |   2 +-
>   PVE/Service/pvescheduler.pm        | 131 +++++++++++++
>   bin/Makefile                       |   6 +-
>   bin/pvescheduler                   |  28 +++
>   debian/postinst                    |   3 +-
>   services/Makefile                  |   3 +-
>   services/pvescheduler.service      |  16 ++
>   services/pvesr.service             |   8 -
>   services/pvesr.timer               |  12 --
>   www/manager6/dc/Backup.js          |  46 +++--
>   www/manager6/dc/BackupJobDetail.js |  10 +-
>   18 files changed, 823 insertions(+), 106 deletions(-)
>   create mode 100644 PVE/Jobs.pm
>   create mode 100644 PVE/Jobs/Makefile
>   create mode 100644 PVE/Jobs/Plugin.pm
>   create mode 100644 PVE/Jobs/VZDump.pm
>   create mode 100755 PVE/Service/pvescheduler.pm
>   create mode 100755 bin/pvescheduler
>   create mode 100644 services/pvescheduler.service
>   delete mode 100644 services/pvesr.service
>   delete mode 100644 services/pvesr.timer
>




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more)
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
                   ` (7 preceding siblings ...)
  2021-11-09 14:17 ` [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dylan Whyte
@ 2021-11-09 16:55 ` Aaron Lauterer
  2021-11-10 20:42 ` [pve-devel] applied-series: " Thomas Lamprecht
  2021-11-11 11:35 ` [pve-devel] " Fabian Ebner
  10 siblings, 0 replies; 16+ messages in thread
From: Aaron Lauterer @ 2021-11-09 16:55 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

Gave it a quick try converting the single backup job that I already had.

Created a new job.

Changed the schedule for the existing job to every two minutes, knowing that it takes longer. It behaves as expected, checking against the start and stop times in the task logs, I see that the job takes about 5 minutes to complete and the the next run is scheduled within the same minute as the previous stops. Though the start / stop time is shown only with minute precision here (seconds are always 00).

A (lightly):

Tested-By: Aaron Lauterer <a.lauterer@proxmox.com>

On 11/8/21 14:07, Dominik Csapak wrote:
> with this series, we implement a new daemon (pvescheduler) that takes
> over from pvesrs' systemd timer (original patch from thomas[0]) and
> extends it with a generic job handling mechanism
> 
> then i convert the vzdump cron jobs to these jobs, the immediate
> gain is that users can use calendarevent schedules instead of
> dow + starttime
> 
> for now, this 'jobs.cfg' only handles vzdump jobs, but should be easily
> extendable for other type of recurring jobs (like auth realm sync, etc.)
> 
> also, i did not yet convert the replication jobs to this job system,
> but that could probably be done without too much effort (though
> i did not look too deeply into it)
> 
> if some version of this gets applied, the further plan would be
> to remove the vzdump.cron part completely with 8.0, but until then
> we must at least list/parse that
> 
> whats currently missing but not too hard to add is a calculated
> 'next-run' column in the gui
> 
> changes from v1:
> * do not log replication into the syslog
> * readjust the loop to start at the full minute every 1000 loops
> * rework locking state locking/handling:
>    - i introduces a new 'starting' state that is set before we start
>      and we set it to started after the start.
>      we sadly cannot start the job while we hold the lock, since the open
>      file descriptor will be still open in the worker, and then we cannot
>      get the flock again. now it's more modeled after how we do qm/ct
>      long running locks (by writing 'starting' locked into the state)
>    - the stop check is now its own call at the beginning of the job handling
>    - handle created/removed jobs properly:
>      i did not think of state handling on other nodes in my previous
>      iteration. now on every loop, i sync the statefiles with the config
>      (create/remvoe) so that the file gets created/removed on all nodes
> * incorporated fabians feedback for the api (thanks!)
> 
> 0: https://lists.proxmox.com/pipermail/pve-devel/2018-April/031357.html
> 
> pve-cluster:
> 
> Dominik Csapak (1):
>    add 'jobs.cfg' to observed files
> 
>   data/PVE/Cluster.pm | 1 +
>   data/src/status.c   | 1 +
>   2 files changed, 2 insertions(+)
> 
> pve-manager:
> 
> Dominik Csapak (5):
>    add PVE/Jobs to handle VZDump jobs
>    pvescheduler: run jobs from jobs.cfg
>    api/backup: refactor string for all days
>    api/backup: handle new vzdump jobs
>    ui: dc/backup: show id+schedule instead of dow+starttime
> 
> Thomas Lamprecht (1):
>    replace systemd timer with pvescheduler daemon
> 
>   PVE/API2/Backup.pm                 | 235 +++++++++++++++++++-----
>   PVE/API2/Cluster/BackupInfo.pm     |   9 +
>   PVE/Jobs.pm                        | 286 +++++++++++++++++++++++++++++
>   PVE/Jobs/Makefile                  |  16 ++
>   PVE/Jobs/Plugin.pm                 |  61 ++++++
>   PVE/Jobs/VZDump.pm                 |  54 ++++++
>   PVE/Makefile                       |   3 +-
>   PVE/Service/Makefile               |   2 +-
>   PVE/Service/pvescheduler.pm        | 131 +++++++++++++
>   bin/Makefile                       |   6 +-
>   bin/pvescheduler                   |  28 +++
>   debian/postinst                    |   3 +-
>   services/Makefile                  |   3 +-
>   services/pvescheduler.service      |  16 ++
>   services/pvesr.service             |   8 -
>   services/pvesr.timer               |  12 --
>   www/manager6/dc/Backup.js          |  46 +++--
>   www/manager6/dc/BackupJobDetail.js |  10 +-
>   18 files changed, 823 insertions(+), 106 deletions(-)
>   create mode 100644 PVE/Jobs.pm
>   create mode 100644 PVE/Jobs/Makefile
>   create mode 100644 PVE/Jobs/Plugin.pm
>   create mode 100644 PVE/Jobs/VZDump.pm
>   create mode 100755 PVE/Service/pvescheduler.pm
>   create mode 100755 bin/pvescheduler
>   create mode 100644 services/pvescheduler.service
>   delete mode 100644 services/pvesr.service
>   delete mode 100644 services/pvesr.timer
> 




^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] applied: [PATCH cluster v2 1/1] add 'jobs.cfg' to observed files
  2021-11-08 13:07 ` [pve-devel] [PATCH cluster v2 1/1] add 'jobs.cfg' to observed files Dominik Csapak
@ 2021-11-09 17:18   ` Thomas Lamprecht
  0 siblings, 0 replies; 16+ messages in thread
From: Thomas Lamprecht @ 2021-11-09 17:18 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

On 08.11.21 14:07, Dominik Csapak wrote:
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  data/PVE/Cluster.pm | 1 +
>  data/src/status.c   | 1 +
>  2 files changed, 2 insertions(+)
> 
>

applied this one (without looking to much at the rest yet though) to have it
available with the next cluster bump, it won't really change anyway, thanks!




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more)
  2021-11-09 14:17 ` [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dylan Whyte
@ 2021-11-10  9:38   ` Dominik Csapak
  2021-11-10 11:05     ` Thomas Lamprecht
  0 siblings, 1 reply; 16+ messages in thread
From: Dominik Csapak @ 2021-11-10  9:38 UTC (permalink / raw)
  To: Dylan Whyte, Proxmox VE development discussion

On 11/9/21 15:17, Dylan Whyte wrote:
> Hi,
> 
> The patch set works as advertised and is very clean in terms of the 
> switch from vzdump.cron to jobs.cfg.

thanks for testing!

> 
> One very minor inconvenience that i seen was the fact that for 
> vzdump.cron 'Pool based' backups, you can't simply select 'Edit->Ok' to 
> convert it to the new format, but rather you need to change some detail 
> to allow the edit. However, this could barely be considered an issue, as 
> it just turns conversion into a two-step process of 'make minor edit. 
> revert minor edit'.
> 
> The IDs that are automatically generated for the old cron jobs are a bit 
> ugly and all get shuffled each time a job is moved from vzdump.cron to 
> jobs.cfg (as mentioned in the commit message), although they are at 
> least easy to change from the job config file, so I don't imagine anyone 
> using these default IDs for very long. On this note however, is having 
> an API call to change the ID something that we want to avoid, or can it 
> be added? I can see users being annoyed by not being able to change it 
> from the GUI, even if only for the initial conversion.

the cron job ids are basically the digest + linenumber, since we do not
have an id in the cron file

changing an id is something we generally do not allow, being 
storage/realms/etc. normally we tell the users to delete & recreate,
though i think the better solution here is to have an additional
'comment' field and leave the autogeneration of ids in place (and maybe
hide it again from the gui?)

though as you said, if you really want to change an id, editing the
file is not that bad (the statefiles get cleaned up/recreated by
the scheduler anyway)

the comment thing could be done as a followup though

> 
> When entering the ID of new jobs, it might also be helpful to show the 
> allowed characters in a tool-tip box and to prevent the user trying to 
> validate bad IDs by disabling the "Create" button while invalid, as is 
> done with the 'add storage' window. The "invalid ID" message is a bit 
> vague in any case, so I would at least mention what is allowed there.

yeah i forgot to add the 'ConfigId' vtype to the field.
would also send this as a follow up, since it's only a single line

> 
> Other than this, I didn't find any problems.
> 
> Tested-By: Dylan Whyte <d.whyte@proxmox.com>
> 
> 
> On 11/8/21 2:07 PM, Dominik Csapak wrote:
>> with this series, we implement a new daemon (pvescheduler) that takes
>> over from pvesrs' systemd timer (original patch from thomas[0]) and
>> extends it with a generic job handling mechanism
>>
>> then i convert the vzdump cron jobs to these jobs, the immediate
>> gain is that users can use calendarevent schedules instead of
>> dow + starttime
>>
>> for now, this 'jobs.cfg' only handles vzdump jobs, but should be easily
>> extendable for other type of recurring jobs (like auth realm sync, etc.)
>>
>> also, i did not yet convert the replication jobs to this job system,
>> but that could probably be done without too much effort (though
>> i did not look too deeply into it)
>>
>> if some version of this gets applied, the further plan would be
>> to remove the vzdump.cron part completely with 8.0, but until then
>> we must at least list/parse that
>>
>> whats currently missing but not too hard to add is a calculated
>> 'next-run' column in the gui
>>
>> changes from v1:
>> * do not log replication into the syslog
>> * readjust the loop to start at the full minute every 1000 loops
>> * rework locking state locking/handling:
>>    - i introduces a new 'starting' state that is set before we start
>>      and we set it to started after the start.
>>      we sadly cannot start the job while we hold the lock, since the open
>>      file descriptor will be still open in the worker, and then we cannot
>>      get the flock again. now it's more modeled after how we do qm/ct
>>      long running locks (by writing 'starting' locked into the state)
>>    - the stop check is now its own call at the beginning of the job 
>> handling
>>    - handle created/removed jobs properly:
>>      i did not think of state handling on other nodes in my previous
>>      iteration. now on every loop, i sync the statefiles with the config
>>      (create/remvoe) so that the file gets created/removed on all nodes
>> * incorporated fabians feedback for the api (thanks!)
>>
>> 0: https://lists.proxmox.com/pipermail/pve-devel/2018-April/031357.html
>>
>> pve-cluster:
>>
>> Dominik Csapak (1):
>>    add 'jobs.cfg' to observed files
>>
>>   data/PVE/Cluster.pm | 1 +
>>   data/src/status.c   | 1 +
>>   2 files changed, 2 insertions(+)
>>
>> pve-manager:
>>
>> Dominik Csapak (5):
>>    add PVE/Jobs to handle VZDump jobs
>>    pvescheduler: run jobs from jobs.cfg
>>    api/backup: refactor string for all days
>>    api/backup: handle new vzdump jobs
>>    ui: dc/backup: show id+schedule instead of dow+starttime
>>
>> Thomas Lamprecht (1):
>>    replace systemd timer with pvescheduler daemon
>>
>>   PVE/API2/Backup.pm                 | 235 +++++++++++++++++++-----
>>   PVE/API2/Cluster/BackupInfo.pm     |   9 +
>>   PVE/Jobs.pm                        | 286 +++++++++++++++++++++++++++++
>>   PVE/Jobs/Makefile                  |  16 ++
>>   PVE/Jobs/Plugin.pm                 |  61 ++++++
>>   PVE/Jobs/VZDump.pm                 |  54 ++++++
>>   PVE/Makefile                       |   3 +-
>>   PVE/Service/Makefile               |   2 +-
>>   PVE/Service/pvescheduler.pm        | 131 +++++++++++++
>>   bin/Makefile                       |   6 +-
>>   bin/pvescheduler                   |  28 +++
>>   debian/postinst                    |   3 +-
>>   services/Makefile                  |   3 +-
>>   services/pvescheduler.service      |  16 ++
>>   services/pvesr.service             |   8 -
>>   services/pvesr.timer               |  12 --
>>   www/manager6/dc/Backup.js          |  46 +++--
>>   www/manager6/dc/BackupJobDetail.js |  10 +-
>>   18 files changed, 823 insertions(+), 106 deletions(-)
>>   create mode 100644 PVE/Jobs.pm
>>   create mode 100644 PVE/Jobs/Makefile
>>   create mode 100644 PVE/Jobs/Plugin.pm
>>   create mode 100644 PVE/Jobs/VZDump.pm
>>   create mode 100755 PVE/Service/pvescheduler.pm
>>   create mode 100755 bin/pvescheduler
>>   create mode 100644 services/pvescheduler.service
>>   delete mode 100644 services/pvesr.service
>>   delete mode 100644 services/pvesr.timer
>>




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more)
  2021-11-10  9:38   ` Dominik Csapak
@ 2021-11-10 11:05     ` Thomas Lamprecht
  0 siblings, 0 replies; 16+ messages in thread
From: Thomas Lamprecht @ 2021-11-10 11:05 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak, Dylan Whyte

On 10.11.21 10:38, Dominik Csapak wrote:
> changing an id is something we generally do not allow, being storage/realms/etc. normally we tell the users to delete & recreate,
> though i think the better solution here is to have an additional
> 'comment' field and leave the autogeneration of ids in place (and maybe
> hide it again from the gui?)
> 
> though as you said, if you really want to change an id, editing the
> file is not that bad (the statefiles get cleaned up/recreated by
> the scheduler anyway)

+1 on hiding the ID-column by default and to add a comment (IIRC we talked about that a few weeks ago)
The comment thing is even a enhancement report: https://bugzilla.proxmox.com/show_bug.cgi?id=3658

Can be sent as a followup, but should IMO get release together

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pve-devel] applied-series: [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more)
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
                   ` (8 preceding siblings ...)
  2021-11-09 16:55 ` Aaron Lauterer
@ 2021-11-10 20:42 ` Thomas Lamprecht
  2021-11-11 11:35 ` [pve-devel] " Fabian Ebner
  10 siblings, 0 replies; 16+ messages in thread
From: Thomas Lamprecht @ 2021-11-10 20:42 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

On 08.11.21 14:07, Dominik Csapak wrote:
> pve-cluster:
> 
> Dominik Csapak (1):
>   add 'jobs.cfg' to observed files
> 
>  data/PVE/Cluster.pm | 1 +
>  data/src/status.c   | 1 +
>  2 files changed, 2 insertions(+)
> 
> pve-manager:
> 
> Dominik Csapak (5):
>   add PVE/Jobs to handle VZDump jobs
>   pvescheduler: run jobs from jobs.cfg
>   api/backup: refactor string for all days
>   api/backup: handle new vzdump jobs
>   ui: dc/backup: show id+schedule instead of dow+starttime
> 
> Thomas Lamprecht (1):
>   replace systemd timer with pvescheduler daemon


applied series, thanks! It's obv. time to call it a day now as I forgot to apply
Dylan's and Aaron's T-b tags, sorry about that, but they certainly assured me
and allowed to invest less time here, thanks to both of you testers!

I made a few smaller followup on top:

bd0af783 ui: form/calendar event: add some more complex examples
90d66ebf ui: form/calendar event: increase width of combobox picker
9569c89b ui: backup job: limit ID to 20 chars for now
3fc5bbc4 ui: backup job: ellipsize read-only ID on edit
ae8c5aba ui: backup job: avoid row wrapping due to overly long label

@Dominik, please still check the other reply to the followup series:
https://lists.proxmox.com/pipermail/pve-devel/2021-November/050863.html

And I think now it would be a good time to add that scheduler simulation API call
to get the next X actual date-times and calendar event would trigger, which we
could use to provide a small button for the users to show them their 10 (or so)
next events, at least that functionality from `systemd-analyze calendar <event>`
always helps me a lot to reassure my systemd.timer units are going off as planned ^^




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more)
  2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
                   ` (9 preceding siblings ...)
  2021-11-10 20:42 ` [pve-devel] applied-series: " Thomas Lamprecht
@ 2021-11-11 11:35 ` Fabian Ebner
  2021-11-11 11:46   ` Dominik Csapak
  10 siblings, 1 reply; 16+ messages in thread
From: Fabian Ebner @ 2021-11-11 11:35 UTC (permalink / raw)
  To: pve-devel, Dominik Csapak

Am 08.11.21 um 14:07 schrieb Dominik Csapak:
> with this series, we implement a new daemon (pvescheduler) that takes
> over from pvesrs' systemd timer (original patch from thomas[0]) and
> extends it with a generic job handling mechanism
> 
> then i convert the vzdump cron jobs to these jobs, the immediate
> gain is that users can use calendarevent schedules instead of
> dow + starttime
> 

There's a problem for mixed-version clusters:

If I edit/create a cluster-wide job (or job for another node) while 
running the new version, a node that doesn't have the pvescheduler yet, 
won't see that job anymore.

> for now, this 'jobs.cfg' only handles vzdump jobs, but should be easily
> extendable for other type of recurring jobs (like auth realm sync, etc.)
> 
> also, i did not yet convert the replication jobs to this job system,
> but that could probably be done without too much effort (though
> i did not look too deeply into it)
> 
> if some version of this gets applied, the further plan would be
> to remove the vzdump.cron part completely with 8.0, but until then
> we must at least list/parse that
> 
> whats currently missing but not too hard to add is a calculated
> 'next-run' column in the gui
> 
> changes from v1:
> * do not log replication into the syslog
> * readjust the loop to start at the full minute every 1000 loops
> * rework locking state locking/handling:
>    - i introduces a new 'starting' state that is set before we start
>      and we set it to started after the start.
>      we sadly cannot start the job while we hold the lock, since the open
>      file descriptor will be still open in the worker, and then we cannot
>      get the flock again. now it's more modeled after how we do qm/ct
>      long running locks (by writing 'starting' locked into the state)
>    - the stop check is now its own call at the beginning of the job handling
>    - handle created/removed jobs properly:
>      i did not think of state handling on other nodes in my previous
>      iteration. now on every loop, i sync the statefiles with the config
>      (create/remvoe) so that the file gets created/removed on all nodes
> * incorporated fabians feedback for the api (thanks!)
> 
> 0: https://lists.proxmox.com/pipermail/pve-devel/2018-April/031357.html
> 
> pve-cluster:
> 
> Dominik Csapak (1):
>    add 'jobs.cfg' to observed files
> 
>   data/PVE/Cluster.pm | 1 +
>   data/src/status.c   | 1 +
>   2 files changed, 2 insertions(+)
> 
> pve-manager:
> 
> Dominik Csapak (5):
>    add PVE/Jobs to handle VZDump jobs
>    pvescheduler: run jobs from jobs.cfg
>    api/backup: refactor string for all days
>    api/backup: handle new vzdump jobs
>    ui: dc/backup: show id+schedule instead of dow+starttime
> 
> Thomas Lamprecht (1):
>    replace systemd timer with pvescheduler daemon
> 
>   PVE/API2/Backup.pm                 | 235 +++++++++++++++++++-----
>   PVE/API2/Cluster/BackupInfo.pm     |   9 +
>   PVE/Jobs.pm                        | 286 +++++++++++++++++++++++++++++
>   PVE/Jobs/Makefile                  |  16 ++
>   PVE/Jobs/Plugin.pm                 |  61 ++++++
>   PVE/Jobs/VZDump.pm                 |  54 ++++++
>   PVE/Makefile                       |   3 +-
>   PVE/Service/Makefile               |   2 +-
>   PVE/Service/pvescheduler.pm        | 131 +++++++++++++
>   bin/Makefile                       |   6 +-
>   bin/pvescheduler                   |  28 +++
>   debian/postinst                    |   3 +-
>   services/Makefile                  |   3 +-
>   services/pvescheduler.service      |  16 ++
>   services/pvesr.service             |   8 -
>   services/pvesr.timer               |  12 --
>   www/manager6/dc/Backup.js          |  46 +++--
>   www/manager6/dc/BackupJobDetail.js |  10 +-
>   18 files changed, 823 insertions(+), 106 deletions(-)
>   create mode 100644 PVE/Jobs.pm
>   create mode 100644 PVE/Jobs/Makefile
>   create mode 100644 PVE/Jobs/Plugin.pm
>   create mode 100644 PVE/Jobs/VZDump.pm
>   create mode 100755 PVE/Service/pvescheduler.pm
>   create mode 100755 bin/pvescheduler
>   create mode 100644 services/pvescheduler.service
>   delete mode 100644 services/pvesr.service
>   delete mode 100644 services/pvesr.timer
> 




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more)
  2021-11-11 11:35 ` [pve-devel] " Fabian Ebner
@ 2021-11-11 11:46   ` Dominik Csapak
  0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2021-11-11 11:46 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel

On 11/11/21 12:35, Fabian Ebner wrote:
> Am 08.11.21 um 14:07 schrieb Dominik Csapak:
>> with this series, we implement a new daemon (pvescheduler) that takes
>> over from pvesrs' systemd timer (original patch from thomas[0]) and
>> extends it with a generic job handling mechanism
>>
>> then i convert the vzdump cron jobs to these jobs, the immediate
>> gain is that users can use calendarevent schedules instead of
>> dow + starttime
>>
> 
> There's a problem for mixed-version clusters:
> 
> If I edit/create a cluster-wide job (or job for another node) while 
> running the new version, a node that doesn't have the pvescheduler yet, 
> won't see that job anymore.
> 

yes, that's a known issue and will be features in the release notes.
basically, until all nodes are updated, you should not touch the backup
jobs via a webinterface on a new node





^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-11-11 11:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 13:07 [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dominik Csapak
2021-11-08 13:07 ` [pve-devel] [PATCH cluster v2 1/1] add 'jobs.cfg' to observed files Dominik Csapak
2021-11-09 17:18   ` [pve-devel] applied: " Thomas Lamprecht
2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 1/6] replace systemd timer with pvescheduler daemon Dominik Csapak
2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 2/6] add PVE/Jobs to handle VZDump jobs Dominik Csapak
2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 3/6] pvescheduler: run jobs from jobs.cfg Dominik Csapak
2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 4/6] api/backup: refactor string for all days Dominik Csapak
2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 5/6] api/backup: handle new vzdump jobs Dominik Csapak
2021-11-08 13:07 ` [pve-devel] [PATCH manager v2 6/6] ui: dc/backup: show id+schedule instead of dow+starttime Dominik Csapak
2021-11-09 14:17 ` [pve-devel] [PATCH cluster/manager v2] add scheduling daemon for pvesr + vzdump (and more) Dylan Whyte
2021-11-10  9:38   ` Dominik Csapak
2021-11-10 11:05     ` Thomas Lamprecht
2021-11-09 16:55 ` Aaron Lauterer
2021-11-10 20:42 ` [pve-devel] applied-series: " Thomas Lamprecht
2021-11-11 11:35 ` [pve-devel] " Fabian Ebner
2021-11-11 11:46   ` Dominik Csapak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal