From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <t.lamprecht@proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 219B461554 for <pve-devel@lists.proxmox.com>; Thu, 17 Dec 2020 10:24:07 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 0F81623E67 for <pve-devel@lists.proxmox.com>; Thu, 17 Dec 2020 10:23:37 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 3321723E5C for <pve-devel@lists.proxmox.com>; Thu, 17 Dec 2020 10:23:36 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id EAE6345259 for <pve-devel@lists.proxmox.com>; Thu, 17 Dec 2020 10:23:35 +0100 (CET) To: Fabian Ebner <f.ebner@proxmox.com>, Proxmox VE development discussion <pve-devel@lists.proxmox.com> References: <20201214130039.9997-1-f.ebner@proxmox.com> <5c12de59-cec9-d1f3-9c3d-17a99c67e872@proxmox.com> <ac0e0452-9cbd-ae60-fb6e-d688bc2e4481@proxmox.com> From: Thomas Lamprecht <t.lamprecht@proxmox.com> Message-ID: <cfb0a229-ad22-349d-cb76-df40fac4c936@proxmox.com> Date: Thu, 17 Dec 2020 10:23:33 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:82.0) Gecko/20100101 Thunderbird/82.0 MIME-Version: 1.0 In-Reply-To: <ac0e0452-9cbd-ae60-fb6e-d688bc2e4481@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Language: en-GB Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.065 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [0pointer.de] Subject: Re: [pve-devel] [PATCH zsync] fix #2821: only abort if there really is a waiting/syncing job instance already X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> List-Post: <mailto:pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> X-List-Received-Date: Thu, 17 Dec 2020 09:24:07 -0000 On 17/12/2020 09:40, Fabian Ebner wrote: > Am 14.12.20 um 14:47 schrieb Thomas Lamprecht: >> On 14.12.20 14:00, Fabian Ebner wrote: >>> @@ -584,6 +586,33 @@ sub destroy_job { >>> }); >>> } >>> +sub get_process_start_time { >>> + my ($pid) = @_; >>> + >>> + return eval { run_cmd(['ps', '-o', 'lstart=', '-p', "$pid"]); }; >> >> instead of fork+exec do a much cheaper file read? >> >> I.e., copying over file_read_firstline from PVE::Tools then: >> >> sub get_process_start_time { >> my $stat_str = file_read_firstline("/proc/$pid/stat"); >> my $stat = [ split(/\s+/, $stat_str) ]; >> >> return $stat->[21]; >> } >> >> plus some error handling (note I did not test above) >> > > Agreed, although we also need to obtain the boot time (from /proc/stat) to have the actual start time, because the value in /proc/$pid/stat is just the number of clock ticks since boot when the process was started. But it's still much cheaper of course. hmm, yeah intra-boot this would not be enough to always tell 100% for sure. FYI, there you probably could also use `/proc/sys/kernel/random/boot_id` can be read once at program startup. http://0pointer.de/blog/projects/ids.html (see "Software IDs"), >>> @@ -593,11 +622,18 @@ sub sync { >>> eval { $job = get_job($param) }; >>> if ($job) { >>> - if (defined($job->{state}) && ($job->{state} eq "syncing" || $job->{state} eq "waiting")) { >>> + my $state = $job->{state} // 'ok'; >>> + $state = 'ok' if !instance_exists($job->{instance_id}); >>> + >>> + if ($state eq "syncing" || $state eq "waiting") { >>> die "Job --source $param->{source} --name $param->{name} is already scheduled to sync\n"; >>> } >>> $job->{state} = "waiting"; >>> + >>> + eval { $job->{instance_id} = get_instance_id($$); }; >> >> I'd query and cache the local instance ID from the current process on startup, this >> would have the nice side effect of avoiding error potential here completely >> > > What if querying fails on startup? I'd rather have it be a non-critical failure and continue. Then we'd still need a check here to see if the cached instance_id is defined. if you make it just reads of /proc and it fails you can assume critical conditions and abort. If you really do not want too, you can add a singleton which returns the cached info and if not available retry getting it and warn. my $id_cache; sub get_local_instance_id { return $id_cache if defined($id_cache); $id_cache = eval { get_instance_id($$) }; warn $@ if $@; return $id_cache; } Albeit, I'd have less hard feelings about caching if getting the ID doesn't fork, nor other rather costly operations.