From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.gruenbichler@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 2D51E975E1
 for <pbs-devel@lists.proxmox.com>; Wed, 17 Apr 2024 09:46:59 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 0C276379E8
 for <pbs-devel@lists.proxmox.com>; Wed, 17 Apr 2024 09:46:59 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pbs-devel@lists.proxmox.com>; Wed, 17 Apr 2024 09:46:58 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 2E6E54179C
 for <pbs-devel@lists.proxmox.com>; Wed, 17 Apr 2024 09:46:58 +0200 (CEST)
Date: Wed, 17 Apr 2024 09:46:54 +0200
From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= <f.gruenbichler@proxmox.com>
To: Gabriel Goller <g.goller@proxmox.com>,
 Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com>
References: <20240412100631.94218-1-l.wagner@proxmox.com>
 <20240412100631.94218-15-l.wagner@proxmox.com>
 <D0LG2CV3ONM9.WM2AEAILP5GY@proxmox.com>
 <e8497ce2-d60f-44fb-bc34-8482179f2015@proxmox.com>
In-Reply-To: <e8497ce2-d60f-44fb-bc34-8482179f2015@proxmox.com>
MIME-Version: 1.0
User-Agent: astroid/0.16.0 (https://github.com/astroidmail/astroid)
Message-Id: <1713339434.lmgwlqq4ff.astroid@yuna.none>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.056 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pbs-devel] [PATCH proxmox-backup 14/33] server: notifications:
 send GC notifications via notification system
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 17 Apr 2024 07:46:59 -0000

On April 16, 2024 2:13 pm, Lukas Wagner wrote:
>=20
>=20
> On  2024-04-16 11:37, Gabriel Goller wrote:
>> On Fri Apr 12, 2024 at 12:06 PM CEST, Lukas Wagner wrote:
>>> diff --git a/src/server/gc_job.rs b/src/server/gc_job.rs
>>> index 41375d72..ff5bdccf 100644
>>> --- a/src/server/gc_job.rs
>>> +++ b/src/server/gc_job.rs
>>> @@ -19,8 +19,6 @@ pub fn do_garbage_collection_job(
>>>  ) -> Result<String, Error> {
>>>      let store =3D datastore.name().to_string();
>>> =20
>>> -    let (email, notify) =3D crate::server::lookup_datastore_notify_set=
tings(&store);
>>> -
>>>      let worker_type =3D job.jobtype().to_string();
>>>      let upid_str =3D WorkerTask::new_thread(
>>>          &worker_type,
>>> @@ -43,11 +41,9 @@ pub fn do_garbage_collection_job(
>>>                  eprintln!("could not finish job state for {}: {err}", =
job.jobtype());
>>>              }
>>> =20
>>> -            if let Some(email) =3D email {
>>> -                let gc_status =3D datastore.last_gc_status();
>>> -                if let Err(err) =3D send_gc_status(&email, notify, &st=
ore, &gc_status, &result) {
>>> -                    eprintln!("send gc notification failed: {err}");
>>> -                }
>>> +            let gc_status =3D datastore.last_gc_status();
>>> +            if let Err(err) =3D send_gc_status(&store, &gc_status, &re=
sult) {
>>> +                eprintln!("send gc notification failed: {err}");
>>=20
>> I think we should use 'task_err!()' here. I know eprintln is used above,
>> and technically works because we redirect stderr in the service setup=20
>> but it's still slow and kinda the legacy method of printing task errors.
>=20
> I think the reason why the original code does not use task_log is because=
 the
> job is already marked as finished at that point:
>=20
>             if let Err(err) =3D job.finish(status) {
>                 eprintln!("could not finish job state for {}: {err}", job=
.jobtype());
>             }
>=20
>             let gc_status =3D datastore.last_gc_status();
>             if let Err(err) =3D send_gc_status(&store, &gc_status, &resul=
t) {
>                 eprintln!("send gc notification failed: {err}");
>             }
>=20
> Other jobs seem to follow the same pattern - use task_log! before, and ep=
rintln/log::error!
> after the job is finished.
> A `task_log!` after the job is finished still seems to work, but I'm not =
sure if that
> might lead to problems. What do you think?

I don't think this is a problem per se - job.finish() just marks the job as
finished in the job state, it's not related to the worker task
finish/exit itself.

since task_log (and friends) require the worker to still exist, and the
last thing that happens when the worker fn returns its result is that
result being logged via the same logging mechanism, this should be fine.
but we might still want to switch the order around, so that a warning
for notification failure actually gets counted?

if it's such a common pattern to do job.start at the start, and
job.finish near/at the end of a worker task, I wonder whether we
couldn't handle that in a uniform fashion ;) then we might also not end
up calling worker.create_state twice (once for job.finish, and once for
worker.log_result as part of the worker exiting) with slightly different
states..