From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <w.bumiller@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 4A6E19EB8A
 for <pbs-devel@lists.proxmox.com>; Fri,  3 Nov 2023 09:56:09 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 223FC1A085
 for <pbs-devel@lists.proxmox.com>; Fri,  3 Nov 2023 09:56:09 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pbs-devel@lists.proxmox.com>; Fri,  3 Nov 2023 09:56:08 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D1F3D4415C
 for <pbs-devel@lists.proxmox.com>; Fri,  3 Nov 2023 09:56:07 +0100 (CET)
Date: Fri, 3 Nov 2023 09:56:06 +0100
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Gabriel Goller <g.goller@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
Message-ID: <niwp2y3yaxptlalae27oehzheyyjykz7mp2qfkfqcg2mlxfrpx@v6chhe34y4vx>
References: <20231025135325.198073-1-g.goller@proxmox.com>
 <20231025135325.198073-3-g.goller@proxmox.com>
 <uilsuhdampqxjkl6j4iqdwnsto2ntft2v2774h7xhulcgaq6lh@knus6hopuhyy>
 <f5833873-687a-4480-b8ad-fb9c151a301f@proxmox.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <f5833873-687a-4480-b8ad-fb9c151a301f@proxmox.com>
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.103 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Subject: Re: [pbs-devel] [RFC proxmox v2 2/2] proxmox-log: added tracing
 infra
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Fri, 03 Nov 2023 08:56:09 -0000

On Thu, Nov 02, 2023 at 03:58:02PM +0100, Gabriel Goller wrote:
> Thanks for the review!
> 
> On 11/2/23 14:43, Wolfgang Bumiller wrote:
> > [..]
> > > +
> > > +impl<S: Subscriber> Layer<S> for FilelogLayer {
> > > +    fn on_event(&self, event: &Event<'_>, _ctx: Context<'_, S>) {
> > > +        let mut buf = String::new();
> > > +
> > > +        event.record(&mut EventVisitor::new(&mut buf));
> > I'd argue the above 2 lines should be part of the closure below,
> > otherwise in the error case you just produce a string to throw away.
> I Agree!
> > > +
> > > +        let logger_exists = LOGGER.try_with(|logger| {
> > > +            log_to_file(&mut logger.borrow_mut(), event.metadata().level(), buf);
> > > +        });
> > > +        if let Err(e) = logger_exists {
> > > +            error!(
> > Is it wise to call log functions from within log handlers? ;-)
> Hmm, yes this could be tricky...
> How do we want to handle this anyway? We could panic here, but that seems a
> bit
> harsh. Logging with `eprintln` is also an option, but a quite useless one
> tbh.

Well it's either eprinln!, or, well, do nothing and hope there's another
layer dealing with it. 
But really, what *is* this error anyway? AFAICT it means the task-local
is not set, so we should not even run into this, and panicking might
almost be fine...

Alternatively we could drop the separate layer idea and just have 1
layer where the file logger not existing is the same as the task logger
not being enabled, that is, we fall back to using the syslogger.
I guess that depends on how much additional functionality we really need
from all this tracing infrastructure?

> > > [..]
> > > +pub struct WorkerTaskFilter {
> > > +    in_worker_task: Arc<Mutex<bool>>,
> > AFAICT you only have locks which are shortlived to set/clear/check this
> > value.
> > For such a thing you can use `Arc<AtomicBool>` and get rid of all the
> > error handling.
> Ooh, this makes it a lot easier, thanks for the heads up!I think we should
> be alright by just using `Ordering::Relaxed` everywhere, right?

Yes, this will AFAICT never be accessed in parallel. (Might as well be
a simple Cell unless we run into a `Sync` bound somewhere?)

> > > +}
> > > +
> > > +impl WorkerTaskFilter {
> > > +    pub fn new(in_worker_task: Arc<Mutex<bool>>) -> WorkerTaskFilter {
> > > +        WorkerTaskFilter { in_worker_task }
> > > +    }
> > > +}
> > > +
> > > +impl<S: Subscriber + for<'a> LookupSpan<'a>> Filter<S> for WorkerTaskFilter {
> > > +    fn on_enter(&self, id: &span::Id, ctx: Context<'_, S>) {
> > > +        let metadata = ctx.metadata(id);
> > > +        if let Some(m) = metadata {
> > > +            if m.name() == "worker_task" {
> > I'm not so happy with this.
> > Now each time we poll a worker task we go through this layer system
> > which uses string comparison to know whether we're currently in a worker
> > task, for something that is actually rather static in the code.
> > I'd much prefer a simply custom `Future` wrapping the worker task's
> > future and setting this flag for the duration of the `poll()` method.
> Ok, but the `Future` wrapper would only work for the task, right?

I don't see the difference. In the threaded version you'd just
initialize in_worker_task to `true` - I mean, it does not matter where
you access that `Arc<AtomicBool>` from?

I don't know. I find this whole thing a bit backwards? Maybe it's just
me, but I'm not convinced that's how this part of `tracing` is meant to
be used.

Consider this:
If you already have an Arc<Bool>, all you need is a thread-local
copy of it:
- A Future based worker would have a wrapping Future containing that
  Arc, which, for the duration of `poll()`, sets the thread-local to
  point to its Arc.
- A thread based worker would just set it at the very beginning.
- A guard to enable/disable it (if we even need it, like on_enter/on_exit)
  would contain the old value and restore it on Drop (giving us a
  push/pop mechanics for cheap).

Also note that AFAICT neither the current tracing variant nor what I
wrote above deal with `tokio::spawn()`ed tasks, but that's fine.
If we want the ability to inherit the logging facility, this would need
to be independent from the boolean anyway since tasks can run on
separate threads - unless we restrict it to `LocalSet`s.
So I think we might need to first decide how that bit *should* work
before adding the ability to enter/leave the scope at will.
Unless I'm missing something.

> So we would need to keep the `span` version (or come up with
> something different) for the thread use-case and then again have a
> lot of `if (thread) { do this } else if (task) { do this }` stuff, which I
> don't
> really like.
> What we could do is have another `tokio::task_local!()` thingy, which
> contains a bool 'log_to_tasklog'. Then have another `scope` and `sync_scope`
> around the worker logic (So we would substitute the span stuff with another
> TLS).
> > This just seems like a whole lot of overhead we don't need for such
> > simple functionality. Also, the `on_enter` and `on_exit` methods make it
> > look like you could easily enter and exit this type of span, but that's
> > not the case. `on_exit` always stores `false`, so nested spans
> > temporarily disabling and enabling the worker task log would just end up
> > with a messed up state (this would need to be a counter...).
> Right, I would need a counter here..

(well actually some kind of stack, if we even really need the
functionality)

> > > [..]
> > > +
> > > +impl Visit for EventVisitor<'_> {
> > > +    fn record_debug(&mut self, field: &Field, value: &dyn fmt::Debug) {
> > > +        if field.name() == "message" {
> > > +            self.buf.push_str(&format!("{:?}", value));
> > String implements fmt::Write. You can
> > 
> >      use std::fmt::Write as _;
> >      let _ = write!(state.buf, "{value:?}");
> > 
> > it's possible for this to be more efficient since it does not enforce
> > the creation of a separate allocated string.
> 
> Thanks!
> 
> > [..]