From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Lukas Wagner <l.wagner@proxmox.com>
Cc: pdm-devel@lists.proxmox.com
Subject: Re: [pdm-devel] [PATCH proxmox-datacenter-manager 10/25] metric collection: collect overdue metrics on startup/timer change
Date: Thu, 13 Feb 2025 16:34:29 +0100 [thread overview]
Message-ID: <siuxzgzefcm6kye63hkpjmmz4ibqrxhjo233egh545dqddpcxh@ezwzvaqhjihi> (raw)
In-Reply-To: <ff8f6b15-b3aa-4229-ae50-3218a919c5d4@proxmox.com>
On Thu, Feb 13, 2025 at 04:21:33PM +0100, Lukas Wagner wrote:
>
>
> On 2025-02-13 15:19, Wolfgang Bumiller wrote:
> > On Thu, Feb 13, 2025 at 02:50:32PM +0100, Lukas Wagner wrote:
> >>
> >>
> >> On 2025-02-13 09:55, Wolfgang Bumiller wrote:
> >>>> loop {
> >>>> let old_settings = self.settings.clone();
> >>>> tokio::select! {
> >>>> @@ -124,7 +132,12 @@ impl MetricCollectionTask {
> >>>> "metric collection interval changed to {} seconds, reloading timer",
> >>>> interval
> >>>> );
> >>>> - timer = Self::setup_timer(interval);
> >>>> + (timer, next_run) = Self::setup_timer(interval);
> >>>> + // If change (and therefore reset) our timer right before it fires,
> >>>> + // we could potentially miss one collection event.
> >>>
> >>> Couldn't we instead just pass `next_run` through to `setup_timer` and
> >>> call `reset_at(next_run)` on it? (`first_run` would only be used in the
> >>> initial setup, so `next_run` could either be an `Option`, or the setup
> >>> code does the `next_aligned_instant` call...
> >>>
> >>> This should be much less code by making the new
> >>> `fetch_overdue{,_and_save_sate}()` functions unnecessary, or am I
> >>> missing something?
> >>>
> >>
> >> I guess the question is, do we want nicely aligned timer ticks?
> >>
> >> e.g. 14:01:00, 14:02:00, 14:03:00 ... for 60 second interval
> >> or 14:00:00, 14:05:00, 14:10:00 ... for a 5 minute interval?
> >>
> >> Because that was the main intention behind using the 'collection-interval' as
> >> a base for calculating the aligned instant for the first timer reset.
> >> If we reuse the 'old' `next_run` when the interval is changed, we
> >> also reuse the old alignment.
> >>
> >> For instance, when changing from initially 1 minute to 5 minutes, the
> >> timer ticks might come at
> >> 14:01:00, 14:06:00, 14:11:00
> >>
> >> Technically, the naming for the `next_run` variable is not the best,
> >> since it just contains the Instant when the timer *first* fires, but
> >> this is then never updated to the *next* time the timer will fire...
> >> So that means that when changing the interval with your suggested change,
> >> you'd pass an Instant to `reset_at` that is already in the past,
> >> causing the timer to fire immediately.
> >>
> >> If we *don't* care about the aligned ticks as described above, we could
> >> just use a static alignment boundary, e.g. 60 seconds.
> >> In this case we can also get rid of the fetch_overdue stuff, since
> >> at worst case we have 60 seconds until the next tick on startup or timer change,
> >> which should be good enough to prevent any significant gaps in the data.
> >
> > What about setting a flag - if the current next tick was earlier than
> > the new next tick - to tell tick() to re-align the timer when it is next
> > triggered?
>
> The problem is that tokio::time::Interval doesn't give you a way to query when
> the next expected tick will be. We can only approximate it by recalculating
> a new aligned instant with the same interval, but I guess this might behave
> unpredictably in edge cases.
>
> >
> > So when going from 1 to 5 minutes at 14:01:50, we `.reset_at(14:02:00)`
> > and also set `realign = true`, and at 14:02, tick() should
> > `.reset_at(14:05:00)`.
> >
> > I just feel like the logic in the "fetch_overdue" code should not be
> > necessary to have, but if it's too awkward to handle via the tick timer,
> > it's fine to keep it in v2.
>
> fetch_overdue is also called when the daemon starts up. If we align to the collection interval
> and the daemon was down for a while, we otherwise might end up with gaps in the data.
>
> Remotes keep metric history for 30 minutes. If PDM is down for, say, 29 minutes and
> we are aligning to 15min boundaries, in the worst case we might have to wait for
> another 15min to fetch metrics, resulting in a gap.
>
> Of course we could just unconditionally force collection after startup, but I think the
> fetch_overdue solution solves this and the timer change issue quite okayish.
>
> In v2 I got rid of the fetch_overdue_and_save_state wrapper by putting the state.save()
> that we already had in the main loop at the end of the loop, so it's a bit less code now.
> The remaining code is not really that complex, I think I'd prefer to keep it
> for now.
Okay.
_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel
next prev parent reply other threads:[~2025-02-13 15:34 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-11 12:05 [pdm-devel] [PATCH proxmox-datacenter-manager 00/25] metric collection improvements (concurrency, config, API, CLI) Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 01/25] test support: add NamedTempFile helper Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 02/25] test support: add NamedTempDir helper Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 03/25] pdm-api-types: add CollectionSettings type Lukas Wagner
2025-02-11 14:18 ` Maximiliano Sandoval
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 04/25] pdm-config: add functions for reading/writing metric collection settings Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 05/25] metric collection: split top_entities split into separate module Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 06/25] metric collection: save metric data to RRD in separate task Lukas Wagner
2025-02-12 13:59 ` Wolfgang Bumiller
2025-02-12 14:32 ` Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 07/25] metric collection: rework metric poll task Lukas Wagner
2025-02-11 12:58 ` Lukas Wagner
2025-02-12 15:57 ` Wolfgang Bumiller
2025-02-13 12:31 ` Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 08/25] metric collection: persist state after metric collection Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 09/25] metric collection: skip if last_collection < MIN_COLLECTION_INTERVAL Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 10/25] metric collection: collect overdue metrics on startup/timer change Lukas Wagner
2025-02-13 8:55 ` Wolfgang Bumiller
2025-02-13 13:50 ` Lukas Wagner
2025-02-13 14:19 ` Wolfgang Bumiller
2025-02-13 15:21 ` Lukas Wagner
2025-02-13 15:34 ` Wolfgang Bumiller [this message]
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 11/25] metric collection: add tests for the fetch_remotes function Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 12/25] metric collection: add test for fetch_overdue Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 13/25] metric collection: pass rrd cache instance as function parameter Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 14/25] metric collection: add test for rrd task Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 15/25] metric collection: wrap rrd_cache::Cache in a struct Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 16/25] metric collection: record remote response time in metric database Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 17/25] metric collection: save time needed for collection run to RRD Lukas Wagner
2025-02-13 11:53 ` Wolfgang Bumiller
2025-02-13 12:12 ` Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 18/25] metric collection: periodically clean removed remotes from statefile Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 19/25] api: add endpoint for updating metric collection settings Lukas Wagner
2025-02-13 12:09 ` Wolfgang Bumiller
2025-02-13 12:15 ` Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 20/25] api: add endpoint to trigger metric collection Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 21/25] api: remotes: trigger immediate metric collection for newly added nodes Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 22/25] api: add api for querying metric collection RRD data Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 23/25] api: metric-collection: add status endpoint Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 24/25] pdm-client: add metric collection API methods Lukas Wagner
2025-02-13 12:10 ` Wolfgang Bumiller
2025-02-13 13:52 ` Lukas Wagner
2025-02-11 12:05 ` [pdm-devel] [PATCH proxmox-datacenter-manager 25/25] cli: add commands for metric-collection settings, trigger, status Lukas Wagner
2025-02-13 12:14 ` Wolfgang Bumiller
2025-02-13 14:17 ` Lukas Wagner
2025-02-13 14:56 ` Wolfgang Bumiller
2025-02-13 14:58 ` Lukas Wagner
2025-02-13 15:11 ` Lukas Wagner
2025-02-14 13:08 ` [pdm-devel] [PATCH proxmox-datacenter-manager 00/25] metric collection improvements (concurrency, config, API, CLI) Lukas Wagner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=siuxzgzefcm6kye63hkpjmmz4ibqrxhjo233egh545dqddpcxh@ezwzvaqhjihi \
--to=w.bumiller@proxmox.com \
--cc=l.wagner@proxmox.com \
--cc=pdm-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal