From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pdm-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id ACCC01FF14C
	for <inbox@lore.proxmox.com>; Fri, 15 May 2026 14:57:01 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id AEF9FE041;
	Fri, 15 May 2026 14:57:00 +0200 (CEST)
Mime-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=UTF-8
Date: Fri, 15 May 2026 14:56:55 +0200
Message-Id: <DIJ9MRW8VM5U.3C4D5N0OP7B02@proxmox.com>
From: "Lukas Wagner" <l.wagner@proxmox.com>
To: "Thomas Lamprecht" <t.lamprecht@proxmox.com>, "Lukas Wagner"
 <l.wagner@proxmox.com>
Subject: Re: [PATCH datacenter-manager 4/4] remote-updates: switch over to
 new api_cache
X-Mailer: aerc 0.21.0-0-g5549850facc2-dirty
References: <20260513135457.573414-1-l.wagner@proxmox.com>
 <20260513135457.573414-5-l.wagner@proxmox.com>
 <20260515090637.950992-4-t.lamprecht@proxmox.com>
In-Reply-To: <20260515090637.950992-4-t.lamprecht@proxmox.com>
X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2
X-Bm-Transport-Timestamp: 1778849809529
X-SPAM-LEVEL: Spam detection results:  0
	AWL                     0.054 Adjusted score from AWL reputation of From:
 address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DMARC_MISSING             0.1 Missing DMARC policy
	KAM_DMARC_STATUS         0.01 Test Rule for DKIM or SPF Failure with Strict
 Alignment
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_PASS               -0.001 SPF: sender matches SPF record
Message-ID-Hash: Y7B3GQNFTPI2WQMJG4YASZDFDNQCH4A2
X-Message-ID-Hash: Y7B3GQNFTPI2WQMJG4YASZDFDNQCH4A2
X-MailFrom: l.wagner@proxmox.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop;
 banned-address; emergency; member-moderation; nonmember-moderation;
 administrivia; implicit-dest; max-recipients; max-size; news-moderation;
 no-subject; digests; suspicious-header
CC: pdm-devel@lists.proxmox.com
X-Mailman-Version: 3.3.10
Precedence: list
List-Id: Proxmox Datacenter Manager development discussion
 <pdm-devel.lists.proxmox.com>
List-Help: <mailto:pdm-devel-request@lists.proxmox.com?subject=help>
List-Owner: <mailto:pdm-devel-owner@lists.proxmox.com>
List-Post: <mailto:pdm-devel@lists.proxmox.com>
List-Subscribe: <mailto:pdm-devel-join@lists.proxmox.com>
List-Unsubscribe: <mailto:pdm-devel-leave@lists.proxmox.com>

On Fri May 15, 2026 at 11:06 AM CEST, Thomas Lamprecht wrote:
> On Wed, 13 May 2026 15:54:57 +0200, Lukas Wagner wrote:
>> diff --git a/server/src/remote_updates.rs b/server/src/remote_updates.rs
>> --- a/server/src/remote_updates.rs
>> +++ b/server/src/remote_updates.rs
>> @@ -179,10 +167,11 @@ async fn update_cached_summary_for_node(
>>      node: String,
>>      node_data: NodeUpdateSummary,
>>  ) -> Result<(), Error> {
>> -    let mut file =3D File::open(UPDATE_CACHE)?;
>> -    let mut cache_content: UpdateSummary =3D serde_json::from_reader(&m=
ut file)?;
>> -    let remote_entry =3D
>> -        cache_content
>> +    let cache =3D api_cache::write_global().await?;
>> +    let cache_content =3D cache.get::<UpdateSummary>(UPDATE_SUMMARY_CAC=
HE_KEY).await?;
>> +
>> +    if let Some(mut entry) =3D cache_content {
>> +        let remote_entry =3D entry
>>              .remotes
>>              .entry(remote.id)
>>              .or_insert_with(|| RemoteUpdateSummary {
>> @@ -191,15 +180,9 @@ async fn update_cached_summary_for_node(
>>                  status: RemoteUpdateStatus::Success,
>>              });
>>
>> -    remote_entry.nodes.insert(node, node_data);
> [...]
>> +        remote_entry.nodes.insert(node, node_data);
>> +        cache.set(UPDATE_SUMMARY_CACHE_KEY, entry).await?;
>> +    }
>>
>>      Ok(())
>>  }
>
> Small behaviour change worth a second look: the old code did
> `File::open(UPDATE_CACHE)?`, which returned an error if the cache file
> did not exist.

Yeah, that actually was problematic before. It did not really surface,
since at the moment we trigger the remote update fetching for all
remotes very soon after daemon startup.

> The new code uses `cache.get(..)`, which returns `Ok(None)` for that
> case, and the `if let Some(..)` then silently skips the write.
> So, if `list_available_updates` is called before any refresh has
> populated the cache, its result is now thrown away silently instead of
> surfaced as an error.
> If you want this code path to be able to create the initial cache entry
> as well, replacing the `if let Some(..)` with `let mut entry =3D
> cache_content.unwrap_or_default();` would do it.
>

Did exactly that, thanks!

>> @@ -212,7 +195,7 @@ pub async fn refresh_update_summary_cache(remotes: V=
ec<Remote>) -> Result<(), Er
>>          .do_for_all_remote_nodes(remotes.clone().into_iter(), fetch_ava=
ilable_updates)
>>          .await;
>>
>> -    let mut content =3D get_cached_summary_or_default()?;
>> +    let mut content =3D get_cached_summary_or_default().await?;
>>
>>      // Clean out any remotes that might have been removed from the remo=
te config in the meanwhile.
>>      content
>> @@ -275,8 +258,28 @@ pub async fn refresh_update_summary_cache(remotes: =
Vec<Remote>) -> Result<(), Er
>>          }
>>      }
>>
>> -    let options =3D proxmox_product_config::default_create_options();
>> -    proxmox_sys::fs::replace_file(UPDATE_CACHE, &serde_json::to_vec(&co=
ntent)?, options, true)?;
>> +    cleanup_old_cachefile().await?;
>> +
>> +    let cache =3D api_cache::write_global().await?;
>> +    cache.set(UPDATE_SUMMARY_CACHE_KEY, content).await?;
>> +
>> +    Ok(())
>> +}
>
> Two things on the final write:
>
>  - `get_cached_summary_or_default()` above takes a read lock and drops
>    it again before `write_global()` is called down here. If
>    `update_cached_summary_for_node` runs in that gap, the entry it just
>    wrote will be overwritten by the `cache.set` below. Holding a single
>    write lock for the whole function would prevent that. Not sure if we
>    strictly need that guarantee though.

Moved the write_global() up top to make sure that the lock is held while
updating the values in the UpdateSummary data structure.

Thanks!

>
>  - `cleanup_old_cachefile` runs before the new `cache.set`. If the `set`
>    ever fails, the old file is already gone, so the next refresh starts
>    from an empty cache.=20

We don't really migrate contents over from the old cachefile, but start
from a clean slate anyway with the new cache, so I don't think this
makes any difference?

>    Either move the cleanup after the successful
>    write, or check whether the old file exists first so the cleanup only
>    runs once.

The cleanup literally just make sure that the old, unused cachefile does
not litter /var/cache/... - it's also fine to run
'cleanup_old_cachefile' multiple times. It would re-attempt to delete
the file, but explicitly catches the 'file does not exist' case and does
not log or throw an error then.

I changed the order of operations anyway (so, set before cleanup), but
I don't really think it makes a difference. (But maybe I'm missing
something?)