From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
Shannon Sterz <s.sterz@proxmox.com>,
Nicolas Frey <n.frey@proxmox.com>
Cc: pve-devel <pve-devel-bounces@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH pve-storage] fix #6450: add file-checksum endpoint to storage API
Date: Thu, 2 Oct 2025 14:41:20 +0200 [thread overview]
Message-ID: <5a01ab84-2d91-4e64-826a-29ebf6bd4545@proxmox.com> (raw)
In-Reply-To: <DD7TUB3A631N.10A53DGKL6JEZ@proxmox.com>
Am 02.10.25 um 14:15 schrieb Shannon Sterz:
>> warn $@ if $@;
>> }
>>
>> + if (exists $param->{checksum}) {
>> + print "calculating checksum...\n";
>> + $entry->{checksum} = PVE::Tools::get_file_hash($param->{checksum}, $path);
> i've tested this with some not too uncommon disk images such as a 32GB
> volume that is essentially empty and the api endpoint here just times
> out. which is not too surprising. i wonder if we can cache the hashes
> here somehow and calculate them in a worker tasks. i also wonder how
> this should ideally work for running vm and container images as their
> checksum could change all the time.
>
> maybe we can at least calculate the hashes here for some more static
> assets such iso etc. ahead of time and only enable this flag for things
> like that (so isos, container templates, images of vm and container
> templates etc.) basically things that don't change that much?
I could not find it, but IIRC there was such a request (or patch?) for
checksums of storage content submitted in the past where we discussed
this already.
Anyhow, this is really not something trivial and would need some system
to cache the hash while also having a heuristic that ensures the cached
hash is still valid – as having a wrong hash returned might needlessly
wreck some nerves of any admin that take their job seriously.
We could do a file that contains the hash(es) and a inode nr., file
size and mtime value from the time those hash(es) got created as
heuristic to detect legitimate change. Plus probably the date to
show the user that this is was not calculated on the fly.
And yes, actual calculation needs to happen in a task worker, as
this can run for quite a while on big files and/or slow storages.
So probably best done in a dedicated API call I guess, but with all
this in mind I'm questing a bit if this is really worth that much
effort...
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-10-02 12:41 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-29 9:35 Nicolas Frey
2025-10-02 12:15 ` Shannon Sterz
2025-10-02 12:41 ` Thomas Lamprecht [this message]
2025-10-02 12:51 ` Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5a01ab84-2d91-4e64-826a-29ebf6bd4545@proxmox.com \
--to=t.lamprecht@proxmox.com \
--cc=n.frey@proxmox.com \
--cc=pve-devel-bounces@lists.proxmox.com \
--cc=pve-devel@lists.proxmox.com \
--cc=s.sterz@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox