public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH pve-storage] fix #6450: add file-checksum endpoint to storage API
@ 2025-09-29  9:35 Nicolas Frey
  2025-10-02 12:15 ` Shannon Sterz
  0 siblings, 1 reply; 4+ messages in thread
From: Nicolas Frey @ 2025-09-29  9:35 UTC (permalink / raw)
  To: pve-devel

The storage API endpoint now supports an optional 'checksum' parameter.
If set, the checksum is calculated using one of the hashing algorithms
listed in the enum and returned in the 'checksum' return property.

Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=6450

Signed-off-by: Nicolas Frey <n.frey@proxmox.com>
---
 src/PVE/API2/Storage/Content.pm | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/PVE/API2/Storage/Content.pm b/src/PVE/API2/Storage/Content.pm
index 1fe7303..ea60589 100644
--- a/src/PVE/API2/Storage/Content.pm
+++ b/src/PVE/API2/Storage/Content.pm
@@ -306,6 +306,12 @@ __PACKAGE__->register_method({
                 description => "Volume identifier",
                 type => 'string',
             },
+            checksum => {
+                description => 'The algorithm to calculate the checksum of the file.',
+                enum => [qw(sha512 sha384 sha256 sha224 sha1 md5)],
+                type => 'string',
+                optional => 1,
+            }
         },
     },
     returns => {
@@ -340,6 +346,11 @@ __PACKAGE__->register_method({
                 type => 'boolean',
                 optional => 1,
             },
+            checksum => {
+                description => 'The checksum of the file.',
+                type => 'string',
+                optional => 1,
+            }
         },
     },
     code => sub {
@@ -376,6 +387,11 @@ __PACKAGE__->register_method({
             warn $@ if $@;
         }
 
+        if (exists $param->{checksum}) {
+            print "calculating checksum...\n";
+            $entry->{checksum} = PVE::Tools::get_file_hash($param->{checksum}, $path);
+        }
+
         return $entry;
     },
 });
-- 
2.47.3


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH pve-storage] fix #6450: add file-checksum endpoint to storage API
  2025-09-29  9:35 [pve-devel] [PATCH pve-storage] fix #6450: add file-checksum endpoint to storage API Nicolas Frey
@ 2025-10-02 12:15 ` Shannon Sterz
  2025-10-02 12:41   ` Thomas Lamprecht
  0 siblings, 1 reply; 4+ messages in thread
From: Shannon Sterz @ 2025-10-02 12:15 UTC (permalink / raw)
  To: Proxmox VE development discussion; +Cc: pve-devel

one comment in-line

On Mon Sep 29, 2025 at 11:35 AM CEST, Nicolas Frey wrote:
> The storage API endpoint now supports an optional 'checksum' parameter.
> If set, the checksum is calculated using one of the hashing algorithms
> listed in the enum and returned in the 'checksum' return property.
>
> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=6450
>
> Signed-off-by: Nicolas Frey <n.frey@proxmox.com>
> ---
>  src/PVE/API2/Storage/Content.pm | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/src/PVE/API2/Storage/Content.pm b/src/PVE/API2/Storage/Content.pm
> index 1fe7303..ea60589 100644
> --- a/src/PVE/API2/Storage/Content.pm
> +++ b/src/PVE/API2/Storage/Content.pm
> @@ -306,6 +306,12 @@ __PACKAGE__->register_method({
>                  description => "Volume identifier",
>                  type => 'string',
>              },
> +            checksum => {
> +                description => 'The algorithm to calculate the checksum of the file.',
> +                enum => [qw(sha512 sha384 sha256 sha224 sha1 md5)],
> +                type => 'string',
> +                optional => 1,
> +            }
>          },
>      },
>      returns => {
> @@ -340,6 +346,11 @@ __PACKAGE__->register_method({
>                  type => 'boolean',
>                  optional => 1,
>              },
> +            checksum => {
> +                description => 'The checksum of the file.',
> +                type => 'string',
> +                optional => 1,
> +            }
>          },
>      },
>      code => sub {
> @@ -376,6 +387,11 @@ __PACKAGE__->register_method({
>              warn $@ if $@;
>          }
>
> +        if (exists $param->{checksum}) {
> +            print "calculating checksum...\n";
> +            $entry->{checksum} = PVE::Tools::get_file_hash($param->{checksum}, $path);

i've tested this with some not too uncommon disk images such as a 32GB
volume that is essentially empty and the api endpoint here just times
out. which is not too surprising. i wonder if we can cache the hashes
here somehow and calculate them in a worker tasks. i also wonder how
this should ideally work for running vm and container images as their
checksum could change all the time.

maybe we can at least calculate the hashes here for some more static
assets such iso etc. ahead of time and only enable this flag for things
like that (so isos, container templates, images of vm and container
templates etc.) basically things that don't change that much?

> +        }
> +
>          return $entry;
>      },
>  });



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH pve-storage] fix #6450: add file-checksum endpoint to storage API
  2025-10-02 12:15 ` Shannon Sterz
@ 2025-10-02 12:41   ` Thomas Lamprecht
  2025-10-02 12:51     ` Fabian Grünbichler
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Lamprecht @ 2025-10-02 12:41 UTC (permalink / raw)
  To: Proxmox VE development discussion, Shannon Sterz, Nicolas Frey; +Cc: pve-devel

Am 02.10.25 um 14:15 schrieb Shannon Sterz:
>>              warn $@ if $@;
>>          }
>>
>> +        if (exists $param->{checksum}) {
>> +            print "calculating checksum...\n";
>> +            $entry->{checksum} = PVE::Tools::get_file_hash($param->{checksum}, $path);
> i've tested this with some not too uncommon disk images such as a 32GB
> volume that is essentially empty and the api endpoint here just times
> out. which is not too surprising. i wonder if we can cache the hashes
> here somehow and calculate them in a worker tasks. i also wonder how
> this should ideally work for running vm and container images as their
> checksum could change all the time.
> 
> maybe we can at least calculate the hashes here for some more static
> assets such iso etc. ahead of time and only enable this flag for things
> like that (so isos, container templates, images of vm and container
> templates etc.) basically things that don't change that much?


I could not find it, but IIRC there was such a request (or patch?) for
checksums of storage content submitted in the past where we discussed
this already.

Anyhow, this is really not something trivial and would need some system
to cache the hash while also having a heuristic that ensures the cached
hash is still valid – as having a wrong hash returned might needlessly
wreck some nerves of any admin that take their job seriously.

We could do a file that contains the hash(es) and a inode nr., file
size and mtime value from the time those hash(es) got created as
heuristic to detect legitimate change. Plus probably the date to
show the user that this is was not calculated on the fly.
And yes, actual calculation needs to happen in a task worker, as
this can run for quite a while on big files and/or slow storages.
So probably best done in a dedicated API call I guess, but with all
this in mind I'm questing a bit if this is really worth that much
effort...



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH pve-storage] fix #6450: add file-checksum endpoint to storage API
  2025-10-02 12:41   ` Thomas Lamprecht
@ 2025-10-02 12:51     ` Fabian Grünbichler
  0 siblings, 0 replies; 4+ messages in thread
From: Fabian Grünbichler @ 2025-10-02 12:51 UTC (permalink / raw)
  To: Nicolas Frey, Proxmox VE development discussion, Shannon Sterz,
	Thomas Lamprecht
  Cc: pve-devel

On October 2, 2025 2:41 pm, Thomas Lamprecht wrote:
> Am 02.10.25 um 14:15 schrieb Shannon Sterz:
>>>              warn $@ if $@;
>>>          }
>>>
>>> +        if (exists $param->{checksum}) {
>>> +            print "calculating checksum...\n";
>>> +            $entry->{checksum} = PVE::Tools::get_file_hash($param->{checksum}, $path);
>> i've tested this with some not too uncommon disk images such as a 32GB
>> volume that is essentially empty and the api endpoint here just times
>> out. which is not too surprising. i wonder if we can cache the hashes
>> here somehow and calculate them in a worker tasks. i also wonder how
>> this should ideally work for running vm and container images as their
>> checksum could change all the time.
>> 
>> maybe we can at least calculate the hashes here for some more static
>> assets such iso etc. ahead of time and only enable this flag for things
>> like that (so isos, container templates, images of vm and container
>> templates etc.) basically things that don't change that much?
> 
> 
> I could not find it, but IIRC there was such a request (or patch?) for
> checksums of storage content submitted in the past where we discussed
> this already.
> 
> Anyhow, this is really not something trivial and would need some system
> to cache the hash while also having a heuristic that ensures the cached
> hash is still valid – as having a wrong hash returned might needlessly
> wreck some nerves of any admin that take their job seriously.
> 
> We could do a file that contains the hash(es) and a inode nr., file
> size and mtime value from the time those hash(es) got created as
> heuristic to detect legitimate change. Plus probably the date to
> show the user that this is was not calculated on the fly.
> And yes, actual calculation needs to happen in a task worker, as
> this can run for quite a while on big files and/or slow storages.
> So probably best done in a dedicated API call I guess, but with all
> this in mind I'm questing a bit if this is really worth that much
> effort...

recently discussed this with Dominik in the context of the streaming PBS
content API - we should really finally get around to implement an async
storage content list API call - then this could easily be only enabled
for the async variant..

the rough sketch was:

- add a task worker variant that is "ephemeral"/"light-weight"/..
- such task workers return a structured result object that is saved to disk
- the API endpoint starting them returns some kind of "token" (similar
  to the UPID for regular tasks, or maybe even use the same format?)
- they are not included in the regular task list
- the result can be queried using the token, once the task has finished
  either an error or the result is returned and the result is removed
  from disk

the UI could then trigger periodic refreshs of the content view, always
display (slightly outdated) information, etc.pp., other clients could
opt-into the async variant as well, if it fits their use case.

besides the storage content view, there's a few more that would benefit
from this kind of mechanism (with or without a client-side cache):

https://bugzilla.proxmox.com/show_bug.cgi?id=4447
https://bugzilla.proxmox.com/show_bug.cgi?id=3045

https://bugzilla.proxmox.com/show_bug.cgi?id=4961

and probably a few more that I failed to find quickly.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-10-02 12:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-29  9:35 [pve-devel] [PATCH pve-storage] fix #6450: add file-checksum endpoint to storage API Nicolas Frey
2025-10-02 12:15 ` Shannon Sterz
2025-10-02 12:41   ` Thomas Lamprecht
2025-10-02 12:51     ` Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal