* [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume
@ 2025-06-26 14:40 Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 1/9] plugin: add " Fiona Ebner
` (8 more replies)
0 siblings, 9 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
Changes in v3:
* Make tidy.
* After the upstream discussion [0], do not patch QEMU. Instead, make
sure that the 'keyring' is set in the storage's configuration and
set the rbd_cache_policy on the EFI image itself. For the 'keyring'
option, we also need something in pve8to9 so that users that already
have a custom configuration for an external Ceph storage are
prompted to set it too (an existing configuration is not touched by
the proposed patch). See patch 5/9 for details.
* Mention that volume is activated before qemu_blockdev_options() is
called in docs.
Changes in v2:
* Patch QEMU to support both keyring and rbd_cache_policy settings,
it is much cleaner than generate per-volume (or technically would
even be per-instance, as a volume could be re-used) temporary
ceph.conf files. For keyring, we could also make sure to write the
path to every storage's ceph.conf in pve8to9 and require it going
forward, but supporting the option in QEMU seems to be the simplest.
The rbd_cache_policy is truly a per-volume setting in Proxmox VE,
which is specifically changed for EFI disks, so ceph.conf is just
not the right place for that setting.
* Add a special 'hints' option to indicate when it's an EFI disk, so
the RBD plugin can set the policy.
When using -drive, storage plugins currently give us a path that QEMU
understands, some using special protocols such as 'iscsi://'. We'd
like to switch to using the more modern -blockdev for PVE 9. The
plan is to have the storage plugins return the very basic information
required to access the image, and qemu-server can then add other
settings like cache, aio, etc. on top. In fact, pretty similar to what
we have now for -drive, just with a structured hash rather than a
string.
This is also a prerequisite for qemu-storage-deamon, that would be
useful for TPM-as-qcow2 exported via NBD or FUSE or external backup
provider restore providing an NBD export for the provider to write to.
[0]: https://lore.kernel.org/qemu-devel/20250515112908.383693-1-f.ebner@proxmox.com/
Fiona Ebner (9):
plugin: add method to get qemu blockdevice options for volume
iscsi direct plugin: implement method to get qemu blockdevice options
zfs iscsi plugin: implement new method to get qemu blockdevice options
zfs pool plugin: implement method to get qemu blockdevice options
ceph/rbd: set 'keyring' in ceph configuration for externally managed
RBD storages
rbd plugin: implement new method to get qemu blockdevice options
plugin: qemu block device: add hints option and EFI disk hint
plugin: qemu block device: add support for snapshot option
plugin api: bump api version and age
ApiChangeLog | 13 ++++
src/PVE/CephConfig.pm | 50 +++++++++++++++
src/PVE/Storage.pm | 21 +++++-
src/PVE/Storage/ISCSIDirectPlugin.pm | 17 +++++
src/PVE/Storage/Plugin.pm | 96 ++++++++++++++++++++++++++++
src/PVE/Storage/RBDPlugin.pm | 62 ++++++++++++++++++
src/PVE/Storage/ZFSPlugin.pm | 19 ++++++
src/PVE/Storage/ZFSPoolPlugin.pm | 12 ++++
8 files changed, 288 insertions(+), 2 deletions(-)
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
2025-07-01 9:28 ` Thomas Lamprecht
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 2/9] iscsi direct plugin: implement method to get qemu blockdevice options Fiona Ebner
` (7 subsequent siblings)
8 siblings, 1 reply; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
This is in preparation to switch qemu-server from using '-drive' to
the modern '-blockdev' in the QEMU commandline options as well as for
the qemu-storage-daemon, which only supports '-blockdev'. The plugins
know best what driver and options are needed to access an image, so
a dedicated plugin method returning the necessary parameters for
'-blockdev' is the most straight-forward.
There intentionally is only handling for absolute paths in the default
plugin implementation. Any plugin requiring more needs to implement
the method itself. With PVE 9 being a major release and most popular
plugins not using special protocols like 'rbd://', this seems
acceptable.
For NBD, etc. qemu-server should construct the blockdev object.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
Changes in v3:
* Mention that volume is activated before function is called in docs.
src/PVE/Storage.pm | 17 +++++++++++
src/PVE/Storage/Plugin.pm | 62 +++++++++++++++++++++++++++++++++++++++
2 files changed, 79 insertions(+)
diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index 69eb435..ec8b753 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -719,6 +719,23 @@ sub abs_filesystem_path {
return $path;
}
+# see the documentation for the plugin method
+sub qemu_blockdev_options {
+ my ($cfg, $volid) = @_;
+
+ my ($storeid, $volname) = parse_volume_id($volid);
+
+ my $scfg = storage_config($cfg, $storeid);
+
+ my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
+
+ my ($vtype) = $plugin->parse_volname($volname);
+ die "cannot use volume of type '$vtype' as a QEMU blockdevice\n"
+ if $vtype ne 'images' && $vtype ne 'iso' && $vtype ne 'import';
+
+ return $plugin->qemu_blockdev_options($scfg, $storeid, $volname);
+}
+
# used as last resort to adapt volnames when migrating
my $volname_for_storage = sub {
my ($cfg, $storeid, $name, $vmid, $format) = @_;
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 53b9848..9951272 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -1961,6 +1961,68 @@ sub rename_volume {
return "${storeid}:${base}${target_vmid}/${target_volname}";
}
+=pod
+
+=head3 qemu_blockdev_options
+
+ $blockdev = $plugin->qemu_blockdev_options($scfg, $storeid, $volname)
+
+Returns a hash reference with the basic options needed to open the volume via QEMU's C<-blockdev>
+API. This at least requires a C<< $blockdev->{driver} >> and a reference to the image, e.g.
+C<< $blockdev->{filename} >> for the C<file> driver. For files, the C<file> driver can be used. For
+host block devices, the C<host_device> driver can be used. The plugin must not set options like
+C<cache> or C<aio>. Those are managed by qemu-server and will be overwritten. For other available
+drivers and the exact specification of the options, see
+L<https://qemu.readthedocs.io/en/master/interop/qemu-qmp-ref.html#object-QMP-block-core.BlockdevOptions>
+
+While Perl does not have explicit types, the result will need to be converted to JSON later and
+match the QMP specification (see link above), so implicit types are important. In the return value,
+use C<JSON::true> and C<JSON::false> for booleans, C<"$value"> for strings, and C<int($value)> for
+integers.
+
+The volume is activated before the function is called.
+
+Arguments:
+
+=over
+
+=item C<$scfg>
+
+The hash reference with the storage configuration.
+
+=item C<$storeid>
+
+The storage ID.
+
+=item C<$volume>
+
+The volume name.
+
+=back
+
+=cut
+
+sub qemu_blockdev_options {
+ my ($class, $scfg, $storeid, $volname) = @_;
+
+ my $blockdev = {};
+
+ my ($path) = $class->filesystem_path($scfg, $volname);
+
+ if ($path =~ m|^/|) {
+ # The 'file' driver only works for regular files. The check below is taken from
+ # block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
+ # devices here, those are not managed by the storage layer.
+ my $st = File::stat::stat($path) or die "stat for '$path' failed - $!\n";
+ my $driver = (S_ISCHR($st->mode) || S_ISBLK($st->mode)) ? 'host_device' : 'file';
+ $blockdev = { driver => $driver, filename => $path };
+ } else {
+ die "storage plugin doesn't implement qemu_blockdev_options() method\n";
+ }
+
+ return $blockdev;
+}
+
# Used by storage plugins for external backup providers. See PVE::BackupProvider::Plugin for the API
# the provider needs to implement.
#
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [PATCH v3 storage 2/9] iscsi direct plugin: implement method to get qemu blockdevice options
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 1/9] plugin: add " Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 3/9] zfs iscsi plugin: implement new " Fiona Ebner
` (6 subsequent siblings)
8 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
No changes in v3.
src/PVE/Storage/ISCSIDirectPlugin.pm | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/src/PVE/Storage/ISCSIDirectPlugin.pm b/src/PVE/Storage/ISCSIDirectPlugin.pm
index 9b7f77c..8c6b4ab 100644
--- a/src/PVE/Storage/ISCSIDirectPlugin.pm
+++ b/src/PVE/Storage/ISCSIDirectPlugin.pm
@@ -110,6 +110,20 @@ sub path {
return ($path, $vmid, $vtype);
}
+sub qemu_blockdev_options {
+ my ($class, $scfg, $storeid, $volname) = @_;
+
+ my $lun = ($class->parse_volname($volname))[1];
+
+ return {
+ driver => 'iscsi',
+ transport => 'tcp',
+ portal => "$scfg->{portal}",
+ target => "$scfg->{target}",
+ lun => int($lun),
+ };
+}
+
sub create_base {
my ($class, $storeid, $scfg, $volname) = @_;
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [PATCH v3 storage 3/9] zfs iscsi plugin: implement new method to get qemu blockdevice options
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 1/9] plugin: add " Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 2/9] iscsi direct plugin: implement method to get qemu blockdevice options Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 4/9] zfs pool plugin: implement " Fiona Ebner
` (5 subsequent siblings)
8 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
Reported-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
No changes in v3.
src/PVE/Storage/ZFSPlugin.pm | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/PVE/Storage/ZFSPlugin.pm b/src/PVE/Storage/ZFSPlugin.pm
index f0fa522..c03fcca 100644
--- a/src/PVE/Storage/ZFSPlugin.pm
+++ b/src/PVE/Storage/ZFSPlugin.pm
@@ -247,6 +247,22 @@ sub path {
return ($path, $vmid, $vtype);
}
+sub qemu_blockdev_options {
+ my ($class, $scfg, $storeid, $volname) = @_;
+
+ my $name = ($class->parse_volname($volname))[1];
+ my $guid = $class->zfs_get_lu_name($scfg, $name);
+ my $lun = $class->zfs_get_lun_number($scfg, $guid);
+
+ return {
+ driver => 'iscsi',
+ transport => 'tcp',
+ portal => "$scfg->{portal}",
+ target => "$scfg->{target}",
+ lun => int($lun),
+ };
+}
+
sub create_base {
my ($class, $storeid, $scfg, $volname) = @_;
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [PATCH v3 storage 4/9] zfs pool plugin: implement method to get qemu blockdevice options
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
` (2 preceding siblings ...)
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 3/9] zfs iscsi plugin: implement new " Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
2025-06-30 11:20 ` Fabian Grünbichler
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 5/9] ceph/rbd: set 'keyring' in ceph configuration for externally managed RBD storages Fiona Ebner
` (4 subsequent siblings)
8 siblings, 1 reply; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
ZFS does not have a filesystem_path() method, so the default
implementation for qemu_blockdev_options() cannot be re-used. This is
most likely, because snapshots are currently not directly accessible
via a filesystem path in the Proxmox VE storage layer.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
No changes in v3.
src/PVE/Storage/ZFSPoolPlugin.pm | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/src/PVE/Storage/ZFSPoolPlugin.pm b/src/PVE/Storage/ZFSPoolPlugin.pm
index 713d26f..4479fe4 100644
--- a/src/PVE/Storage/ZFSPoolPlugin.pm
+++ b/src/PVE/Storage/ZFSPoolPlugin.pm
@@ -162,6 +162,16 @@ sub path {
return ($path, $vmid, $vtype);
}
+sub qemu_blockdev_options {
+ my ($class, $scfg, $storeid, $volname) = @_;
+
+ my ($path) = $class->path($scfg, $volname, $storeid);
+
+ my $blockdev = { driver => 'host_device', filename => $path };
+
+ return $blockdev;
+}
+
sub zfs_request {
my ($class, $scfg, $timeout, $method, @params) = @_;
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [RFC v3 storage 5/9] ceph/rbd: set 'keyring' in ceph configuration for externally managed RBD storages
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
` (3 preceding siblings ...)
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 4/9] zfs pool plugin: implement " Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 6/9] rbd plugin: implement new method to get qemu blockdevice options Fiona Ebner
` (3 subsequent siblings)
8 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
For QEMU, when using '-blockdev', there is no way to specify the
keyring file like was possible with '-drive', so it has to be set in
the corresponding Ceph configuration file. As it applies to all images
on the storage, it also is the most natural place for the setting.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
New in v3.
This should also be mentioned in the upgrade guide for PVE 9 and
the pve8to9 script should tell the user and/or automatically set it
for existing externally managed RBD storages, that already do have a
custom configuration.
src/PVE/CephConfig.pm | 50 ++++++++++++++++++++++++++++++++++++
src/PVE/Storage/RBDPlugin.pm | 3 +++
2 files changed, 53 insertions(+)
diff --git a/src/PVE/CephConfig.pm b/src/PVE/CephConfig.pm
index 5347781..e5815c4 100644
--- a/src/PVE/CephConfig.pm
+++ b/src/PVE/CephConfig.pm
@@ -3,6 +3,8 @@ package PVE::CephConfig;
use strict;
use warnings;
use Net::IP;
+
+use PVE::RESTEnvironment qw(log_warn);
use PVE::Tools qw(run_command);
use PVE::Cluster qw(cfs_register_file);
@@ -420,6 +422,10 @@ sub ceph_connect_option {
} else {
$cmd_option->{ceph_conf} = "/etc/pve/priv/ceph/${storeid}.conf";
}
+ } elsif (!$pveceph_managed) {
+ # No dedicated config for non-PVE-managed cluster, create new
+ # TODO PVE 10 - remove. All such storages already got a configuration upon creation or here.
+ ceph_create_configuration($scfg->{type}, $storeid);
}
$cmd_option->{keyring} = $keyfile if (-e $keyfile);
@@ -487,6 +493,50 @@ sub ceph_remove_keyfile {
}
}
+sub ceph_create_configuration {
+ my ($type, $storeid) = @_;
+
+ return if $type eq 'cephfs'; # no configuration file needed currently
+
+ my $extension = 'keyring';
+ $extension = 'secret' if $type eq 'cephfs';
+ my $ceph_storage_keyring = "/etc/pve/priv/ceph/${storeid}.$extension";
+
+ return if !-e $ceph_storage_keyring;
+
+ my $ceph_storage_config = "/etc/pve/priv/ceph/${storeid}.conf";
+
+ if (-e $ceph_storage_config) {
+ log_warn(
+ "file $ceph_storage_config already exists, check manually and ensure 'keyring'"
+ . " option is set to '$ceph_storage_keyring'!\n",
+ );
+ return;
+ }
+
+ my $ceph_config = {
+ global => {
+ keyring => $ceph_storage_keyring,
+ },
+ };
+
+ my $contents = PVE::CephConfig::write_ceph_config($ceph_storage_config, $ceph_config);
+ PVE::Tools::file_set_contents($ceph_storage_config, $contents, 0600);
+
+ return;
+}
+
+sub ceph_remove_configuration {
+ my ($storeid) = @_;
+
+ my $ceph_storage_config = "/etc/pve/priv/ceph/${storeid}.conf";
+ if (-f $ceph_storage_config) {
+ unlink $ceph_storage_config or log_warn("removing $ceph_storage_config failed - $!\n");
+ }
+
+ return;
+}
+
my $ceph_version_parser = sub {
my $ceph_version = shift;
# FIXME this is the same as pve-manager PVE::Ceph::Tools get_local_version
diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
index c0bbe2c..3f7ca9f 100644
--- a/src/PVE/Storage/RBDPlugin.pm
+++ b/src/PVE/Storage/RBDPlugin.pm
@@ -448,6 +448,7 @@ sub on_add_hook {
my ($class, $storeid, $scfg, %param) = @_;
PVE::CephConfig::ceph_create_keyfile($scfg->{type}, $storeid, $param{keyring});
+ PVE::CephConfig::ceph_create_configuration($scfg->{type}, $storeid);
return;
}
@@ -469,6 +470,8 @@ sub on_update_hook {
sub on_delete_hook {
my ($class, $storeid, $scfg) = @_;
PVE::CephConfig::ceph_remove_keyfile($scfg->{type}, $storeid);
+ PVE::CephConfig::ceph_remove_configuration($storeid);
+
return;
}
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [PATCH v3 storage 6/9] rbd plugin: implement new method to get qemu blockdevice options
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
` (4 preceding siblings ...)
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 5/9] ceph/rbd: set 'keyring' in ceph configuration for externally managed RBD storages Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
2025-06-30 11:19 ` Fabian Grünbichler
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 7/9] plugin: qemu block device: add hints option and EFI disk hint Fiona Ebner
` (2 subsequent siblings)
8 siblings, 1 reply; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
The mon host parsing is adapted from other places. While there
currently is no support for vector notation in the storage config
(it's a pve-storage-portal-dns-list option), it doesn't hurt to
anticipate it, should the list of mon hosts come from a ceph.conf
instead anytime in the future.
Co-developed-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
No changes in v3.
src/PVE/Storage/RBDPlugin.pm | 39 ++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
index 3f7ca9f..229b07d 100644
--- a/src/PVE/Storage/RBDPlugin.pm
+++ b/src/PVE/Storage/RBDPlugin.pm
@@ -513,6 +513,45 @@ sub path {
return ($path, $vmid, $vtype);
}
+sub qemu_blockdev_options {
+ my ($class, $scfg, $storeid, $volname) = @_;
+
+ my $cmd_option = PVE::CephConfig::ceph_connect_option($scfg, $storeid);
+ my ($name) = ($class->parse_volname($volname))[1];
+
+ if ($scfg->{krbd}) {
+ my $rbd_dev_path = get_rbd_dev_path($scfg, $storeid, $name);
+ return { driver => 'host_device', filename => $rbd_dev_path };
+ }
+
+ my $blockdev = {
+ driver => 'rbd',
+ pool => $scfg->{pool} ? "$scfg->{pool}" : 'rbd',
+ image => "$name",
+ };
+ $blockdev->{namespace} = "$scfg->{namespace}" if defined($scfg->{namespace});
+
+ $blockdev->{conf} = $cmd_option->{ceph_conf} if $cmd_option->{ceph_conf};
+
+ if (my $monhost = $scfg->{'monhost'}) {
+ my $server = [];
+ my @mons = PVE::Tools::split_list($monhost);
+ for my $mon (@mons) {
+ $mon =~ s/^\[?v\d\://; # remove beginning of vector
+ $mon =~ s|/\d+\]?||; # remove end of vector
+ my ($host, $port) = PVE::Tools::parse_host_and_port($mon);
+ $port = '3300' if !$port;
+ push @$server, { host => $host, port => $port };
+ }
+ $blockdev->{server} = $server;
+ $blockdev->{'auth-client-required'} = ["$cmd_option->{auth_supported}"];
+ }
+
+ $blockdev->{user} = "$cmd_option->{userid}" if $cmd_option->{keyring};
+
+ return $blockdev;
+}
+
sub find_free_diskname {
my ($class, $storeid, $scfg, $vmid, $fmt, $add_fmt_suffix) = @_;
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [RFC v3 storage 7/9] plugin: qemu block device: add hints option and EFI disk hint
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
` (5 preceding siblings ...)
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 6/9] rbd plugin: implement new method to get qemu blockdevice options Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 8/9] plugin: qemu block device: add support for snapshot option Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 9/9] plugin api: bump api version and age Fiona Ebner
8 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
For '-drive', qemu-server sets special cache options for EFI disk
using RBD. In preparation to seamlessly switch to the new '-blockdev'
interface, do the same here. Note that the issue from bug #3329, which
is solved by these cache options, still affects current versions.
With -blockdev, the cache options are split up. While cache.direct and
cache.no-flush can be set in the -blockdev options, cache.writeback is
a front-end property and was intentionally removed from the -blockdev
options by QEMU commit aaa436f998 ("block: Remove cache.writeback from
blockdev-add"). It needs to be configured as the 'write-cache'
property for the ide-hd/scsi-hd/virtio-blk device.
The default is already 'writeback' and no cache mode can be set for an
EFI drive configuration in Proxmox VE currently, so there will not be
a clash.
┌─────────────┬─────────────────┬──────────────┬────────────────┐
│ │ cache.writeback │ cache.direct │ cache.no-flush │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│writeback │ on │ off │ off │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│none │ on │ on │ off │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│writethrough │ off │ off │ off │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│directsync │ off │ on │ off │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│unsafe │ on │ off │ on │
└─────────────┴─────────────────┴──────────────┴────────────────┘
Table from 'man kvm'.
Alternatively, the option could only be set once when allocating the
RBD volume. However, then we would need to detect all cases were a
volume could potentially be used as an EFI disk later. Having a custom
disk type would help a lot there. The approach here was chosen as it
is catch-all and should not be too costly either.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
Changes in v3:
* Set the policy via 'rbd config image set'.
src/PVE/Storage.pm | 4 ++--
src/PVE/Storage/ISCSIDirectPlugin.pm | 2 +-
src/PVE/Storage/Plugin.pm | 27 +++++++++++++++++++++++++--
src/PVE/Storage/RBDPlugin.pm | 20 +++++++++++++++++++-
src/PVE/Storage/ZFSPlugin.pm | 2 +-
src/PVE/Storage/ZFSPoolPlugin.pm | 2 +-
6 files changed, 49 insertions(+), 8 deletions(-)
diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index ec8b753..4920bd6 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -721,7 +721,7 @@ sub abs_filesystem_path {
# see the documentation for the plugin method
sub qemu_blockdev_options {
- my ($cfg, $volid) = @_;
+ my ($cfg, $volid, $options) = @_;
my ($storeid, $volname) = parse_volume_id($volid);
@@ -733,7 +733,7 @@ sub qemu_blockdev_options {
die "cannot use volume of type '$vtype' as a QEMU blockdevice\n"
if $vtype ne 'images' && $vtype ne 'iso' && $vtype ne 'import';
- return $plugin->qemu_blockdev_options($scfg, $storeid, $volname);
+ return $plugin->qemu_blockdev_options($scfg, $storeid, $volname, $options);
}
# used as last resort to adapt volnames when migrating
diff --git a/src/PVE/Storage/ISCSIDirectPlugin.pm b/src/PVE/Storage/ISCSIDirectPlugin.pm
index 8c6b4ab..12b894d 100644
--- a/src/PVE/Storage/ISCSIDirectPlugin.pm
+++ b/src/PVE/Storage/ISCSIDirectPlugin.pm
@@ -111,7 +111,7 @@ sub path {
}
sub qemu_blockdev_options {
- my ($class, $scfg, $storeid, $volname) = @_;
+ my ($class, $scfg, $storeid, $volname, $options) = @_;
my $lun = ($class->parse_volname($volname))[1];
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 9951272..7a274a3 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -1965,7 +1965,7 @@ sub rename_volume {
=head3 qemu_blockdev_options
- $blockdev = $plugin->qemu_blockdev_options($scfg, $storeid, $volname)
+ $blockdev = $plugin->qemu_blockdev_options($scfg, $storeid, $volname, $options)
Returns a hash reference with the basic options needed to open the volume via QEMU's C<-blockdev>
API. This at least requires a C<< $blockdev->{driver} >> and a reference to the image, e.g.
@@ -1998,12 +1998,35 @@ The storage ID.
The volume name.
+=item C<$options>
+
+A hash reference with additional options.
+
+=over
+
+=item C<< $options->{hints} >>
+
+A hash reference with hints indicating what the volume will be used for. This can be safely ignored
+if no concrete issues are known with your plugin. For certain use cases, setting additional
+(plugin-specific) options might be very beneficial however. An example is setting the correct cache
+options for an EFI disk on RBD. The list of hints might get expanded in the future.
+
+=over
+
+=item C<< $options->{hints}->{'efi-disk'} >>
+
+(optional) If set, the volume will be used as the EFI disk of a VM, containing its OMVF variables.
+
+=back
+
+=back
+
=back
=cut
sub qemu_blockdev_options {
- my ($class, $scfg, $storeid, $volname) = @_;
+ my ($class, $scfg, $storeid, $volname, $options) = @_;
my $blockdev = {};
diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
index 229b07d..6b37ba3 100644
--- a/src/PVE/Storage/RBDPlugin.pm
+++ b/src/PVE/Storage/RBDPlugin.pm
@@ -374,6 +374,16 @@ my sub rbd_volume_exists {
return 0;
}
+# Needs to be public, so qemu-server can mock it for cfg2cmd.
+sub rbd_volume_config_set {
+ my ($scfg, $storeid, $volname, $key, $value) = @_;
+
+ my $cmd = $rbd_cmd->($scfg, $storeid, 'config', 'image', 'set', $volname, $key, $value);
+ run_rbd_command($cmd, errmsg => "rbd config image set $volname $key $value error");
+
+ return;
+}
+
# Configuration
sub type {
@@ -514,7 +524,7 @@ sub path {
}
sub qemu_blockdev_options {
- my ($class, $scfg, $storeid, $volname) = @_;
+ my ($class, $scfg, $storeid, $volname, $options) = @_;
my $cmd_option = PVE::CephConfig::ceph_connect_option($scfg, $storeid);
my ($name) = ($class->parse_volname($volname))[1];
@@ -549,6 +559,14 @@ sub qemu_blockdev_options {
$blockdev->{user} = "$cmd_option->{userid}" if $cmd_option->{keyring};
+ # SPI flash does lots of read-modify-write OPs, without writeback this gets really slow #3329
+ if ($options->{hints}->{'efi-disk'}) {
+ # Querying the value would just cost more and the 'rbd image config get' command will just
+ # fail if the config has not been set yet, so it's not even straight-forward to do so.
+ # Simply set the value (possibly again).
+ rbd_volume_config_set($scfg, $storeid, $name, 'rbd_cache_policy', 'writeback');
+ }
+
return $blockdev;
}
diff --git a/src/PVE/Storage/ZFSPlugin.pm b/src/PVE/Storage/ZFSPlugin.pm
index c03fcca..0f64898 100644
--- a/src/PVE/Storage/ZFSPlugin.pm
+++ b/src/PVE/Storage/ZFSPlugin.pm
@@ -248,7 +248,7 @@ sub path {
}
sub qemu_blockdev_options {
- my ($class, $scfg, $storeid, $volname) = @_;
+ my ($class, $scfg, $storeid, $volname, $options) = @_;
my $name = ($class->parse_volname($volname))[1];
my $guid = $class->zfs_get_lu_name($scfg, $name);
diff --git a/src/PVE/Storage/ZFSPoolPlugin.pm b/src/PVE/Storage/ZFSPoolPlugin.pm
index 4479fe4..677f88c 100644
--- a/src/PVE/Storage/ZFSPoolPlugin.pm
+++ b/src/PVE/Storage/ZFSPoolPlugin.pm
@@ -163,7 +163,7 @@ sub path {
}
sub qemu_blockdev_options {
- my ($class, $scfg, $storeid, $volname) = @_;
+ my ($class, $scfg, $storeid, $volname, $options) = @_;
my ($path) = $class->path($scfg, $volname, $storeid);
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [RFC v3 storage 8/9] plugin: qemu block device: add support for snapshot option
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
` (6 preceding siblings ...)
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 7/9] plugin: qemu block device: add hints option and EFI disk hint Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
2025-06-30 11:40 ` Fabian Grünbichler
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 9/9] plugin api: bump api version and age Fiona Ebner
8 siblings, 1 reply; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
This is mostly in preparation for external qcow2 snapshot support.
For internal qcow2 snapshots, which currently are the only supported
variant, it is not possible to attach the snapshot only. If access to
that is required it will need to be handled differently, e.g. via a
FUSE/NBD export.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
No changes in v3. RFC, because it depends on the previous one.
src/PVE/Storage/ISCSIDirectPlugin.pm | 3 +++
src/PVE/Storage/Plugin.pm | 13 ++++++++++++-
src/PVE/Storage/RBDPlugin.pm | 2 ++
src/PVE/Storage/ZFSPlugin.pm | 3 +++
src/PVE/Storage/ZFSPoolPlugin.pm | 2 ++
5 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/src/PVE/Storage/ISCSIDirectPlugin.pm b/src/PVE/Storage/ISCSIDirectPlugin.pm
index 12b894d..e0f8a62 100644
--- a/src/PVE/Storage/ISCSIDirectPlugin.pm
+++ b/src/PVE/Storage/ISCSIDirectPlugin.pm
@@ -113,6 +113,9 @@ sub path {
sub qemu_blockdev_options {
my ($class, $scfg, $storeid, $volname, $options) = @_;
+ die "volume snapshot is not possible on iscsi device\n"
+ if $options->{'snapshot-name'};
+
my $lun = ($class->parse_volname($volname))[1];
return {
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 7a274a3..e6892ec 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -2004,6 +2004,11 @@ A hash reference with additional options.
=over
+=item C<< $options->{'snapshot-name'} >>
+
+(optional) The snapshot name. Set when the associated snapshot should be opened rather than the
+volume itself.
+
=item C<< $options->{hints} >>
A hash reference with hints indicating what the volume will be used for. This can be safely ignored
@@ -2030,9 +2035,15 @@ sub qemu_blockdev_options {
my $blockdev = {};
- my ($path) = $class->filesystem_path($scfg, $volname);
+ my ($path) = $class->filesystem_path($scfg, $volname, $options->{'snapshot-name'});
if ($path =~ m|^/|) {
+ # For qcow2 and qed the path of a snapshot will be the same, but it's not possible to attach
+ # the snapshot alone.
+ my $format = ($class->parse_volname($volname))[6];
+ die "cannot attach only the snapshot of a '$format' image\n"
+ if $options->{'snapshot-name'} && ($format eq 'qcow2' || $format eq 'qed');
+
# The 'file' driver only works for regular files. The check below is taken from
# block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
# devices here, those are not managed by the storage layer.
diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
index 6b37ba3..cf3d354 100644
--- a/src/PVE/Storage/RBDPlugin.pm
+++ b/src/PVE/Storage/RBDPlugin.pm
@@ -530,6 +530,7 @@ sub qemu_blockdev_options {
my ($name) = ($class->parse_volname($volname))[1];
if ($scfg->{krbd}) {
+ $name .= '@' . $options->{'snapshot-name'} if $options->{'snapshot-name'};
my $rbd_dev_path = get_rbd_dev_path($scfg, $storeid, $name);
return { driver => 'host_device', filename => $rbd_dev_path };
}
@@ -540,6 +541,7 @@ sub qemu_blockdev_options {
image => "$name",
};
$blockdev->{namespace} = "$scfg->{namespace}" if defined($scfg->{namespace});
+ $blockdev->{snapshot} = $options->{'snapshot-name'} if $options->{'snapshot-name'};
$blockdev->{conf} = $cmd_option->{ceph_conf} if $cmd_option->{ceph_conf};
diff --git a/src/PVE/Storage/ZFSPlugin.pm b/src/PVE/Storage/ZFSPlugin.pm
index 0f64898..940d4f0 100644
--- a/src/PVE/Storage/ZFSPlugin.pm
+++ b/src/PVE/Storage/ZFSPlugin.pm
@@ -250,6 +250,9 @@ sub path {
sub qemu_blockdev_options {
my ($class, $scfg, $storeid, $volname, $options) = @_;
+ die "direct access to snapshots not implemented\n"
+ if $options->{'snapshot-name'};
+
my $name = ($class->parse_volname($volname))[1];
my $guid = $class->zfs_get_lu_name($scfg, $name);
my $lun = $class->zfs_get_lun_number($scfg, $guid);
diff --git a/src/PVE/Storage/ZFSPoolPlugin.pm b/src/PVE/Storage/ZFSPoolPlugin.pm
index 677f88c..86f83a2 100644
--- a/src/PVE/Storage/ZFSPoolPlugin.pm
+++ b/src/PVE/Storage/ZFSPoolPlugin.pm
@@ -165,6 +165,8 @@ sub path {
sub qemu_blockdev_options {
my ($class, $scfg, $storeid, $volname, $options) = @_;
+ die "cannot attach only the snapshot of a zvol\n" if $options->{'snapshot-name'};
+
my ($path) = $class->path($scfg, $volname, $storeid);
my $blockdev = { driver => 'host_device', filename => $path };
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* [pve-devel] [PATCH v3 storage 9/9] plugin api: bump api version and age
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
` (7 preceding siblings ...)
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 8/9] plugin: qemu block device: add support for snapshot option Fiona Ebner
@ 2025-06-26 14:40 ` Fiona Ebner
8 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-06-26 14:40 UTC (permalink / raw)
To: pve-devel
Introduce qemu_blockdev_options() plugin method.
In terms of the plugin API only, adding the qemu_blockdev_options()
method is a fully backwards-compatible change. When qemu-server will
switch to '-blockdev' however, plugins where the default implemenation
is not sufficient, will not be usable for virtual machines anymore.
Therefore, this is intended for the next major release, Proxmox VE 9.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
No changes in v3.
ApiChangeLog | 13 +++++++++++++
src/PVE/Storage.pm | 4 ++--
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/ApiChangeLog b/ApiChangeLog
index 987da54..062e75b 100644
--- a/ApiChangeLog
+++ b/ApiChangeLog
@@ -6,6 +6,19 @@ without breaking anything unaware of it.)
Future changes should be documented in here.
+## Version 12:
+
+* Introduce `qemu_blockdev_options()` plugin method
+
+ Proxmox VE will switch to the more modern QEMU command line option `-blockdev` replacing `-drive`.
+ With `-drive`, it was enough to specify a path, where special protocol paths like `iscsi://` were
+ also supported. With `-blockdev`, the data is more structured, a driver needs to be specified
+ alongside the path to an image and each driver supports driver-specific options. Most storage
+ plugins should be fine using driver `host_device` in case of a block device and `file` in case of
+ a file and no special options, see the default implemenation of the base plugin. Implement this
+ this method for Proxmox VE 9. For available drivers and driver-specific options, see:
+ https://qemu.readthedocs.io/en/master/interop/qemu-qmp-ref.html#object-QMP-block-core.BlockdevOptions
+
## Version 11:
* Allow declaring storage features via plugin data
diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index 4920bd6..b3b68fa 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -41,11 +41,11 @@ use PVE::Storage::BTRFSPlugin;
use PVE::Storage::ESXiPlugin;
# Storage API version. Increment it on changes in storage API interface.
-use constant APIVER => 11;
+use constant APIVER => 12;
# Age is the number of versions we're backward compatible with.
# This is like having 'current=APIVER' and age='APIAGE' in libtool,
# see https://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html
-use constant APIAGE => 2;
+use constant APIAGE => 3;
our $KNOWN_EXPORT_FORMATS = ['raw+size', 'tar+size', 'qcow2+size', 'vmdk+size', 'zfs', 'btrfs'];
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 6/9] rbd plugin: implement new method to get qemu blockdevice options
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 6/9] rbd plugin: implement new method to get qemu blockdevice options Fiona Ebner
@ 2025-06-30 11:19 ` Fabian Grünbichler
2025-07-01 12:15 ` Fiona Ebner
0 siblings, 1 reply; 24+ messages in thread
From: Fabian Grünbichler @ 2025-06-30 11:19 UTC (permalink / raw)
To: Proxmox VE development discussion
On June 26, 2025 4:40 pm, Fiona Ebner wrote:
> The mon host parsing is adapted from other places. While there
> currently is no support for vector notation in the storage config
> (it's a pve-storage-portal-dns-list option), it doesn't hurt to
> anticipate it, should the list of mon hosts come from a ceph.conf
> instead anytime in the future.
but in `path` we don't do this? shouldn't this be consistent?
also, we already have a full helper for getting a mon list from a ceph
config, so if we ever use that as source here, shouldn't we use that
helper?
>
> Co-developed-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>
> No changes in v3.
>
> src/PVE/Storage/RBDPlugin.pm | 39 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 39 insertions(+)
>
> diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
> index 3f7ca9f..229b07d 100644
> --- a/src/PVE/Storage/RBDPlugin.pm
> +++ b/src/PVE/Storage/RBDPlugin.pm
> @@ -513,6 +513,45 @@ sub path {
> return ($path, $vmid, $vtype);
> }
>
> +sub qemu_blockdev_options {
> + my ($class, $scfg, $storeid, $volname) = @_;
> +
> + my $cmd_option = PVE::CephConfig::ceph_connect_option($scfg, $storeid);
> + my ($name) = ($class->parse_volname($volname))[1];
> +
> + if ($scfg->{krbd}) {
> + my $rbd_dev_path = get_rbd_dev_path($scfg, $storeid, $name);
> + return { driver => 'host_device', filename => $rbd_dev_path };
> + }
> +
> + my $blockdev = {
> + driver => 'rbd',
> + pool => $scfg->{pool} ? "$scfg->{pool}" : 'rbd',
> + image => "$name",
> + };
> + $blockdev->{namespace} = "$scfg->{namespace}" if defined($scfg->{namespace});
> +
> + $blockdev->{conf} = $cmd_option->{ceph_conf} if $cmd_option->{ceph_conf};
> +
> + if (my $monhost = $scfg->{'monhost'}) {
> + my $server = [];
> + my @mons = PVE::Tools::split_list($monhost);
> + for my $mon (@mons) {
> + $mon =~ s/^\[?v\d\://; # remove beginning of vector
> + $mon =~ s|/\d+\]?||; # remove end of vector
> + my ($host, $port) = PVE::Tools::parse_host_and_port($mon);
> + $port = '3300' if !$port;
> + push @$server, { host => $host, port => $port };
> + }
> + $blockdev->{server} = $server;
> + $blockdev->{'auth-client-required'} = ["$cmd_option->{auth_supported}"];
> + }
> +
> + $blockdev->{user} = "$cmd_option->{userid}" if $cmd_option->{keyring};
> +
> + return $blockdev;
> +}
> +
> sub find_free_diskname {
> my ($class, $storeid, $scfg, $vmid, $fmt, $add_fmt_suffix) = @_;
>
> --
> 2.47.2
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 4/9] zfs pool plugin: implement method to get qemu blockdevice options
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 4/9] zfs pool plugin: implement " Fiona Ebner
@ 2025-06-30 11:20 ` Fabian Grünbichler
2025-07-01 12:08 ` Fiona Ebner
0 siblings, 1 reply; 24+ messages in thread
From: Fabian Grünbichler @ 2025-06-30 11:20 UTC (permalink / raw)
To: Proxmox VE development discussion
On June 26, 2025 4:40 pm, Fiona Ebner wrote:
> ZFS does not have a filesystem_path() method, so the default
> implementation for qemu_blockdev_options() cannot be re-used. This is
> most likely, because snapshots are currently not directly accessible
> via a filesystem path in the Proxmox VE storage layer.
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>
> No changes in v3.
>
> src/PVE/Storage/ZFSPoolPlugin.pm | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/src/PVE/Storage/ZFSPoolPlugin.pm b/src/PVE/Storage/ZFSPoolPlugin.pm
> index 713d26f..4479fe4 100644
> --- a/src/PVE/Storage/ZFSPoolPlugin.pm
> +++ b/src/PVE/Storage/ZFSPoolPlugin.pm
> @@ -162,6 +162,16 @@ sub path {
> return ($path, $vmid, $vtype);
> }
>
> +sub qemu_blockdev_options {
> + my ($class, $scfg, $storeid, $volname) = @_;
> +
> + my ($path) = $class->path($scfg, $volname, $storeid);
> +
> + my $blockdev = { driver => 'host_device', filename => $path };
> +
> + return $blockdev;
should we assert that $volname is a zvol?
> +}
> +
> sub zfs_request {
> my ($class, $scfg, $timeout, $method, @params) = @_;
>
> --
> 2.47.2
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [RFC v3 storage 8/9] plugin: qemu block device: add support for snapshot option
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 8/9] plugin: qemu block device: add support for snapshot option Fiona Ebner
@ 2025-06-30 11:40 ` Fabian Grünbichler
2025-07-01 12:23 ` Fiona Ebner
0 siblings, 1 reply; 24+ messages in thread
From: Fabian Grünbichler @ 2025-06-30 11:40 UTC (permalink / raw)
To: Proxmox VE development discussion
On June 26, 2025 4:40 pm, Fiona Ebner wrote:
> This is mostly in preparation for external qcow2 snapshot support.
>
> For internal qcow2 snapshots, which currently are the only supported
> variant, it is not possible to attach the snapshot only. If access to
> that is required it will need to be handled differently, e.g. via a
> FUSE/NBD export.
just for the avoidance of doubt since it's only implied and not spelled
out explicitly - we don't do any such accesses at the moment, right?
linked clones cannot be done for running VMs, full clones of snapshots
don't happen within the running VM but via qemu-img convert, and nothing
else besides replication (which is ZFS specific) and manual `pvesm
export` calls (which are on the storage level by definition) directly
reads from a storage snapshot?
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>
> No changes in v3. RFC, because it depends on the previous one.
>
> src/PVE/Storage/ISCSIDirectPlugin.pm | 3 +++
> src/PVE/Storage/Plugin.pm | 13 ++++++++++++-
> src/PVE/Storage/RBDPlugin.pm | 2 ++
> src/PVE/Storage/ZFSPlugin.pm | 3 +++
> src/PVE/Storage/ZFSPoolPlugin.pm | 2 ++
> 5 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/src/PVE/Storage/ISCSIDirectPlugin.pm b/src/PVE/Storage/ISCSIDirectPlugin.pm
> index 12b894d..e0f8a62 100644
> --- a/src/PVE/Storage/ISCSIDirectPlugin.pm
> +++ b/src/PVE/Storage/ISCSIDirectPlugin.pm
> @@ -113,6 +113,9 @@ sub path {
> sub qemu_blockdev_options {
> my ($class, $scfg, $storeid, $volname, $options) = @_;
>
> + die "volume snapshot is not possible on iscsi device\n"
> + if $options->{'snapshot-name'};
> +
> my $lun = ($class->parse_volname($volname))[1];
>
> return {
> diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
> index 7a274a3..e6892ec 100644
> --- a/src/PVE/Storage/Plugin.pm
> +++ b/src/PVE/Storage/Plugin.pm
> @@ -2004,6 +2004,11 @@ A hash reference with additional options.
>
> =over
>
> +=item C<< $options->{'snapshot-name'} >>
> +
> +(optional) The snapshot name. Set when the associated snapshot should be opened rather than the
> +volume itself.
> +
> =item C<< $options->{hints} >>
>
> A hash reference with hints indicating what the volume will be used for. This can be safely ignored
> @@ -2030,9 +2035,15 @@ sub qemu_blockdev_options {
>
> my $blockdev = {};
>
> - my ($path) = $class->filesystem_path($scfg, $volname);
> + my ($path) = $class->filesystem_path($scfg, $volname, $options->{'snapshot-name'});
>
> if ($path =~ m|^/|) {
> + # For qcow2 and qed the path of a snapshot will be the same, but it's not possible to attach
> + # the snapshot alone.
> + my $format = ($class->parse_volname($volname))[6];
> + die "cannot attach only the snapshot of a '$format' image\n"
> + if $options->{'snapshot-name'} && ($format eq 'qcow2' || $format eq 'qed');
> +
> # The 'file' driver only works for regular files. The check below is taken from
> # block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
> # devices here, those are not managed by the storage layer.
> diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
> index 6b37ba3..cf3d354 100644
> --- a/src/PVE/Storage/RBDPlugin.pm
> +++ b/src/PVE/Storage/RBDPlugin.pm
> @@ -530,6 +530,7 @@ sub qemu_blockdev_options {
> my ($name) = ($class->parse_volname($volname))[1];
>
> if ($scfg->{krbd}) {
> + $name .= '@' . $options->{'snapshot-name'} if $options->{'snapshot-name'};
> my $rbd_dev_path = get_rbd_dev_path($scfg, $storeid, $name);
> return { driver => 'host_device', filename => $rbd_dev_path };
> }
> @@ -540,6 +541,7 @@ sub qemu_blockdev_options {
> image => "$name",
> };
> $blockdev->{namespace} = "$scfg->{namespace}" if defined($scfg->{namespace});
> + $blockdev->{snapshot} = $options->{'snapshot-name'} if $options->{'snapshot-name'};
>
> $blockdev->{conf} = $cmd_option->{ceph_conf} if $cmd_option->{ceph_conf};
>
> diff --git a/src/PVE/Storage/ZFSPlugin.pm b/src/PVE/Storage/ZFSPlugin.pm
> index 0f64898..940d4f0 100644
> --- a/src/PVE/Storage/ZFSPlugin.pm
> +++ b/src/PVE/Storage/ZFSPlugin.pm
> @@ -250,6 +250,9 @@ sub path {
> sub qemu_blockdev_options {
> my ($class, $scfg, $storeid, $volname, $options) = @_;
>
> + die "direct access to snapshots not implemented\n"
> + if $options->{'snapshot-name'};
> +
> my $name = ($class->parse_volname($volname))[1];
> my $guid = $class->zfs_get_lu_name($scfg, $name);
> my $lun = $class->zfs_get_lun_number($scfg, $guid);
> diff --git a/src/PVE/Storage/ZFSPoolPlugin.pm b/src/PVE/Storage/ZFSPoolPlugin.pm
> index 677f88c..86f83a2 100644
> --- a/src/PVE/Storage/ZFSPoolPlugin.pm
> +++ b/src/PVE/Storage/ZFSPoolPlugin.pm
> @@ -165,6 +165,8 @@ sub path {
> sub qemu_blockdev_options {
> my ($class, $scfg, $storeid, $volname, $options) = @_;
>
> + die "cannot attach only the snapshot of a zvol\n" if $options->{'snapshot-name'};
> +
> my ($path) = $class->path($scfg, $volname, $storeid);
>
> my $blockdev = { driver => 'host_device', filename => $path };
> --
> 2.47.2
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 1/9] plugin: add " Fiona Ebner
@ 2025-07-01 9:28 ` Thomas Lamprecht
2025-07-01 11:01 ` Fiona Ebner
0 siblings, 1 reply; 24+ messages in thread
From: Thomas Lamprecht @ 2025-07-01 9:28 UTC (permalink / raw)
To: Proxmox VE development discussion, Fiona Ebner
Am 26.06.25 um 16:40 schrieb Fiona Ebner:
> This is in preparation to switch qemu-server from using '-drive' to
> the modern '-blockdev' in the QEMU commandline options as well as for
> the qemu-storage-daemon, which only supports '-blockdev'. The plugins
> know best what driver and options are needed to access an image, so
> a dedicated plugin method returning the necessary parameters for
> '-blockdev' is the most straight-forward.
>
> There intentionally is only handling for absolute paths in the default
> plugin implementation. Any plugin requiring more needs to implement
> the method itself. With PVE 9 being a major release and most popular
> plugins not using special protocols like 'rbd://', this seems
> acceptable.
>
> For NBD, etc. qemu-server should construct the blockdev object.
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>
> Changes in v3:
> * Mention that volume is activated before function is called in docs.
>
> src/PVE/Storage.pm | 17 +++++++++++
> src/PVE/Storage/Plugin.pm | 62 +++++++++++++++++++++++++++++++++++++++
> 2 files changed, 79 insertions(+)
>
> +=pod
> +
> +=head3 qemu_blockdev_options
> +
> + $blockdev = $plugin->qemu_blockdev_options($scfg, $storeid, $volname)
> +
> +Returns a hash reference with the basic options needed to open the volume via QEMU's C<-blockdev>
> +API. This at least requires a C<< $blockdev->{driver} >> and a reference to the image, e.g.
> +C<< $blockdev->{filename} >> for the C<file> driver. For files, the C<file> driver can be used. For
> +host block devices, the C<host_device> driver can be used. The plugin must not set options like
> +C<cache> or C<aio>. Those are managed by qemu-server and will be overwritten. For other available
> +drivers and the exact specification of the options, see
> +L<https://qemu.readthedocs.io/en/master/interop/qemu-qmp-ref.html#object-QMP-block-core.BlockdevOptions>
> +
> +While Perl does not have explicit types, the result will need to be converted to JSON later and
> +match the QMP specification (see link above), so implicit types are important. In the return value,
> +use C<JSON::true> and C<JSON::false> for booleans, C<"$value"> for strings, and C<int($value)> for
> +integers.
> +
> +The volume is activated before the function is called.
> +
> +Arguments:
> +
> +=over
> +
> +=item C<$scfg>
> +
> +The hash reference with the storage configuration.
> +
> +=item C<$storeid>
> +
> +The storage ID.
> +
> +=item C<$volume>
> +
> +The volume name.
bit much POD noise for my taste, couldn't the item's be at least placed
on a single line? E.g.:
=item C<$volume>: The volume name.
=item ...
> +
> +=back
> +
> +=cut
> +
> +sub qemu_blockdev_options {
> + my ($class, $scfg, $storeid, $volname) = @_;
might make sense to pass some versioning information here, as otherwise this
is a bit to tight coupling for my taste and might cause long term maintenance
headache.
It might be potentially enough to have the QEMU version and machine version
passed here, as then one can generate block dev options that can be understood
by qemu(-server) and are also compatible with the target machine version.
And are we sure that we want to allow passing arbitrary options here?
Could we at least generate some simple schema automatically from the QAPI or
the like? Could be also a specially prepared JSON file just for this purpose
shipped by our qemu package, or tracked separately so that we can track the
version in which a option first appeared or became obsolete.
> +
> + my $blockdev = {};
> +
> + my ($path) = $class->filesystem_path($scfg, $volname);
> +
> + if ($path =~ m|^/|) {
> + # The 'file' driver only works for regular files. The check below is taken from
> + # block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
> + # devices here, those are not managed by the storage layer.
> + my $st = File::stat::stat($path) or die "stat for '$path' failed - $!\n";
> + my $driver = (S_ISCHR($st->mode) || S_ISBLK($st->mode)) ? 'host_device' : 'file';
> + $blockdev = { driver => $driver, filename => $path };
> + } else {
> + die "storage plugin doesn't implement qemu_blockdev_options() method\n";
> + }
Should we rather default to an empty set of extra options? At least for external
plugins that would be the safer choice for upgrading, might not always work but
as is users can only loose FWICT?
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-07-01 9:28 ` Thomas Lamprecht
@ 2025-07-01 11:01 ` Fiona Ebner
2025-07-01 11:09 ` Fabian Grünbichler
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-07-01 11:01 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion
Am 01.07.25 um 11:28 schrieb Thomas Lamprecht:
> Am 26.06.25 um 16:40 schrieb Fiona Ebner:
>> +=item C<$scfg>
>> +
>> +The hash reference with the storage configuration.
>> +
>> +=item C<$storeid>
>> +
>> +The storage ID.
>> +
>> +=item C<$volume>
>> +
>> +The volume name.
>
> bit much POD noise for my taste, couldn't the item's be at least placed
> on a single line? E.g.:
>
> =item C<$volume>: The volume name.
> =item ...
Sure, fine by me, but blank lines between two =item elements seem to be
necessary.
>> +
>> +=back
>> +
>> +=cut
>> +
>> +sub qemu_blockdev_options {
>> + my ($class, $scfg, $storeid, $volname) = @_;
>
> might make sense to pass some versioning information here, as otherwise this
> is a bit to tight coupling for my taste and might cause long term maintenance
> headache.
>
> It might be potentially enough to have the QEMU version and machine version
> passed here, as then one can generate block dev options that can be understood
> by qemu(-server) and are also compatible with the target machine version.
It would seem pretty strange to me that an option for accessing an image
would depend on the machine version. We already rely on having an image
accessed via different drivers/options to be fully equivalent for
migration (or we couldn't combine block mirror and migration). OTOH, the
machine version is the natural guard and there is less potential for a
plugin to break live migration by accidentally setting or changing an
option it shouldn't have. And if a plugin depends on a specific binary
version for a feature, it can just also guard with the machine version
corresponding to that binary version. So I'd just pass along the machine
version and not the binary version.
> And are we sure that we want to allow passing arbitrary options here?
> Could we at least generate some simple schema automatically from the QAPI or
> the like? Could be also a specially prepared JSON file just for this purpose
> shipped by our qemu package, or tracked separately so that we can track the
> version in which a option first appeared or became obsolete.
Sorry, I don't understand what you mean here.
Verify the block device options in the return value that the plugin
implementation has given us? I don't think verifying would give us much,
as QEMU will already complain the same way.
Plugins should just use the most basic options to access/open the image.
Other options will be set by qemu-server. The POD mentions this, but
maybe it should be emphasized more?
>> +
>> + my $blockdev = {};
>> +
>> + my ($path) = $class->filesystem_path($scfg, $volname);
>> +
>> + if ($path =~ m|^/|) {
>> + # The 'file' driver only works for regular files. The check below is taken from
>> + # block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
>> + # devices here, those are not managed by the storage layer.
>> + my $st = File::stat::stat($path) or die "stat for '$path' failed - $!\n";
>> + my $driver = (S_ISCHR($st->mode) || S_ISBLK($st->mode)) ? 'host_device' : 'file';
>> + $blockdev = { driver => $driver, filename => $path };
>> + } else {
>> + die "storage plugin doesn't implement qemu_blockdev_options() method\n";
>> + }
>
> Should we rather default to an empty set of extra options? At least for external
> plugins that would be the safer choice for upgrading, might not always work but
> as is users can only loose FWICT?
What extra options do you mean? The default implementation here only
sets driver and filename which is the most minimal possible.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-07-01 11:01 ` Fiona Ebner
@ 2025-07-01 11:09 ` Fabian Grünbichler
2025-07-02 8:27 ` Thomas Lamprecht
2025-07-01 11:52 ` Fiona Ebner
2025-07-02 8:12 ` Thomas Lamprecht
2 siblings, 1 reply; 24+ messages in thread
From: Fabian Grünbichler @ 2025-07-01 11:09 UTC (permalink / raw)
To: Proxmox VE development discussion, Fiona Ebner, Thomas Lamprecht
> Fiona Ebner <f.ebner@proxmox.com> hat am 01.07.2025 13:01 CEST geschrieben:
>
>
> Am 01.07.25 um 11:28 schrieb Thomas Lamprecht:
> > Am 26.06.25 um 16:40 schrieb Fiona Ebner:
> >> +
> >> +=back
> >> +
> >> +=cut
> >> +
> >> +sub qemu_blockdev_options {
> >> + my ($class, $scfg, $storeid, $volname) = @_;
> >
> > might make sense to pass some versioning information here, as otherwise this
> > is a bit to tight coupling for my taste and might cause long term maintenance
> > headache.
we discussed this in the past - this coupling is already there (see below), and
either qemu-server needs to hard-code storage things (which excludes third party
plugins), or pve-storage needs to encode some qemu things (the option we chose,
since there was already precedent and QEMU does provide a somewhat stable
interface there).
> > It might be potentially enough to have the QEMU version and machine version
> > passed here, as then one can generate block dev options that can be understood
> > by qemu(-server) and are also compatible with the target machine version.
>
> It would seem pretty strange to me that an option for accessing an image
> would depend on the machine version. We already rely on having an image
> accessed via different drivers/options to be fully equivalent for
> migration (or we couldn't combine block mirror and migration). OTOH, the
> machine version is the natural guard and there is less potential for a
> plugin to break live migration by accidentally setting or changing an
> option it shouldn't have. And if a plugin depends on a specific binary
> version for a feature, it can just also guard with the machine version
> corresponding to that binary version. So I'd just pass along the machine
> version and not the binary version.
>
> > And are we sure that we want to allow passing arbitrary options here?
> > Could we at least generate some simple schema automatically from the QAPI or
> > the like? Could be also a specially prepared JSON file just for this purpose
> > shipped by our qemu package, or tracked separately so that we can track the
> > version in which a option first appeared or became obsolete.
this is also how plugins already work:
- path returns arbitrary drive options (if it's not an actual file path)
- plugins are 100% trusted and executed in root context already, so there isn't
much we can safeguard against..
> Sorry, I don't understand what you mean here.
>
> Verify the block device options in the return value that the plugin
> implementation has given us? I don't think verifying would give us much,
> as QEMU will already complain the same way.
>
> Plugins should just use the most basic options to access/open the image.
> Other options will be set by qemu-server. The POD mentions this, but
> maybe it should be emphasized more?
> >> +
> >> + my $blockdev = {};
> >> +
> >> + my ($path) = $class->filesystem_path($scfg, $volname);
> >> +
> >> + if ($path =~ m|^/|) {
> >> + # The 'file' driver only works for regular files. The check below is taken from
> >> + # block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
> >> + # devices here, those are not managed by the storage layer.
> >> + my $st = File::stat::stat($path) or die "stat for '$path' failed - $!\n";
> >> + my $driver = (S_ISCHR($st->mode) || S_ISBLK($st->mode)) ? 'host_device' : 'file';
> >> + $blockdev = { driver => $driver, filename => $path };
> >> + } else {
> >> + die "storage plugin doesn't implement qemu_blockdev_options() method\n";
> >> + }
> >
> > Should we rather default to an empty set of extra options? At least for external
> > plugins that would be the safer choice for upgrading, might not always work but
> > as is users can only loose FWICT?
>
> What extra options do you mean? The default implementation here only
> sets driver and filename which is the most minimal possible.
since this is not 100% obvious - this default implementation here is 100% backwards
compatible for plugins that previously returned a file or block device path. it's
just plugins that already return custom options that require their own implementation.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-07-01 11:01 ` Fiona Ebner
2025-07-01 11:09 ` Fabian Grünbichler
@ 2025-07-01 11:52 ` Fiona Ebner
2025-07-02 8:12 ` Thomas Lamprecht
2 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-07-01 11:52 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion
Am 01.07.25 um 13:01 schrieb Fiona Ebner:
> Am 01.07.25 um 11:28 schrieb Thomas Lamprecht:
>> Am 26.06.25 um 16:40 schrieb Fiona Ebner:
>>> +
>>> + my $blockdev = {};
>>> +
>>> + my ($path) = $class->filesystem_path($scfg, $volname);
>>> +
>>> + if ($path =~ m|^/|) {
>>> + # The 'file' driver only works for regular files. The check below is taken from
>>> + # block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
>>> + # devices here, those are not managed by the storage layer.
>>> + my $st = File::stat::stat($path) or die "stat for '$path' failed - $!\n";
>>> + my $driver = (S_ISCHR($st->mode) || S_ISBLK($st->mode)) ? 'host_device' : 'file';
>>> + $blockdev = { driver => $driver, filename => $path };
>>> + } else {
>>> + die "storage plugin doesn't implement qemu_blockdev_options() method\n";
>>> + }
>>
>> Should we rather default to an empty set of extra options? At least for external
>> plugins that would be the safer choice for upgrading, might not always work but
>> as is users can only loose FWICT?
>
> What extra options do you mean? The default implementation here only
> sets driver and filename which is the most minimal possible.
Do you mean restricting what the individual plugins may return and
verify that in the Storage.pm's implementation of qemu_blockdev_options()?
E.g. a hash with entries for allowed drivers and allowed driver-specific
options like
$allowed = {
file => {
filename => 1,
},
host_device => {
filename => 1,
},
rbd => {
...
},
...
};
Then plugin authors that already require a custom implementation would
need to tell us what they need first and we'd need to allow it.
Is this somewhat what you meant?
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 4/9] zfs pool plugin: implement method to get qemu blockdevice options
2025-06-30 11:20 ` Fabian Grünbichler
@ 2025-07-01 12:08 ` Fiona Ebner
0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-07-01 12:08 UTC (permalink / raw)
To: Proxmox VE development discussion, Fabian Grünbichler
Am 30.06.25 um 13:20 schrieb Fabian Grünbichler:
> On June 26, 2025 4:40 pm, Fiona Ebner wrote:
>> ZFS does not have a filesystem_path() method, so the default
>> implementation for qemu_blockdev_options() cannot be re-used. This is
>> most likely, because snapshots are currently not directly accessible
>> via a filesystem path in the Proxmox VE storage layer.
>>
>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>> ---
>>
>> No changes in v3.
>>
>> src/PVE/Storage/ZFSPoolPlugin.pm | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/src/PVE/Storage/ZFSPoolPlugin.pm b/src/PVE/Storage/ZFSPoolPlugin.pm
>> index 713d26f..4479fe4 100644
>> --- a/src/PVE/Storage/ZFSPoolPlugin.pm
>> +++ b/src/PVE/Storage/ZFSPoolPlugin.pm
>> @@ -162,6 +162,16 @@ sub path {
>> return ($path, $vmid, $vtype);
>> }
>>
>> +sub qemu_blockdev_options {
>> + my ($class, $scfg, $storeid, $volname) = @_;
>> +
>> + my ($path) = $class->path($scfg, $volname, $storeid);
>> +
>> + my $blockdev = { driver => 'host_device', filename => $path };
>> +
>> + return $blockdev;
>
> should we assert that $volname is a zvol?
Will add it in v2.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 6/9] rbd plugin: implement new method to get qemu blockdevice options
2025-06-30 11:19 ` Fabian Grünbichler
@ 2025-07-01 12:15 ` Fiona Ebner
0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-07-01 12:15 UTC (permalink / raw)
To: Proxmox VE development discussion, Fabian Grünbichler
Am 30.06.25 um 13:19 schrieb Fabian Grünbichler:
> On June 26, 2025 4:40 pm, Fiona Ebner wrote:
>> The mon host parsing is adapted from other places. While there
>> currently is no support for vector notation in the storage config
>> (it's a pve-storage-portal-dns-list option), it doesn't hurt to
>> anticipate it, should the list of mon hosts come from a ceph.conf
>> instead anytime in the future.
>
> but in `path` we don't do this? shouldn't this be consistent?
Okay, I'll drop it in v4.
> also, we already have a full helper for getting a mon list from a ceph
> config, so if we ever use that as source here, shouldn't we use that
> helper?
Would need to be adapted a bit (to avoid joining+splitting), but sure.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [RFC v3 storage 8/9] plugin: qemu block device: add support for snapshot option
2025-06-30 11:40 ` Fabian Grünbichler
@ 2025-07-01 12:23 ` Fiona Ebner
0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-07-01 12:23 UTC (permalink / raw)
To: Proxmox VE development discussion, Fabian Grünbichler
Am 30.06.25 um 13:40 schrieb Fabian Grünbichler:
> On June 26, 2025 4:40 pm, Fiona Ebner wrote:
>> This is mostly in preparation for external qcow2 snapshot support.
>>
>> For internal qcow2 snapshots, which currently are the only supported
>> variant, it is not possible to attach the snapshot only. If access to
>> that is required it will need to be handled differently, e.g. via a
>> FUSE/NBD export.
>
> just for the avoidance of doubt since it's only implied and not spelled
> out explicitly - we don't do any such accesses at the moment, right?
>
> linked clones cannot be done for running VMs, full clones of snapshots
> don't happen within the running VM but via qemu-img convert, and nothing
> else besides replication (which is ZFS specific) and manual `pvesm
> export` calls (which are on the storage level by definition) directly
> reads from a storage snapshot?
Yes, we do not currently have such accesses. Neither the commandline
generation (and thus also not drive hotplug) nor drive-mirror in
qemu-server has support for specifying a snapshot. Yes, we only access
internal qcow2 snapshots via qemu-img. This is just in anticipation for
Alexandre's external snapshot series.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-07-01 11:01 ` Fiona Ebner
2025-07-01 11:09 ` Fabian Grünbichler
2025-07-01 11:52 ` Fiona Ebner
@ 2025-07-02 8:12 ` Thomas Lamprecht
2025-07-02 8:32 ` Fiona Ebner
2 siblings, 1 reply; 24+ messages in thread
From: Thomas Lamprecht @ 2025-07-02 8:12 UTC (permalink / raw)
To: Fiona Ebner, Proxmox VE development discussion
Am 01.07.25 um 13:01 schrieb Fiona Ebner:
> Am 01.07.25 um 11:28 schrieb Thomas Lamprecht:
>> It might be potentially enough to have the QEMU version and machine version
>> passed here, as then one can generate block dev options that can be understood
>> by qemu(-server) and are also compatible with the target machine version.
>
> It would seem pretty strange to me that an option for accessing an image
> would depend on the machine version.
How so? New QEMU learns a new option for accessing images that we want to
adopt but can't do so transparently because it's not backwards compatible?
> We already rely on having an image accessed via different drivers/options
> to be fully equivalent for migration (or we couldn't combine block mirror
> and migration).
Well, that's the current assumption that worked out mostly by luck and
because we kept to standard options, that doesn't have to stay that way
ant this interface would allow.
> OTOH, the machine version is the natural guard and there is less potential
> for a plugin to break live migration by accidentally setting or changing an
> option it shouldn't have. And if a plugin depends on a specific binary
> version for a feature, it can just also guard with the machine version
> corresponding to that binary version. So I'd just pass along the machine
> version and not the binary version.
Yeah, just the machine version would be probably enough.
>> And are we sure that we want to allow passing arbitrary options here?
>> Could we at least generate some simple schema automatically from the QAPI or
>> the like? Could be also a specially prepared JSON file just for this purpose
>> shipped by our qemu package, or tracked separately so that we can track the
>> version in which a option first appeared or became obsolete.
>
> Sorry, I don't understand what you mean here.
You talk about this being QEMU block dev options, so are they or are they
mapped by us in qemu-server? If the former I'd need a more specific question,
with the latter this is still not ideal but doesn't matter much for now.
> Verify the block device options in the return value that the plugin
> implementation has given us? I don't think verifying would give us much,
> as QEMU will already complain the same way.
What QEMU understands and what we want to expose and allow might be very
different things.
> Plugins should just use the most basic options to access/open the image.
> Other options will be set by qemu-server. The POD mentions this, but
> maybe it should be emphasized more?
Currently, plugins cannot set options directly at all, with this series, they
can set anything we do not explicitly overwrite afterward in qemu-server IIUC.
Going back from that would need a major break of the storage API. And given
that we would like to minify the attack surface of QEMU/KVM in the future (not
having all mounts/blockdevs/..) accessible, I could easily imagine that we do
not want to expose all blockdev options just because they are there, starting
out with a smaller set would easily allow this, especially if it's available
for plugins to check if using an option is supported/allowed; that would
have little implementation cost while reducing friction in managing this
API surface thus reducing also maintenance cost for both us and those
maintaining plugins.
If unsure, and I really have not seen any actual justification presented why
such an API should be required, let's please start out with a more limiting API
first, as for APIs allowing more flexibility is always simpler than restricting
it later on.
>>> + my $blockdev = {};
>>> +
>>> + my ($path) = $class->filesystem_path($scfg, $volname);
>>> +
>>> + if ($path =~ m|^/|) {
>>> + # The 'file' driver only works for regular files. The check below is taken from
>>> + # block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
>>> + # devices here, those are not managed by the storage layer.
>>> + my $st = File::stat::stat($path) or die "stat for '$path' failed - $!\n";
>>> + my $driver = (S_ISCHR($st->mode) || S_ISBLK($st->mode)) ? 'host_device' : 'file';
>>> + $blockdev = { driver => $driver, filename => $path };
>>> + } else {
>>> + die "storage plugin doesn't implement qemu_blockdev_options() method\n";
>>> + }
>> Should we rather default to an empty set of extra options? At least for external
>> plugins that would be the safer choice for upgrading, might not always work but
>> as is users can only loose FWICT?
> What extra options do you mean? The default implementation here only
> sets driver and filename which is the most minimal possible.
>
The default implementation die's if it's not a path based.. So any existing
external plugin that isn't path based will break with this; I said it a lot
of times already, I do not want unnecessary breaking changes for the storage
API..
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-07-01 11:09 ` Fabian Grünbichler
@ 2025-07-02 8:27 ` Thomas Lamprecht
2025-07-02 8:40 ` Fabian Grünbichler
0 siblings, 1 reply; 24+ messages in thread
From: Thomas Lamprecht @ 2025-07-02 8:27 UTC (permalink / raw)
To: Fabian Grünbichler, Proxmox VE development discussion, Fiona Ebner
Am 01.07.25 um 13:09 schrieb Fabian Grünbichler:
>> Am 01.07.25 um 11:28 schrieb Thomas Lamprecht:
>>> might make sense to pass some versioning information here, as otherwise this
>>> is a bit to tight coupling for my taste and might cause long term maintenance
>>> headache.
>
> we discussed this in the past - this coupling is already there (see below), and
> either qemu-server needs to hard-code storage things (which excludes third party
> plugins), or pve-storage needs to encode some qemu things (the option we chose,
> since there was already precedent and QEMU does provide a somewhat stable
> interface there).
> this is also how plugins already work:
> - path returns arbitrary drive options (if it's not an actual file path)
Which means my assumption that external plugins could not pass options
in my other reply was wrong, so we would at least need to grandfather
limitations in (like checking if api-version > current-version), not ideal
but would be a way forward.
> - plugins are 100% trusted and executed in root context already, so there isn't
> much we can safeguard against..
So, we do lots of work here and got the chance to quite probably make
the future target of having a more contained plugin and QEMU a little
bit easier just to argue moving laterally because the status quo is
like that? I'd care much if this would be internal only, but as it's a
versioned public API that I do not want to break if not necessary I'd like
to have this explored more, not just (to exaggerate, I know you both
are quite thorough) hand waved away.
I mean, OTOH, once actual works towards plugin/VM containment starts we
probably want to have a feature for that storage plugins need to signal
support for, maybe combined with some other mechanisms, and that again
would be grandfathered in for one major version (and/or configurable by
the user) so we probably can still move towards more limitations also here
in the future without a hard break. So, the option limitations don't need
to necessarily happen now, but I'd really add the machine version now already,
as that should at least allows plugin authors to better navigate the coupling
we got here.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-07-02 8:12 ` Thomas Lamprecht
@ 2025-07-02 8:32 ` Fiona Ebner
0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2025-07-02 8:32 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion
Am 02.07.25 um 10:12 schrieb Thomas Lamprecht:
> Am 01.07.25 um 13:01 schrieb Fiona Ebner:
>> Am 01.07.25 um 11:28 schrieb Thomas Lamprecht:
>
>>> It might be potentially enough to have the QEMU version and machine version
>>> passed here, as then one can generate block dev options that can be understood
>>> by qemu(-server) and are also compatible with the target machine version.
>>
>> It would seem pretty strange to me that an option for accessing an image
>> would depend on the machine version.
>
> How so? New QEMU learns a new option for accessing images that we want to
> adopt but can't do so transparently because it's not backwards compatible?
It's much more likely that that depends only on the binary version, not
on the machine version. Again, I'm talking about "accessing an image",
which should not be any concern to how the virtual hardware looks like.
I'm not talking about options for front-end drive devices here.
>> We already rely on having an image accessed via different drivers/options
>> to be fully equivalent for migration (or we couldn't combine block mirror
>> and migration).
>
> Well, that's the current assumption that worked out mostly by luck and
> because we kept to standard options, that doesn't have to stay that way
> ant this interface would allow.
No, that's not luck. Again, you could not use block mirror in
combination with migration otherwise, because that necessarily changes
the command line. But the command line for the back-end block device
should not have any influence on the migration stream. Only the
front-end drive device matters.
>>> And are we sure that we want to allow passing arbitrary options here?
>>> Could we at least generate some simple schema automatically from the QAPI or
>>> the like? Could be also a specially prepared JSON file just for this purpose
>>> shipped by our qemu package, or tracked separately so that we can track the
>>> version in which a option first appeared or became obsolete.
>>
>> Sorry, I don't understand what you mean here.
>
> You talk about this being QEMU block dev options, so are they or are they
> mapped by us in qemu-server? If the former I'd need a more specific question,
> with the latter this is still not ideal but doesn't matter much for now.
No, they are not mapped by us.
>> Verify the block device options in the return value that the plugin
>> implementation has given us? I don't think verifying would give us much,
>> as QEMU will already complain the same way.
>
> What QEMU understands and what we want to expose and allow might be very
> different things.
>
>> Plugins should just use the most basic options to access/open the image.
>> Other options will be set by qemu-server. The POD mentions this, but
>> maybe it should be emphasized more?
>
> Currently, plugins cannot set options directly at all, with this series, they
> can set anything we do not explicitly overwrite afterward in qemu-server IIUC.
> Going back from that would need a major break of the storage API. And given
> that we would like to minify the attack surface of QEMU/KVM in the future (not
> having all mounts/blockdevs/..) accessible, I could easily imagine that we do
> not want to expose all blockdev options just because they are there, starting
> out with a smaller set would easily allow this, especially if it's available
> for plugins to check if using an option is supported/allowed; that would
> have little implementation cost while reducing friction in managing this
> API surface thus reducing also maintenance cost for both us and those
> maintaining plugins.
>
> If unsure, and I really have not seen any actual justification presented why
> such an API should be required, let's please start out with a more limiting API
> first, as for APIs allowing more flexibility is always simpler than restricting
> it later on.
Okay, see also my other reply:
https://lore.proxmox.com/pve-devel/00842c28-b265-434e-9b42-2874c6195399@proxmox.com/
>>>> + my $blockdev = {};
>>>> +
>>>> + my ($path) = $class->filesystem_path($scfg, $volname);
>>>> +
>>>> + if ($path =~ m|^/|) {
>>>> + # The 'file' driver only works for regular files. The check below is taken from
>>>> + # block/file-posix.c:hdev_probe_device() in QEMU. Do not bother with detecting 'host_cdrom'
>>>> + # devices here, those are not managed by the storage layer.
>>>> + my $st = File::stat::stat($path) or die "stat for '$path' failed - $!\n";
>>>> + my $driver = (S_ISCHR($st->mode) || S_ISBLK($st->mode)) ? 'host_device' : 'file';
>>>> + $blockdev = { driver => $driver, filename => $path };
>>>> + } else {
>>>> + die "storage plugin doesn't implement qemu_blockdev_options() method\n";
>>>> + }
>>> Should we rather default to an empty set of extra options? At least for external
>>> plugins that would be the safer choice for upgrading, might not always work but
>>> as is users can only loose FWICT?
>> What extra options do you mean? The default implementation here only
>> sets driver and filename which is the most minimal possible.
>>
>
> The default implementation die's if it's not a path based.. So any existing
> external plugin that isn't path based will break with this; I said it a lot
> of times already, I do not want unnecessary breaking changes for the storage
> API..
Okay, we can add conversion of common protocol paths too. This is
partially at odds with restricting allowed options, because what if a
plugin sets an option in the protocol path, but we don't want to allow
that option?
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [pve-devel] [PATCH v3 storage 1/9] plugin: add method to get qemu blockdevice options for volume
2025-07-02 8:27 ` Thomas Lamprecht
@ 2025-07-02 8:40 ` Fabian Grünbichler
0 siblings, 0 replies; 24+ messages in thread
From: Fabian Grünbichler @ 2025-07-02 8:40 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion, Fiona Ebner
> Thomas Lamprecht <t.lamprecht@proxmox.com> hat am 02.07.2025 10:27 CEST geschrieben:
>
>
> Am 01.07.25 um 13:09 schrieb Fabian Grünbichler:
> >> Am 01.07.25 um 11:28 schrieb Thomas Lamprecht:
> >>> might make sense to pass some versioning information here, as otherwise this
> >>> is a bit to tight coupling for my taste and might cause long term maintenance
> >>> headache.
> >
> > we discussed this in the past - this coupling is already there (see below), and
> > either qemu-server needs to hard-code storage things (which excludes third party
> > plugins), or pve-storage needs to encode some qemu things (the option we chose,
> > since there was already precedent and QEMU does provide a somewhat stable
> > interface there).
>
> > this is also how plugins already work:
> > - path returns arbitrary drive options (if it's not an actual file path)
>
> Which means my assumption that external plugins could not pass options
> in my other reply was wrong, so we would at least need to grandfather
> limitations in (like checking if api-version > current-version), not ideal
> but would be a way forward.
yes. we initially also evaluated trying to avoid all breakage here and
mapping back a returned `path` string to the new block dev options syntax,
but given that we don't/can't know what external plugins are doing, that is
probably also error-prone..
> > - plugins are 100% trusted and executed in root context already, so there isn't
> > much we can safeguard against..
> So, we do lots of work here and got the chance to quite probably make
> the future target of having a more contained plugin and QEMU a little
> bit easier just to argue moving laterally because the status quo is
> like that? I'd care much if this would be internal only, but as it's a
> versioned public API that I do not want to break if not necessary I'd like
> to have this explored more, not just (to exaggerate, I know you both
> are quite thorough) hand waved away.
no, I just wanted to avoid the impression that this opens up some kind of
big hole here. and I think even if we make progress on locking down the
storage layer, we'd still need to delegate part of that work to the plugin
or admin in case of external plugins, so plugins will remain "trusted" in
that sense probably forever, even if their code during execution is put
into some kind of sandbox to reduce the blast radius of bugs.
> I mean, OTOH, once actual works towards plugin/VM containment starts we
> probably want to have a feature for that storage plugins need to signal
> support for, maybe combined with some other mechanisms, and that again
> would be grandfathered in for one major version (and/or configurable by
> the user) so we probably can still move towards more limitations also here
> in the future without a hard break. So, the option limitations don't need
> to necessarily happen now, but I'd really add the machine version now already,
> as that should at least allows plugin authors to better navigate the coupling
> we got here.
yes, adding the machine version so that plugin code can make informed decisions
seems sensible for future-proofing.
and if we can make that work, adding a schema in pve-storage to limit what a
plugin can return can also make sense, but I am not sure how much work that
entails in practice. there is some additional potential for breakage if we don't
account for all things external plugins might use (or if we want to forbid certain
things, but then it would be intentional breakage from our side ;)).
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2025-07-02 8:40 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 1/9] plugin: add " Fiona Ebner
2025-07-01 9:28 ` Thomas Lamprecht
2025-07-01 11:01 ` Fiona Ebner
2025-07-01 11:09 ` Fabian Grünbichler
2025-07-02 8:27 ` Thomas Lamprecht
2025-07-02 8:40 ` Fabian Grünbichler
2025-07-01 11:52 ` Fiona Ebner
2025-07-02 8:12 ` Thomas Lamprecht
2025-07-02 8:32 ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 2/9] iscsi direct plugin: implement method to get qemu blockdevice options Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 3/9] zfs iscsi plugin: implement new " Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 4/9] zfs pool plugin: implement " Fiona Ebner
2025-06-30 11:20 ` Fabian Grünbichler
2025-07-01 12:08 ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 5/9] ceph/rbd: set 'keyring' in ceph configuration for externally managed RBD storages Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 6/9] rbd plugin: implement new method to get qemu blockdevice options Fiona Ebner
2025-06-30 11:19 ` Fabian Grünbichler
2025-07-01 12:15 ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 7/9] plugin: qemu block device: add hints option and EFI disk hint Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 8/9] plugin: qemu block device: add support for snapshot option Fiona Ebner
2025-06-30 11:40 ` Fabian Grünbichler
2025-07-01 12:23 ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 9/9] plugin api: bump api version and age Fiona Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal