all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [RFC v3 storage 7/9] plugin: qemu block device: add hints option and EFI disk hint
Date: Thu, 26 Jun 2025 16:40:21 +0200	[thread overview]
Message-ID: <20250626144644.279679-8-f.ebner@proxmox.com> (raw)
In-Reply-To: <20250626144644.279679-1-f.ebner@proxmox.com>

For '-drive', qemu-server sets special cache options for EFI disk
using RBD. In preparation to seamlessly switch to the new '-blockdev'
interface, do the same here. Note that the issue from bug #3329, which
is solved by these cache options, still affects current versions.

With -blockdev, the cache options are split up. While cache.direct and
cache.no-flush can be set in the -blockdev options, cache.writeback is
a front-end property and was intentionally removed from the -blockdev
options by QEMU commit aaa436f998 ("block: Remove cache.writeback from
blockdev-add"). It needs to be configured as the 'write-cache'
property for the ide-hd/scsi-hd/virtio-blk device.

The default is already 'writeback' and no cache mode can be set for an
EFI drive configuration in Proxmox VE currently, so there will not be
a clash.

┌─────────────┬─────────────────┬──────────────┬────────────────┐
│             │ cache.writeback │ cache.direct │ cache.no-flush │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│writeback    │ on              │ off          │ off            │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│none         │ on              │ on           │ off            │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│writethrough │ off             │ off          │ off            │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│directsync   │ off             │ on           │ off            │
├─────────────┼─────────────────┼──────────────┼────────────────┤
│unsafe       │ on              │ off          │ on             │
└─────────────┴─────────────────┴──────────────┴────────────────┘

Table from 'man kvm'.

Alternatively, the option could only be set once when allocating the
RBD volume. However, then we would need to detect all cases were a
volume could potentially be used as an EFI disk later. Having a custom
disk type would help a lot there. The approach here was chosen as it
is catch-all and should not be too costly either.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* Set the policy via 'rbd config image set'.

 src/PVE/Storage.pm                   |  4 ++--
 src/PVE/Storage/ISCSIDirectPlugin.pm |  2 +-
 src/PVE/Storage/Plugin.pm            | 27 +++++++++++++++++++++++++--
 src/PVE/Storage/RBDPlugin.pm         | 20 +++++++++++++++++++-
 src/PVE/Storage/ZFSPlugin.pm         |  2 +-
 src/PVE/Storage/ZFSPoolPlugin.pm     |  2 +-
 6 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index ec8b753..4920bd6 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -721,7 +721,7 @@ sub abs_filesystem_path {
 
 # see the documentation for the plugin method
 sub qemu_blockdev_options {
-    my ($cfg, $volid) = @_;
+    my ($cfg, $volid, $options) = @_;
 
     my ($storeid, $volname) = parse_volume_id($volid);
 
@@ -733,7 +733,7 @@ sub qemu_blockdev_options {
     die "cannot use volume of type '$vtype' as a QEMU blockdevice\n"
         if $vtype ne 'images' && $vtype ne 'iso' && $vtype ne 'import';
 
-    return $plugin->qemu_blockdev_options($scfg, $storeid, $volname);
+    return $plugin->qemu_blockdev_options($scfg, $storeid, $volname, $options);
 }
 
 # used as last resort to adapt volnames when migrating
diff --git a/src/PVE/Storage/ISCSIDirectPlugin.pm b/src/PVE/Storage/ISCSIDirectPlugin.pm
index 8c6b4ab..12b894d 100644
--- a/src/PVE/Storage/ISCSIDirectPlugin.pm
+++ b/src/PVE/Storage/ISCSIDirectPlugin.pm
@@ -111,7 +111,7 @@ sub path {
 }
 
 sub qemu_blockdev_options {
-    my ($class, $scfg, $storeid, $volname) = @_;
+    my ($class, $scfg, $storeid, $volname, $options) = @_;
 
     my $lun = ($class->parse_volname($volname))[1];
 
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 9951272..7a274a3 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -1965,7 +1965,7 @@ sub rename_volume {
 
 =head3 qemu_blockdev_options
 
-    $blockdev = $plugin->qemu_blockdev_options($scfg, $storeid, $volname)
+    $blockdev = $plugin->qemu_blockdev_options($scfg, $storeid, $volname, $options)
 
 Returns a hash reference with the basic options needed to open the volume via QEMU's C<-blockdev>
 API. This at least requires a C<< $blockdev->{driver} >> and a reference to the image, e.g.
@@ -1998,12 +1998,35 @@ The storage ID.
 
 The volume name.
 
+=item C<$options>
+
+A hash reference with additional options.
+
+=over
+
+=item C<< $options->{hints} >>
+
+A hash reference with hints indicating what the volume will be used for. This can be safely ignored
+if no concrete issues are known with your plugin. For certain use cases, setting additional
+(plugin-specific) options might be very beneficial however. An example is setting the correct cache
+options for an EFI disk on RBD. The list of hints might get expanded in the future.
+
+=over
+
+=item C<< $options->{hints}->{'efi-disk'} >>
+
+(optional) If set, the volume will be used as the EFI disk of a VM, containing its OMVF variables.
+
+=back
+
+=back
+
 =back
 
 =cut
 
 sub qemu_blockdev_options {
-    my ($class, $scfg, $storeid, $volname) = @_;
+    my ($class, $scfg, $storeid, $volname, $options) = @_;
 
     my $blockdev = {};
 
diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
index 229b07d..6b37ba3 100644
--- a/src/PVE/Storage/RBDPlugin.pm
+++ b/src/PVE/Storage/RBDPlugin.pm
@@ -374,6 +374,16 @@ my sub rbd_volume_exists {
     return 0;
 }
 
+# Needs to be public, so qemu-server can mock it for cfg2cmd.
+sub rbd_volume_config_set {
+    my ($scfg, $storeid, $volname, $key, $value) = @_;
+
+    my $cmd = $rbd_cmd->($scfg, $storeid, 'config', 'image', 'set', $volname, $key, $value);
+    run_rbd_command($cmd, errmsg => "rbd config image set $volname $key $value error");
+
+    return;
+}
+
 # Configuration
 
 sub type {
@@ -514,7 +524,7 @@ sub path {
 }
 
 sub qemu_blockdev_options {
-    my ($class, $scfg, $storeid, $volname) = @_;
+    my ($class, $scfg, $storeid, $volname, $options) = @_;
 
     my $cmd_option = PVE::CephConfig::ceph_connect_option($scfg, $storeid);
     my ($name) = ($class->parse_volname($volname))[1];
@@ -549,6 +559,14 @@ sub qemu_blockdev_options {
 
     $blockdev->{user} = "$cmd_option->{userid}" if $cmd_option->{keyring};
 
+    # SPI flash does lots of read-modify-write OPs, without writeback this gets really slow #3329
+    if ($options->{hints}->{'efi-disk'}) {
+        # Querying the value would just cost more and the 'rbd image config get' command will just
+        # fail if the config has not been set yet, so it's not even straight-forward to do so.
+        # Simply set the value (possibly again).
+        rbd_volume_config_set($scfg, $storeid, $name, 'rbd_cache_policy', 'writeback');
+    }
+
     return $blockdev;
 }
 
diff --git a/src/PVE/Storage/ZFSPlugin.pm b/src/PVE/Storage/ZFSPlugin.pm
index c03fcca..0f64898 100644
--- a/src/PVE/Storage/ZFSPlugin.pm
+++ b/src/PVE/Storage/ZFSPlugin.pm
@@ -248,7 +248,7 @@ sub path {
 }
 
 sub qemu_blockdev_options {
-    my ($class, $scfg, $storeid, $volname) = @_;
+    my ($class, $scfg, $storeid, $volname, $options) = @_;
 
     my $name = ($class->parse_volname($volname))[1];
     my $guid = $class->zfs_get_lu_name($scfg, $name);
diff --git a/src/PVE/Storage/ZFSPoolPlugin.pm b/src/PVE/Storage/ZFSPoolPlugin.pm
index 4479fe4..677f88c 100644
--- a/src/PVE/Storage/ZFSPoolPlugin.pm
+++ b/src/PVE/Storage/ZFSPoolPlugin.pm
@@ -163,7 +163,7 @@ sub path {
 }
 
 sub qemu_blockdev_options {
-    my ($class, $scfg, $storeid, $volname) = @_;
+    my ($class, $scfg, $storeid, $volname, $options) = @_;
 
     my ($path) = $class->path($scfg, $volname, $storeid);
 
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

  parent reply	other threads:[~2025-06-26 14:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-26 14:40 [pve-devel] [PATCH-SERIES v3 storage 0/9] storage plugin method to get qemu blockdevice options for volume Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 1/9] plugin: add " Fiona Ebner
2025-07-01  9:28   ` Thomas Lamprecht
2025-07-01 11:01     ` Fiona Ebner
2025-07-01 11:09       ` Fabian Grünbichler
2025-07-02  8:27         ` Thomas Lamprecht
2025-07-02  8:40           ` Fabian Grünbichler
2025-07-01 11:52       ` Fiona Ebner
2025-07-02  8:12       ` Thomas Lamprecht
2025-07-02  8:32         ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 2/9] iscsi direct plugin: implement method to get qemu blockdevice options Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 3/9] zfs iscsi plugin: implement new " Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 4/9] zfs pool plugin: implement " Fiona Ebner
2025-06-30 11:20   ` Fabian Grünbichler
2025-07-01 12:08     ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 5/9] ceph/rbd: set 'keyring' in ceph configuration for externally managed RBD storages Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 6/9] rbd plugin: implement new method to get qemu blockdevice options Fiona Ebner
2025-06-30 11:19   ` Fabian Grünbichler
2025-07-01 12:15     ` Fiona Ebner
2025-06-26 14:40 ` Fiona Ebner [this message]
2025-06-26 14:40 ` [pve-devel] [RFC v3 storage 8/9] plugin: qemu block device: add support for snapshot option Fiona Ebner
2025-06-30 11:40   ` Fabian Grünbichler
2025-07-01 12:23     ` Fiona Ebner
2025-06-26 14:40 ` [pve-devel] [PATCH v3 storage 9/9] plugin api: bump api version and age Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250626144644.279679-8-f.ebner@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal