* [pve-devel] [PATCH storage 1/5] plugin: introduce hints for activating and mapping volumes
2025-10-31 10:36 [pve-devel] [PATCH qemu-server/storage 0/6] fix #5779: introduce guest hints to pass rxbounce flag to KRBD Friedrich Weber
@ 2025-10-31 10:36 ` Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH storage 2/5] plugin api: bump api version and age Friedrich Weber
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Friedrich Weber @ 2025-10-31 10:36 UTC (permalink / raw)
To: pve-devel
Currently, when a storage plugin activates or maps a guest volume, it
has no information about the respective guest. This is by design to
reduce coupling between the storage layer and the upper layers.
However, in some cases, storage plugins may need to activate volumes
differently based on certain features of the guest. An example is the
RBD plugin with KRBD enabled, where guest volumes of Windows VMs have
to be mapped with the rxbounce option.
Introduce "hints" as a mechanism that allows the upper layers to pass
down well-defined information to the storage plugins on volume
activation/mapping. The storage plugin can make adjustments to its
volume activation/mapping based on the hints. The supported hints are
specified by a JSON schema and may be extended in the future.
Add a subroutine that checks whether a particular hint is supported
(may be used by the storage plugin as well as upper layers). This
allows to add further hints without having to bump pve-storage, since
upper layers can just check whether the current pve-storage supports a
particular hint.
The Boolean 'guest-is-windows' hint denotes that the
to-be-activated/mapped volume belongs to a Windows VM.
It is not guaranteed that the volume is inactive when
{activate,map}_volume are called, and it is not guaranteed that hints
are passed on every storage activation. Hence, it can happen that a
volume is already active but applying the hint would require unmapping
the volume and mapping it again with the hint applied (this is the
case for rxbounce). To cover such cases, the Boolean hint
'plugin-may-deactivate-volume' denotes whether unmapping the volume is
currently safe. Only if this hint is true, the plugin may deactivate
the volume and map it again with the hint applied.
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---
src/PVE/Storage/Plugin.pm | 43 +++++++++++++++++++++++++++++++++++++--
1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 8acd214..4fe903f 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -59,6 +59,25 @@ cfs_register_file(
sub { __PACKAGE__->write_config(@_); },
);
+our $hints_properties = {
+ 'guest-is-windows' => {
+ type => 'boolean',
+ optional => 1,
+ description => "true if the volume belongs to a Windows VM guest",
+ },
+ 'plugin-may-deactivate-volume' => {
+ type => 'boolean',
+ optional => 1,
+ description => "true if the plugin may deactivate the volume in order to apply a hint",
+ },
+};
+
+our $hints_format = {
+ type => 'object',
+ additionalProperties => 0,
+ properties => $hints_properties,
+};
+
my %prune_option = (
optional => 1,
type => 'integer',
@@ -614,6 +633,26 @@ sub preallocation_cmd_opt {
return;
}
+sub is_hint_supported {
+ my ($hint) = @_;
+
+ return defined($hints_properties->{$hint});
+}
+
+sub verify_hints {
+ my ($hints, $noerr) = @_;
+
+ return if !defined($hints);
+
+ eval { PVE::JSONSchema::validate($hints, $hints_format); };
+ my $err = $@;
+
+ return $hints if !$err;
+ return if $noerr;
+
+ die "internal error - hints are not valid: $@";
+}
+
# Storage implementation
=head3 get_formats
@@ -1871,7 +1910,7 @@ sub deactivate_storage {
}
sub map_volume {
- my ($class, $storeid, $scfg, $volname, $snapname) = @_;
+ my ($class, $storeid, $scfg, $volname, $snapname, $hints) = @_;
my ($path) = $class->path($scfg, $volname, $storeid, $snapname);
return $path;
@@ -1884,7 +1923,7 @@ sub unmap_volume {
}
sub activate_volume {
- my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
+ my ($class, $storeid, $scfg, $volname, $snapname, $cache, $hints) = @_;
my $path = $class->filesystem_path($scfg, $volname, $snapname);
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread* [pve-devel] [PATCH storage 2/5] plugin api: bump api version and age
2025-10-31 10:36 [pve-devel] [PATCH qemu-server/storage 0/6] fix #5779: introduce guest hints to pass rxbounce flag to KRBD Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH storage 1/5] plugin: introduce hints for activating and mapping volumes Friedrich Weber
@ 2025-10-31 10:36 ` Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH storage 3/5] storage: activate/map volumes: verify and pass hints to plugin Friedrich Weber
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Friedrich Weber @ 2025-10-31 10:36 UTC (permalink / raw)
To: pve-devel
Introduce $hints parameter to activate_volume() and map_volume().
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---
ApiChangeLog | 42 ++++++++++++++++++++++++++++++++++++++++++
src/PVE/Storage.pm | 4 ++--
2 files changed, 44 insertions(+), 2 deletions(-)
diff --git a/ApiChangeLog b/ApiChangeLog
index d80bfb3..508f85e 100644
--- a/ApiChangeLog
+++ b/ApiChangeLog
@@ -6,6 +6,48 @@ without breaking anything unaware of it.)
Future changes should be documented in here.
+## Version 13:
+
+* Add new parameter $hints to the `activate_volume()` and `map_volume()` plugin methods
+
+ Old signature of `activate_volume()`:
+
+ sub($plugin, $storeid, $scfg, $volname, $snapname, $cache)
+
+ New signature of `activate_volume()`:
+
+ sub($plugin, $storeid, $scfg, $volname, $snapname, $cache, $hints)
+
+ Old signature of `map_volume()`:
+
+ sub($plugin, $storeid, $scfg, $volname, $snapname)
+
+ New signature of `map_volume()`:
+
+ sub($plugin, $storeid, $scfg, $volname, $snapname, $hints)
+
+ Hints are a mechanism that allows the upper layers to optionally pass down well-defined
+ information to the storage plugins on volume activation/mapping. The storage plugin can make
+ adjustments to its volume activation/mapping based on the hints (we call this "applying" the
+ hint).
+
+ If the new parameter $hints is defined, it is a hash reference that adheres to the schema
+ `PVE::Storage::Plugin::$hints_format` (verified by caller). The supported hints are specified in
+ `PVE::Storage::Plugin::$hints_properties` and may be extended in the future. It is not guaranteed
+ that the volume is inactive when `activate_volume()` or `map_volume()` are called, and it is not
+ guaranteed that hints are passed on every storage activation. Hints may be passed at additional
+ storage activation call sites in the future without an API version bump.
+
+ It can happen a volume is already active but applying the hint would require unmapping the volume
+ and mapping it again with the hint applied. To cover such cases, the Boolean hint
+ 'plugin-may-deactivate-volume' denotes whether unmapping the volume is currently safe. Only if
+ this hint is true, the plugin may deactivate the volume and map it again with the hint applied.
+
+ Hints may be added in the future without an API version bump. Plugins may call
+ `PVE::Storage::Plugin::is_hint_supported()` to check whether a specific hint is supported.
+
+ Plugins that do not require hints can safely ignore the additional parameter.
+
## Version 12:
* Introduce `qemu_blockdev_options()` plugin method
diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index 1dde2b7..ca0bf0e 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -41,11 +41,11 @@ use PVE::Storage::BTRFSPlugin;
use PVE::Storage::ESXiPlugin;
# Storage API version. Increment it on changes in storage API interface.
-use constant APIVER => 12;
+use constant APIVER => 13;
# Age is the number of versions we're backward compatible with.
# This is like having 'current=APIVER' and age='APIAGE' in libtool,
# see https://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html
-use constant APIAGE => 3;
+use constant APIAGE => 4;
our $KNOWN_EXPORT_FORMATS = ['raw+size', 'tar+size', 'qcow2+size', 'vmdk+size', 'zfs', 'btrfs'];
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread* [pve-devel] [PATCH storage 3/5] storage: activate/map volumes: verify and pass hints to plugin
2025-10-31 10:36 [pve-devel] [PATCH qemu-server/storage 0/6] fix #5779: introduce guest hints to pass rxbounce flag to KRBD Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH storage 1/5] plugin: introduce hints for activating and mapping volumes Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH storage 2/5] plugin api: bump api version and age Friedrich Weber
@ 2025-10-31 10:36 ` Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH storage 4/5] plugin: rbd: factor out subroutine to obtain RBD ID Friedrich Weber
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Friedrich Weber @ 2025-10-31 10:36 UTC (permalink / raw)
To: pve-devel
Plugin methods {activate,map}_volume accept an optional hints
parameter. Make PVE::Storage::{activate,map}_volumes also accept
hints, verify they are consistent with the schema, and pass them down
to the plugin when activating the volumes.
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---
src/PVE/Storage.pm | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index ca0bf0e..2371443 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -1122,7 +1122,9 @@ sub vdisk_create_base {
}
sub map_volume {
- my ($cfg, $volid, $snapname) = @_;
+ my ($cfg, $volid, $snapname, $hints) = @_;
+
+ PVE::Storage::Plugin::verify_hints($hints);
my ($storeid, $volname) = parse_volume_id($volid);
@@ -1130,7 +1132,7 @@ sub map_volume {
my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
- return $plugin->map_volume($storeid, $scfg, $volname, $snapname);
+ return $plugin->map_volume($storeid, $scfg, $volname, $snapname, $hints);
}
sub unmap_volume {
@@ -1386,10 +1388,12 @@ sub deactivate_storage {
}
sub activate_volumes {
- my ($cfg, $vollist, $snapname) = @_;
+ my ($cfg, $vollist, $snapname, $hints) = @_;
return if !($vollist && scalar(@$vollist));
+ PVE::Storage::Plugin::verify_hints($hints);
+
my $storagehash = {};
foreach my $volid (@$vollist) {
my ($storeid, undef) = parse_volume_id($volid);
@@ -1404,7 +1408,7 @@ sub activate_volumes {
my ($storeid, $volname) = parse_volume_id($volid);
my $scfg = storage_config($cfg, $storeid);
my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
- $plugin->activate_volume($storeid, $scfg, $volname, $snapname, $cache);
+ $plugin->activate_volume($storeid, $scfg, $volname, $snapname, $cache, $hints);
}
}
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread* [pve-devel] [PATCH storage 4/5] plugin: rbd: factor out subroutine to obtain RBD ID
2025-10-31 10:36 [pve-devel] [PATCH qemu-server/storage 0/6] fix #5779: introduce guest hints to pass rxbounce flag to KRBD Friedrich Weber
` (2 preceding siblings ...)
2025-10-31 10:36 ` [pve-devel] [PATCH storage 3/5] storage: activate/map volumes: verify and pass hints to plugin Friedrich Weber
@ 2025-10-31 10:36 ` Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH storage 5/5] plugin: rbd: pass rxbounce when mapping Windows VM guest volumes Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH qemu-server 1/1] fix #5779: vm start: pass storage hints when activating " Friedrich Weber
5 siblings, 0 replies; 7+ messages in thread
From: Friedrich Weber @ 2025-10-31 10:36 UTC (permalink / raw)
To: pve-devel
This allows the subroutine to be reused.
No functional change intended.
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---
src/PVE/Storage/RBDPlugin.pm | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
index cf371c7..69957d2 100644
--- a/src/PVE/Storage/RBDPlugin.pm
+++ b/src/PVE/Storage/RBDPlugin.pm
@@ -49,6 +49,13 @@ my sub get_rbd_path {
return $path;
}
+my sub get_rbd_id {
+ my ($path) = @_;
+ my $real_dev = abs_path($path);
+ my ($rbd_id) = ($real_dev =~ m|/dev/rbd([0-9]+)$|);
+ return $rbd_id;
+}
+
my sub get_rbd_dev_path {
my ($scfg, $storeid, $volume) = @_;
@@ -76,8 +83,7 @@ my sub get_rbd_dev_path {
if (!-e $pve_path && -e $path) {
# possibly mapped before rbd-pve rule existed
- my $real_dev = abs_path($path);
- my ($rbd_id) = ($real_dev =~ m|/dev/rbd([0-9]+)$|);
+ my $rbd_id = get_rbd_id($path);
my $dev_cluster_id = file_read_firstline("/sys/devices/rbd/${rbd_id}/cluster_fsid");
return $path if $cluster_id eq $dev_cluster_id;
}
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread* [pve-devel] [PATCH storage 5/5] plugin: rbd: pass rxbounce when mapping Windows VM guest volumes
2025-10-31 10:36 [pve-devel] [PATCH qemu-server/storage 0/6] fix #5779: introduce guest hints to pass rxbounce flag to KRBD Friedrich Weber
` (3 preceding siblings ...)
2025-10-31 10:36 ` [pve-devel] [PATCH storage 4/5] plugin: rbd: factor out subroutine to obtain RBD ID Friedrich Weber
@ 2025-10-31 10:36 ` Friedrich Weber
2025-10-31 10:36 ` [pve-devel] [PATCH qemu-server 1/1] fix #5779: vm start: pass storage hints when activating " Friedrich Weber
5 siblings, 0 replies; 7+ messages in thread
From: Friedrich Weber @ 2025-10-31 10:36 UTC (permalink / raw)
To: pve-devel
When mapping a volume (e.g., because KRBD is enabled) and the hint
'guest-is-windows' is given and true, pass the rxbounce option. This
is to avoid "bad crc/signature" warnings in the journal, retransmits
and degraded performance, see [1]. If the volume is already mapped
without rxbounce (this can be determined from the map options exposed
in sysfs), and it should be mapped with rxbounce, and the
'plugin-may-deactivate-volume' hint denotes it is currently safe to
deactivate the volume, unmap the volume and re-map it with rxbounce.
If 'guest-is-windows' is not given or not true, and the volume is
already mapped, take no action. This also means that guest volumes
that are mapped with rxbounce, but do not have to be (because they do
not belong to a Windows guest), are not deactivated. This can be the
case if a user applied the workaround of adding rxbounce to
'rbd_default_map_options', since this applies to all volumes.
[1] https://bugzilla.proxmox.com/show_bug.cgi?id=5779
[2] https://forum.proxmox.com/threads/155741/post-710845
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---
src/PVE/Storage/RBDPlugin.pm | 42 ++++++++++++++++++++++++++++++++----
1 file changed, 38 insertions(+), 4 deletions(-)
diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm
index 69957d2..7d3e7ab 100644
--- a/src/PVE/Storage/RBDPlugin.pm
+++ b/src/PVE/Storage/RBDPlugin.pm
@@ -90,6 +90,19 @@ my sub get_rbd_dev_path {
return $pve_path;
}
+my sub read_rbd_map_options {
+ my ($rbd_id) = @_;
+
+ my $sysfs_config_info = file_read_firstline("/sys/devices/rbd/${rbd_id}/config_info");
+ return if !defined($sysfs_config_info);
+
+ my $config_info = [split(/\s+/, $sysfs_config_info)];
+ my $options = @$config_info[1];
+ return if !defined($options);
+
+ return [split(/,/, $options)];
+}
+
my $rbd_cmd = sub {
my ($scfg, $storeid, $op, @options) = @_;
@@ -799,7 +812,7 @@ sub deactivate_storage {
}
sub map_volume {
- my ($class, $storeid, $scfg, $volname, $snapname) = @_;
+ my ($class, $storeid, $scfg, $volname, $snapname, $hints) = @_;
my ($vtype, $img_name, $vmid) = $class->parse_volname($volname);
@@ -808,12 +821,33 @@ sub map_volume {
my $kerneldev = get_rbd_dev_path($scfg, $storeid, $name);
+ my @extra_options = ();
+
+ if (defined($hints) && $hints->{'guest-is-windows'}) {
+ # make sure to pass rxbounce for Windows guest volumes to avoid degraded performance
+ @extra_options = ('--options', 'rxbounce');
+
+ # if already mapped without rxbounce and deactivation is safe, try to unmap
+ if (-b $kerneldev) {
+ my $mapped_options = read_rbd_map_options(get_rbd_id($kerneldev));
+
+ if ($mapped_options && scalar(grep { /rxbounce/ } @$mapped_options) == 0) {
+ if ($hints->{'plugin-may-deactivate-volume'}) {
+ eval { $class->unmap_volume($storeid, $scfg, $volname, $snapname); };
+ warn "could not unmap to apply rxbounce - $@\n" if $@;
+ } else {
+ warn "not unmapping volume $volname to apply rxbounce since it is not safe\n";
+ }
+ }
+ }
+ }
+
return $kerneldev if -b $kerneldev; # already mapped
# features can only be enabled/disabled for image, not for snapshot!
$krbd_feature_update->($scfg, $storeid, $img_name);
- my $cmd = $rbd_cmd->($scfg, $storeid, 'map', $name);
+ my $cmd = $rbd_cmd->($scfg, $storeid, 'map', $name, @extra_options);
run_rbd_command($cmd, errmsg => "can't map rbd volume $name");
return $kerneldev;
@@ -836,9 +870,9 @@ sub unmap_volume {
}
sub activate_volume {
- my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
+ my ($class, $storeid, $scfg, $volname, $snapname, $cache, $hints) = @_;
- $class->map_volume($storeid, $scfg, $volname, $snapname) if $scfg->{krbd};
+ $class->map_volume($storeid, $scfg, $volname, $snapname, $hints) if $scfg->{krbd};
return 1;
}
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread* [pve-devel] [PATCH qemu-server 1/1] fix #5779: vm start: pass storage hints when activating guest volumes
2025-10-31 10:36 [pve-devel] [PATCH qemu-server/storage 0/6] fix #5779: introduce guest hints to pass rxbounce flag to KRBD Friedrich Weber
` (4 preceding siblings ...)
2025-10-31 10:36 ` [pve-devel] [PATCH storage 5/5] plugin: rbd: pass rxbounce when mapping Windows VM guest volumes Friedrich Weber
@ 2025-10-31 10:36 ` Friedrich Weber
5 siblings, 0 replies; 7+ messages in thread
From: Friedrich Weber @ 2025-10-31 10:36 UTC (permalink / raw)
To: pve-devel
Currently supported hints are
- 'guest-is-windows' to denote a Windows guest,
- 'plugin-may-deactivate-volume' which denotes that the plugin may
deactivate and re-activate a volume to apply a hint.
Currently the only consumer of hints is the RBD plugin. In case a
Windows VM guest volume is mapped using KRBD, certain VM workloads may
cause degraded performance and e.g. "libceph: [...] bad crc/signature"
messages in the host journal. A workaround is to pass the rxbounce
option when mapping the volume [1]. If 'guest-is-windows' is set to 1,
the RBD plugin will map the guest volume with the rxbounce option.
[1] https://docs.ceph.com/en/squid/man/8/rbd/#kernel-rbd-krbd-options
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---
src/PVE/QemuServer.pm | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm
index cf195ccc..17ddd39b 100644
--- a/src/PVE/QemuServer.pm
+++ b/src/PVE/QemuServer.pm
@@ -5363,6 +5363,21 @@ my sub remove_left_over_vmstate_opts {
PVE::QemuConfig->write_config($vmid, $conf) if $found;
}
+sub generate_storage_hints {
+ my ($conf, $plugin_may_deactivate_volume) = @_;
+
+ my $hints = {};
+
+ if (PVE::Storage::Plugin::is_hint_supported('guest-is-windows')) {
+ $hints->{'guest-is-windows'} = int(!!windows_version($conf->{ostype}));
+ }
+ if (PVE::Storage::Plugin::is_hint_supported('plugin-may-deactivate-volume')) {
+ $hints->{'plugin-may-deactivate-volume'} = $plugin_may_deactivate_volume || 0;
+ }
+
+ return $hints;
+}
+
# see vm_start_nolock for parameters, additionally:
# migrate_opts:
# storagemap = parsed storage map for allocating NBD disks
@@ -5530,7 +5545,9 @@ sub vm_start_nolock {
my $pci_reserve_list = [];
eval {
# With -blockdev, it is necessary to activate the volumes before generating the command line
- PVE::Storage::activate_volumes($storecfg, $vollist);
+ # Plugins can safely deactivate already-active volumes here if needed
+ my $storage_hints = generate_storage_hints($conf, 1);
+ PVE::Storage::activate_volumes($storecfg, $vollist, undef, $storage_hints);
# Note that for certain cases like templates, the configuration is minimized, so need to ensure
# the rest of the function here uses the same configuration that was used to build the command
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread