From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 931881FF183 for ; Wed, 16 Jul 2025 18:34:01 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 4492517727; Wed, 16 Jul 2025 18:34:31 +0200 (CEST) To: pve-devel@lists.proxmox.com Date: Wed, 16 Jul 2025 18:34:15 +0200 In-Reply-To: <20250716163415.1837210-1-alexandre.derumier@groupe-cyllene.com> References: <20250716163415.1837210-1-alexandre.derumier@groupe-cyllene.com> MIME-Version: 1.0 Message-ID: List-Id: Proxmox VE development discussion List-Post: From: Alexandre Derumier via pve-devel Precedence: list Cc: Alexandre Derumier X-Mailman-Version: 2.1.29 X-BeenThere: pve-devel@lists.proxmox.com List-Subscribe: , List-Unsubscribe: , List-Archive: Reply-To: Proxmox VE development discussion List-Help: Subject: [pve-devel] [PATCH v2 qemu-server 10/10] RFC: add multiple iothreads support Content-Type: multipart/mixed; boundary="===============2470806514452719422==" Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" --===============2470806514452719422== Content-Type: message/rfc822 Content-Disposition: inline Return-Path: X-Original-To: pve-devel@lists.proxmox.com Delivered-To: pve-devel@lists.proxmox.com Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id CA092C90EF for ; Wed, 16 Jul 2025 18:34:29 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 7F9FC176E3 for ; Wed, 16 Jul 2025 18:34:27 +0200 (CEST) Received: from bastiontest.odiso.net (unknown [185.151.190.228]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 16 Jul 2025 18:34:21 +0200 (CEST) Received: from formationkvm1.odiso.net (unknown [10.11.201.57]) by bastiontest.odiso.net (Postfix) with ESMTP id 1D550862E52; Wed, 16 Jul 2025 18:34:18 +0200 (CEST) Received: by formationkvm1.odiso.net (Postfix, from userid 0) id 1AFE911F6B0C; Wed, 16 Jul 2025 18:34:18 +0200 (CEST) From: Alexandre Derumier To: pve-devel@lists.proxmox.com Subject: [PATCH v2 qemu-server 10/10] RFC: add multiple iothreads support Date: Wed, 16 Jul 2025 18:34:15 +0200 Message-Id: <20250716163415.1837210-11-alexandre.derumier@groupe-cyllene.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250716163415.1837210-1-alexandre.derumier@groupe-cyllene.com> References: <20250716163415.1837210-1-alexandre.derumier@groupe-cyllene.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.099 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_NONE 0.1 DMARC none policy HEADER_FROM_DIFFERENT_DOMAINS 0.001 From and EnvelopeFrom 2nd level mail domains are different KAM_DMARC_NONE 0.25 DKIM has Failed or SPF has failed on the message and the domain has no DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_LAZY_DOMAIN_SECURITY 1 Sending domain does not have any anti-forgery methods RDNS_NONE 0.793 Delivered to internal network by a host with no rDNS SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_NONE 0.001 SPF: sender does not publish an SPF Record This add support for multiple iothreads by disk. iothreads are defined globally at vm start, and are shared across disk. iothreads are different threads than vm vcpus, and can be optionally pinned to host cpu cores (not yet implemented). The number of iothreads can't be bigger than number of host cores. (or performance will decrease because of context switches) not implement: add an option to specify specific iothread mapping by disk. Signed-off-by: Alexandre Derumier --- src/PVE/QemuServer.pm | 23 ++++++++++- src/PVE/QemuServer/DriveDevice.pm | 59 ++++++++++++++++++++++++++++- src/test/cfg2cmd/iothreads.conf | 7 ++++ src/test/cfg2cmd/iothreads.conf.cmd | 45 ++++++++++++++++++++++ 4 files changed, 131 insertions(+), 3 deletions(-) create mode 100644 src/test/cfg2cmd/iothreads.conf create mode 100644 src/test/cfg2cmd/iothreads.conf.cmd diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm index 29fccffb..b9500f1f 100644 --- a/src/PVE/QemuServer.pm +++ b/src/PVE/QemuServer.pm @@ -73,7 +73,7 @@ use PVE::QemuServer::Drive qw( storage_allows_io_uring_default ); use PVE::QemuServer::DriveDevice - qw(get_drivedevice_controller get_drivedevice get_drivedevice_iothread scsihw_infos); + qw(get_drivedevice_controller get_drivedevice get_drivedevice_iothread parse_iothreads scsihw_infos); use PVE::QemuServer::Machine; use PVE::QemuServer::Memory qw(get_current_memory); use PVE::QemuServer::MetaInfo; @@ -314,6 +314,16 @@ my $confdesc = { maximum => 262144, default => 'cgroup v1: 1024, cgroup v2: 100', }, + iothreads => { + optional => 1, + type => 'string', + verbose_description => + "Create multiple iothreads and share them between disks where iothread is enabled." + . "Iothreads can be optionally pinned to specific host cpu. (Ideally on different" + . "host cores than the vm vcpus)", + description => "Multiple iothreads configuration.", + format => $PVE::QemuServer::DriveDevice::iothreads_fmt, + }, memory => { optional => 1, type => 'string', @@ -3453,6 +3463,17 @@ sub config_to_command { push @$devices, '-iscsi', "initiator-name=$initiator"; } + if ($conf->{iothreads}) { + my $iothreads = parse_iothreads($conf->{iothreads}); + die "MAX $allowed_vcpus iothreads allowed per VM on this node\n" + if ($allowed_vcpus < $iothreads->{iothreads}); + + for (my $i = 0; $i < $iothreads->{iothreads}; $i++) { + my $iothread = { 'qom-type' => 'iothread', id => "iothread$i" }; + push @$cmd, '-object', to_json($iothread, { canonical => 1 }); + } + } + PVE::QemuConfig->foreach_volume( $conf, sub { diff --git a/src/PVE/QemuServer/DriveDevice.pm b/src/PVE/QemuServer/DriveDevice.pm index 65555a58..e79bf128 100644 --- a/src/PVE/QemuServer/DriveDevice.pm +++ b/src/PVE/QemuServer/DriveDevice.pm @@ -5,6 +5,7 @@ use warnings; use URI::Escape; +use PVE::JSONSchema qw(parse_property_string); use PVE::QemuServer::Drive qw (drive_is_cdrom); use PVE::QemuServer::Helpers qw(kvm_user_version min_version); use PVE::QemuServer::Machine; @@ -16,9 +17,43 @@ our @EXPORT_OK = qw( get_drivedevice get_drivedevice_controller get_drivedevice_iothread + parse_iothreads scsihw_infos ); +our $iothreads_fmt = { + iothreads => { + optional => 1, + type => 'integer', + default_key => 1, + description => "Number of iothreads.", + minimum => 1, + default => 0, + }, + affinity => { + type => 'string', + format => 'pve-cpuset', + format_description => "id[-id];...", + description => "List of host cores used to execute iothreads, for example: 0,5,8-11", + optional => 1, + }, +}; + +sub parse_iothreads { + my ($data) = @_; + + my $res = eval { parse_property_string($iothreads_fmt, $data) }; + warn $@ if $@; + return $res; +} + +sub print_iothreads { + my ($iothreads) = @_; + return PVE::JSONSchema::print_property_string($iothreads, $iothreads_fmt); +} + +PVE::JSONSchema::register_format("pve-qm-iothreads", $iothreads_fmt); + sub scsihw_infos { my ($conf, $drive) = @_; @@ -41,6 +76,23 @@ sub scsihw_infos { return ($maxdev, $controller, $controller_prefix); } +my sub add_iothread_mapping { + my ($conf, $device, $iothread_id) = @_; + + if ($conf->{iothreads}) { + my $iothreads_mapping = []; + #if multiple iothreads are defined, we share them all by default + #improve me: add a manual mapping options for advanced users ? + my $iothreads = parse_iothreads($conf->{iothreads}); + for (my $i = 0; $i < $iothreads->{iothreads}; $i++) { + push @$iothreads_mapping, { iothread => "iothread$i" }; + } + $device->{'iothread-vq-mapping'} = $iothreads_mapping; + } else { + $device->{iothread} = $iothread_id; + } +} + sub get_drivedevice { my ($storecfg, $conf, $vmid, $drive, $bridges, $arch, $machine_type) = @_; @@ -65,7 +117,7 @@ sub get_drivedevice { if (!min_version($machine_version, 10, 0) || $drive->{file} ne 'none') { $device->{drive} = "drive-$drive_id"; } - $device->{iothread} = "iothread-$drive_id" if $drive->{iothread}; + add_iothread_mapping($conf, $device, "iothread-$drive_id") if $drive->{iothread}; } elsif ($drive->{interface} eq 'scsi') { my ($maxdev, $controller, $controller_prefix) = scsihw_infos($conf, $drive); @@ -195,7 +247,7 @@ sub get_drivedevice_controller { && $conf->{scsihw} eq "virtio-scsi-single" && $drive->{iothread} ) { - $device->{iothread} = "iothread-$controllerid"; + add_iothread_mapping($conf, $device, "iothread-$controllerid"); } my $queues = ''; @@ -232,6 +284,9 @@ sub get_drivedevice_controller { sub get_drivedevice_iothread { my ($conf, $drive) = @_; + #we don't generate single iothread by disk if multiple iothreads are defined + return if $conf->{iothreads}; + my $drive_id = PVE::QemuServer::Drive::get_drive_id($drive); if ($drive->{interface} eq 'virtio') { diff --git a/src/test/cfg2cmd/iothreads.conf b/src/test/cfg2cmd/iothreads.conf new file mode 100644 index 00000000..f803499e --- /dev/null +++ b/src/test/cfg2cmd/iothreads.conf @@ -0,0 +1,7 @@ +# TEST: global multiple iothreads with disk with or without iothread enabled +iothreads: 4,affinity=2-6 +scsihw: virtio-scsi-single +scsi0: local:8006/vm-8006-disk-0.raw,size=104858K,iothread=1 +scsi1: local:8006/vm-8006-disk-1.raw,size=104858K +virtio0: local:8006/vm-8006-disk-3.raw,size=104858K,iothread=1 +virtio1: local:8006/vm-8006-disk-4.raw,size=104858K diff --git a/src/test/cfg2cmd/iothreads.conf.cmd b/src/test/cfg2cmd/iothreads.conf.cmd new file mode 100644 index 00000000..39bec041 --- /dev/null +++ b/src/test/cfg2cmd/iothreads.conf.cmd @@ -0,0 +1,45 @@ +/usr/bin/kvm \ + -id 8006 \ + -name 'vm8006,debug-threads=on' \ + -no-shutdown \ + -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \ + -mon 'chardev=qmp,mode=control' \ + -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect-ms=5000' \ + -mon 'chardev=qmp-event,mode=control' \ + -pidfile /var/run/qemu-server/8006.pid \ + -daemonize \ + -smp '1,sockets=1,cores=1,maxcpus=1' \ + -nodefaults \ + -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \ + -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \ + -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \ + -m 512 \ + -object '{"id":"iothread0","qom-type":"iothread"}' \ + -object '{"id":"iothread1","qom-type":"iothread"}' \ + -object '{"id":"iothread2","qom-type":"iothread"}' \ + -object '{"id":"iothread3","qom-type":"iothread"}' \ + -object '{"id":"throttle-drive-scsi0","limits":{},"qom-type":"throttle-group"}' \ + -object '{"id":"throttle-drive-scsi1","limits":{},"qom-type":"throttle-group"}' \ + -object '{"id":"throttle-drive-virtio0","limits":{},"qom-type":"throttle-group"}' \ + -object '{"id":"throttle-drive-virtio1","limits":{},"qom-type":"throttle-group"}' \ + -global 'PIIX4_PM.disable_s3=1' \ + -global 'PIIX4_PM.disable_s4=1' \ + -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \ + -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \ + -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' \ + -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \ + -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \ + -device 'VGA,id=vga,bus=pci.0,addr=0x2' \ + -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \ + -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ + -device '{"addr":"0x1","bus":"pci.3","driver":"virtio-scsi-pci","id":"virtioscsi0","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},{"iothread":"iothread2"},{"iothread":"iothread3"}]}' \ + -blockdev '{"driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"driver":"raw","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-0.raw","node-name":"e3b2553803d55d43b9986a0aac3e9a7","read-only":false},"node-name":"f3b2553803d55d43b9986a0aac3e9a7","read-only":false},"node-name":"drive-scsi0","throttle-group":"throttle-drive-scsi0"}' \ + -device '{"bus":"virtioscsi0.0","channel":0,"drive":"drive-scsi0","driver":"scsi-hd","id":"scsi0","lun":0,"scsi-id":0,"write-cache":"on"}' \ + -device '{"addr":"0x2","bus":"pci.3","driver":"virtio-scsi-pci","id":"virtioscsi1"}' \ + -blockdev '{"driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"driver":"raw","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-1.raw","node-name":"e08707d013893852b3d4d42301a4298","read-only":false},"node-name":"f08707d013893852b3d4d42301a4298","read-only":false},"node-name":"drive-scsi1","throttle-group":"throttle-drive-scsi1"}' \ + -device '{"bus":"virtioscsi1.0","channel":0,"drive":"drive-scsi1","driver":"scsi-hd","id":"scsi1","lun":1,"scsi-id":0,"write-cache":"on"}' \ + -blockdev '{"driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"driver":"raw","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-3.raw","node-name":"ee025d619f851351566850c50231c52","read-only":false},"node-name":"fe025d619f851351566850c50231c52","read-only":false},"node-name":"drive-virtio0","throttle-group":"throttle-drive-virtio0"}' \ + -device '{"addr":"0xa","bus":"pci.0","drive":"drive-virtio0","driver":"virtio-blk-pci","id":"virtio0","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},{"iothread":"iothread2"},{"iothread":"iothread3"}],"write-cache":"on"}' \ + -blockdev '{"driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"driver":"raw","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-4.raw","node-name":"e3cbbe5a2921aa63c4a946b7317e87c","read-only":false},"node-name":"f3cbbe5a2921aa63c4a946b7317e87c","read-only":false},"node-name":"drive-virtio1","throttle-group":"throttle-drive-virtio1"}' \ + -device '{"addr":"0xb","bus":"pci.0","drive":"drive-virtio1","driver":"virtio-blk-pci","id":"virtio1","write-cache":"on"}' \ + -machine 'type=pc+pve0' -- 2.39.5 --===============2470806514452719422== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel --===============2470806514452719422==--