From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id B67041FF396 for ; Thu, 6 Jun 2024 11:22:11 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 573EAAD22; Thu, 6 Jun 2024 11:22:26 +0200 (CEST) From: Dominik Csapak To: pve-devel@lists.proxmox.com Date: Thu, 6 Jun 2024 11:21:55 +0200 Message-Id: <20240606092220.1190913-1-d.csapak@proxmox.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.021 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: [pve-devel] [PATCH guest-common/qemu-server/manager/docs v4] implement experimental vgpu live migration X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" and some useful cleanups This is implemented for mapped resources. This requires driver and hardware support, but aside from nvidia vgpus there don't seem to be many drivers (if any) that do support that. qemu already supports that for vfio-pci devices, so nothing to be done there besides actively enabling it. Since we currently can't properly test it here and very much depends on hardware/driver support, mark it as experimental everywhere (docs/api/gui). (though i tested the live-migration part manually here by using "exec:cat > /tmp/test" for the migration target, and "exec: cat /tmp/test" as the 'incoming' parameter for a new vm start, which worked ;) ) i opted for marking them migratable at the mapping level, but we could theoretically also put it in the hostpciX config instead. (though imho it fits better in the cluster-wide resource mapping config) also the naming/texts could probably be improved, but i think 'live-migration-capable' is very descriptive and i didn't want to use an overly short name for it (which can be confusing, see the 'shared' flag for storages) guest-common 6/6 is optional and breaks qemu-server versions without qemu-server patches 1&2 guest-common 1-4; qemu-server 1-6; pve-manager 1,2 are preparations/cleanups mostly and could be applied independently changes from v3: * rebased on master * split first guest-common patch into 3 * instead of merging keys, just write all expected keys in to expected_props * made $cfg optional so it does not break callers that don't call it * added patch to fix the cfg2cmd tests for mdev check * added patch to show vfio state transferred for migration * incorporated fionas feedback (mostly minor stuff) for more details see the individual patches changes from v2: * rebased on master * rework the rework of the properties check (pve-guest-common 1/4) * properly check mdev in the gui (pve-manager 1/5) pve-guest-common: Dominik Csapak (6): mapping: pci: assert_valid: rename cfg to mapping mapping: pci: assert_valid: reword error messages mapping: pci: make sure all desired properties are checked mapping: pci: check the mdev configuration on the device too mapping: pci: add 'live-migration-capable' flag to mappings mapping: remove find_on_current_node src/PVE/Mapping/PCI.pm | 60 ++++++++++++++++++++++++------------------ src/PVE/Mapping/USB.pm | 10 ------- 2 files changed, 34 insertions(+), 36 deletions(-) qemu-server: Dominik Csapak (12): usb: mapping: move implementation of find_on_current_node here pci: mapping: move implementation of find_on_current_node here pci: mapping: check mdev config against hardware stop cleanup: remove unnecessary tpmstate cleanup vm_stop_cleanup: add noerr parameter migrate: call vm_stop_cleanup after stopping in phase3_cleanup pci: set 'enable-migration' to on for live-migration marked mapped devices check_local_resources: add more info per mapped device and return as hash api: enable live migration for marked mapped pci devices api: include not mapped resources for running vms in migrate preconditions tests: cfg2cmd: fix mdev tests migration: show vfio state transferred too PVE/API2/Qemu.pm | 55 ++++++++++++++++++++------------ PVE/CLI/qm.pm | 2 +- PVE/QemuMigrate.pm | 44 +++++++++++++++++-------- PVE/QemuServer.pm | 38 +++++++++++----------- PVE/QemuServer/PCI.pm | 14 ++++++-- PVE/QemuServer/USB.pm | 5 ++- test/MigrationTest/Shared.pm | 3 ++ test/run_config2command_tests.pl | 2 +- 8 files changed, 104 insertions(+), 59 deletions(-) pve-manager: Dominik Csapak (5): mapping: pci: include mdev in config checks bulk migrate: improve precondition checks bulk migrate: include checks for live-migratable local resources ui: adapt migration window to precondition api change fix #5175: ui: allow configuring and live migration of mapped pci resources PVE/API2/Cluster/Mapping/PCI.pm | 2 +- PVE/API2/Nodes.pm | 27 ++++++++++++++-- www/manager6/dc/PCIMapView.js | 6 ++++ www/manager6/window/Migrate.js | 51 ++++++++++++++++++++----------- www/manager6/window/PCIMapEdit.js | 12 ++++++++ 5 files changed, 76 insertions(+), 22 deletions(-) pve-docs: Dominik Csapak (2): qm: resource mapping: add description for `mdev` option qm: resource mapping: document `live-migration-capable` setting qm.adoc | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) -- 2.39.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel