From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 97DE08E08 for ; Wed, 23 Aug 2023 11:44:29 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 77CBD14F89 for ; Wed, 23 Aug 2023 11:44:29 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 23 Aug 2023 11:44:28 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id A58BD435EA for ; Wed, 23 Aug 2023 11:44:28 +0200 (CEST) From: Aaron Lauterer To: pve-devel@lists.proxmox.com Date: Wed, 23 Aug 2023 11:44:27 +0200 Message-Id: <20230823094427.2683024-1-a.lauterer@proxmox.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.083 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [osd.pm] Subject: [pve-devel] [PATCH manager v3] fix #4631: ceph: osd: create: add osds-per-device X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Aug 2023 09:44:29 -0000 Allows to automatically create multiple OSDs per physical device. The main use case are fast NVME drives that would be bottlenecked by a single OSD service. By using the 'ceph-volume lvm batch' command instead of the 'ceph-volume lvm create' for multiple OSDs / device, we don't have to deal with the split of the drive ourselves. But this means that the parameters to specify a DB or WAL device won't work as the 'batch' command doesn't use them. Dedicated DB and WAL devices don't make much sense anyway if we place the OSDs on fast NVME drives. Some other changes to how the command is built were needed as well, as the 'batch' command needs the path to the disk as a positional argument, not as '--data /dev/sdX'. We drop the '--cluster-fsid' parameter because the 'batch' command doesn't accept it. The 'create' will fall back to reading it from the ceph.conf file. Removal of OSDs works as expected without any code changes. As long as there are other OSDs on a disk, the VG & PV won't be removed, even if 'cleanup' is enabled. The '--no-auto' parameter is used to avoid the following deprecation warning: ``` --> DEPRECATION NOTICE --> You are using the legacy automatic disk sorting behavior --> The Pacific release will change the default to --no-auto --> passed data devices: 1 physical, 0 LVM --> relative data size: 0.3333333333333333 ``` Signed-off-by: Aaron Lauterer --- changes since v2: * removed check for fsid * rework ceph-volume call to place the positional devpath parameter after '--' PVE/API2/Ceph/OSD.pm | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm index ded35990..a1d92ca7 100644 --- a/PVE/API2/Ceph/OSD.pm +++ b/PVE/API2/Ceph/OSD.pm @@ -275,6 +275,13 @@ __PACKAGE__->register_method ({ type => 'string', description => "Set the device class of the OSD in crush." }, + 'osds-per-device' => { + optional => 1, + type => 'integer', + minimum => '1', + description => 'OSD services per physical device. Only useful for fast ". + "NVME devices to utilize their performance better.', + }, }, }, returns => { type => 'string' }, @@ -294,6 +301,15 @@ __PACKAGE__->register_method ({ # extract parameter info and fail if a device is set more than once my $devs = {}; + # allow 'osds-per-device' only without dedicated db and/or wal devs. We cannot specify them with + # 'ceph-volume lvm batch' and they don't make a lot of sense on fast NVMEs anyway. + if ($param->{'osds-per-device'}) { + for my $type ( qw(db_dev wal_dev) ) { + raise_param_exc({ $type => "canot use 'osds-per-device' parameter with '${type}'" }) + if $param->{$type}; + } + } + my $ceph_conf = cfs_read_file('ceph.conf'); my $osd_network = $ceph_conf->{global}->{cluster_network}; @@ -363,10 +379,6 @@ __PACKAGE__->register_method ({ my $rados = PVE::RADOS->new(); my $monstat = $rados->mon_command({ prefix => 'quorum_status' }); - die "unable to get fsid\n" if !$monstat->{monmap} || !$monstat->{monmap}->{fsid}; - my $fsid = $monstat->{monmap}->{fsid}; - $fsid = $1 if $fsid =~ m/^([0-9a-f\-]+)$/; - my $ceph_bootstrap_osd_keyring = PVE::Ceph::Tools::get_config('ceph_bootstrap_osd_keyring'); if (! -f $ceph_bootstrap_osd_keyring && $ceph_conf->{global}->{auth_client_required} eq 'cephx') { @@ -470,7 +482,10 @@ __PACKAGE__->register_method ({ $test_disk_requirements->($disklist); my $dev_class = $param->{'crush-device-class'}; - my $cmd = ['ceph-volume', 'lvm', 'create', '--cluster-fsid', $fsid ]; + # create allows for detailed configuration of DB and WAL devices + # batch for easy creation of multiple OSDs (per device) + my $create_mode = $param->{'osds-per-device'} ? 'batch' : 'create'; + my $cmd = ['ceph-volume', 'lvm', $create_mode ]; push @$cmd, '--crush-device-class', $dev_class if $dev_class; my $devname = $devs->{dev}->{name}; @@ -504,9 +519,17 @@ __PACKAGE__->register_method ({ push @$cmd, "--block.$type", $part_or_lv; } - push @$cmd, '--data', $devpath; + push @$cmd, '--data', $devpath if $create_mode eq 'create'; push @$cmd, '--dmcrypt' if $param->{encrypted}; + if ($create_mode eq 'batch') { + push @$cmd, + '--osds-per-device', $param->{'osds-per-device'}, + '--yes', + '--no-auto', + '--', + $devpath; + } PVE::Diskmanage::wipe_blockdev($devpath); if (PVE::Diskmanage::is_partition($devpath)) { -- 2.39.2