From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 10C80F199 for ; Thu, 28 Sep 2023 15:16:38 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id DE8B416C02 for ; Thu, 28 Sep 2023 15:16:07 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 28 Sep 2023 15:16:06 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 81E2648E3E for ; Thu, 28 Sep 2023 15:16:06 +0200 (CEST) Message-ID: <0747b1b3-cee1-424b-935e-8a08078eec23@proxmox.com> Date: Thu, 28 Sep 2023 15:16:05 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US From: Aaron Lauterer To: pve-devel@lists.proxmox.com Reply-To: Proxmox VE development discussion References: <20230823094427.2683024-1-a.lauterer@proxmox.com> In-Reply-To: <20230823094427.2683024-1-a.lauterer@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.078 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH manager v3] fix #4631: ceph: osd: create: add osds-per-device X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Sep 2023 13:16:38 -0000 ping? Patch still applies previous patch versions with discussion are: https://lists.proxmox.com/pipermail/pve-devel/2023-August/058794.html https://lists.proxmox.com/pipermail/pve-devel/2023-August/058803.html On 8/23/23 11:44, Aaron Lauterer wrote: > Allows to automatically create multiple OSDs per physical device. The > main use case are fast NVME drives that would be bottlenecked by a > single OSD service. > > By using the 'ceph-volume lvm batch' command instead of the 'ceph-volume > lvm create' for multiple OSDs / device, we don't have to deal with the > split of the drive ourselves. > > But this means that the parameters to specify a DB or WAL device won't > work as the 'batch' command doesn't use them. Dedicated DB and WAL > devices don't make much sense anyway if we place the OSDs on fast NVME > drives. > > Some other changes to how the command is built were needed as well, as > the 'batch' command needs the path to the disk as a positional argument, > not as '--data /dev/sdX'. > We drop the '--cluster-fsid' parameter because the 'batch' command > doesn't accept it. The 'create' will fall back to reading it from the > ceph.conf file. > > Removal of OSDs works as expected without any code changes. As long as > there are other OSDs on a disk, the VG & PV won't be removed, even if > 'cleanup' is enabled. > > The '--no-auto' parameter is used to avoid the following deprecation > warning: > ``` > --> DEPRECATION NOTICE > --> You are using the legacy automatic disk sorting behavior > --> The Pacific release will change the default to --no-auto > --> passed data devices: 1 physical, 0 LVM > --> relative data size: 0.3333333333333333 > ``` > > Signed-off-by: Aaron Lauterer > --- > > changes since v2: > * removed check for fsid > * rework ceph-volume call to place the positional devpath parameter > after '--' > > PVE/API2/Ceph/OSD.pm | 35 +++++++++++++++++++++++++++++------ > 1 file changed, 29 insertions(+), 6 deletions(-) > > diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm > index ded35990..a1d92ca7 100644 > --- a/PVE/API2/Ceph/OSD.pm > +++ b/PVE/API2/Ceph/OSD.pm > @@ -275,6 +275,13 @@ __PACKAGE__->register_method ({ > type => 'string', > description => "Set the device class of the OSD in crush." > }, > + 'osds-per-device' => { > + optional => 1, > + type => 'integer', > + minimum => '1', > + description => 'OSD services per physical device. Only useful for fast ". > + "NVME devices to utilize their performance better.', > + }, > }, > }, > returns => { type => 'string' }, > @@ -294,6 +301,15 @@ __PACKAGE__->register_method ({ > # extract parameter info and fail if a device is set more than once > my $devs = {}; > > + # allow 'osds-per-device' only without dedicated db and/or wal devs. We cannot specify them with > + # 'ceph-volume lvm batch' and they don't make a lot of sense on fast NVMEs anyway. > + if ($param->{'osds-per-device'}) { > + for my $type ( qw(db_dev wal_dev) ) { > + raise_param_exc({ $type => "canot use 'osds-per-device' parameter with '${type}'" }) > + if $param->{$type}; > + } > + } > + > my $ceph_conf = cfs_read_file('ceph.conf'); > > my $osd_network = $ceph_conf->{global}->{cluster_network}; > @@ -363,10 +379,6 @@ __PACKAGE__->register_method ({ > my $rados = PVE::RADOS->new(); > my $monstat = $rados->mon_command({ prefix => 'quorum_status' }); > > - die "unable to get fsid\n" if !$monstat->{monmap} || !$monstat->{monmap}->{fsid}; > - my $fsid = $monstat->{monmap}->{fsid}; > - $fsid = $1 if $fsid =~ m/^([0-9a-f\-]+)$/; > - > my $ceph_bootstrap_osd_keyring = PVE::Ceph::Tools::get_config('ceph_bootstrap_osd_keyring'); > > if (! -f $ceph_bootstrap_osd_keyring && $ceph_conf->{global}->{auth_client_required} eq 'cephx') { > @@ -470,7 +482,10 @@ __PACKAGE__->register_method ({ > $test_disk_requirements->($disklist); > > my $dev_class = $param->{'crush-device-class'}; > - my $cmd = ['ceph-volume', 'lvm', 'create', '--cluster-fsid', $fsid ]; > + # create allows for detailed configuration of DB and WAL devices > + # batch for easy creation of multiple OSDs (per device) > + my $create_mode = $param->{'osds-per-device'} ? 'batch' : 'create'; > + my $cmd = ['ceph-volume', 'lvm', $create_mode ]; > push @$cmd, '--crush-device-class', $dev_class if $dev_class; > > my $devname = $devs->{dev}->{name}; > @@ -504,9 +519,17 @@ __PACKAGE__->register_method ({ > push @$cmd, "--block.$type", $part_or_lv; > } > > - push @$cmd, '--data', $devpath; > + push @$cmd, '--data', $devpath if $create_mode eq 'create'; > push @$cmd, '--dmcrypt' if $param->{encrypted}; > > + if ($create_mode eq 'batch') { > + push @$cmd, > + '--osds-per-device', $param->{'osds-per-device'}, > + '--yes', > + '--no-auto', > + '--', > + $devpath; > + } > PVE::Diskmanage::wipe_blockdev($devpath); > > if (PVE::Diskmanage::is_partition($devpath)) {