From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <a.lauterer@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id B18BBD7C0
 for <pve-devel@lists.proxmox.com>; Mon, 21 Aug 2023 12:52:20 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 7FAAA1629E
 for <pve-devel@lists.proxmox.com>; Mon, 21 Aug 2023 12:51:50 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Mon, 21 Aug 2023 12:51:49 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id C8BA342CD2
 for <pve-devel@lists.proxmox.com>; Mon, 21 Aug 2023 12:51:48 +0200 (CEST)
Message-ID: <cad5f917-49df-4855-bc55-420f694f9186@proxmox.com>
Date: Mon, 21 Aug 2023 12:51:47 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: Fiona Ebner <f.ebner@proxmox.com>,
 Proxmox VE development discussion <pve-devel@lists.proxmox.com>
References: <20230418122646.3079833-1-a.lauterer@proxmox.com>
 <5d60e2f0-7d45-75a5-8fd9-506f950c5d2f@proxmox.com>
From: Aaron Lauterer <a.lauterer@proxmox.com>
In-Reply-To: <5d60e2f0-7d45-75a5-8fd9-506f950c5d2f@proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.084 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] [PATCH manager] fix #4631: ceph: osd: create: add
 osds-per-device
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Mon, 21 Aug 2023 10:52:20 -0000

responses inline

On 8/21/23 10:20, Fiona Ebner wrote:
> Am 18.04.23 um 14:26 schrieb Aaron Lauterer:
>> Allows to automatically create multiple OSDs per physical device. The
>> main use case are fast NVME drives that would be bottlenecked by a
>> single OSD service.
>>
>> By using the 'ceph-volume lvm batch' command instead of the 'ceph-volume
>> lvm create' for multiple OSDs / device, we don't have to deal with the
>> split of the drive ourselves.
>>
>> But this means that the parameters to specify a DB or WAL device won't
>> work as the 'batch' command doesn't use them. Dedicated DB and WAL
>> devices don't make much sense anyway if we place the OSDs on fast NVME
>> drives.
>>
>> Some other changes to how the command is built were needed as well, as
>> the 'batch' command needs the path to the disk as a positional argument,
>> not as '--data /dev/sdX'.
>> We drop the '--cluster-fsid' paramter because the 'batch' command
>> doesn't accept it. The 'create' will fall back to reading it from the
>> ceph.conf file.
>>
>> Removal of OSDs works as expected without any code changes. As long as
>> there are other OSDs on a disk, the VG & PV won't be removed, even if
>> 'cleanup' is enabled.
>>
>> Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
>> ---
> 
> I noticed a warning while testing
> 
> --> DEPRECATION NOTICE
> --> You are using the legacy automatic disk sorting behavior
> --> The Pacific release will change the default to --no-auto
> --> passed data devices: 1 physical, 0 LVM
> --> relative data size: 0.3333333333333333
> 
> Note that I'm on Quincy, so maybe they didn't still didn't change it :P

Also shows up when using `ceph-volume lvm batch …` directly. So I guess not much 
we can do about it after consulting the man page.
> 
>> @@ -275,6 +275,12 @@ __PACKAGE__->register_method ({
>>   		type => 'string',
>>   		description => "Set the device class of the OSD in crush."
>>   	    },
>> +	    'osds-per-device' => {
>> +		optional => 1,
>> +		type => 'number',
> 
> should be integer

will change
> 
>> +		minimum => '1',
>> +		description => 'OSD services per physical device. Can improve fast NVME utilization.',
> 
> Can we add an explicit recommendation against doing it for other disk
> types? I imagine it's not beneficial for those, or?

What about something like:
"Only useful for fast NVME devices to utilize their performance better."?

> 
>> +	    },
>>   	},
>>       },
>>       returns => { type => 'string' },
>> @@ -294,6 +300,15 @@ __PACKAGE__->register_method ({
>>   	# extract parameter info and fail if a device is set more than once
>>   	my $devs = {};
>>   
>> +	# allow 'osds-per-device' only without dedicated db and/or wal devs. We cannot specify them with
>> +	# 'ceph-volume lvm batch' and they don't make a lot of sense on fast NVMEs anyway.
>> +	if ($param->{'osds-per-device'}) {
>> +	    for my $type ( qw(db_dev wal_dev) ) {
>> +		die "Cannot use 'osds-per-device' parameter with '${type}'"
> 
> Missing newline after error message.
> Could also use raise_param_exc().

Ah thanks. Will switch it to an `raise_param_exc()` where we don't need the 
newline AFAICT?
> 
>> +		    if $param->{$type};
>> +	    }
>> +	}
>> +
>>   	my $ceph_conf = cfs_read_file('ceph.conf');
>>   
>>   	my $osd_network = $ceph_conf->{global}->{cluster_network};