From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id C762EB151 for ; Thu, 28 Apr 2022 16:33:14 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id B67BF2F3E for ; Thu, 28 Apr 2022 16:32:44 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 84EFF2F33 for ; Thu, 28 Apr 2022 16:32:43 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 5251B42F53; Thu, 28 Apr 2022 16:32:43 +0200 (CEST) Message-ID: Date: Thu, 28 Apr 2022 16:32:41 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Thunderbird/100.0 Content-Language: en-US To: Proxmox VE development discussion , Aaron Lauterer References: <20220428115809.1661165-1-a.lauterer@proxmox.com> <20220428115809.1661165-6-a.lauterer@proxmox.com> From: Dominik Csapak In-Reply-To: <20220428115809.1661165-6-a.lauterer@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 1.051 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -1.857 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [pools.pm, tools.pm] Subject: Re: [pve-devel] [PATCH manager 5/5] ceph pools: allow to create erasure code pools X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Apr 2022 14:33:14 -0000 is there a specific reason why you still force the autocreation of the storage? i don't mind really, but giving a reason would be nice. if there is no reason, i'd drop that since the user can specify it anyway and adding one manually should also not too hard.. some nits/comments inline (mostly minor issues imho that could be fixed up) On 4/28/22 13:58, Aaron Lauterer wrote: > To use erasure coded (EC) pools for RBD storages, we need two pools. One > regular replicated pool that will hold the RBD omap and other metadata > and the EC pool which will hold the image data. > > The coupling happens when an RBD image is created by adding the > --data-pool parameter. This is why we have the 'data-pool' parameter in > the storage configuration. > > To follow already established semantics, we will create a 'X-metadata' > and 'X-data' pool. The storage configuration is always added as it is > the only thing that links the two together (besides naming schemes). > > Different pg_num defaults are chosen for the replicated metadata pool as > it will not hold a lot of data. > > Signed-off-by: Aaron Lauterer > --- > we need some more follow ups as setting pool parameters on ec pools will > currently fail when trying to set the size. In that case, we most likely > should adapt the gui as well and make inon changeable settings, such as > size, read only. > > changes since rfc: > > ec profile parameters need to be configured during pool creation now. or > alternatively the name of an ec profile can be provided. > > cleanup of ec profile & crush rule on pool destruction > > PVE/API2/Ceph/Pools.pm | 120 ++++++++++++++++++++++++++++++++++++++--- > PVE/Ceph/Tools.pm | 11 ++-- > 2 files changed, 122 insertions(+), 9 deletions(-) > > diff --git a/PVE/API2/Ceph/Pools.pm b/PVE/API2/Ceph/Pools.pm > index 05855e15..69cf0d1a 100644 > --- a/PVE/API2/Ceph/Pools.pm > +++ b/PVE/API2/Ceph/Pools.pm > @@ -280,7 +280,7 @@ my $ceph_pool_common_options = sub { > > > my $add_storage = sub { > - my ($pool, $storeid) = @_; > + my ($pool, $storeid, $data_pool) = @_; > > my $storage_params = { > type => 'rbd', > @@ -290,6 +290,8 @@ my $add_storage = sub { > content => 'rootdir,images', > }; > > + $storage_params->{'data-pool'} = $data_pool if $data_pool; > + > PVE::API2::Storage::Config->create($storage_params); > }; > > @@ -330,10 +332,41 @@ __PACKAGE__->register_method ({ > properties => { > node => get_standard_option('pve-node'), > add_storages => { > - description => "Configure VM and CT storage using the new pool.", > + description => "Configure VM and CT storage using the new pool. ". > + "Always enabled for erasure coded pools.", > type => 'boolean', > optional => 1, > }, > + k => { > + type => 'integer', > + description => "Number of data chunks. Will create an erasure coded pool plus a ". > + "replicated pool for metadata.", > + optional => 1, > + }, > + m => { > + type => 'integer', > + description => "Number of coding chunks. Will create an erasure coded pool plus a ". > + "replicated pool for metadata.", > + optional => 1, > + }, > + 'failure-domain' => { > + type => 'string', > + description => "CRUSH failure domain. Default is 'host'. Will create an erasure ". > + "coded pool plus a replicated pool for metadata.", > + optional => 1, > + }, > + 'device-class' => { > + type => 'string', > + description => "CRUSH device class. Will create an erasure coded pool plus a ". > + "replicated pool for metadata.", > + optional => 1, > + }, > + ecprofile => { > + description => "Override the erasure code (EC) profile to use. Will create an ". > + "erasure coded pool plus a replicated pool for metadata.", > + type => 'string', > + optional => 1, > + }, i'm still missing some limitations here for the parameters. at least lets limit the length of the names and set k/m to minimum 1? > %{ $ceph_pool_common_options->() }, > }, > }, > @@ -344,10 +377,31 @@ __PACKAGE__->register_method ({ > PVE::Cluster::check_cfs_quorum(); > PVE::Ceph::Tools::check_ceph_configured(); > > - my $pool = extract_param($param, 'name'); > + my $pool = my $name = extract_param($param, 'name'); > my $node = extract_param($param, 'node'); > my $add_storages = extract_param($param, 'add_storages'); > > + my $ec_k = extract_param($param, 'k'); > + my $ec_m = extract_param($param, 'm'); > + my $ec_failure_domain = extract_param($param, 'failure-domain'); > + my $ec_device_class = extract_param($param, 'device-class'); > + > + my $is_ec = 0; > + > + my $ecprofile = extract_param($param, 'ecprofile'); > + die "Erasure code profile '$ecprofile' does not exist.\n" > + if $ecprofile && !PVE::Ceph::Tools::ecprofile_exists($ecprofile); > + > + if ($ec_k || $ec_m || $ec_failure_domain || $ec_device_class) { > + die "'k' and 'm' parameters are needed for an erasure coded pool\n" > + if !$ec_k || !$ec_m; nit: we could param a param_exc here (technically only relevant the gui, so not necessary) > + > + $is_ec = 1; > + } > + > + $is_ec = 1 if $ecprofile; > + $add_storages = 1 if $is_ec; > + > my $rpcenv = PVE::RPCEnvironment::get(); > my $user = $rpcenv->get_user(); > > @@ -370,13 +424,47 @@ __PACKAGE__->register_method ({ > $param->{application} //= 'rbd'; > $param->{pg_autoscale_mode} //= 'warn'; > > - my $worker = sub { > + my $data_param = {}; > + my $data_pool = ''; > + if (!$ecprofile) { > + $ecprofile = PVE::Ceph::Tools::get_ecprofile_name($pool); > + eval { > + PVE::Ceph::Tools::create_ecprofile( > + $ecprofile, > + $ec_k, > + $ec_m, > + $ec_failure_domain, > + $ec_device_class, > + ); > + }; is creating an ec profile so fast that we can it do outside the worker? (iow. can it hang? i guess so if the ceph cluster is not healthy?) > + die "could not create erasure code profile '$ecprofile': $@\n" if $@; > + } > + > + if ($is_ec) { > + # copy all params, should be a flat hash > + $data_param = { map { $_ => $param->{$_} } keys %$param }; > > + $data_param->{pool_type} = 'erasure'; > + $data_param->{allow_ec_overwrites} = 'true'; > + $data_param->{erasure_code_profile} = $ecprofile; > + delete $data_param->{size}; > + delete $data_param->{min_size}; > + > + # metadata pool should be ok with 32 PGs > + $param->{pg_num} = 32; > + > + $pool = "${name}-metadata"; > + $data_pool = "${name}-data"; > + } > + > + my $worker = sub { > PVE::Ceph::Tools::create_pool($pool, $param); > > + PVE::Ceph::Tools::create_pool($data_pool, $data_param) if $is_ec; nit: both of these take an optional rados object, so we could create one beforehand and pass it to both also, ideally we would clean up if there was an error here so if e.g. data pool creation fails, clean up the auto-created metadata pool and ec profile > + > if ($add_storages) { > - eval { $add_storage->($pool, "${pool}") }; > - die "adding PVE storage for ceph pool '$pool' failed: $@\n" if $@; > + eval { $add_storage->($pool, "${name}", $data_pool) }; > + die "adding PVE storage for ceph pool '$name' failed: $@\n" if $@; > } > }; > > @@ -414,6 +502,12 @@ __PACKAGE__->register_method ({ > optional => 1, > default => 0, > }, > + remove_ecprofile => { > + description => "Remove the erasure code profile. Used for erasure code pools. Default is true", > + type => 'boolean', > + optional => 1, > + default => 1, > + }, > }, > }, > returns => { type => 'string' }, > @@ -428,6 +522,7 @@ __PACKAGE__->register_method ({ > if $param->{remove_storages}; > > my $pool = $param->{name}; > + my $remove_ecprofile = $param->{remove_ecprofile} // 1; > > my $worker = sub { > my $storages = $get_storages->($pool); > @@ -447,8 +542,21 @@ __PACKAGE__->register_method ({ > } > } > > + my $pool_properties = PVE::Ceph::Tools::get_pool_properties($pool); > + > PVE::Ceph::Tools::destroy_pool($pool); > > + if (my $ecprofile = $pool_properties->{erasure_code_profile}) { > + my $crush_rule = $pool_properties->{crush_rule}; > + eval { PVE::Ceph::Tools::destroy_crush_rule($crush_rule); }; > + warn "removing crush rule '${crush_rule}' failed: $@\n" if $@; > + > + if ($remove_ecprofile) { > + eval { PVE::Ceph::Tools::destroy_ecprofile($ecprofile) }; > + warn "removing EC profile '${ecprofile}' failed: $@\n" if $@; > + } same here for the rados objects (if the subs accept one) we do create 4 Rados objects here: get_pool_properties destroy_pool destroy_crush_rule destroy_ecprofile > + } > + > if ($param->{remove_storages}) { > my $err; > foreach my $storeid (keys %$storages) { > diff --git a/PVE/Ceph/Tools.pm b/PVE/Ceph/Tools.pm > index 7e94ee58..f50cec67 100644 > --- a/PVE/Ceph/Tools.pm > +++ b/PVE/Ceph/Tools.pm > @@ -8,7 +8,7 @@ use File::Basename; > use IO::File; > use JSON; > > -use PVE::Tools qw(run_command dir_glob_foreach); > +use PVE::Tools qw(run_command dir_glob_foreach extract_param); > use PVE::Cluster qw(cfs_read_file); > use PVE::RADOS; > use PVE::Ceph::Services; > @@ -277,12 +277,17 @@ sub create_pool { > > my $pg_num = $param->{pg_num} || 128; > > - $rados->mon_command({ > + my $mon_params = { > prefix => "osd pool create", > pool => $pool, > pg_num => int($pg_num), > format => 'plain', > - }); > + }; > + $mon_params->{pool_type} = extract_param($param, 'pool_type') if $param->{pool_type}; > + $mon_params->{erasure_code_profile} = extract_param($param, 'erasure_code_profile') > + if $param->{erasure_code_profile}; > + > + $rados->mon_command($mon_params); > > set_pool($pool, $param); >