all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Fabian Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [RFC manager 5/5] fix #2422: allow multiple Ceph public networks
Date: Fri, 30 Apr 2021 15:54:37 +0200	[thread overview]
Message-ID: <20210430135437.4816-6-f.ebner@proxmox.com> (raw)
In-Reply-To: <20210430135437.4816-1-f.ebner@proxmox.com>

From: Alwin Antreich <a.antreich@proxmox.com>

Multiple public networks can be defined in the ceph.conf. The networks need to
be routed to each other.

Support handling multiple IPs for a single monitor. By default, one address from
each public network is selected for monitor creation, but, as before, it can be
overwritten with the mon-address parameter, now taking a list of addresses.

On removal, make sure the all addresses are removed from the mon_host entry in
the ceph configuration.

Originally-by: Alwin Antreich <a.antreich@proxmox.com>
[handling of multiple addresses]
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Sorry if the patch is a bit messy to review, what it essentially is, is creating
loops over nets/addresses around the old existing logic.

Viewing with a word-based diff might help in some places.

Changes from Alwin's patch as suggested by Fabian G.:
    * split list after truth check for $pubnet
    * avoid long post-if statement
    * style/indentation fixes
    * verify cidr

Other changes:
    * test/select IPs from each public network separately
    * return and handle a list of IPs, so that it's not necessary to create a
      monitor for each public network
    * make mon-address a list of IPs
    * also handle multiple addresses upon removal

What could still be improved (not new in this patch):
    * match addresses semantically everywhere, which is mostly relevant for IPv6

 PVE/API2/Ceph/MON.pm | 180 ++++++++++++++++++++++++++++---------------
 1 file changed, 119 insertions(+), 61 deletions(-)

diff --git a/PVE/API2/Ceph/MON.pm b/PVE/API2/Ceph/MON.pm
index 6f232e34..63d4d93e 100644
--- a/PVE/API2/Ceph/MON.pm
+++ b/PVE/API2/Ceph/MON.pm
@@ -20,8 +20,10 @@ use PVE::API2::Ceph::MGR;
 
 use base qw(PVE::RESTHandler);
 
-my $find_mon_ip = sub {
-    my ($cfg, $rados, $node, $overwrite_ip) = @_;
+my $find_mon_ips = sub {
+    my ($cfg, $rados, $node, $mon_address) = @_;
+
+    my $overwrite_ips = [ PVE::Tools::split_list($mon_address) ];
 
     my $pubnet;
     if ($rados) {
@@ -34,40 +36,70 @@ my $find_mon_ip = sub {
     $pubnet //= $cfg->{global}->{public_network};
 
     if (!$pubnet) {
-	return $overwrite_ip // PVE::Cluster::remote_node_ip($node);
+	if (scalar(@{$overwrite_ips})) {
+	    return $overwrite_ips;
+	} else {
+	   # don't refactor into '[ PVE::Cluster::remote... ]' as it uses wantarray
+	   my $ip = PVE::Cluster::remote_node_ip($node);
+	   return [ $ip ];
+	}
     }
 
-    my $allowed_ips = PVE::Network::get_local_ip_from_cidr($pubnet);
-    die "No active IP found for the requested ceph public network '$pubnet' on node '$node'\n"
-	if scalar(@$allowed_ips) < 1;
+    my $public_nets = [ PVE::Tools::split_list($pubnet) ];
+    if (scalar(@{$public_nets}) > 1) {
+	warn "Multiple Ceph public networks detected on $node: $pubnet\n";
+	warn "Networks must be capable of routing to each other.\n";
+    }
+
+    my $res = [];
+
+    if (!scalar(@{$overwrite_ips})) { # auto-select one address for each public network
+	for my $net (@{$public_nets}) {
+	    $net = PVE::JSONSchema::pve_verify_cidr($net);
 
-    if (!$overwrite_ip) {
-	if (scalar(@$allowed_ips) == 1 || !grep { $_ ne $allowed_ips->[0] } @$allowed_ips) {
-	    return $allowed_ips->[0];
+	    my $allowed_ips = PVE::Network::get_local_ip_from_cidr($net);
+	    die "No active IP found for the requested ceph public network '$net' on node '$node'\n"
+		if scalar(@$allowed_ips) < 1;
+
+	    if (scalar(@$allowed_ips) == 1 || !grep { $_ ne $allowed_ips->[0] } @$allowed_ips) {
+		push @{$res}, $allowed_ips->[0];
+	    } else {
+		die "Multiple IPs for ceph public network '$net' detected on $node:\n".
+		    join("\n", @$allowed_ips) ."\nuse 'mon-address' to specify one of them.\n";
+	    }
+	}
+    } else { # check if overwrite IPs are active and in any of the public networks
+	my $allowed_ips = [];
+
+	for my $net (@{$public_nets}) {
+	    $net = PVE::JSONSchema::pve_verify_cidr($net);
+
+	    push @{$allowed_ips}, @{PVE::Network::get_local_ip_from_cidr($net)};
 	}
-	die "Multiple IPs for ceph public network '$pubnet' detected on $node:\n".
-	    join("\n", @$allowed_ips) ."\nuse 'mon-address' to specify one of them.\n";
-    } else {
-	if (grep { $_ eq $overwrite_ip } @$allowed_ips) {
-	    return $overwrite_ip;
+
+	for my $overwrite_ip (@{$overwrite_ips}) {
+	    die "Specified monitor IP '$overwrite_ip' not configured or up on $node!\n"
+		if !grep { $_ eq $overwrite_ip } @{$allowed_ips};
 	}
-	die "Monitor IP '$overwrite_ip' not in ceph public network '$pubnet'\n"
-	    if !PVE::Network::is_ip_in_cidr($overwrite_ip, $pubnet);
 
-	die "Specified monitor IP '$overwrite_ip' not configured or up on $node!\n";
+	$res = $overwrite_ips;
     }
+
+    return $res;
 };
 
 my $assert_mon_prerequisites = sub {
-    my ($cfg, $monhash, $monid, $monip) = @_;
+    my ($cfg, $monhash, $monid, $monips) = @_;
 
-    my $ip_regex = '(?:^|[^\d])' . # start or nondigit
-	       "\Q$monip\E" .  # the ip not interpreted as regex
-	       '(?:[^\d]|$)';  # end or nondigit
+    for my $monip (@{$monips}) {
+	my $ip_regex = '(?:^|[^\d])' . # start or nondigit
+		   "\Q$monip\E" .  # the ip not interpreted as regex
+		   '(?:[^\d]|$)';  # end or nondigit
 
-    if (($cfg->{global}->{mon_host} && $cfg->{global}->{mon_host} =~ m/$ip_regex/) ||
-	grep { $_->{addr} && $_->{addr}  =~ m/$ip_regex/ } values %$monhash) {
-	die "monitor address '$monip' already in use\n";
+	if (($cfg->{global}->{mon_host} && $cfg->{global}->{mon_host} =~ m/$ip_regex/) ||
+	    grep { $_->{addr} && $_->{addr}  =~ m/$ip_regex/ } values %$monhash) {
+	    die "monitor address '$monip' already in use\n";
+	}
     }
 
     if (defined($monhash->{$monid})) {
@@ -177,9 +209,9 @@ __PACKAGE__->register_method ({
 		description => "The ID for the monitor, when omitted the same as the nodename",
 	    },
 	    'mon-address' => {
-		description => 'Overwrites autodetected monitor IP address. ' .
-		               'Must be in the public network of ceph.',
-		type => 'string', format => 'ip',
+		description => 'Overwrites autodetected monitor IP address(es). ' .
+		               'Must be in the public network(s) of Ceph.',
+		type => 'string', format => 'ip-list',
 		optional => 1,
 	    },
 	},
@@ -207,9 +239,9 @@ __PACKAGE__->register_method ({
 
 	my $monid = $param->{monid} // $param->{node};
 	my $monsection = "mon.$monid";
-	my $ip = $find_mon_ip->($cfg, $rados, $param->{node}, $param->{'mon-address'});
+	my $ips = $find_mon_ips->($cfg, $rados, $param->{node}, $param->{'mon-address'});
 
-	$assert_mon_prerequisites->($cfg, $monhash, $monid, $ip);
+	$assert_mon_prerequisites->($cfg, $monhash, $monid, $ips);
 
 	my $worker = sub  {
 	    my $upid = shift;
@@ -222,7 +254,7 @@ __PACKAGE__->register_method ({
 		    $rados = PVE::RADOS->new(timeout => PVE::Ceph::Tools::get_config('long_rados_timeout'));
 		}
 		$monhash = PVE::Ceph::Services::get_services_info('mon', $cfg, $rados);
-		$assert_mon_prerequisites->($cfg, $monhash, $monid, $ip);
+		$assert_mon_prerequisites->($cfg, $monhash, $monid, $ips);
 
 		my $client_keyring = PVE::Ceph::Tools::get_or_create_admin_keyring();
 		my $mon_keyring = PVE::Ceph::Tools::get_config('pve_mon_key_path');
@@ -260,22 +292,31 @@ __PACKAGE__->register_method ({
 		    run_command(['chown', 'ceph:ceph', $mondir]);
 
 		    my $is_first_address = !defined($rados);
-		    if (Net::IP::ip_is_ipv6($ip)) {
-			$cfg->{global}->{ms_bind_ipv6} = 'true';
-			$cfg->{global}->{ms_bind_ipv4} = 'false' if $is_first_address;
-		    } else {
-			$cfg->{global}->{ms_bind_ipv4} = 'true';
-			$cfg->{global}->{ms_bind_ipv6} = 'false' if $is_first_address;
-		    }
 
-		    my $monaddr = Net::IP::ip_is_ipv6($ip) ? "[$ip]" : $ip;
+		    my $monaddrs = [];
+
+		    for my $ip (@{$ips}) {
+			if (Net::IP::ip_is_ipv6($ip)) {
+			    $cfg->{global}->{ms_bind_ipv6} = 'true';
+			    $cfg->{global}->{ms_bind_ipv4} = 'false' if $is_first_address;
+			} else {
+			    $cfg->{global}->{ms_bind_ipv4} = 'true';
+			    $cfg->{global}->{ms_bind_ipv6} = 'false' if $is_first_address;
+			}
+
+			my $monaddr = Net::IP::ip_is_ipv6($ip) ? "[$ip]" : $ip;
+			push @{$monaddrs}, "v2:$monaddr:3300";
+			push @{$monaddrs}, "v1:$monaddr:6789";
+
+			$is_first_address = 0;
+		    }
 
 		    my $monmaptool_cmd = [
 			'monmaptool',
 			'--clobber',
 			'--addv',
 			$monid,
-			"[v2:$monaddr:3300,v1:$monaddr:6789]",
+			"[" . join(',', @{$monaddrs}) . "]",
 			'--print',
 			$monmap,
 		    ];
@@ -316,10 +357,10 @@ __PACKAGE__->register_method ({
 			$monhost .= " " . $monhash->{$mon}->{addr};
 		    }
 		}
-		$monhost .= " $ip";
+		$monhost .= " " . join(' ', @{$ips});
 		$cfg->{global}->{mon_host} = $monhost;
 		# The IP is needed in the ceph.conf for the first boot
-		$cfg->{$monsection}->{public_addr} = $ip;
+		$cfg->{$monsection}->{public_addr} = $ips->[0];
 
 		cfs_write_file('ceph.conf', $cfg);
 
@@ -414,11 +455,26 @@ __PACKAGE__->register_method ({
 		$monstat = $rados->mon_command({ prefix => 'quorum_status' });
 		$monlist = $monstat->{monmap}->{mons};
 
-		my $addr;
+		my $addrs = [];
+
+		my $add_addr = sub {
+		    my ($addr) = @_;
+
+		    # extract the ip without port and nonce (if present)
+		    ($addr) = $addr =~ m|^(.*):\d+(/\d+)?$|;
+		    push @{$addrs}, $addr;
+		};
+
 		for my $mon (@$monlist) {
 		    if ($mon->{name} eq $monid) {
-			$addr = $mon->{public_addr} // $mon->{addr};
-			($addr) = $addr =~ m|^(.*):\d+/\d+$|; # extract the ip without port/nonce
+			if ($mon->{public_addrs} && $mon->{public_addrs}->{addrvec}) {
+			    my $addrvec = $mon->{public_addrs}->{addrvec};
+			    for my $addr (@{$addrvec}) {
+				$add_addr->($addr->{addr});
+			    }
+			} else {
+			    $add_addr->($mon->{public_addr} // $mon->{addr});
+			}
 			last;
 		    }
 		}
@@ -433,23 +489,25 @@ __PACKAGE__->register_method ({
 
 		# delete from mon_host
 		if (my $monhost = $cfg->{global}->{mon_host}) {
-		    # various replaces to remove the ip
-		    # we always match the beginning or a separator (also at the end)
-		    # so we do not accidentally remove a wrong ip
-		    # e.g. removing 10.0.0.1 should not remove 10.0.0.101 or 110.0.0.1
-
-		    # remove vector containing this ip
-		    # format is [vX:ip:port/nonce,vY:ip:port/nonce]
-		    my $vectorpart_re = "v\\d+:\Q$addr\E:\\d+\\/\\d+";
-		    $monhost =~ s/(^|[ ,;]*)\[$vectorpart_re(?:,$vectorpart_re)*\](?:[ ,;]+|$)/$1/;
-
-		    # ip (+ port)
-		    $monhost =~ s/(^|[ ,;]+)\Q$addr\E(?::\d+)?(?:[ ,;]+|$)/$1/;
-
-		    # ipv6 only without brackets
-		    if ($addr =~ m/^\[?(.*?:.*?)\]?$/) {
-			$addr = $1;
-			$monhost =~ s/(^|[ ,;]+)\Q$addr\E(?:[ ,;]+|$)/$1/;
+		    for my $addr (@{$addrs}) {
+			# various replaces to remove the ip
+			# we always match the beginning or a separator (also at the end)
+			# so we do not accidentally remove a wrong ip
+			# e.g. removing 10.0.0.1 should not remove 10.0.0.101 or 110.0.0.1
+
+			# remove vector containing this ip
+			# format is [vX:ip:port/nonce,vY:ip:port/nonce]
+			my $vectorpart_re = "v\\d+:\Q$addr\E:\\d+\\/\\d+";
+			$monhost =~ s/(^|[ ,;]*)\[$vectorpart_re(?:,$vectorpart_re)*\](?:[ ,;]+|$)/$1/;
+
+			# ip (+ port)
+			$monhost =~ s/(^|[ ,;]+)\Q$addr\E(?::\d+)?(?:[ ,;]+|$)/$1/;
+
+			# ipv6 only without brackets
+			if ($addr =~ m/^\[?(.*?:.*?)\]?$/) {
+			    $addr = $1;
+			    $monhost =~ s/(^|[ ,;]+)\Q$addr\E(?:[ ,;]+|$)/$1/;
+			}
 		    }
 
 		    # remove trailing separators
-- 
2.20.1





  parent reply	other threads:[~2021-04-30 13:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-30 13:54 [pve-devel] [RFC manager] " Fabian Ebner
2021-04-30 13:54 ` [pve-devel] [RFC manager 1/5] api: ceph: mon: split up arguments for run_command Fabian Ebner
2021-04-30 13:54 ` [pve-devel] [RFC manager 2/5] api: ceph: create mon: handle ms_bind_ipv* options more generally Fabian Ebner
2021-04-30 13:54 ` [pve-devel] [RFC manager 3/5] api: ceph: create mon: factor out monmaptool command Fabian Ebner
2021-04-30 13:54 ` [pve-devel] [RFC manager 4/5] api: ceph: create mon: explicitly add subsequent monitors to the monmap Fabian Ebner
2021-04-30 13:54 ` Fabian Ebner [this message]
2021-05-04  8:24 ` [pve-devel] [RFC manager] fix #2422: allow multiple Ceph public networks Dominik Csapak
2021-05-07  8:04   ` Fabian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210430135437.4816-6-f.ebner@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal