From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 63EAB9F93 for ; Wed, 27 Apr 2022 14:21:26 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 5784226204 for ; Wed, 27 Apr 2022 14:20:56 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id F213C261FB for ; Wed, 27 Apr 2022 14:20:54 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id C66E642E06 for ; Wed, 27 Apr 2022 14:20:54 +0200 (CEST) Date: Wed, 27 Apr 2022 14:20:46 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox VE development discussion Cc: Thomas Lamprecht References: <20220413094322.1736862-1-a.lauterer@proxmox.com> In-Reply-To: <20220413094322.1736862-1-a.lauterer@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.15.0 (https://github.com/astroidmail/astroid) Message-Id: <1651061960.64j6eezibf.astroid@nora.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.120 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment PROLO_LEO1 0.1 Meta Catches all Leo drug variations so far SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] applied: [PATCH v2 storage] rbd: fix #3969: add rbd dev paths with cluster info X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Apr 2022 12:21:26 -0000 with two small follow-ups: - re-order conditions for fallback to old path (ideally avoiding one=20 stat()) - drop aliased private sub please don't forget to submit the udev change upstream as well ;) On April 13, 2022 11:43 am, Aaron Lauterer wrote: > By adding our own customized rbd udev rules and ceph-rbdnamer we can > create device paths that include the cluster fsid and avoid any > ambiguity if the same pool and namespace combination is used in > different clusters we connect to. >=20 > Additionally to the '/dev/rbd//...' paths we now have > '/dev/rbd-pve///...' paths. >=20 > The other half of the patch makes use of the new device paths in the RBD > plugin. >=20 > The new 'get_rbd_dev_path' method the full device path. In case that the > image has been mapped before the rbd-pve udev rule has been installed, > it returns the old path. >=20 > The cluster fsid is read from the 'ceph.conf' file in the case of a > hyperconverged setup. In the case of an external Ceph cluster we need to > fetch it via a rados api call. >=20 > Co-authored-by: Thomas Lamprecht > Signed-off-by: Aaron Lauterer > --- >=20 > Testing the migration phase can be done by starting a VM first, then > install the new package version. Don't forget to have the disk image on > an rbd storage with krbd enabled. >=20 > `qm showcmd VMID --pretty` should still show the old rbd path for that > disk. > Add a new disk to the running VM and run showcmd again. It should show > the new rbd path for the added disk. >=20 >=20 > The udev rules seem to get applied automatically after installing the > package. So we should be fine to assume that once we run this new code, > newly mapped rbd images should also be exposed in the new /dev/rbd-pve > path. >=20 > I also realized that the rescan logic (pct rescan, qm rescan) gets > confused by having two pools with the same name on different clusters. > Something we need to take a look at as well. >=20 > changes since v1: > * reverted changes to get_rbd_path > * added additional get_rbd_dev_path function that returns the full path >=20 >=20 > Makefile | 1 + > PVE/Storage/RBDPlugin.pm | 60 ++++++++++++++++++++++++++++---------- > udev-rbd/50-rbd-pve.rules | 2 ++ > udev-rbd/Makefile | 21 +++++++++++++ > udev-rbd/ceph-rbdnamer-pve | 24 +++++++++++++++ > 5 files changed, 92 insertions(+), 16 deletions(-) > create mode 100644 udev-rbd/50-rbd-pve.rules > create mode 100644 udev-rbd/Makefile > create mode 100755 udev-rbd/ceph-rbdnamer-pve >=20 > diff --git a/Makefile b/Makefile > index 431db16..029b586 100644 > --- a/Makefile > +++ b/Makefile > @@ -41,6 +41,7 @@ install: PVE pvesm.1 pvesm.bash-completion pvesm.zsh-co= mpletion > install -d ${DESTDIR}${SBINDIR} > install -m 0755 pvesm ${DESTDIR}${SBINDIR} > make -C PVE install > + make -C udev-rbd install > install -d ${DESTDIR}/usr/share/man/man1 > install -m 0644 pvesm.1 ${DESTDIR}/usr/share/man/man1/ > gzip -9 -n ${DESTDIR}/usr/share/man/man1/pvesm.1 > diff --git a/PVE/Storage/RBDPlugin.pm b/PVE/Storage/RBDPlugin.pm > index e287e28..3293565 100644 > --- a/PVE/Storage/RBDPlugin.pm > +++ b/PVE/Storage/RBDPlugin.pm > @@ -8,6 +8,7 @@ use JSON; > use Net::IP; > =20 > use PVE::CephConfig; > +use PVE::Cluster qw(cfs_read_file);; > use PVE::JSONSchema qw(get_standard_option); > use PVE::ProcFSTools; > use PVE::RADOS; > @@ -22,6 +23,16 @@ my $get_parent_image_name =3D sub { > return $parent->{image} . "@" . $parent->{snapshot}; > }; > =20 > +my $librados_connect =3D sub { > + my ($scfg, $storeid, $options) =3D @_; > + > + my $librados_config =3D PVE::CephConfig::ceph_connect_option($scfg, = $storeid); > + > + my $rados =3D PVE::RADOS->new(%$librados_config); > + > + return $rados; > +}; > + > my sub get_rbd_path { > my ($scfg, $volume) =3D @_; > my $path =3D $scfg->{pool} ? $scfg->{pool} : 'rbd'; > @@ -30,6 +41,32 @@ my sub get_rbd_path { > return $path; > }; > =20 > +my sub get_rbd_dev_path { > + my ($scfg, $storeid, $volume) =3D @_; > + > + my $cluster_id =3D ''; > + if ($scfg->{monhost}) { > + my $rados =3D $librados_connect->($scfg, $storeid); > + $cluster_id =3D $rados->mon_command({ prefix =3D> 'fsid', format =3D> '= json' })->{fsid}; > + } else { > + $cluster_id =3D cfs_read_file('ceph.conf')->{global}->{fsid}; > + } > + > + my $uuid_pattern =3D "([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{= 4}-[0-9a-f]{12})"; > + if ($cluster_id =3D~ qr/^${uuid_pattern}$/is) { > + $cluster_id =3D $1; # use untained value > + } else { > + die "cluster fsid has invalid format\n"; > + } > + > + my $rbd_path =3D get_rbd_path($scfg, $volume); > + my $pve_path =3D "/dev/rbd-pve/${cluster_id}/${rbd_path}"; > + my $path =3D "/dev/rbd/${rbd_path}"; > + > + return $path if -e $path && !-e $pve_path; # mapped before rbd-pve u= dev rule existed > + return $pve_path; > +} > + > my $build_cmd =3D sub { > my ($binary, $scfg, $storeid, $op, @options) =3D @_; > =20 > @@ -70,16 +107,6 @@ my $rados_cmd =3D sub { > return $build_cmd->('/usr/bin/rados', $scfg, $storeid, $op, @options= ); > }; > =20 > -my $librados_connect =3D sub { > - my ($scfg, $storeid, $options) =3D @_; > - > - my $librados_config =3D PVE::CephConfig::ceph_connect_option($scfg, = $storeid); > - > - my $rados =3D PVE::RADOS->new(%$librados_config); > - > - return $rados; > -}; > - > # needed for volumes created using ceph jewel (or higher) > my $krbd_feature_update =3D sub { > my ($scfg, $storeid, $name) =3D @_; > @@ -380,9 +407,10 @@ sub path { > my ($vtype, $name, $vmid) =3D $class->parse_volname($volname); > $name .=3D '@'.$snapname if $snapname; > =20 > - my $rbd_path =3D get_rbd_path($scfg, $name); > - return ("/dev/rbd/${rbd_path}", $vmid, $vtype) if $scfg->{krbd}; > + my $rbd_dev_path =3D get_rbd_dev_path($scfg, $storeid, $name); > + return ($rbd_dev_path, $vmid, $vtype) if $scfg->{krbd}; > =20 > + my $rbd_path =3D get_rbd_path($scfg, $name); > my $path =3D "rbd:${rbd_path}"; > =20 > $path .=3D ":conf=3D$cmd_option->{ceph_conf}" if $cmd_option->{ceph_= conf}; > @@ -619,8 +647,8 @@ sub deactivate_storage { > } > =20 > my sub get_kernel_device_path { > - my ($scfg, $name) =3D @_; > - return "/dev/rbd/" . get_rbd_path($scfg, $name); > + my ($scfg, $storeid, $name) =3D @_; > + return get_rbd_dev_path($scfg, $storeid, $name); > }; > =20 > sub map_volume { > @@ -631,7 +659,7 @@ sub map_volume { > my $name =3D $img_name; > $name .=3D '@'.$snapname if $snapname; > =20 > - my $kerneldev =3D get_kernel_device_path($scfg, $name); > + my $kerneldev =3D get_kernel_device_path($scfg, $storeid, $name); > =20 > return $kerneldev if -b $kerneldev; # already mapped > =20 > @@ -650,7 +678,7 @@ sub unmap_volume { > my ($vtype, $name, $vmid) =3D $class->parse_volname($volname); > $name .=3D '@'.$snapname if $snapname; > =20 > - my $kerneldev =3D get_kernel_device_path($scfg, $name); > + my $kerneldev =3D get_kernel_device_path($scfg, $storeid, $name); > =20 > if (-b $kerneldev) { > my $cmd =3D $rbd_cmd->($scfg, $storeid, 'unmap', $kerneldev); > diff --git a/udev-rbd/50-rbd-pve.rules b/udev-rbd/50-rbd-pve.rules > new file mode 100644 > index 0000000..79432df > --- /dev/null > +++ b/udev-rbd/50-rbd-pve.rules > @@ -0,0 +1,2 @@ > +KERNEL=3D=3D"rbd[0-9]*", ENV{DEVTYPE}=3D=3D"disk", PROGRAM=3D"/usr/libex= ec/ceph-rbdnamer-pve %k", SYMLINK+=3D"rbd-pve/%c" > +KERNEL=3D=3D"rbd[0-9]*", ENV{DEVTYPE}=3D=3D"partition", PROGRAM=3D"/usr/= libexec/ceph-rbdnamer-pve %k", SYMLINK+=3D"rbd-pve/%c-part%n" > diff --git a/udev-rbd/Makefile b/udev-rbd/Makefile > new file mode 100644 > index 0000000..065933b > --- /dev/null > +++ b/udev-rbd/Makefile > @@ -0,0 +1,21 @@ > +PACKAGE=3Dlibpve-storage-perl > + > +DESTDIR=3D > +PREFIX=3D/usr > +LIBEXECDIR=3D${PREFIX}/libexec > +LIBDIR=3D${PREFIX}/lib > + > +all: > + > +.PHONY: install > +install: 50-rbd-pve.rules ceph-rbdnamer-pve > + install -d ${DESTDIR}${LIBEXECDIR} > + install -m 0755 ceph-rbdnamer-pve ${DESTDIR}${LIBEXECDIR} > + install -d ${DESTDIR}${LIBDIR}/udev/rules.d > + install -m 0644 50-rbd-pve.rules ${DESTDIR}${LIBDIR}/udev/rules.d > + > +.PHONY: clean > +clean: > + > +.PHONY: distclean > +distclean: clean > diff --git a/udev-rbd/ceph-rbdnamer-pve b/udev-rbd/ceph-rbdnamer-pve > new file mode 100755 > index 0000000..23dd626 > --- /dev/null > +++ b/udev-rbd/ceph-rbdnamer-pve > @@ -0,0 +1,24 @@ > +#!/bin/sh > + > +DEV=3D$1 > +NUM=3D`echo $DEV | sed 's#p.*##g; s#[a-z]##g'` > +POOL=3D`cat /sys/devices/rbd/$NUM/pool` > +CLUSTER_FSID=3D`cat /sys/devices/rbd/$NUM/cluster_fsid` > + > +if [ -f /sys/devices/rbd/$NUM/pool_ns ]; then > + NAMESPACE=3D`cat /sys/devices/rbd/$NUM/pool_ns` > +else > + NAMESPACE=3D"" > +fi > +IMAGE=3D`cat /sys/devices/rbd/$NUM/name` > +SNAP=3D`cat /sys/devices/rbd/$NUM/current_snap` > + > +echo -n "/$CLUSTER_FSID/$POOL" > + > +if [ -n "$NAMESPACE" ]; then > + echo -n "/$NAMESPACE" > +fi > +echo -n "/$IMAGE" > +if [ "$SNAP" !=3D "-" ]; then > + echo -n "@$SNAP" > +fi > --=20 > 2.30.2 >=20 >=20 >=20 > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel >=20 >=20 >=20