From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH storage] rbd: alloc image: fix #3970 avoid ambiguous rbd path
Date: Wed, 06 Apr 2022 09:36:35 +0200 [thread overview]
Message-ID: <1649229686.hkn6936l0d.astroid@nora.none> (raw)
In-Reply-To: <20220405124040.2996487-1-a.lauterer@proxmox.com>
On April 5, 2022 2:40 pm, Aaron Lauterer wrote:
> If two RBD storages use the same pool, but connect to different
> clusters, we cannot say to which cluster the mapped RBD image belongs to
> if krbd is used. To avoid potential data loss, we need to verify that no
> other storage is configured that could have a volume mapped under the
> same path before we allocate the image.
>
> The ambiguous mapping is in
> /dev/rbd/<pool>/<ns>/<image> where the namespace <ns> is optional.
>
> Once we can tell the clusters apart in the mapping, we can remove these
> checks again.
>
> See bug #3969 for more information on the root cause.
>
> Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
> ---
> changes since RFC:
>
> * moved check to pve-storage since containers and VMs both have issues
> not just on a move or clone of the image, but also when creating a new
> volume
> * reworked the checks, instead of large if conditions, we use
> PVE::Tools::safe_compare with comparison functions
> * normalize monhost list to match correctly if the list is in different
> order
> * add storage name to error message that triggered the checks
> * ignore disabled storages
>
> PVE/Storage/RBDPlugin.pm | 34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/PVE/Storage/RBDPlugin.pm b/PVE/Storage/RBDPlugin.pm
> index e287e28..a9dbf5e 100644
> --- a/PVE/Storage/RBDPlugin.pm
> +++ b/PVE/Storage/RBDPlugin.pm
> @@ -516,6 +516,40 @@ sub alloc_image {
> die "illegal name '$name' - should be 'vm-$vmid-*'\n"
> if $name && $name !~ m/^vm-$vmid-/;
>
> + # check if another rbd storage with the same pool name but different
> + # cluster exists. If so, allocating a new volume can potentially be
> + # dangerous because the RBD mapping, exposes it in an ambiguous way under
> + # /dev/rbd/<pool>/<ns>/<image>. Without any information to which cluster it
> + # belongs, we cannot clearly determine which image we access and
> + # potentially use the wrong one. See
> + # https://bugzilla.proxmox.com/show_bug.cgi?id=3969 and
> + # https://bugzilla.proxmox.com/show_bug.cgi?id=3970
> + # TODO: remove these checks once #3969 is fixed and we can clearly tell to
> + # which cluster an image belongs to
> + my $storecfg = PVE::Storage::config();
> + foreach my $store (keys %{$storecfg->{ids}}) {
I think this needs to go somewhere else - probably into a new private
helper that gets called in alloc_image, clone_image and rename_image (at
least those are the ones that currently call find_free_diskname).
basically all existing volids are as they are (they should be fine, else
the user would probably already have noticed data loss/corruption), but
anything that takes a new slot should be blocked before causing mayhem.
> + next if $store eq $storeid;
> +
> + my $checked_scfg = $storecfg->{ids}->{$store};
> +
> + next if $checked_scfg->{type} ne 'rbd';
> + next if $checked_scfg->{disable};
> + next if $scfg->{pool} ne $checked_scfg->{pool};
> +
> + my $normalize_mons = sub { return join('/', sort( PVE::Tools::split_list(' ', shift))) };
this doesn't do what you think it does ;) split_list takes a single
argument (the string to be split). I think joining with ';' might be
more natural (it's basically a 'split->sort->join-as-string-list' then),
and semicolons don't make any sense inside a monhost anyway.
> + my $cmp_mons = sub { $normalize_mons->($_[0]) cmp $normalize_mons->($_[1]) };
> + my $cmp = sub { $_[0] cmp $_[1] };
that might be a nice addition to safe_compare (no $cmp -> use `cmp`),
but alas.
> + # internal and internal, or external and external with identical monitors
> + # => same cluster
> + next if PVE::Tools::safe_compare($scfg->{monhost}, $checked_scfg->{monhost}, $cmp_mons) == 0;
> +
> + # different namespaces => no clash possible
> + next if !PVE::Tools::safe_compare($scfg->{namespace}, $checked_scfg->{namespace}, $cmp) == 0;
!= 0 please!
> +
> + die "Other storage found which would lead to ambiguous mappings: '$store'\n";
it might make sense to include both storages here? e.g.:
"Cannot create volume on '$storeid' - RBD blockdev paths shared with
storage '$store'\n";
or even a reference to the bug that explains it all? could post a
comment with workarounds as well then (although I do hope that not many
people will run into this, and most of those are hopefully false
positives of the check and not actually problematic setups).
> + }
> +
> $name = $class->find_free_diskname($storeid, $scfg, $vmid) if !$name;
>
> my @options = (
> --
> 2.30.2
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
>
next prev parent reply other threads:[~2022-04-06 7:37 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-05 12:40 Aaron Lauterer
2022-04-06 7:36 ` Fabian Grünbichler [this message]
2022-04-06 7:52 ` Aaron Lauterer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1649229686.hkn6936l0d.astroid@nora.none \
--to=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox