From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 8817190B6D for ; Thu, 16 Mar 2023 14:59:40 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 702D65BC9 for ; Thu, 16 Mar 2023 14:59:40 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 16 Mar 2023 14:59:38 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 094CF440E6 for ; Thu, 16 Mar 2023 14:59:38 +0100 (CET) Date: Thu, 16 Mar 2023 14:59:36 +0100 From: Wolfgang Bumiller To: Friedrich Weber Cc: pve-devel@lists.proxmox.com Message-ID: <20230316135936.hhr7jmw5nnkhdqa5@casey.proxmox.com> References: <20230223170302.3014798-1-f.weber@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230223170302.3014798-1-f.weber@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.181 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH container] lxc start: warn in case of conflicting lxc.idmap entries X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Mar 2023 13:59:40 -0000 On Thu, Feb 23, 2023 at 06:03:02PM +0100, Friedrich Weber wrote: > Users can customize the mapping between host and container uids/gids > by providing `lxc.idmap` entries in the container config. The syntax > is described in lxc.container.conf(5). One source of errors are > conflicting entries for one or more uid/gids. An example: > > ... > lxc.idmap: u 0 100000 65536 > lxc.idmap: u 1000 1000 10 > ... > > Assuming `root:1000:10` is correctly added to /etc/subuid, starting > the container fails with an error that is hard to interpret: > > lxc_map_ids: 3701 newuidmap failed to write mapping > "newuidmap: write to uid_map failed: Invalid argument": > newuidmap 67993 0 100000 65536 1000 1000 10 > > In order to simplify troubleshooting, validate the mapping before > starting the container and print a warning if conflicting uid/gid > mappings are detected. For the above mapping: > > lxc.idmap: invalid map entry 'u 1000 1000 10': container uid 1000 > is already mapped by entry 'u 0 100000 65536' > > The warning appears in the task log and in the output of `pct start`. > > A failed validation check only prints a warning instead of erroring > out, to make sure potentially buggy (or outdated) validation code does > not prevent containers from starting. > > The validation algorithm is quite straightforward and has quadratic > runtime in the worst case. The kernel allows at most 340 lines in > uid_map (see user_namespaces(7)), which the algorithm should be able > to handle in well under 0.5s, but to be safe, skip validation for more > than 100 entries. > > Note that validation does not take /etc/sub{uid,gid} into account, > which, if misconfigured, could still prevent the container from > starting with an error like > > "newuidmap: uid range [1000-1010) -> [1000-1010) not allowed" > > If needed, validating /etc/sub{uid,gid} could be added in the future. > > Signed-off-by: Friedrich Weber > --- > The warning could of course be even more detailed, e.g., "container uid range > [1000...1009] is already mapped to [101000...101009] by entry 'u 0 100000 > 65536'". But this would require a more sophisticated algorithm, and I'm not > sure if the added complexity is worth it -- happy to add it if wanted, though. > > src/PVE/LXC.pm | 97 ++++++++++++++++++ > src/test/Makefile | 5 +- > src/test/idmap-test.pm | 197 ++++++++++++++++++++++++++++++++++++ > src/test/run_idmap_tests.pl | 10 ++ > 4 files changed, 308 insertions(+), 1 deletion(-) > create mode 100644 src/test/idmap-test.pm > create mode 100755 src/test/run_idmap_tests.pl > > diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm > index 54ee0d9..141f195 100644 > --- a/src/PVE/LXC.pm > +++ b/src/PVE/LXC.pm > @@ -2313,6 +2313,93 @@ sub parse_id_maps { > return ($id_map, $rootuid, $rootgid); > } > > +# parse custom id mapping and throw a human-readable error if any > +# host/container uid/gid is mapped twice. note that the algorithm has quadratic > +# worst-case runtime, which may be problematic with >1000 mapping entries. > +# we're unlikely to exceed this because the kernel only allows 340 entries (see > +# user_namespaces(7)), but if we do, this could be implemented more efficiently > +# (see e.g. "interval trees"). Both seem a bit excessive to me. Let's look at the data: We have a set of ranges consisting of a type, 2 starts and a count. The types are uids and gids, so we can view those as 2 separate instances of sets of [ct_start, host_start, count]. Since neither the container nor the host sides must overlap we can - again - view these as separate sets of container side [start, count] and host side [start, count]. In other words, we can see the entire id map as just 4 sets of [start, count] ranges which must not overlap. So I think all we need to do is sort these by the 'start' value, and for each element make sure that prevous_start + previous_count <= current_start And yes, that means we need to sort $id_maps twice, once by ct id, once by host id, and then iterate and do the above check. Should be much shorter (and faster). > +sub validate_id_maps { > + my ($id_map) = @_; > + > + # keep track of already-mapped uids/gid ranges on the container and host > + # sides. each range is an array [mapped_entry, first_id, last_id], where > + # mapped_entry is the original config entry (stored for generating error > + # messages), and [first_id, last_id] is an (inclusive) interval of mapped > + # ids. Ranges are sorted ascendingly by first_id for more efficient > + # traversal, and they must not overlap (this would be an invalid mapping). > + my $sides_ranges = { > + host => { u => [], g => [] }, > + container => { u => [], g => [] }, > + }; > + > + # try to update the mapping with the requested range. > + # return (1, undef, undef) on success and (0, $mapped_entry, $id) on failure, > + # where $mapped_entry is the conflicting map entry that already maps $id. > + my $map_range = sub { > + my ($requested_entry, $mapped_ranges, $requested_first, $requested_count) = @_; > + my $requested_last = $requested_first + $requested_count - 1; > + > + # fallback case: insert the requested range at the end > + my $mapped_length = scalar(@$mapped_ranges); > + my $insert_before = $mapped_length; > + > + foreach my $i (0 .. $mapped_length - 1) { > + my ($mapped_entry, $mapped_first, $mapped_last) = @$mapped_ranges[$i]->@*; > + > + if ($requested_last < $mapped_first) { > + # mapped: [---] > + # requested: [---] > + # as the ranges are sorted, the requested range cannot overlap > + # with any subsequent mapped range, so we can stop iterating > + $insert_before = $i; > + last; > + } > + > + # two shapes of overlapping ranges are possible. > + # mapped: [---] > + # requested: [... > + # ^- conflicting id > + return (0, $mapped_entry, $requested_first) > + if $mapped_first <= $requested_first <= $mapped_last; > + > + # mapped: [... > + # requested: [---] > + # ^- conflicting id > + return (0, $mapped_entry, $mapped_first) > + if $requested_first <= $mapped_first <= $requested_last; > + } > + > + my $new_entry = [$requested_entry, $requested_first, $requested_last]; > + splice @$mapped_ranges, $insert_before, 0, $new_entry; > + return (1, undef, undef); > + }; > + > + my $validate_and_map_range = sub { > + my ($map_entry, $side, $type, $first_id, $count) = @_; > + my $mapped_ranges = $sides_ranges->{$side}->{$type}; > + > + my ($ok, $already_mapped_entry, $already_mapped) = $map_range->( > + $map_entry, > + $mapped_ranges, > + $first_id, > + $count, > + ); > + die "invalid map entry '@$map_entry': " . > + "$side ${type}id $already_mapped is already mapped by entry '@$already_mapped_entry'\n" > + if !$ok; > + }; > + > + foreach my $map_entry (@$id_map) { > + my ($type, $first_ct_id, $first_host_id, $count) = @$map_entry; > + > + $validate_and_map_range->($map_entry, 'container', $type, int($first_ct_id), int($count)); > + $validate_and_map_range->($map_entry, 'host', $type, int($first_host_id), int($count)); > + } > + > + return $sides_ranges; > +} > + > sub userns_command { > my ($id_map) = @_; > if (@$id_map) { >...