From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <s.sterz@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 217FB9ADC0
 for <pve-devel@lists.proxmox.com>; Tue, 23 May 2023 10:56:27 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 0BA4E8E92
 for <pve-devel@lists.proxmox.com>; Tue, 23 May 2023 10:56:27 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Tue, 23 May 2023 10:56:26 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D828B46A5A
 for <pve-devel@lists.proxmox.com>; Tue, 23 May 2023 10:56:25 +0200 (CEST)
Message-ID: <02972932-cf42-cbc1-870d-301ed9ff43b9@proxmox.com>
Date: Tue, 23 May 2023 10:56:24 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.11.0
To: Christoph Heiss <c.heiss@proxmox.com>
Cc: pve-devel@lists.proxmox.com
References: <20230517133931.148634-1-s.sterz@proxmox.com>
 <20230523065850.3hhphl2e4p7awvaf@maui.proxmox.com>
Content-Language: en-US
From: Stefan Sterz <s.sterz@proxmox.com>
In-Reply-To: <20230523065850.3hhphl2e4p7awvaf@maui.proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.054 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.091 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [proxmox.com, metacpan.org, dn-regex-test.pl, ldap.pm]
Subject: Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished
 names regex
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 23 May 2023 08:56:27 -0000

On 23.05.23 08:58, Christoph Heiss wrote:
> On Wed, May 17, 2023 at 03:39:31PM +0200, Stefan Sterz wrote:
>> [..]
>>
>> this commit also adds a test file that tests the regex against a
>> number of common pitfalls. including distinguished names that are
>> structurally similar to those reported as erroneously forbidden
>> earlier. these tests should help avoid regressions in the future.
> That's a really good idea :^)
> 
>>
>> Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
>> ---
>> would really appreciate another pair of eyes here. given the recent
>> churn related to this regex. it's very likely i missed something too.
>>
>>  src/PVE/Auth/LDAP.pm      |  9 +++++--
>>  src/test/Makefile         |  1 +
>>  src/test/dn-regex-test.pl | 54 +++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 62 insertions(+), 2 deletions(-)
>>  create mode 100755 src/test/dn-regex-test.pl
>>
>> diff --git a/src/PVE/Auth/LDAP.pm b/src/PVE/Auth/LDAP.pm
>> index fc82a17..ccc6864 100755
>> --- a/src/PVE/Auth/LDAP.pm
>> +++ b/src/PVE/Auth/LDAP.pm
>> @@ -10,8 +10,13 @@ use PVE::Tools;
>>
>>  use base qw(PVE::Auth::Plugin);
>>
>> -my  $dn_part_regex = qr!("[^"]+"|[^ ,+"/<>;=#][^,+"/<>;=]*[^ ,+"/<>;=]|[^ ,+"/<>;=#])!;
>> -our $dn_regex = qr!\w+=${dn_part_regex}(,\s*\w+=${dn_part_regex})*!;
>> +my $escaped  = qr!\\(?:[ "#+,;<=>\\]|[0-9a-fA-F]{2})!;
>> +my $start    = qr!(?:${escaped}|[^"+,;<>\\\0 #])!;
>> +my $middle   = qr!(?:${escaped}|[^"+,;<>\\\0])!;
>> +my $end      = qr!(?:${escaped}|[^"+,;<>\\\0 ])!;
>> +my $attr_val = qr!("[^"]+"|${start}(?:${middle}*${end})?)!;
>> +
>> +our $dn_regex = qr!\w+=${attr_val}([,\+]\s*\w+=${attr_val})*!;
> While reviewing that, I had a look at the `Net::LDAP` perl library
> again, if it provides a way to _somehow_ validate DNs properly without
> having to resort to very fragile regexes.
> 
> => `Net::LDAP` provides a canonical_dn() function in `Net::LDAP::Util`
>    https://metacpan.org/pod/Net::LDAP::Util#canonical_dn
> 

handling this via Net::LDAP is probably the better way to go. however,
just to through this suggestion out there: do we maybe want to use a
more lax regex in general instead of actually checking whether a dn
*could* be valid? maybe this check should just be a sanity check instead
of making the dn conform to the spec.

> I quickly hacked up your test script to use canonical_dn() instead of
> the regex. Works pretty much the same, except that it considers empty
> strings and attributes with empty values as valid DNs and allows
> whitespaces at both the beginning and the end of attribute values. But
> these edge-cases can be much more easily checked than the whole thing
> (esp. with escaping and such), and should be more robust.

yes, but imo that might end up being somewhat confusing. especially if
we end up using the canonical version of the dn returned by
`canonical_dn()`. in my testing it also escaped already validly escaped
values such as `CN=James \"Jim\" Smith\, III` and turned them into
`CN=James \22Jim\22 Smith\2c III`. this may be confusing to some users.

it's handling of spaces is imo especially strange. according to the docs:

> Converts all leading and trailing spaces in values to be \20.

but `ou= test ,dc=net` yields `OU=test,DC=net` for me. so it doesn't
escape them, it just drops them (also note that capitalization)? if you
quote the value it does what it says though. so `ou=" test ",dc=net`
becomes `OU=\20test\20,DC=net` as expected, although somewhat
pointlessly so in this case as spaces in quoted values are in-spec.

imo if we go this route we either need additional logic to handle spaces
and empty values/dns before passing the dn to `canonical_dn()` for
validation only, or we need to, at least, document the transformations
that `canonical_dn()` does. otherwise, these substitutions may end up
confusing users.

> [ I've attached the diff for the test script below for reference. ]
> 
> So IHMO that should be the way forward.
>> @Stefan: I'd be willing to properly rewrite the DN checking using
> canonical_dn() - or do you want to?
> 
>> [..]
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
> 
> diff --git a/src/test/dn-regex-test.pl b/src/test/dn-regex-test.pl
> index a2410af..9a48483 100755
> --- a/src/test/dn-regex-test.pl
> +++ b/src/test/dn-regex-test.pl
> @@ -3,6 +3,7 @@ use strict;
>  use warnings;
> 
>  use PVE::Auth::LDAP;
> +use Net::LDAP::Util qw(canonical_dn);
> 
>  # this test file attempts to check whether the ldap distinguished name regex
>  # defined in `PVE::Auth::LDAP` complies with RFC 4514.
> @@ -33,10 +34,10 @@ my @fail = (
>      "ou=a,",			# no empty AttributeTypeAndValue
>      "ou=a+",			# no multi-valued RelativeDistinguishedName with empty second part
>      "ou",			# missing separator and AttributeValue
> -    "ou=",			# no 'empty' AttributeValue
> +    # "ou=",			# no 'empty' AttributeValue
>      "ou=+",			# forbidden character '+' in AttributeValue
> -    "ou= or",			# no space at the beginning of an AttributeValue
> -    "ou=orgs ",			# no space at the end of an AttributeValue> +    # "ou= or",			# no space at the beginning of an AttributeValue
> +    # "ou=orgs ",			# no space at the end of an AttributeValue
>      "ou=#value",		# no '#' at the beginning an AttributeValue
>      "ou=\"+,;<>\\\0",		# no un-escaped forbidden characters in unquoted AttributeValues
>      "ou=name\0",		# no null value in AttributeValue
> @@ -44,11 +45,12 @@ my @fail = (
>  );
> 
>  for my $test_case (@pass) {
> -    die "\'" . $test_case . "\' unexpectedly did not match the ldap dn_regex\n"
> -	if !($test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/);
> +    die "'$test_case' unexpectedly failed to parse as valid DN\n"
> +	if !canonical_dn($test_case);
>  }
> 
>  for my $test_case (@fail) {
> -    die "\'" . $test_case . "\' unexpectedly did match the the ldap dn_regex\n"
> -	if $test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/;
> +    my $dn = canonical_dn($test_case);
> +    die "'$test_case' unexpectedly parsed as valid DN: '$dn'\n"
> +	if $dn;
>  }