public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex
@ 2023-05-17 13:39 Stefan Sterz
  2023-05-17 13:42 ` Stefan Sterz
  2023-05-23  6:58 ` Christoph Heiss
  0 siblings, 2 replies; 8+ messages in thread
From: Stefan Sterz @ 2023-05-17 13:39 UTC (permalink / raw)
  To: pve-devel

according to the current specification of the string representation of
ldap distinguished names (DN) presented by RFC 4514 [1] the current
regex checking ldap DNs still prevents users from entering valid
DNs.

for example we do not allow multi-valued RelativeDistinguishedNames as
these are separated by a '+'. this was already reported as an issue in
the subscription support.

escaped sequences in unquoted AttributeValues are also not allowed.
this includes commas (',') and quotes ('"'). for example, the
following does not work even though the RFC explicitly mentions it as
a valid example ([1], section 4):

CN=James \"Jim\" Smith\, III,DC=example,DC=net

while this may not be usable as a valid username, as argued
previously [2], it seems arbitrary to allow this for quoted
AttributeValues but not for unquoted ones.

this commit also adds a test file that tests the regex against a
number of common pitfalls. including distinguished names that are
structurally similar to those reported as erroneously forbidden
earlier. these tests should help avoid regressions in the future.

[1]: https://datatracker.ietf.org/doc/html/rfc4514
[2]: https://git.proxmox.com/?p=pve-access-control.git;h=1aa2355a

Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
---
would really appreciate another pair of eyes here. given the recent
churn related to this regex. it's very likely i missed something too.

 src/PVE/Auth/LDAP.pm      |  9 +++++--
 src/test/Makefile         |  1 +
 src/test/dn-regex-test.pl | 54 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 2 deletions(-)
 create mode 100755 src/test/dn-regex-test.pl

diff --git a/src/PVE/Auth/LDAP.pm b/src/PVE/Auth/LDAP.pm
index fc82a17..ccc6864 100755
--- a/src/PVE/Auth/LDAP.pm
+++ b/src/PVE/Auth/LDAP.pm
@@ -10,8 +10,13 @@ use PVE::Tools;
 
 use base qw(PVE::Auth::Plugin);
 
-my  $dn_part_regex = qr!("[^"]+"|[^ ,+"/<>;=#][^,+"/<>;=]*[^ ,+"/<>;=]|[^ ,+"/<>;=#])!;
-our $dn_regex = qr!\w+=${dn_part_regex}(,\s*\w+=${dn_part_regex})*!;
+my $escaped  = qr!\\(?:[ "#+,;<=>\\]|[0-9a-fA-F]{2})!;
+my $start    = qr!(?:${escaped}|[^"+,;<>\\\0 #])!;
+my $middle   = qr!(?:${escaped}|[^"+,;<>\\\0])!;
+my $end      = qr!(?:${escaped}|[^"+,;<>\\\0 ])!;
+my $attr_val = qr!("[^"]+"|${start}(?:${middle}*${end})?)!;
+
+our $dn_regex = qr!\w+=${attr_val}([,\+]\s*\w+=${attr_val})*!;
 
 sub type {
     return 'ldap';
diff --git a/src/test/Makefile b/src/test/Makefile
index 859a84b..b4c05eb 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -13,3 +13,4 @@ check:
 	perl -I.. perm-test7.pl
 	perl -I.. perm-test8.pl
 	perl -I.. realm_sync_test.pl
+	perl -I.. dn-regex-test.pl
diff --git a/src/test/dn-regex-test.pl b/src/test/dn-regex-test.pl
new file mode 100755
index 0000000..a2410af
--- /dev/null
+++ b/src/test/dn-regex-test.pl
@@ -0,0 +1,54 @@
+
+use strict;
+use warnings;
+
+use PVE::Auth::LDAP;
+
+# this test file attempts to check whether the ldap distinguished name regex
+# defined in `PVE::Auth::LDAP` complies with RFC 4514.
+# 
+# see: https://datatracker.ietf.org/doc/html/rfc4514
+
+my @pass = (
+    "ou=a",			# single AttributeTypeValue 
+    "ou=orga,dc=com,cn=name",	# multiple RelativeDistinguishedNames
+    "STREET=a,cn=a,C=c",	# single character AttributeValues
+    "UID=tt,cn=\"#+,;<>\\ \"",	# forbidden characters are allowed when quoted
+    "c=\\\"\\#\\+\\;\\<\\=\\>",	# specific characters allowed when escaped
+    "a=\\\\",			# escaped backslashes are allowed
+    "ST=a,cn=\"Test, User\"",	# allow un-escaped commas in quoted AttributeValues
+    "o2u=bc,cn=Test\\, User",	# allow escaped commas
+    "T2=a #b",			# spaces (' ') and '#' are allowed in the middle of AttributeValues
+    "word4word=ab#",		# allow '#' at the end of an AttributeValue
+    "ou=orga+sub=ab",		# allow '+' as separators for multi-valued RelativeDistinguishedName
+    "dc=\\f0\\Ac\\93",		# allow escaping hex values in unquoted AttributeValues
+
+    # regression tests
+    "ou=adf-bd,dc=abcd+efOuId=BL:BL:sldkf:704004,dc=or,dc=com",
+    "gvGid=DE:8A:wordCaps,ou=Service,dc=alsdkj+abOuId=UK:A8:137100,dc=edu,dc=de",
+);
+
+my @fail = (
+    "",				# no empty distinguished name
+    "ou=a,",			# no empty AttributeTypeAndValue
+    "ou=a+",			# no multi-valued RelativeDistinguishedName with empty second part
+    "ou",			# missing separator and AttributeValue
+    "ou=",			# no 'empty' AttributeValue
+    "ou=+",			# forbidden character '+' in AttributeValue
+    "ou= or",			# no space at the beginning of an AttributeValue
+    "ou=orgs ",			# no space at the end of an AttributeValue
+    "ou=#value",		# no '#' at the beginning an AttributeValue
+    "ou=\"+,;<>\\\0",		# no un-escaped forbidden characters in unquoted AttributeValues
+    "ou=name\0",		# no null value in AttributeValue
+    "ou=zy\\xw\\v"		# no unescaped backslashes that are not escaping specific characters
+);
+
+for my $test_case (@pass) {
+    die "\'" . $test_case . "\' unexpectedly did not match the ldap dn_regex\n"
+	if !($test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/);
+}
+
+for my $test_case (@fail) {
+    die "\'" . $test_case . "\' unexpectedly did match the the ldap dn_regex\n"
+	if $test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/;
+}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex
  2023-05-17 13:39 [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex Stefan Sterz
@ 2023-05-17 13:42 ` Stefan Sterz
  2023-05-28 17:50   ` Thomas Lamprecht
  2023-05-23  6:58 ` Christoph Heiss
  1 sibling, 1 reply; 8+ messages in thread
From: Stefan Sterz @ 2023-05-17 13:42 UTC (permalink / raw)
  To: pve-devel

sorry just noticed i forgot: adding tests for this was

Suggested-by: Dominik Csapak <d.csapak@proxmox.com>

On 17.05.23 15:39, Stefan Sterz wrote:
> according to the current specification of the string representation of
> ldap distinguished names (DN) presented by RFC 4514 [1] the current
> regex checking ldap DNs still prevents users from entering valid
> DNs.
> 
> for example we do not allow multi-valued RelativeDistinguishedNames as
> these are separated by a '+'. this was already reported as an issue in
> the subscription support.
> 
> escaped sequences in unquoted AttributeValues are also not allowed.
> this includes commas (',') and quotes ('"'). for example, the
> following does not work even though the RFC explicitly mentions it as
> a valid example ([1], section 4):
> 
> CN=James \"Jim\" Smith\, III,DC=example,DC=net
> 
> while this may not be usable as a valid username, as argued
> previously [2], it seems arbitrary to allow this for quoted
> AttributeValues but not for unquoted ones.
> 
> this commit also adds a test file that tests the regex against a
> number of common pitfalls. including distinguished names that are
> structurally similar to those reported as erroneously forbidden
> earlier. these tests should help avoid regressions in the future.
> 
> [1]: https://datatracker.ietf.org/doc/html/rfc4514
> [2]: https://git.proxmox.com/?p=pve-access-control.git;h=1aa2355a
> 
> Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
> ---
> would really appreciate another pair of eyes here. given the recent
> churn related to this regex. it's very likely i missed something too.
> 
>  src/PVE/Auth/LDAP.pm      |  9 +++++--
>  src/test/Makefile         |  1 +
>  src/test/dn-regex-test.pl | 54 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 62 insertions(+), 2 deletions(-)
>  create mode 100755 src/test/dn-regex-test.pl
> 
> diff --git a/src/PVE/Auth/LDAP.pm b/src/PVE/Auth/LDAP.pm
> index fc82a17..ccc6864 100755
> --- a/src/PVE/Auth/LDAP.pm
> +++ b/src/PVE/Auth/LDAP.pm
> @@ -10,8 +10,13 @@ use PVE::Tools;
>  
>  use base qw(PVE::Auth::Plugin);
>  
> -my  $dn_part_regex = qr!("[^"]+"|[^ ,+"/<>;=#][^,+"/<>;=]*[^ ,+"/<>;=]|[^ ,+"/<>;=#])!;
> -our $dn_regex = qr!\w+=${dn_part_regex}(,\s*\w+=${dn_part_regex})*!;
> +my $escaped  = qr!\\(?:[ "#+,;<=>\\]|[0-9a-fA-F]{2})!;
> +my $start    = qr!(?:${escaped}|[^"+,;<>\\\0 #])!;
> +my $middle   = qr!(?:${escaped}|[^"+,;<>\\\0])!;
> +my $end      = qr!(?:${escaped}|[^"+,;<>\\\0 ])!;
> +my $attr_val = qr!("[^"]+"|${start}(?:${middle}*${end})?)!;
> +
> +our $dn_regex = qr!\w+=${attr_val}([,\+]\s*\w+=${attr_val})*!;
>  
>  sub type {
>      return 'ldap';
> diff --git a/src/test/Makefile b/src/test/Makefile
> index 859a84b..b4c05eb 100644
> --- a/src/test/Makefile
> +++ b/src/test/Makefile
> @@ -13,3 +13,4 @@ check:
>  	perl -I.. perm-test7.pl
>  	perl -I.. perm-test8.pl
>  	perl -I.. realm_sync_test.pl
> +	perl -I.. dn-regex-test.pl
> diff --git a/src/test/dn-regex-test.pl b/src/test/dn-regex-test.pl
> new file mode 100755
> index 0000000..a2410af
> --- /dev/null
> +++ b/src/test/dn-regex-test.pl
> @@ -0,0 +1,54 @@
> +
> +use strict;
> +use warnings;
> +
> +use PVE::Auth::LDAP;
> +
> +# this test file attempts to check whether the ldap distinguished name regex
> +# defined in `PVE::Auth::LDAP` complies with RFC 4514.
> +# 
> +# see: https://datatracker.ietf.org/doc/html/rfc4514
> +
> +my @pass = (
> +    "ou=a",			# single AttributeTypeValue 
> +    "ou=orga,dc=com,cn=name",	# multiple RelativeDistinguishedNames
> +    "STREET=a,cn=a,C=c",	# single character AttributeValues
> +    "UID=tt,cn=\"#+,;<>\\ \"",	# forbidden characters are allowed when quoted
> +    "c=\\\"\\#\\+\\;\\<\\=\\>",	# specific characters allowed when escaped
> +    "a=\\\\",			# escaped backslashes are allowed
> +    "ST=a,cn=\"Test, User\"",	# allow un-escaped commas in quoted AttributeValues
> +    "o2u=bc,cn=Test\\, User",	# allow escaped commas
> +    "T2=a #b",			# spaces (' ') and '#' are allowed in the middle of AttributeValues
> +    "word4word=ab#",		# allow '#' at the end of an AttributeValue
> +    "ou=orga+sub=ab",		# allow '+' as separators for multi-valued RelativeDistinguishedName
> +    "dc=\\f0\\Ac\\93",		# allow escaping hex values in unquoted AttributeValues
> +
> +    # regression tests
> +    "ou=adf-bd,dc=abcd+efOuId=BL:BL:sldkf:704004,dc=or,dc=com",
> +    "gvGid=DE:8A:wordCaps,ou=Service,dc=alsdkj+abOuId=UK:A8:137100,dc=edu,dc=de",
> +);
> +
> +my @fail = (
> +    "",				# no empty distinguished name
> +    "ou=a,",			# no empty AttributeTypeAndValue
> +    "ou=a+",			# no multi-valued RelativeDistinguishedName with empty second part
> +    "ou",			# missing separator and AttributeValue
> +    "ou=",			# no 'empty' AttributeValue
> +    "ou=+",			# forbidden character '+' in AttributeValue
> +    "ou= or",			# no space at the beginning of an AttributeValue
> +    "ou=orgs ",			# no space at the end of an AttributeValue
> +    "ou=#value",		# no '#' at the beginning an AttributeValue
> +    "ou=\"+,;<>\\\0",		# no un-escaped forbidden characters in unquoted AttributeValues
> +    "ou=name\0",		# no null value in AttributeValue
> +    "ou=zy\\xw\\v"		# no unescaped backslashes that are not escaping specific characters
> +);
> +
> +for my $test_case (@pass) {
> +    die "\'" . $test_case . "\' unexpectedly did not match the ldap dn_regex\n"
> +	if !($test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/);
> +}
> +
> +for my $test_case (@fail) {
> +    die "\'" . $test_case . "\' unexpectedly did match the the ldap dn_regex\n"
> +	if $test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/;
> +}





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex
  2023-05-17 13:39 [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex Stefan Sterz
  2023-05-17 13:42 ` Stefan Sterz
@ 2023-05-23  6:58 ` Christoph Heiss
  2023-05-23  8:56   ` Stefan Sterz
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Heiss @ 2023-05-23  6:58 UTC (permalink / raw)
  To: Stefan Sterz; +Cc: pve-devel

On Wed, May 17, 2023 at 03:39:31PM +0200, Stefan Sterz wrote:
> [..]
>
> this commit also adds a test file that tests the regex against a
> number of common pitfalls. including distinguished names that are
> structurally similar to those reported as erroneously forbidden
> earlier. these tests should help avoid regressions in the future.
That's a really good idea :^)

>
> Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
> ---
> would really appreciate another pair of eyes here. given the recent
> churn related to this regex. it's very likely i missed something too.
>
>  src/PVE/Auth/LDAP.pm      |  9 +++++--
>  src/test/Makefile         |  1 +
>  src/test/dn-regex-test.pl | 54 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 62 insertions(+), 2 deletions(-)
>  create mode 100755 src/test/dn-regex-test.pl
>
> diff --git a/src/PVE/Auth/LDAP.pm b/src/PVE/Auth/LDAP.pm
> index fc82a17..ccc6864 100755
> --- a/src/PVE/Auth/LDAP.pm
> +++ b/src/PVE/Auth/LDAP.pm
> @@ -10,8 +10,13 @@ use PVE::Tools;
>
>  use base qw(PVE::Auth::Plugin);
>
> -my  $dn_part_regex = qr!("[^"]+"|[^ ,+"/<>;=#][^,+"/<>;=]*[^ ,+"/<>;=]|[^ ,+"/<>;=#])!;
> -our $dn_regex = qr!\w+=${dn_part_regex}(,\s*\w+=${dn_part_regex})*!;
> +my $escaped  = qr!\\(?:[ "#+,;<=>\\]|[0-9a-fA-F]{2})!;
> +my $start    = qr!(?:${escaped}|[^"+,;<>\\\0 #])!;
> +my $middle   = qr!(?:${escaped}|[^"+,;<>\\\0])!;
> +my $end      = qr!(?:${escaped}|[^"+,;<>\\\0 ])!;
> +my $attr_val = qr!("[^"]+"|${start}(?:${middle}*${end})?)!;
> +
> +our $dn_regex = qr!\w+=${attr_val}([,\+]\s*\w+=${attr_val})*!;
While reviewing that, I had a look at the `Net::LDAP` perl library
again, if it provides a way to _somehow_ validate DNs properly without
having to resort to very fragile regexes.

=> `Net::LDAP` provides a canonical_dn() function in `Net::LDAP::Util`
   https://metacpan.org/pod/Net::LDAP::Util#canonical_dn

I quickly hacked up your test script to use canonical_dn() instead of
the regex. Works pretty much the same, except that it considers empty
strings and attributes with empty values as valid DNs and allows
whitespaces at both the beginning and the end of attribute values. But
these edge-cases can be much more easily checked than the whole thing
(esp. with escaping and such), and should be more robust.
[ I've attached the diff for the test script below for reference. ]

So IHMO that should be the way forward.

@Stefan: I'd be willing to properly rewrite the DN checking using
canonical_dn() - or do you want to?

> [..]
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>

diff --git a/src/test/dn-regex-test.pl b/src/test/dn-regex-test.pl
index a2410af..9a48483 100755
--- a/src/test/dn-regex-test.pl
+++ b/src/test/dn-regex-test.pl
@@ -3,6 +3,7 @@ use strict;
 use warnings;

 use PVE::Auth::LDAP;
+use Net::LDAP::Util qw(canonical_dn);

 # this test file attempts to check whether the ldap distinguished name regex
 # defined in `PVE::Auth::LDAP` complies with RFC 4514.
@@ -33,10 +34,10 @@ my @fail = (
     "ou=a,",			# no empty AttributeTypeAndValue
     "ou=a+",			# no multi-valued RelativeDistinguishedName with empty second part
     "ou",			# missing separator and AttributeValue
-    "ou=",			# no 'empty' AttributeValue
+    # "ou=",			# no 'empty' AttributeValue
     "ou=+",			# forbidden character '+' in AttributeValue
-    "ou= or",			# no space at the beginning of an AttributeValue
-    "ou=orgs ",			# no space at the end of an AttributeValue
+    # "ou= or",			# no space at the beginning of an AttributeValue
+    # "ou=orgs ",			# no space at the end of an AttributeValue
     "ou=#value",		# no '#' at the beginning an AttributeValue
     "ou=\"+,;<>\\\0",		# no un-escaped forbidden characters in unquoted AttributeValues
     "ou=name\0",		# no null value in AttributeValue
@@ -44,11 +45,12 @@ my @fail = (
 );

 for my $test_case (@pass) {
-    die "\'" . $test_case . "\' unexpectedly did not match the ldap dn_regex\n"
-	if !($test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/);
+    die "'$test_case' unexpectedly failed to parse as valid DN\n"
+	if !canonical_dn($test_case);
 }

 for my $test_case (@fail) {
-    die "\'" . $test_case . "\' unexpectedly did match the the ldap dn_regex\n"
-	if $test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/;
+    my $dn = canonical_dn($test_case);
+    die "'$test_case' unexpectedly parsed as valid DN: '$dn'\n"
+	if $dn;
 }




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex
  2023-05-23  6:58 ` Christoph Heiss
@ 2023-05-23  8:56   ` Stefan Sterz
  2023-05-23 10:12     ` Christoph Heiss
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Sterz @ 2023-05-23  8:56 UTC (permalink / raw)
  To: Christoph Heiss; +Cc: pve-devel

On 23.05.23 08:58, Christoph Heiss wrote:
> On Wed, May 17, 2023 at 03:39:31PM +0200, Stefan Sterz wrote:
>> [..]
>>
>> this commit also adds a test file that tests the regex against a
>> number of common pitfalls. including distinguished names that are
>> structurally similar to those reported as erroneously forbidden
>> earlier. these tests should help avoid regressions in the future.
> That's a really good idea :^)
> 
>>
>> Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
>> ---
>> would really appreciate another pair of eyes here. given the recent
>> churn related to this regex. it's very likely i missed something too.
>>
>>  src/PVE/Auth/LDAP.pm      |  9 +++++--
>>  src/test/Makefile         |  1 +
>>  src/test/dn-regex-test.pl | 54 +++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 62 insertions(+), 2 deletions(-)
>>  create mode 100755 src/test/dn-regex-test.pl
>>
>> diff --git a/src/PVE/Auth/LDAP.pm b/src/PVE/Auth/LDAP.pm
>> index fc82a17..ccc6864 100755
>> --- a/src/PVE/Auth/LDAP.pm
>> +++ b/src/PVE/Auth/LDAP.pm
>> @@ -10,8 +10,13 @@ use PVE::Tools;
>>
>>  use base qw(PVE::Auth::Plugin);
>>
>> -my  $dn_part_regex = qr!("[^"]+"|[^ ,+"/<>;=#][^,+"/<>;=]*[^ ,+"/<>;=]|[^ ,+"/<>;=#])!;
>> -our $dn_regex = qr!\w+=${dn_part_regex}(,\s*\w+=${dn_part_regex})*!;
>> +my $escaped  = qr!\\(?:[ "#+,;<=>\\]|[0-9a-fA-F]{2})!;
>> +my $start    = qr!(?:${escaped}|[^"+,;<>\\\0 #])!;
>> +my $middle   = qr!(?:${escaped}|[^"+,;<>\\\0])!;
>> +my $end      = qr!(?:${escaped}|[^"+,;<>\\\0 ])!;
>> +my $attr_val = qr!("[^"]+"|${start}(?:${middle}*${end})?)!;
>> +
>> +our $dn_regex = qr!\w+=${attr_val}([,\+]\s*\w+=${attr_val})*!;
> While reviewing that, I had a look at the `Net::LDAP` perl library
> again, if it provides a way to _somehow_ validate DNs properly without
> having to resort to very fragile regexes.
> 
> => `Net::LDAP` provides a canonical_dn() function in `Net::LDAP::Util`
>    https://metacpan.org/pod/Net::LDAP::Util#canonical_dn
> 

handling this via Net::LDAP is probably the better way to go. however,
just to through this suggestion out there: do we maybe want to use a
more lax regex in general instead of actually checking whether a dn
*could* be valid? maybe this check should just be a sanity check instead
of making the dn conform to the spec.

> I quickly hacked up your test script to use canonical_dn() instead of
> the regex. Works pretty much the same, except that it considers empty
> strings and attributes with empty values as valid DNs and allows
> whitespaces at both the beginning and the end of attribute values. But
> these edge-cases can be much more easily checked than the whole thing
> (esp. with escaping and such), and should be more robust.

yes, but imo that might end up being somewhat confusing. especially if
we end up using the canonical version of the dn returned by
`canonical_dn()`. in my testing it also escaped already validly escaped
values such as `CN=James \"Jim\" Smith\, III` and turned them into
`CN=James \22Jim\22 Smith\2c III`. this may be confusing to some users.

it's handling of spaces is imo especially strange. according to the docs:

> Converts all leading and trailing spaces in values to be \20.

but `ou= test ,dc=net` yields `OU=test,DC=net` for me. so it doesn't
escape them, it just drops them (also note that capitalization)? if you
quote the value it does what it says though. so `ou=" test ",dc=net`
becomes `OU=\20test\20,DC=net` as expected, although somewhat
pointlessly so in this case as spaces in quoted values are in-spec.

imo if we go this route we either need additional logic to handle spaces
and empty values/dns before passing the dn to `canonical_dn()` for
validation only, or we need to, at least, document the transformations
that `canonical_dn()` does. otherwise, these substitutions may end up
confusing users.

> [ I've attached the diff for the test script below for reference. ]
> 
> So IHMO that should be the way forward.
>> @Stefan: I'd be willing to properly rewrite the DN checking using
> canonical_dn() - or do you want to?
> 
>> [..]
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
> 
> diff --git a/src/test/dn-regex-test.pl b/src/test/dn-regex-test.pl
> index a2410af..9a48483 100755
> --- a/src/test/dn-regex-test.pl
> +++ b/src/test/dn-regex-test.pl
> @@ -3,6 +3,7 @@ use strict;
>  use warnings;
> 
>  use PVE::Auth::LDAP;
> +use Net::LDAP::Util qw(canonical_dn);
> 
>  # this test file attempts to check whether the ldap distinguished name regex
>  # defined in `PVE::Auth::LDAP` complies with RFC 4514.
> @@ -33,10 +34,10 @@ my @fail = (
>      "ou=a,",			# no empty AttributeTypeAndValue
>      "ou=a+",			# no multi-valued RelativeDistinguishedName with empty second part
>      "ou",			# missing separator and AttributeValue
> -    "ou=",			# no 'empty' AttributeValue
> +    # "ou=",			# no 'empty' AttributeValue
>      "ou=+",			# forbidden character '+' in AttributeValue
> -    "ou= or",			# no space at the beginning of an AttributeValue
> -    "ou=orgs ",			# no space at the end of an AttributeValue> +    # "ou= or",			# no space at the beginning of an AttributeValue
> +    # "ou=orgs ",			# no space at the end of an AttributeValue
>      "ou=#value",		# no '#' at the beginning an AttributeValue
>      "ou=\"+,;<>\\\0",		# no un-escaped forbidden characters in unquoted AttributeValues
>      "ou=name\0",		# no null value in AttributeValue
> @@ -44,11 +45,12 @@ my @fail = (
>  );
> 
>  for my $test_case (@pass) {
> -    die "\'" . $test_case . "\' unexpectedly did not match the ldap dn_regex\n"
> -	if !($test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/);
> +    die "'$test_case' unexpectedly failed to parse as valid DN\n"
> +	if !canonical_dn($test_case);
>  }
> 
>  for my $test_case (@fail) {
> -    die "\'" . $test_case . "\' unexpectedly did match the the ldap dn_regex\n"
> -	if $test_case =~ m/^$PVE::Auth::LDAP::dn_regex$/;
> +    my $dn = canonical_dn($test_case);
> +    die "'$test_case' unexpectedly parsed as valid DN: '$dn'\n"
> +	if $dn;
>  }





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex
  2023-05-23  8:56   ` Stefan Sterz
@ 2023-05-23 10:12     ` Christoph Heiss
  2023-05-23 12:17       ` Stefan Sterz
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Heiss @ 2023-05-23 10:12 UTC (permalink / raw)
  To: Stefan Sterz; +Cc: pve-devel

On Tue, May 23, 2023 at 10:56:24AM +0200, Stefan Sterz wrote:
> On 23.05.23 08:58, Christoph Heiss wrote:
> > On Wed, May 17, 2023 at 03:39:31PM +0200, Stefan Sterz wrote:
> >> [..]
> > While reviewing that, I had a look at the `Net::LDAP` perl library
> > again, if it provides a way to _somehow_ validate DNs properly without
> > having to resort to very fragile regexes.
> >
> > => `Net::LDAP` provides a canonical_dn() function in `Net::LDAP::Util`
> >    https://metacpan.org/pod/Net::LDAP::Util#canonical_dn
> >
>
> handling this via Net::LDAP is probably the better way to go. however,
> just to through this suggestion out there: do we maybe want to use a
> more lax regex in general instead of actually checking whether a dn
> *could* be valid? maybe this check should just be a sanity check instead
> of making the dn conform to the spec.
I guess so. In the end, it will get rejected by the LDAP server anyway
if it isn't valid. Offloading some of that logic might be indeed a good
idea.

When saving, we also don't currently check whether the server actually
accepts it, is allowed to login, etc. So a sync might fail for other
reason too, so it could fail due to invalid DN syntax as well at this
stage IMO.

Or - just a quick idea - maybe directly check the DN with the server on
changes via the API (given that there is a valid server configuration as
well)?

>
> > I quickly hacked up your test script to use canonical_dn() instead of
> > the regex. Works pretty much the same, except that it considers empty
> > strings and attributes with empty values as valid DNs and allows
> > whitespaces at both the beginning and the end of attribute values. But
> > these edge-cases can be much more easily checked than the whole thing
> > (esp. with escaping and such), and should be more robust.
>
> yes, but imo that might end up being somewhat confusing. especially if
> we end up using the canonical version of the dn returned by
> `canonical_dn()`. in my testing it also escaped already validly escaped
> values such as `CN=James \"Jim\" Smith\, III` and turned them into
> `CN=James \22Jim\22 Smith\2c III`. this may be confusing to some users.
If the entered DN is suddenly replaced with the canonicalized version,
definitely. But we could make that clear in the UI and documentation.
Although I would favor keeping it as-is, after thinking about it more.

>
> it's handling of spaces is imo especially strange. according to the docs:
>
> > Converts all leading and trailing spaces in values to be \20.
>
> but `ou= test ,dc=net` yields `OU=test,DC=net` for me. so it doesn't
> escape them, it just drops them
That's weird. Guess a bug on their side?

> (also note that capitalization)?
Yes, in it's canonical form the attribute names are always uppercase,
but any sane server accepts lowercase just fine too. (Maybe the spec
says something about that, but not digging in there right now.)

> if you quote the value it does what it says though. so `ou=" test
> ",dc=net` becomes `OU=\20test\20,DC=net` as expected, although
> somewhat pointlessly so in this case as spaces in quoted values are
> in-spec.
>
>
> imo if we go this route we either need additional logic to handle spaces
> and empty values/dns before passing the dn to `canonical_dn()` for
> validation only, or we need to, at least, document the transformations
> that `canonical_dn()` does. otherwise, these substitutions may end up
> confusing users.

I don't have a strong opinion on what overall approach to take here.

To quickly summarize, I guess
- keep storing the DN as-is from user
- only do a lax sanity check ourselves
- let it fail later on sync if it is invalid
- maybe contact the server on changes via the API and check DN/login?

would be a pretty good way and save us the trouble from tweaking the
regex every few weeks for all eternity.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex
  2023-05-23 10:12     ` Christoph Heiss
@ 2023-05-23 12:17       ` Stefan Sterz
  2023-05-25  9:52         ` Christoph Heiss
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Sterz @ 2023-05-23 12:17 UTC (permalink / raw)
  To: Christoph Heiss; +Cc: pve-devel

On 23.05.23 12:12, Christoph Heiss wrote:
> On Tue, May 23, 2023 at 10:56:24AM +0200, Stefan Sterz wrote:
>> On 23.05.23 08:58, Christoph Heiss wrote:
>>> On Wed, May 17, 2023 at 03:39:31PM +0200, Stefan Sterz wrote:
>>>> [..]
>>> While reviewing that, I had a look at the `Net::LDAP` perl library
>>> again, if it provides a way to _somehow_ validate DNs properly without
>>> having to resort to very fragile regexes.
>>>
>>> => `Net::LDAP` provides a canonical_dn() function in `Net::LDAP::Util`
>>>    https://metacpan.org/pod/Net::LDAP::Util#canonical_dn
>>>
>>
>> handling this via Net::LDAP is probably the better way to go. however,
>> just to through this suggestion out there: do we maybe want to use a
>> more lax regex in general instead of actually checking whether a dn
>> *could* be valid? maybe this check should just be a sanity check instead
>> of making the dn conform to the spec.
> I guess so. In the end, it will get rejected by the LDAP server anyway
> if it isn't valid. Offloading some of that logic might be indeed a good
> idea.
> 
> When saving, we also don't currently check whether the server actually
> accepts it, is allowed to login, etc. So a sync might fail for other
> reason too, so it could fail due to invalid DN syntax as well at this
> stage IMO.
> 
> Or - just a quick idea - maybe directly check the DN with the server on
> changes via the API (given that there is a valid server configuration as
> well)?
> 

yeah that would probably be best, as it's also closer to what the user
wants (a working ldap setup) than either what the regex or `Net::LDAP`
can do (making sure that the dn conforms to spec). since, my knowledge
about ldap is fairly shallow, im not sure how this would work in terms
of timeouts etc.

another point that comes to mind is that lukas reminded me that the same
regex is used in pbs. i haven't yet looked at that, but we probably want
to make sure that both implementations work as similarly as possible.

>>
>>> I quickly hacked up your test script to use canonical_dn() instead of
>>> the regex. Works pretty much the same, except that it considers empty
>>> strings and attributes with empty values as valid DNs and allows
>>> whitespaces at both the beginning and the end of attribute values. But
>>> these edge-cases can be much more easily checked than the whole thing
>>> (esp. with escaping and such), and should be more robust.
>>
>> yes, but imo that might end up being somewhat confusing. especially if
>> we end up using the canonical version of the dn returned by
>> `canonical_dn()`. in my testing it also escaped already validly escaped
>> values such as `CN=James \"Jim\" Smith\, III` and turned them into
>> `CN=James \22Jim\22 Smith\2c III`. this may be confusing to some users.
> If the entered DN is suddenly replaced with the canonicalized version,
> definitely. But we could make that clear in the UI and documentation.
> Although I would favor keeping it as-is, after thinking about it more.
> 

me too. entering one value just to come back to this setting when
something breaks to see another sounds extremely frustrating. especially
considering most users probably can't tell that semantically the dns are
the same. even if well documented.

8< --- snip --- >8

>> imo if we go this route we either need additional logic to handle spaces
>> and empty values/dns before passing the dn to `canonical_dn()` for
>> validation only, or we need to, at least, document the transformations
>> that `canonical_dn()` does. otherwise, these substitutions may end up
>> confusing users.
> 
> I don't have a strong opinion on what overall approach to take here.
> 
> To quickly summarize, I guess
> - keep storing the DN as-is from user
> - only do a lax sanity check ourselves
> - let it fail later on sync if it is invalid
> - maybe contact the server on changes via the API and check DN/login?
> 
> would be a pretty good way and save us the trouble from tweaking the
> regex every few weeks for all eternity.

yeah i agree, we should probably still keep the tests for the lax sanity
check, just in case. i'll take a look at the pbs side. if you want to
take this over, feel free to, just give me a heads-up.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex
  2023-05-23 12:17       ` Stefan Sterz
@ 2023-05-25  9:52         ` Christoph Heiss
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Heiss @ 2023-05-25  9:52 UTC (permalink / raw)
  To: Stefan Sterz; +Cc: pve-devel

On Tue, May 23, 2023 at 02:17:18PM +0200, Stefan Sterz wrote:
> On 23.05.23 12:12, Christoph Heiss wrote:
> > On Tue, May 23, 2023 at 10:56:24AM +0200, Stefan Sterz wrote:
> > [..]
>
> yeah that would probably be best, as it's also closer to what the user
> wants (a working ldap setup) than either what the regex or `Net::LDAP`
> can do (making sure that the dn conforms to spec). since, my knowledge
> about ldap is fairly shallow, im not sure how this would work in terms
> of timeouts etc.
Looking at the docs, the timeout when connecting to the server can be
set. Other than that, trying a bind using the given DN/password should
pose no problems.

>
> another point that comes to mind is that lukas reminded me that the same
> regex is used in pbs. i haven't yet looked at that, but we probably want
> to make sure that both implementations work as similarly as possible.
Good point. Haven't looked at the PBS side at all yet, but I guess we
probably don't have something similar to canonical_dn() there? But in
any case, as long as we keep the same overall approach (lax sanity
check, then try connecting to the server and bind), it should be pretty
feel the same to the user.

> [..]
> yeah i agree, we should probably still keep the tests for the lax sanity
> check, just in case.
Definitvely, more tests are always good.

> i'll take a look at the pbs side. if you want to take this over, feel
> free to, just give me a heads-up.
Great! And sure, I'll take over the PVE side of things.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex
  2023-05-17 13:42 ` Stefan Sterz
@ 2023-05-28 17:50   ` Thomas Lamprecht
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Lamprecht @ 2023-05-28 17:50 UTC (permalink / raw)
  To: Proxmox VE development discussion, Stefan Sterz

Am 17/05/2023 um 15:42 schrieb Stefan Sterz:
> sorry just noticed i forgot: adding tests for this was
> 
> Suggested-by: Dominik Csapak <d.csapak@proxmox.com>

I mean, it's really great that we finally get some tests here and if only
Dominik statement made you do it, then chapeau to him! But, Suggested-by's
should be mostly for suggestion on implementation (details), as you all
definitively have a blanket statement from me, and I'd think it's safe to say
also from all other maintainers, to add tests!

I mean, great if that was all needed to solve the motivation to write tests
in developers ;-P






^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-05-28 17:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-17 13:39 [pve-devel] [PATCH access-control] ldap: fix ldap distinguished names regex Stefan Sterz
2023-05-17 13:42 ` Stefan Sterz
2023-05-28 17:50   ` Thomas Lamprecht
2023-05-23  6:58 ` Christoph Heiss
2023-05-23  8:56   ` Stefan Sterz
2023-05-23 10:12     ` Christoph Heiss
2023-05-23 12:17       ` Stefan Sterz
2023-05-25  9:52         ` Christoph Heiss

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal