From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <s.sterz@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 11CA29AE05
 for <pve-devel@lists.proxmox.com>; Tue, 23 May 2023 14:17:20 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id E8B32ADD8
 for <pve-devel@lists.proxmox.com>; Tue, 23 May 2023 14:17:19 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Tue, 23 May 2023 14:17:19 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id E60BA46968
 for <pve-devel@lists.proxmox.com>; Tue, 23 May 2023 14:17:18 +0200 (CEST)
Message-ID: <b9bea15e-cb99-389a-b797-759fda99701b@proxmox.com>
Date: Tue, 23 May 2023 14:17:18 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.11.0
To: Christoph Heiss <c.heiss@proxmox.com>
Cc: pve-devel@lists.proxmox.com
References: <20230517133931.148634-1-s.sterz@proxmox.com>
 <20230523065850.3hhphl2e4p7awvaf@maui.proxmox.com>
 <02972932-cf42-cbc1-870d-301ed9ff43b9@proxmox.com>
 <20230523101209.z6jby3tiv7n4r7vt@maui.proxmox.com>
Content-Language: en-US
From: Stefan Sterz <s.sterz@proxmox.com>
In-Reply-To: <20230523101209.z6jby3tiv7n4r7vt@maui.proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.054 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.091 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [metacpan.org]
Subject: Re: [pve-devel] [PATCH access-control] ldap: fix ldap distinguished
 names regex
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 23 May 2023 12:17:20 -0000

On 23.05.23 12:12, Christoph Heiss wrote:
> On Tue, May 23, 2023 at 10:56:24AM +0200, Stefan Sterz wrote:
>> On 23.05.23 08:58, Christoph Heiss wrote:
>>> On Wed, May 17, 2023 at 03:39:31PM +0200, Stefan Sterz wrote:
>>>> [..]
>>> While reviewing that, I had a look at the `Net::LDAP` perl library
>>> again, if it provides a way to _somehow_ validate DNs properly without
>>> having to resort to very fragile regexes.
>>>
>>> => `Net::LDAP` provides a canonical_dn() function in `Net::LDAP::Util`
>>>    https://metacpan.org/pod/Net::LDAP::Util#canonical_dn
>>>
>>
>> handling this via Net::LDAP is probably the better way to go. however,
>> just to through this suggestion out there: do we maybe want to use a
>> more lax regex in general instead of actually checking whether a dn
>> *could* be valid? maybe this check should just be a sanity check instead
>> of making the dn conform to the spec.
> I guess so. In the end, it will get rejected by the LDAP server anyway
> if it isn't valid. Offloading some of that logic might be indeed a good
> idea.
> 
> When saving, we also don't currently check whether the server actually
> accepts it, is allowed to login, etc. So a sync might fail for other
> reason too, so it could fail due to invalid DN syntax as well at this
> stage IMO.
> 
> Or - just a quick idea - maybe directly check the DN with the server on
> changes via the API (given that there is a valid server configuration as
> well)?
> 

yeah that would probably be best, as it's also closer to what the user
wants (a working ldap setup) than either what the regex or `Net::LDAP`
can do (making sure that the dn conforms to spec). since, my knowledge
about ldap is fairly shallow, im not sure how this would work in terms
of timeouts etc.

another point that comes to mind is that lukas reminded me that the same
regex is used in pbs. i haven't yet looked at that, but we probably want
to make sure that both implementations work as similarly as possible.

>>
>>> I quickly hacked up your test script to use canonical_dn() instead of
>>> the regex. Works pretty much the same, except that it considers empty
>>> strings and attributes with empty values as valid DNs and allows
>>> whitespaces at both the beginning and the end of attribute values. But
>>> these edge-cases can be much more easily checked than the whole thing
>>> (esp. with escaping and such), and should be more robust.
>>
>> yes, but imo that might end up being somewhat confusing. especially if
>> we end up using the canonical version of the dn returned by
>> `canonical_dn()`. in my testing it also escaped already validly escaped
>> values such as `CN=James \"Jim\" Smith\, III` and turned them into
>> `CN=James \22Jim\22 Smith\2c III`. this may be confusing to some users.
> If the entered DN is suddenly replaced with the canonicalized version,
> definitely. But we could make that clear in the UI and documentation.
> Although I would favor keeping it as-is, after thinking about it more.
> 

me too. entering one value just to come back to this setting when
something breaks to see another sounds extremely frustrating. especially
considering most users probably can't tell that semantically the dns are
the same. even if well documented.

8< --- snip --- >8

>> imo if we go this route we either need additional logic to handle spaces
>> and empty values/dns before passing the dn to `canonical_dn()` for
>> validation only, or we need to, at least, document the transformations
>> that `canonical_dn()` does. otherwise, these substitutions may end up
>> confusing users.
> 
> I don't have a strong opinion on what overall approach to take here.
> 
> To quickly summarize, I guess
> - keep storing the DN as-is from user
> - only do a lax sanity check ourselves
> - let it fail later on sync if it is invalid
> - maybe contact the server on changes via the API and check DN/login?
> 
> would be a pretty good way and save us the trouble from tweaking the
> regex every few weeks for all eternity.

yeah i agree, we should probably still keep the tests for the lax sanity
check, just in case. i'll take a look at the pbs side. if you want to
take this over, feel free to, just give me a heads-up.