From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 0BE751FF13F for ; Thu, 07 May 2026 15:24:22 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E18501D087; Thu, 7 May 2026 15:24:21 +0200 (CEST) Content-Type: text/plain; charset=UTF-8 Date: Thu, 07 May 2026 15:23:46 +0200 Message-Id: Subject: Re: [PATCH datacenter-manager v2 0/8] subscription: add central key pool registry with reissue support From: "Lukas Wagner" To: "Thomas Lamprecht" , Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Mailer: aerc 0.21.0-0-g5549850facc2-dirty References: <20260507082943.2749725-1-t.lamprecht@proxmox.com> In-Reply-To: <20260507082943.2749725-1-t.lamprecht@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1778160118968 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.054 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: 3OHSOGU6LDBGMFQJ5MVVMLYF75SZRMTS X-Message-ID-Hash: 3OHSOGU6LDBGMFQJ5MVVMLYF75SZRMTS X-MailFrom: l.wagner@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox Datacenter Manager development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu May 7, 2026 at 10:26 AM CEST, Thomas Lamprecht wrote: > Add a Subscription Registry to PDM: a central pool of PVE and PBS > subscription keys that an operator can assign to remote nodes from one > place, with an explicit Apply/Clear lifecycle for staged changes plus a > Reissue Key action for freeing a key bound to a node so it can be > reassigned elsewhere. > > Motivation: managing subscriptions across many remotes today means > doing this for each node individually. PDM already has the remote > inventory; with a key pool plus per-remote query data we can show "which > nodes need a subscription" and "which keys are unused" together, and let > an admin batch-assign and tear down from one place. > In the near/mid-term we can also make polling keys from customers more > integrated, but that needs a bit adaption in our shop infa and does not > block the base work here in anyway. Actually, the implementation here > was split out from a more complete work, so most parts of it are already > prepared to adopt this relatively easily. > > Design points worth flagging for review: > > * Storage layout. subscriptions.cfg holds key entries via the typed > section-config layer, with `product-type` as the section type so PVE > and PBS sections live side-by-side. We've been using subdirectories quite a bit more liberally in PDM, so maybe this could also be one? So maybe `subscriptions/keys.cfg`, `subscriptions/keys.shadow`, etc.? Not sure if we ever need more than these files, so might be overkill. > The subscriptions.shadow file is reserved for a future shop-bundle > import flow (signed info blobs) and stays empty for manually-added > keys. I can drop that part for now too, but figured it might be nice > and potentially relevant for review to see the direction this probably > goes now already. > > * Endpoints take PRIV_SYS_AUDIT/MODIFY at the macro > level for the pool itself, with per-remote PRIV_RESOURCE_* enforced > inside the handlers when a specific remote is touched. > A dedicated subscription privilege seemed not like a necessity and > also not fit that well into our general priv approach in PDM. > > * The pending lifecycle goes like: Pool entries with a (remote, node) > * binding whose > live state does not match are "pending push"; entries with the new > pending-reissue flag are "pending removal". Apply Pending walks both > queues; Clear Pending drops the queue without touching any remote > (binding-clear for push, flag-only for reissue so the operator can > retry without re-importing the key). > The per-remote subscription cache is invalidated after each successful > apply step so the next panel load reflects the change rather than a > 5-minute-stale snapshot, which is highly confusing UI/UX wise. This > might warrant a closer look though, might be currently done in a > rather heavier handed fashion as potentially needed (had no time to > recheck). > > * Locking is best-effort, but here that should be fine in practice given > that another entity can always alter the state on a remote node in > parallel anyway. > > The lib/pdm-api-types/tests/test_import.rs test should provide basic > coverage for section-config roundtrip for both subscription.cfg and the > shadow file (which is why I'd be fine with keeping it, but not _that_ > hard feelings), schema acceptance and rejection (for now only accept > PVE/PBS; everything else rejected), ProductType classification, the > SubscriptionLevel display/from-str backward-compat (single-letter and > full-name forms both parse), and pick_best_pve_socket_key edge cases. > > Open follow-ups deliberately out of scope here: > * Auto-import existing remote-side keys into the pool on first > observation (the reissue path already adopts; an explicit import for > legacy onboarding would be cleaner). > * Make reissue a full reissue, if it goes in like this it should be > rather called "Clear Key", but that can be handled on applying too, if > really nothing else comes up (which I doubt) > * A shop-bundle import path (the shadow file plumbing is already in), > either manual copy+paste or through an api token. > * Some polishing code and ui/ux wise (e.g., a reload button), but wante > to finally get this out now. > * ... > Haven't really done a (deep) review of the code yet, I focused mostly on the API and UX for now. Gave this a try on the latest master. Worked nicely, except a minor glitch with key-pool refreshes in the UI: - The key pool list misses a couple of 'refresh' events, mainly noticeable when the cells in the 'Assignment' column change, e.g. when using - Auto-Assign - Clear Assignment I always had to press F5 to get the actual refreshed assignments displayed here, the "Reload" button in the toolbar did not refresh it properly. Some more thoughts, mostly UX related: - In the 'Auto-Assign Proposal' window, the 'Sockets (node/key)' column showed '-/4' for all my PVE nodes (with a key that allows four sockets). It is probably supposed to show the actual socket count for the node there, I assume? - In the 'Auto-Assign Proposal' window, I'd rather use the term 'Assign' on the button that confirms the action, in order to avoid confusion with the "Apply Pending" action. - When adding keys to the key pool and the "Add Subscription Keys" dialog is shown, I noticed a couple of warnings in the browser console: =20 "WARN /usr/share/cargo/registry/pwt-0.8.3/src/widget/input_panel.rs:234 c= ould not extract key from custom child, generating one" I've seen this before somewhere else, so might be not related to these changes. - In general, the UI could more visibly show that there are actions pending, I think. When trying out this feature for the first time, I really wasn't sure whether the keys had been fully applied yet or not; I actually had to double-check in the PBS/PVE web interface to be sure. The icons in the node lists were rather subtle to me. Maybe there could be some status banner in the toolbar iff there are any pending actions? - The "Auto-Assign", "Apply Pending" and "Clear Pending" buttons could maybe use a tool tip with some (brief) explanation. - Should we, maybe later, allow to edit the 'Auto-Assign' proposal (e.g. excluding nodes, allow to to override the key assignment)? I guess this would require a different approach for the API, as explained in one of my other messages - The 'Status' column in the "Node Subscription Panel" could maybe use (translated) pretty printing, e.g. "No subscription" instead of 'notfound' - The "Assign Key to Remote" dialog allowed me select nodes which already had a subscription, I guess it would be good to filter them out (or at least grey them out). - Maybe the "Assign" action should rather be something that is done in the "Node Subscription Status" panel, where the user can first select a node in the tree and then press the button? Then the "Assign" button would simply be disabled when selecting a node that already has a subscription. When pressing "Assign", we could pre-select a matching key (product, socket count) in the dialog that is then shown. Having the "Assign" action also makes sense to me since the "Clear Assignment" is also right there. A "Assign" dialog could also then allow to add a (single) key right there, which is then automatically added to the pool. Could be nice for UX streamlininig. - It seems to me that the "Clear Assignment" button only works for pending assignments - so maybe the button text itself could indicate that, e.g "Clear Pending Assignment"? Could be a bit long though, so might be fine as is. - Most panels in the PDM UI have the remote/node tree on the left side, maybe we should do the same here? On the other hand, having the key pool on the left dictates the correct order of operations (first, add keys), so maybe keeping it there is also good.