From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 06FC999EF0 for ; Thu, 4 May 2023 12:22:29 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D46501CCB2 for ; Thu, 4 May 2023 12:21:58 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 4 May 2023 12:21:57 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 72BB14735D for ; Thu, 4 May 2023 12:21:57 +0200 (CEST) Message-ID: Date: Thu, 4 May 2023 12:21:53 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-GB, de-AT To: Dominik Csapak , Proxmox VE development discussion Cc: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= , Markus Frank , Wolfgang Bumiller References: <20230425102136.85334-1-m.frank@proxmox.com> <20230425102136.85334-4-m.frank@proxmox.com> <1ed81c96-4228-adea-c3ec-e82a29a9e59e@proxmox.com> <6ad543e0-38ad-4540-a39d-320009aa3f42@proxmox.com> <43d62e1c-8555-d641-2788-9b15115d683b@proxmox.com> <10508419-a110-0e52-242d-a20c2e9f7243@proxmox.com> From: Thomas Lamprecht Autocrypt: addr=t.lamprecht@proxmox.com; keydata= xsFNBFsLjcYBEACsaQP6uTtw/xHTUCKF4VD4/Wfg7gGn47+OfCKJQAD+Oyb3HSBkjclopC5J uXsB1vVOfqVYE6PO8FlD2L5nxgT3SWkc6Ka634G/yGDU3ZC3C/7NcDVKhSBI5E0ww4Qj8s9w OQRloemb5LOBkJNEUshkWRTHHOmk6QqFB/qBPW2COpAx6oyxVUvBCgm/1S0dAZ9gfkvpqFSD 90B5j3bL6i9FIv3YGUCgz6Ue3f7u+HsEAew6TMtlt90XV3vT4M2IOuECG/pXwTy7NtmHaBQ7 UJBcwSOpDEweNob50+9B4KbnVn1ydx+K6UnEcGDvUWBkREccvuExvupYYYQ5dIhRFf3fkS4+ wMlyAFh8PQUgauod+vqs45FJaSgTqIALSBsEHKEs6IoTXtnnpbhu3p6XBin4hunwoBFiyYt6 YHLAM1yLfCyX510DFzX/Ze2hLqatqzY5Wa7NIXqYYelz7tXiuCLHP84+sV6JtEkeSUCuOiUY virj6nT/nJK8m0BzdR6FgGtNxp7RVXFRz/+mwijJVLpFsyG1i0Hmv2zTn3h2nyGK/I6yhFNt dX69y5hbo6LAsRjLUvZeHXpTU4TrpN/WiCjJblbj5um5eEr4yhcwhVmG102puTtuCECsDucZ jpKpUqzXlpLbzG/dp9dXFH3MivvfuaHrg3MtjXY1i+/Oxyp5iwARAQABzTNUaG9tYXMgTGFt cHJlY2h0IChBdXRoLTQpIDx0LmxhbXByZWNodEBwcm94bW94LmNvbT7CwY4EEwEIADgWIQQO R4qbEl/pah9K6VrTZCM6gDZWBgUCWwuNxgIbAwULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAK CRDTZCM6gDZWBm/jD/4+6JB2s67eaqoP6x9VGaXNGJPCscwzLuxDTCG90G9FYu29VcXtubH/ bPwsyBbNUQpqTm/s4XboU2qpS5ykCuTjqavrcP33tdkYfGcItj2xMipJ1i3TWvpikQVsX42R G64wovLs/dvpTYphRZkg5DwhgTmy3mRkmofFCTa+//MOcNOORltemp984tWjpR3bUJETNWpF sKGZHa3N4kCNxb7A+VMsJZ/1gN3jbQbQG7GkJtnHlWkw9rKCYqBtWrnrHa4UAvSa9M/XCIAB FThFGqZI1ojdVlv5gd6b/nWxfOPrLlSxbUo5FZ1i/ycj7/24nznW1V4ykG9iUld4uYUY86bB UGSjew1KYp9FmvKiwEoB+zxNnuEQfS7/Bj1X9nxizgweiHIyFsRqgogTvLh403QMSGNSoArk tqkorf1U+VhEncIn4H3KksJF0njZKfilrieOO7Vuot1xKr9QnYrZzJ7m7ZxJ/JfKGaRHXkE1 feMmrvZD1AtdUATZkoeQtTOpMu4r6IQRfSdwm/CkppZXfDe50DJxAMDWwfK2rr2bVkNg/yZI tKLBS0YgRTIynkvv0h8d9dIjiicw3RMeYXyqOnSWVva2r+tl+JBaenr8YTQw0zARrhC0mttu cIZGnVEvQuDwib57QLqMjQaC1gazKHvhA15H5MNxUhwm229UmdH3KM7BTQRbC43GARAAyTkR D6KRJ9Xa2fVMh+6f186q0M3ni+5tsaVhUiykxjsPgkuWXWW9MbLpYXkzX6h/RIEKlo2BGA95 QwG5+Ya2Bo3g7FGJHAkXY6loq7DgMp5/TVQ8phsSv3WxPTJLCBq6vNBamp5hda4cfXFUymsy HsJy4dtgkrPQ/bnsdFDCRUuhJHopnAzKHN8APXpKU6xV5e3GE4LwFsDhNHfH/m9+2yO/trcD txSFpyftbK2gaMERHgA8SKkzRhiwRTt9w5idOfpJVkYRsgvuSGZ0pcD4kLCOIFrer5xXudk6 NgJc36XkFRMnwqrL/bB4k6Pi2u5leyqcXSLyBgeHsZJxg6Lcr2LZ35+8RQGPOw9C0ItmRjtY ZpGKPlSxjxA1WHT2YlF9CEt3nx7c4C3thHHtqBra6BGPyW8rvtq4zRqZRLPmZ0kt/kiMPhTM 8wZAlObbATVrUMcZ/uNjRv2vU9O5aTAD9E5r1B0dlqKgxyoImUWB0JgpILADaT3VybDd3C8X s6Jt8MytUP+1cEWt9VKo4vY4Jh5vwrJUDLJvzpN+TsYCZPNVj18+jf9uGRaoK6W++DdMAr5l gQiwsNgf9372dbMI7pt2gnT5/YdG+ZHnIIlXC6OUonA1Ro/Itg90Q7iQySnKKkqqnWVc+qO9 GJbzcGykxD6EQtCSlurt3/5IXTA7t6sAEQEAAcLBdgQYAQgAIBYhBA5HipsSX+lqH0rpWtNk IzqANlYGBQJbC43GAhsMAAoJENNkIzqANlYGD1sP/ikKgHgcspEKqDED9gQrTBvipH85si0j /Jwu/tBtnYjLgKLh2cjv1JkgYYjb3DyZa1pLsIv6rGnPX9bH9IN03nqirC/Q1Y1lnbNTynPk IflgvsJjoTNZjgu1wUdQlBgL/JhUp1sIYID11jZphgzfDgp/E6ve/8xE2HMAnf4zAfJaKgD0 F+fL1DlcdYUditAiYEuN40Ns/abKs8I1MYx7Yglu3RzJfBzV4t86DAR+OvuF9v188WrFwXCS RSf4DmJ8tntyNej+DVGUnmKHupLQJO7uqCKB/1HLlMKc5G3GLoGqJliHjUHUAXNzinlpE2Vj C78pxpwxRNg2ilE3AhPoAXrY5qED5PLE9sLnmQ9AzRcMMJUXjTNEDxEYbF55SdGBHHOAcZtA kEQKub86e+GHA+Z8oXQSGeSGOkqHi7zfgW1UexddTvaRwE6AyZ6FxTApm8wq8NT2cryWPWTF BDSGB3ujWHMM8ERRYJPcBSjTvt0GcEqnd+OSGgxTkGOdufn51oz82zfpVo1t+J/FNz6MRMcg 8nEC+uKvgzH1nujxJ5pRCBOquFZaGn/p71Yr0oVitkttLKblFsqwa+10Lt6HBxm+2+VLp4Ja 0WZNncZciz3V3cuArpan/ZhhyiWYV5FD0pOXPCJIx7WS9PTtxiv0AOS4ScWEUmBxyhFeOpYa DrEx In-Reply-To: <10508419-a110-0e52-242d-a20c2e9f7243@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL -0.238 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment POISEN_SPAM_PILL 0.1 Meta: its spam POISEN_SPAM_PILL_1 0.1 random spam to be learned in bayes POISEN_SPAM_PILL_3 0.1 random spam to be learned in bayes SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [PATCH manager v4 3/6] added Config for Shared Filesystem Directories X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 May 2023 10:22:29 -0000 Am 04/05/2023 um 10:57 schrieb Dominik Csapak: > On 5/4/23 10:42, Thomas Lamprecht wrote: >> Am 04/05/2023 um 10:31 schrieb Dominik Csapak: >>> On 5/4/23 10:13, Thomas Lamprecht wrote: >>>> Am 03/05/2023 um 13:26 schrieb Dominik Csapak: >>>>> just a short comment since this series overlaps a bit with my >>>>> cluster resource mapping series (i plan on sending a v4 soon). >>>>> >>>>> i'd prefer to have the configuration endpoints for mapping bundled = in a subdirectory >>>>> so instead of /nodes//dirs/ i'd put it in /nodes//mappi= ng/dirs/ >>>>> (or /nodes//map/dirs ) >>>>> >>>>> @thomas, @fabian, any other input on that? >>>>> >>>> >>>> huh? aren't mappings per definition cluster wide i.e. /cluster/resou= rce-map/ >>>> than then allows to add/update the mapping of a resource on a specif= ic node? >>>> A node specific path makes no sense to me, at max it would if adding= /removing a mapping >>>> is completely decoupled from adding/removing/updaing entries to it =E2= =80=93 but that seems >>>> convoluted from an usage POV and easy to get out of sync with the ac= tual mapping list. >>> >>> in markus series the mapping are only ever per node, so each node has= it's >>> own dir mapping >> >> Every resource maping is always per node, so that's not really changin= g anything. >> >=20 > i meant there is no cluster aspect in the current version of markus ser= ies at all only if you lock down the vm hard to be kept at the node its mapping was = configured. >=20 >> Rather what about migrations? Would be simpler from migration and ACL = POV to have >> it cluster wide, >=20 > sure, but how we get the info during migration is just an implementatio= n detail and > for that shouldn't matter where the configs/api endpoints are no it really isn't an implementation detail how ACLs govern access to map= pings, which in turn dictactes where the ACL object paths should be, which then again = ahs implications for where the API endpoints and configs should be =E2=80=93 node level is= n't the one. >>> >>> in my series, the actual config was cluster-wide, but the api endpoin= t to configure >>> them were sitting in the node path (e.g. /node//hardware-map/p= ci/*) >> =C2=A0> Please no. >=20 > was that way since the very first version, would have been nice to get = that feedback > earlier >=20 IIRC I always talked about this being on cluster level, especially w.r.t.= to making it more general resource mapping, maybe I didn't looked to closely and was t= hrown off by the series name, i.e., "add *cluster-wide* hardware device mapping" an= d on the ACL path discussion (which IIRC never had any node-specific param in there, a= nd thus only work if it's either a clusterwide config or if ID uniquness is guaranteed= by the pmxcfs like we do for VMIDs) Anyhow, sorry, I'll try to check for such things more closely earlier in = the future. >>> >>> we *could* put them into the cluster path api, but we'd need to send = a node parameter >>> along and forward it there anyway, so that wouldn't really make a dif= ference >> >> no need for that, see above. >=20 > there is a need for the node parameter, because you always need to know= for which node the > mapping is anyway ;) or did you mean the 'forward' part? yes, I certainly meant the forward part =E2=80=93 otherwise my "Every res= ource mapping is always per node" sentence and mentioned the node selector above would be really = odd. A node:local-id map always needs to exist, I figured that's the core thing = this series wants to solve, but as it's not a hard requirement to forward every request (an= d a huge PITA to get in sync if done that way, or at least would reaquire using a cfs_d= omain_lock), so no (hard) need for proxying. >>> >>> for reading the mappings, that could be done there, but in my series = in the gui at least, >>> i have to make a call to each node to get the current state of the ma= pping >>> (e.g. if the pci device is still there) >> >> For now not ideal but ok, in the future I'd rather go in the direction= of broadcasting >> some types of HW resources via pmxcfs KV and then this isn't an issue = anymore. >=20 > yeah can be done, but the question is if we want to broadcast the whole= pci/usb list > from all nodes? (i don't believe that scales well?) how is one or two dozen of PCI IDs + some meta info, broadcasted normally= once per boot (or statd start) per node a scaling problem? Especially compared to the a= mount of data we broadcast otherwise it's rather nelligible. >=20 >> >>> >>> if a mapping exists (globally) is not interesting most of the time, w= e only need to know >>> if it exists at a specific node >> >> that's looking at it backwards, the user and ACL only care for global = mapings, how >> the code implements that is then, well an implementation detail. >=20 > ACL yes, the user must configure the mapping on a vm (which sits on a s= pecific node) > and there the mapping must exist on that node >=20 Where they also don't care for local-nodeness, they just select a mapping= (no need to hard error on the mapping not being usable on the current node, VM can be migr= ated before the start =E2=80=93 albeit it would certainly improve UX if it was visually h= inted); otherwise they could've just used a local ID directly. >> >>> >>> also, after seeing markus' patches, i also leaned more in the directi= on of splitting >>> my global mapping config into a config per type+node (so node1/usb.co= nf, node1/pci.conf, >> >> no, please no forest^W jungle of config trees :/ >> >> A /etc/pve/resource-map/.conf must be enough, even a /etc/pve/re= source-map.conf >> should be tbh., but I could imagine that splitting per resource type m= akes some (schema) >> things a bit easier and reduce some bug potential, so not to hard feel= ings on having one >> cluster-wide per type; but really not more. >> >=20 > sure a single config is ok for me (only per type would be weird, since = resuing the Not sure if I'd find that flat out weird, depends on the whole design and= planned future=20 direction and if it's OK that the ID used for an PCI mapping can never be= reused for some directory or other relatively unrelated mapping. FWIW, here could = also be the higher level resource type (group), i.e., hardware.cfg for PCI/USB, one for loca= l file system stuff like directories. FWIW, section config have a disadvantage if very different config types a= re mixed, as the schema is then also shown as sum of all possible types in API/man page (e= ven storage.cfg, while handling similar things, is already very confusing); possibly solva= ble on api/manpage doc generation though; just mentioning it as I had that in my mind alread= y a few times and as we should the improve sometime, and expanding use for wildly diffe= rent types it might be better to do it sooner than later (but not directly related to t= his). > sectionconfig would only ever have single type and we'd have to encode = the > nodename somehow) Ideally we wouldn't have to encode the nodename, as doing so would make e= specially using a single config weird (i.e., it would make the + the sectionID = from a config POV but not from a usage POV, which would then be weird for a e.g., "pci: foo= :nodeA" and a "usb: foo:nodeB". It should be tried to have a single config entry per ID, i.e. very handwa= vy: pci: id comment foo allow-multi-function 1 other-map-wide-e.g-enforcement-setting-a 1 ... map nodeA:3e:00.0,nodeB:0a:00.0 Or better, with the long contemplated array support (which not necesarril= y you'd need to spearhead) pci: id ... map node=3DA,id=3D3e:00.0 map node=3DB,id=3D3e:01.0 I read (but not in extended detail) "[PATCH common v3 2/3] add PVE/Hardwa= reMap" and above is probably not expressive enough on it's own, but otoh, the formats used= in the patch seem a bit to flexible (i.e., makes not much sense to allow mdev on one m= ap entry of one node but not on the one from another node =E2=80=93 in practice that can = hardly work for VMs I guess and would limit any future live-migration caps?). Also why's the ID= that split up there? I'd rather try to keep the map to the minimal ID info and some general re= strictions (e.g., if some special use is allowed), but other details that only apply on usi= ng the mapped device in a guest would still be handled there (as long as the mapping al= lows it). Sorry for the a bit higher level and slightly hand-wavy description, if y= ou list specific problems and required things I can try to formalize the config format I'd= envision to then be also useful for other resource mappings a bit more. And just to bring it at the "table", in theory we _maybe_ could split nam= ing a node device and the cluster-wide mapping, i.e., the mapping would be, well, cluster-w= ide and naming devices plus some extra "how can/should it be used" would then be node-sp= ecific (possible not even in /etc/pve) =E2=80=93 not saying I'd prefer that, but that's so= mething that could be explored before finalizing on a design; I don't want to rush this as we h= ave to support something big and certainly poplular as this basically forever, and while= config migrations can be done somehow it's basically always a pita.