https://bugzilla.proxmox.com/show_bug.cgi?id=6049 has been created for this. Thanks! — Mark Schouten CTO, Tuxis B.V. +31 318 200208 / mark@tuxis.nl ------ Original Message ------ From "Shannon Sterz" To "Mark Schouten" Cc "Proxmox Backup Server development discussion" Date 20/12/2024 14:22:18 Subject Re: Re[4]: [pbs-devel] Authentication performance >On Thu Dec 19, 2024 at 10:56 AM CET, Mark Schouten wrote: >> Hi, >> >> We upgraded to 3.3 yesterday, not much gain to notice with regards to >> the new version or the change in keying. It’s still (obvioulsy) pretty >> busy. > >just be aware that the patch i linked to in my last mail has not been >packaged yet, so you wouldn't see the impact of that patch yet. > >> However, I also tried to remove some datastores, which failed with >> timeouts. PBS even stopped authenticating (so probably just working) all >> together for about 10 seconds, which was an unpleasant surprise. >> >> So looking into that further, I noticed the following logging: >> Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET >> /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client >> [::ffff]:42104] Unable to acquire lock >> "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error >> 4) >> Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET >> /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client >> [::ffff]:42144] Unable to acquire lock >> "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error >> 4) >> Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET >> /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client >> [::ffff]:47286] Unable to acquire lock >> "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error >> 4) >> Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET >> /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client >> [::ffff:]:45994] Unable to acquire lock >> "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error >> 4) >> >> Which surprised me, since this is a ’status’ call, which should not need >> locking of the datastore-config. >> >>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=c611f593624977defc49d6e4de2ab8185cfe09e9;hb=HEAD#l687 >> does not lock the config, but >> >>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=pbs-datastore/src/datastore.rs;h=0801b4bf6b25eaa6f306c7b39ae2cfe81b4782e1;hb=HEAD#l204 >> does. >> >> So if I understand this correctly, every ’status’ call (30 per second in >> our case) locks the datastore-config exclusively. And also, every time >> ’status’ get called, the whole datastore-config gets loaded? > >probably, there are some comments about that there already, it might >make sense to open a bugzilla issue to discuss this further [1]. > >[1]: https://bugzilla.proxmox.com/ > >> Is that something that could use some performance tuning? >> >> — >> Mark Schouten >> CTO, Tuxis B.V. >> +31 318 200208 / mark@tuxis.nl >> >> >> ------ Original Message ------ >> From "Shannon Sterz" >> To "Mark Schouten" >> Cc "Proxmox Backup Server development discussion" >> >> Date 16/12/2024 12:51:47 >> Subject Re: Re[2]: [pbs-devel] Authentication performance >> >> >On Mon Dec 16, 2024 at 12:23 PM CET, Mark Schouten wrote: >> >> Hi, >> >> >> >> > >> >> >would you mind sharing either `authkey.pub` or the output of the >> >> >following commands: >> >> > >> >> >head --lines=1 /etc/proxmox-backup/authkey.key >> >> >cat /etc/proxmox-backup/authkey.key | wc -l >> >> >> >> -----BEGIN RSA PRIVATE KEY----- >> >> 51 >> >> >> >> So that is indeed the legacy method. We are going to upgrade our PBS’es >> >> on wednesday. >> >> >> >> > >> >> >The first should give the PEM header of the authkey whereas the second >> >> >provides the amount of lines that the key takes up in the file. Both >> >> >give an indication whether you are using the legacy RSA keys or newer >> >> >Ed25519 keys. The later should provide more performance, security should >> >> >not be affected much by this change. If the output of the commands look >> >> >like this: >> >> > >> >> >-----BEGIN PRIVATE KEY----- >> >> >3 >> >> > >> >> >Then you are using the newer keys. There currently isn't a recommended >> >> >way to upgrade the keys. However, in theory you should be able to remove >> >> >the old keys, re-start PBS and it should just generate keys in the new >> >> >format. Note that this will logout anyone that is currently >> >> >authenticated and they'll have to re-authenticate. >> >> >> >> Seems like a good moment to update those keys as well. >> > >> >Sure, just be aware that you have to manually delete the key before >> >restarting the PBS. Upgrading alone won't affect the key. Ideally you'd >> >test this before rolling it out, if you can >> > >> >> >In general, tokens should still be fater to authenticate so we'd >> >> >recommend that you try to get your users to switch to token-based >> >> >authentication where possible. Improving performance there is a bit >> >> >trickier though, as it often comes with a security trade-off (in the >> >> >background we use yescrypt fo the authentication there, that >> >> >delibaretely adds a work factor). However, we may be able to improve >> >> >performance a bit via caching methods or similar. >> >> >> >> Yes, that might help. I’m also not sure if it actually is >> >> authentication, or if it is the datastore-call that the PVE-environments >> >> call. As you can see in your support issue 3153557, it looks like some >> >> requests loop through all datastores, before responding with a limited >> >> set of datastores. >> > >> >I looked at that ticket and yes, that is probably unrelated to >> >authentication. >> > >> >> For instance (and I’m a complete noob wrt Rust) but if I understand >> >>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=11d2641b9ca2d2c92da1a85e4cb16d780368abd3;hb=HEAD#l1315 >> >> correcly, PBS loops through all the datastores, checks mount-status and >> >> config, and only starts filtering at line 1347. If I understand that >> >> correctly, in our case with over 1100 datastores, that might cause quite >> >> some load? >> > >> >Possible, yes, that would depend on your configuration. Are all of these >> >datastores defined with a backing device? Because if not, than this >> >should be fairly fast (as in, this should not actually touch the disks). >> >If they are, then yes this could be slow as each store would trigger at >> >least 2 stat calls afaict. >> > >> >In any case, it should be fine to move the `mount_status` check after >> >the `if allowed || allow_id` check from what i can tell. Not sure why >> >we'd need to check the mount_status for a datastore we won't include in >> >the resulsts anyway. Same goes for parsing the store config imo. Send a >> >patch for that [1]. >> > >> >[1]: https://lore.proxmox.com/pbs-devel/20241216115044.208595-1-s.sterz@proxmox.com/T/#u >> > >> >> >> >> >> >> Thanks, >> >> >> >> — >> >> Mark Schouten >> >> CTO, Tuxis B.V. >> >> +31 318 200208 / mark@tuxis.nl >> > >> > > >