* [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy
@ 2025-11-20 14:31 Christian Ebner
2025-11-20 15:05 ` Thomas Lamprecht
0 siblings, 1 reply; 9+ messages in thread
From: Christian Ebner @ 2025-11-20 14:31 UTC (permalink / raw)
To: pbs-devel
Since commit 86d5d073 ("GC: fix race with chunk upload/insert on s3
backends"), per-chunk file locks are acquired during phase 2 of
garbage collection for datastores backed by s3 object stores. This
however means that up to 1000 file locks might be held at once, which
can result in the limit of open file handles to be reached.
Therefore, bump the NOFILE soft limit for the proxmox-backup-proxy in
the systemd service unit, while keeping the hard limit as defined in
/etc/systemd/system.conf.
This is acceptable since PBS does not directly depend on problematic
select() calls as verified via `nm` and does not use it in linked
libraries to the best of my knowledge.
Occurrences of the symbol according to `nm -D <shared-object>` are:
/lib/x86_64-linux-gnu/libapt-pkg.so.7.0
U select@GLIBC_2.2.5
/lib/x86_64-linux-gnu/libpam.so.0
U select@GLIBC_2.2.5
/lib/x86_64-linux-gnu/libc.so.6
000000000010e140 W select@@GLIBC_2.2.5
/lib/x86_64-linux-gnu/libcrypto.so.3
U select@GLIBC_2.2.5
[0] https://github.com/systemd/systemd/blob/main/NEWS#L12044
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
Changes since version 2:
- But soft to hard limit
- Extend commit message with respect to select()
etc/proxmox-backup-proxy.service.in | 1 +
1 file changed, 1 insertion(+)
diff --git a/etc/proxmox-backup-proxy.service.in b/etc/proxmox-backup-proxy.service.in
index 7ca806aa4..8e4bbc197 100644
--- a/etc/proxmox-backup-proxy.service.in
+++ b/etc/proxmox-backup-proxy.service.in
@@ -10,6 +10,7 @@ Type=notify
ExecStart=%LIBEXECDIR%/proxmox-backup/proxmox-backup-proxy
ExecReload=/bin/kill -HUP $MAINPID
PIDFile=/run/proxmox-backup/proxy.pid
+LimitNOFILE=524288
Restart=on-failure
User=%PROXY_USER%
Group=%PROXY_USER%
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy 2025-11-20 14:31 [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy Christian Ebner @ 2025-11-20 15:05 ` Thomas Lamprecht 2025-11-20 15:12 ` Christian Ebner 0 siblings, 1 reply; 9+ messages in thread From: Thomas Lamprecht @ 2025-11-20 15:05 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Christian Ebner Am 20.11.25 um 15:32 schrieb Christian Ebner: > This is acceptable since PBS does not directly depend on problematic > select() calls as verified via `nm` and does not use it in linked > libraries to the best of my knowledge. > Isn't above and > Occurrences of the symbol according to `nm -D <shared-object>` are: > > /lib/x86_64-linux-gnu/libapt-pkg.so.7.0 > U select@GLIBC_2.2.5 > /lib/x86_64-linux-gnu/libpam.so.0 > U select@GLIBC_2.2.5 > /lib/x86_64-linux-gnu/libc.so.6 > 000000000010e140 W select@@GLIBC_2.2.5 > /lib/x86_64-linux-gnu/libcrypto.so.3 > U select@GLIBC_2.2.5 above a contradiction? Or do I just misinterpret this? As it would seem to me that the usage of select symbols would in fact show that this might not be safe, or? If the API calls into any function of those libs, that might might then create a FD >= 1024 inside which then could get passed down to any of their select calls? _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy 2025-11-20 15:05 ` Thomas Lamprecht @ 2025-11-20 15:12 ` Christian Ebner 2025-11-20 17:23 ` Thomas Lamprecht 0 siblings, 1 reply; 9+ messages in thread From: Christian Ebner @ 2025-11-20 15:12 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox Backup Server development discussion On 11/20/25 4:05 PM, Thomas Lamprecht wrote: > Am 20.11.25 um 15:32 schrieb Christian Ebner: >> This is acceptable since PBS does not directly depend on problematic >> select() calls as verified via `nm` and does not use it in linked >> libraries to the best of my knowledge. >> > > Isn't above and With above I intended to state that the PBS code itself does not call into select(), while below are dependencies on shared objects which might call into select() according to their symbols. > >> Occurrences of the symbol according to `nm -D <shared-object>` are: >> >> /lib/x86_64-linux-gnu/libapt-pkg.so.7.0 >> U select@GLIBC_2.2.5 >> /lib/x86_64-linux-gnu/libpam.so.0 >> U select@GLIBC_2.2.5 >> /lib/x86_64-linux-gnu/libc.so.6 >> 000000000010e140 W select@@GLIBC_2.2.5 >> /lib/x86_64-linux-gnu/libcrypto.so.3 >> U select@GLIBC_2.2.5 > > above a contradiction? Or do I just misinterpret this? > As it would seem to me that the usage of select symbols would in fact > show that this might not be safe, or? > > If the API calls into any function of those libs, that might might then create > a FD >= 1024 inside which then could get passed down to any of their select > calls? _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy 2025-11-20 15:12 ` Christian Ebner @ 2025-11-20 17:23 ` Thomas Lamprecht 2025-11-21 7:02 ` Christian Ebner 2025-11-21 7:43 ` Fabian Grünbichler 0 siblings, 2 replies; 9+ messages in thread From: Thomas Lamprecht @ 2025-11-20 17:23 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Christian Ebner Am 20.11.25 um 16:12 schrieb Christian Ebner: > On 11/20/25 4:05 PM, Thomas Lamprecht wrote: >> Am 20.11.25 um 15:32 schrieb Christian Ebner: >>> This is acceptable since PBS does not directly depend on problematic >>> select() calls as verified via `nm` and does not use it in linked >>> libraries to the best of my knowledge. >>> >> >> Isn't above and > > With above I intended to state that the PBS code itself does not call into select(), while below are dependencies on shared objects which might call into select() according to their symbols. > And the systemd news entry you link to in the commit message clearly states: ----8<---- Programs that want to take benefit of the increased limit have to "opt-in" into high file descriptors explicitly by raising their soft limit. Of course, when they do that they must acknowledge that they cannot use select() anymore (and **neither can any shared library they use — or any shared library used by any shared library they use and so on**). ---->8---- I just checked the apt repo, and it includes various select calls. Most seem to center around downloading packages and such, but I'd not bet on it that no such select is anywhere in the code paths we use. PAM uses select in the pam_loginuid, which might be part of the login call, albeit it uses it only if require_auditd is enabled (which I don't think it is). I did not yet checked the others out. I mean, one option might be to provide our own select wrapper preloaded overriding the glibc one and keep some FDs below 1024 resereved for that, but I really really dislike doing such things. Similar in spirit would be providing a select compatible implementation using poll and ld_preload that, but also far from great.. Moving either GC, or all the things that might call select as per your list, into a dedicated process might be the nicer thing to do. But as mentioned offlist I'll try to walk through the problem and code again tomorrow and see if I can find some other viable options (or you/fabian got some ideas), as of my current knowledge I cannot really accept doing this bump. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy 2025-11-20 17:23 ` Thomas Lamprecht @ 2025-11-21 7:02 ` Christian Ebner 2025-11-21 7:43 ` Fabian Grünbichler 1 sibling, 0 replies; 9+ messages in thread From: Christian Ebner @ 2025-11-21 7:02 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox Backup Server development discussion On 11/20/25 6:22 PM, Thomas Lamprecht wrote: > Am 20.11.25 um 16:12 schrieb Christian Ebner: >> On 11/20/25 4:05 PM, Thomas Lamprecht wrote: >>> Am 20.11.25 um 15:32 schrieb Christian Ebner: >>>> This is acceptable since PBS does not directly depend on problematic >>>> select() calls as verified via `nm` and does not use it in linked >>>> libraries to the best of my knowledge. >>>> >>> >>> Isn't above and >> >> With above I intended to state that the PBS code itself does not call into select(), while below are dependencies on shared objects which might call into select() according to their symbols. >> > > And the systemd news entry you link to in the commit message clearly states: > > ----8<---- > Programs that want to take benefit of the increased limit have to "opt-in" into > high file descriptors explicitly by raising their soft limit. Of course, when > they do that they must acknowledge that they cannot use select() anymore (and > **neither can any shared library they use — or any shared library used by any > shared library they use and so on**). > ---->8---- > > I just checked the apt repo, and it includes various select calls. Most seem > to center around downloading packages and such, but I'd not bet on it that > no such select is anywhere in the code paths we use. > > PAM uses select in the pam_loginuid, which might be part of the login call, > albeit it uses it only if require_auditd is enabled (which I don't think it is). > I did not yet checked the others out. > > I mean, one option might be to provide our own select wrapper preloaded > overriding the glibc one and keep some FDs below 1024 resereved for that, but > I really really dislike doing such things. Similar in spirit would be providing > a select compatible implementation using poll and ld_preload that, but also far > from great.. > > Moving either GC, or all the things that might call select as per your list, > into a dedicated process might be the nicer thing to do. But as mentioned offlist > I'll try to walk through the problem and code again tomorrow and see if I can > find some other viable options (or you/fabian got some ideas), as of my current > knowledge I cannot really accept doing this bump. I think I came up with a solution over night, which does not require us to bump the file limits: After all one can better control and work around the max number of flocks needed during phase 2 of garbage collection for the s3 backed datastores, without sacrificing to much performance. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy 2025-11-20 17:23 ` Thomas Lamprecht 2025-11-21 7:02 ` Christian Ebner @ 2025-11-21 7:43 ` Fabian Grünbichler 2025-11-21 8:00 ` Christian Ebner 1 sibling, 1 reply; 9+ messages in thread From: Fabian Grünbichler @ 2025-11-21 7:43 UTC (permalink / raw) To: Christian Ebner, Proxmox Backup Server development discussion, Thomas Lamprecht On November 20, 2025 6:23 pm, Thomas Lamprecht wrote: > Am 20.11.25 um 16:12 schrieb Christian Ebner: >> On 11/20/25 4:05 PM, Thomas Lamprecht wrote: >>> Am 20.11.25 um 15:32 schrieb Christian Ebner: >>>> This is acceptable since PBS does not directly depend on problematic >>>> select() calls as verified via `nm` and does not use it in linked >>>> libraries to the best of my knowledge. >>>> >>> >>> Isn't above and >> >> With above I intended to state that the PBS code itself does not call into select(), while below are dependencies on shared objects which might call into select() according to their symbols. >> > > And the systemd news entry you link to in the commit message clearly states: > > ----8<---- > Programs that want to take benefit of the increased limit have to "opt-in" into > high file descriptors explicitly by raising their soft limit. Of course, when > they do that they must acknowledge that they cannot use select() anymore (and > **neither can any shared library they use — or any shared library used by any > shared library they use and so on**). > ---->8---- > > I just checked the apt repo, and it includes various select calls. Most seem > to center around downloading packages and such, but I'd not bet on it that > no such select is anywhere in the code paths we use. > > PAM uses select in the pam_loginuid, which might be part of the login call, > albeit it uses it only if require_auditd is enabled (which I don't think it is). > I did not yet checked the others out. > > I mean, one option might be to provide our own select wrapper preloaded > overriding the glibc one and keep some FDs below 1024 resereved for that, but > I really really dislike doing such things. Similar in spirit would be providing > a select compatible implementation using poll and ld_preload that, but also far > from great.. > > Moving either GC, or all the things that might call select as per your list, > into a dedicated process might be the nicer thing to do. But as mentioned offlist > I'll try to walk through the problem and code again tomorrow and see if I can > find some other viable options (or you/fabian got some ideas), as of my current > knowledge I cannot really accept doing this bump. if we move something, we should move the things (potentially) calling select, as we can then benefit from higher FD limits for all the regular operations. 1k open FDs is not much even without the newly added locks, and we had users running into issues already before that fixed them by raising the limit with a systemd override or other means (or not at all): https://forum.proxmox.com/threads/too-many-open-files-os-error-24.73094/ https://forum.proxmox.com/threads/garbage-collect-job-fails-with-emfile-too-many-open-files.152687/ https://forum.proxmox.com/threads/tasks-fail-with-too-many-open-files-os-error-24.126770/ https://forum.proxmox.com/threads/sync-from-pbs-to-pbs-failed-too-many-open-files.113036/ https://forum.proxmox.com/threads/another-sync-error.73417/ the only alternative I see at the moment would be to either - reduce the lock granularity of the newly introduced lock (e.g., lock-per-chunk-prefix) - reduce the batch size (which determines the number of concurrently held locks in GC) for S3 deletion the latter would be a fairly simple patch, but make GC potentially a bit more expensive (more delete requests to S3): diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs index 0a5179230..20372190c 100644 --- a/pbs-datastore/src/datastore.rs +++ b/pbs-datastore/src/datastore.rs @@ -1716,6 +1716,24 @@ impl DataStore { } chunk_count += 1; + + drop(_guard); + + if delete_list.len() > 100 { + let delete_objects_result = proxmox_async::runtime::block_on( + s3_client.delete_objects( + &delete_list + .iter() + .map(|(key, _)| key.clone()) + .collect::<Vec<S3ObjectKey>>(), + ), + )?; + if let Some(_err) = delete_objects_result.error { + bail!("failed to delete some objects"); + } + // release all chunk guards + delete_list.clear(); + } } if !delete_list.is_empty() { _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy 2025-11-21 7:43 ` Fabian Grünbichler @ 2025-11-21 8:00 ` Christian Ebner 2025-11-21 9:06 ` Fabian Grünbichler 2025-11-21 9:07 ` Fabian Grünbichler 0 siblings, 2 replies; 9+ messages in thread From: Christian Ebner @ 2025-11-21 8:00 UTC (permalink / raw) To: Fabian Grünbichler, Proxmox Backup Server development discussion, Thomas Lamprecht On 11/21/25 8:42 AM, Fabian Grünbichler wrote: > On November 20, 2025 6:23 pm, Thomas Lamprecht wrote: >> Am 20.11.25 um 16:12 schrieb Christian Ebner: >>> On 11/20/25 4:05 PM, Thomas Lamprecht wrote: >>>> Am 20.11.25 um 15:32 schrieb Christian Ebner: >>>>> This is acceptable since PBS does not directly depend on problematic >>>>> select() calls as verified via `nm` and does not use it in linked >>>>> libraries to the best of my knowledge. >>>>> >>>> >>>> Isn't above and >>> >>> With above I intended to state that the PBS code itself does not call into select(), while below are dependencies on shared objects which might call into select() according to their symbols. >>> >> >> And the systemd news entry you link to in the commit message clearly states: >> >> ----8<---- >> Programs that want to take benefit of the increased limit have to "opt-in" into >> high file descriptors explicitly by raising their soft limit. Of course, when >> they do that they must acknowledge that they cannot use select() anymore (and >> **neither can any shared library they use — or any shared library used by any >> shared library they use and so on**). >> ---->8---- >> >> I just checked the apt repo, and it includes various select calls. Most seem >> to center around downloading packages and such, but I'd not bet on it that >> no such select is anywhere in the code paths we use. >> >> PAM uses select in the pam_loginuid, which might be part of the login call, >> albeit it uses it only if require_auditd is enabled (which I don't think it is). >> I did not yet checked the others out. >> >> I mean, one option might be to provide our own select wrapper preloaded >> overriding the glibc one and keep some FDs below 1024 resereved for that, but >> I really really dislike doing such things. Similar in spirit would be providing >> a select compatible implementation using poll and ld_preload that, but also far >> from great.. >> >> Moving either GC, or all the things that might call select as per your list, >> into a dedicated process might be the nicer thing to do. But as mentioned offlist >> I'll try to walk through the problem and code again tomorrow and see if I can >> find some other viable options (or you/fabian got some ideas), as of my current >> knowledge I cannot really accept doing this bump. > > if we move something, we should move the things (potentially) calling > select, as we can then benefit from higher FD limits for all the regular > operations. 1k open FDs is not much even without the newly added locks, > and we had users running into issues already before that fixed them by > raising the limit with a systemd override or other means (or not at > all): > > https://forum.proxmox.com/threads/too-many-open-files-os-error-24.73094/ > https://forum.proxmox.com/threads/garbage-collect-job-fails-with-emfile-too-many-open-files.152687/ > https://forum.proxmox.com/threads/tasks-fail-with-too-many-open-files-os-error-24.126770/ > https://forum.proxmox.com/threads/sync-from-pbs-to-pbs-failed-too-many-open-files.113036/ > https://forum.proxmox.com/threads/another-sync-error.73417/ > > the only alternative I see at the moment would be to either > - reduce the lock granularity of the newly introduced lock (e.g., > lock-per-chunk-prefix) This however does not necessarily solve the issue at hand? Many of these chunks will have different prefixes... So worst case one ends up in the exact same spot we are in now. > - reduce the batch size (which determines the number of concurrently > held locks in GC) for S3 deletion exactly what came to my mind as well :) > > the latter would be a fairly simple patch, but make GC potentially a bit > more expensive (more delete requests to S3): > > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs > index 0a5179230..20372190c 100644 > --- a/pbs-datastore/src/datastore.rs > +++ b/pbs-datastore/src/datastore.rs > @@ -1716,6 +1716,24 @@ impl DataStore { > } > > chunk_count += 1; > + > + drop(_guard); > + > + if delete_list.len() > 100 { > + let delete_objects_result = proxmox_async::runtime::block_on( > + s3_client.delete_objects( > + &delete_list > + .iter() > + .map(|(key, _)| key.clone()) > + .collect::<Vec<S3ObjectKey>>(), > + ), > + )?; > + if let Some(_err) = delete_objects_result.error { > + bail!("failed to delete some objects"); > + } > + // release all chunk guards > + delete_list.clear(); > + } > } > > if !delete_list.is_empty() { Since you already have it in place, do you want to send this patch? My initial draft was still a bit less efficient than this as I did already batch the list object response. Only thing I still see missing in your patch is to make the 100 a constant and set the capacity for the delete list using that on instantiation as well. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy 2025-11-21 8:00 ` Christian Ebner @ 2025-11-21 9:06 ` Fabian Grünbichler 2025-11-21 9:07 ` Fabian Grünbichler 1 sibling, 0 replies; 9+ messages in thread From: Fabian Grünbichler @ 2025-11-21 9:06 UTC (permalink / raw) To: Christian Ebner, Proxmox Backup Server development discussion, Thomas Lamprecht On November 21, 2025 9:00 am, Christian Ebner wrote: > On 11/21/25 8:42 AM, Fabian Grünbichler wrote: >> On November 20, 2025 6:23 pm, Thomas Lamprecht wrote: >>> Am 20.11.25 um 16:12 schrieb Christian Ebner: >>>> On 11/20/25 4:05 PM, Thomas Lamprecht wrote: >>>>> Am 20.11.25 um 15:32 schrieb Christian Ebner: >>>>>> This is acceptable since PBS does not directly depend on problematic >>>>>> select() calls as verified via `nm` and does not use it in linked >>>>>> libraries to the best of my knowledge. >>>>>> >>>>> >>>>> Isn't above and >>>> >>>> With above I intended to state that the PBS code itself does not call into select(), while below are dependencies on shared objects which might call into select() according to their symbols. >>>> >>> >>> And the systemd news entry you link to in the commit message clearly states: >>> >>> ----8<---- >>> Programs that want to take benefit of the increased limit have to "opt-in" into >>> high file descriptors explicitly by raising their soft limit. Of course, when >>> they do that they must acknowledge that they cannot use select() anymore (and >>> **neither can any shared library they use — or any shared library used by any >>> shared library they use and so on**). >>> ---->8---- >>> >>> I just checked the apt repo, and it includes various select calls. Most seem >>> to center around downloading packages and such, but I'd not bet on it that >>> no such select is anywhere in the code paths we use. >>> >>> PAM uses select in the pam_loginuid, which might be part of the login call, >>> albeit it uses it only if require_auditd is enabled (which I don't think it is). >>> I did not yet checked the others out. >>> >>> I mean, one option might be to provide our own select wrapper preloaded >>> overriding the glibc one and keep some FDs below 1024 resereved for that, but >>> I really really dislike doing such things. Similar in spirit would be providing >>> a select compatible implementation using poll and ld_preload that, but also far >>> from great.. >>> >>> Moving either GC, or all the things that might call select as per your list, >>> into a dedicated process might be the nicer thing to do. But as mentioned offlist >>> I'll try to walk through the problem and code again tomorrow and see if I can >>> find some other viable options (or you/fabian got some ideas), as of my current >>> knowledge I cannot really accept doing this bump. >> >> if we move something, we should move the things (potentially) calling >> select, as we can then benefit from higher FD limits for all the regular >> operations. 1k open FDs is not much even without the newly added locks, >> and we had users running into issues already before that fixed them by >> raising the limit with a systemd override or other means (or not at >> all): >> >> https://forum.proxmox.com/threads/too-many-open-files-os-error-24.73094/ >> https://forum.proxmox.com/threads/garbage-collect-job-fails-with-emfile-too-many-open-files.152687/ >> https://forum.proxmox.com/threads/tasks-fail-with-too-many-open-files-os-error-24.126770/ >> https://forum.proxmox.com/threads/sync-from-pbs-to-pbs-failed-too-many-open-files.113036/ >> https://forum.proxmox.com/threads/another-sync-error.73417/ >> >> the only alternative I see at the moment would be to either >> - reduce the lock granularity of the newly introduced lock (e.g., >> lock-per-chunk-prefix) > > This however does not necessarily solve the issue at hand? Many of these > chunks will have different prefixes... So worst case one ends up in the > exact same spot we are in now. yes, that's true. it makes things more complicated, and might reduce the number of open locks by virtue of lock contention, but doesn't ensure we don't run into the issue.. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy 2025-11-21 8:00 ` Christian Ebner 2025-11-21 9:06 ` Fabian Grünbichler @ 2025-11-21 9:07 ` Fabian Grünbichler 1 sibling, 0 replies; 9+ messages in thread From: Fabian Grünbichler @ 2025-11-21 9:07 UTC (permalink / raw) To: Christian Ebner, Proxmox Backup Server development discussion, Thomas Lamprecht On November 21, 2025 9:00 am, Christian Ebner wrote: > On 11/21/25 8:42 AM, Fabian Grünbichler wrote: >> - reduce the batch size (which determines the number of concurrently >> held locks in GC) for S3 deletion > > exactly what came to my mind as well :) > >> >> the latter would be a fairly simple patch, but make GC potentially a bit >> more expensive (more delete requests to S3): >> >> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs >> index 0a5179230..20372190c 100644 >> --- a/pbs-datastore/src/datastore.rs >> +++ b/pbs-datastore/src/datastore.rs >> @@ -1716,6 +1716,24 @@ impl DataStore { >> } >> >> chunk_count += 1; >> + >> + drop(_guard); >> + >> + if delete_list.len() > 100 { >> + let delete_objects_result = proxmox_async::runtime::block_on( >> + s3_client.delete_objects( >> + &delete_list >> + .iter() >> + .map(|(key, _)| key.clone()) >> + .collect::<Vec<S3ObjectKey>>(), >> + ), >> + )?; >> + if let Some(_err) = delete_objects_result.error { >> + bail!("failed to delete some objects"); >> + } >> + // release all chunk guards >> + delete_list.clear(); >> + } >> } >> >> if !delete_list.is_empty() { > > Since you already have it in place, do you want to send this patch? > > My initial draft was still a bit less efficient than this as I did > already batch the list object response. > > Only thing I still see missing in your patch is to make the 100 a > constant and set the capacity for the delete list using that on > instantiation as well. sent as three-patch series: https://lore.proxmox.com/pbs-devel/20251121090605.262675-1-f.gruenbichler@proxmox.com/T/#t _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-11-21 9:07 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-11-20 14:31 [pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy Christian Ebner 2025-11-20 15:05 ` Thomas Lamprecht 2025-11-20 15:12 ` Christian Ebner 2025-11-20 17:23 ` Thomas Lamprecht 2025-11-21 7:02 ` Christian Ebner 2025-11-21 7:43 ` Fabian Grünbichler 2025-11-21 8:00 ` Christian Ebner 2025-11-21 9:06 ` Fabian Grünbichler 2025-11-21 9:07 ` Fabian Grünbichler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox