* [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages @ 2024-06-11 9:30 Dominik Csapak 2024-06-11 9:42 ` [pbs-devel] applied: " Dietmar Maurer 2024-06-11 18:05 ` [pbs-devel] " Thomas Lamprecht 0 siblings, 2 replies; 7+ messages in thread From: Dominik Csapak @ 2024-06-11 9:30 UTC (permalink / raw) To: pbs-devel such as NFS or SMB. They will not provide the expected performance and it's better to recommend against them. Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> --- if we want to discourage users even more, we could also detect it on datastore creation and put a warning into the task log also if we ever come around to implementing the 'health' page thomas wished for, we can put a warning/error there too docs/system-requirements.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/system-requirements.rst b/docs/system-requirements.rst index fb920865..17756b7b 100644 --- a/docs/system-requirements.rst +++ b/docs/system-requirements.rst @@ -41,6 +41,9 @@ Recommended Server System Requirements * Use only SSDs, for best results * If HDDs are used: Using a metadata cache is highly recommended, for example, add a ZFS :ref:`special device mirror <local_zfs_special_device>`. + * While it's technically possible to use remote storages such as NFS or SMB, + the additional latency and overhead drastically reduces performance and it's + not recommended to use such a setup. * Redundant Multi-GBit/s network interface cards (NICs) -- 2.39.2 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* [pbs-devel] applied: [PATCH proxmox-backup] docs: add note for not using remote storages 2024-06-11 9:30 [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages Dominik Csapak @ 2024-06-11 9:42 ` Dietmar Maurer 2024-06-11 18:05 ` [pbs-devel] " Thomas Lamprecht 1 sibling, 0 replies; 7+ messages in thread From: Dietmar Maurer @ 2024-06-11 9:42 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Dominik Csapak > On 11.6.2024 11:30 CEST Dominik Csapak <d.csapak@proxmox.com> wrote: > > > such as NFS or SMB. They will not provide the expected performance > and it's better to recommend against them. > > Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> > --- > if we want to discourage users even more, we could also detect it on > datastore creation and put a warning into the task log > > also if we ever come around to implementing the 'health' page thomas > wished for, we can put a warning/error there too > > docs/system-requirements.rst | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/docs/system-requirements.rst b/docs/system-requirements.rst > index fb920865..17756b7b 100644 > --- a/docs/system-requirements.rst > +++ b/docs/system-requirements.rst > @@ -41,6 +41,9 @@ Recommended Server System Requirements > * Use only SSDs, for best results > * If HDDs are used: Using a metadata cache is highly recommended, for example, > add a ZFS :ref:`special device mirror <local_zfs_special_device>`. > + * While it's technically possible to use remote storages such as NFS or SMB, > + the additional latency and overhead drastically reduces performance and it's > + not recommended to use such a setup. > > * Redundant Multi-GBit/s network interface cards (NICs) > > -- > 2.39.2 > > > > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages 2024-06-11 9:30 [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages Dominik Csapak 2024-06-11 9:42 ` [pbs-devel] applied: " Dietmar Maurer @ 2024-06-11 18:05 ` Thomas Lamprecht 2024-06-12 6:39 ` Dominik Csapak 1 sibling, 1 reply; 7+ messages in thread From: Thomas Lamprecht @ 2024-06-11 18:05 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Dominik Csapak This section is a quite central and important one, so I'm being a bit more nitpicking with it than other content. NFS boxes are still quite popular, a blanket recommendation against them quite probably won't help our cause or reducing noise in our getting help channels. Dietmar already applied this, so would need a follow-up please. Am 11/06/2024 um 11:30 schrieb Dominik Csapak: > such as NFS or SMB. They will not provide the expected performance > and it's better to recommend against them. Not so sure about doing recommending against them as a blanket statement, the "remote" part might adjective is a bit subtle and, e.g., using a local full flash NVMe storage attached over a 100G link with latency in the µs surely beats basically any local spinner only storage and probably even a lot of SATA attached SSD ones. Also, it can be totally fine to use as second datastore, i.e. in a setup with a (smaller) datastore backed by (e.g. local) fast storage that is then periodically synced to a slower remote. > Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> > --- > if we want to discourage users even more, we could also detect it on > datastore creation and put a warning into the task log I would avoid that, at least not without actually measuring how the storage performs (which is probably quite prone to errors, or would require periodic measurements). > > also if we ever come around to implementing the 'health' page thomas > wished for, we can put a warning/error there too > > docs/system-requirements.rst | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/docs/system-requirements.rst b/docs/system-requirements.rst > index fb920865..17756b7b 100644 > --- a/docs/system-requirements.rst > +++ b/docs/system-requirements.rst > @@ -41,6 +41,9 @@ Recommended Server System Requirements > * Use only SSDs, for best results > * If HDDs are used: Using a metadata cache is highly recommended, for example, > add a ZFS :ref:`special device mirror <local_zfs_special_device>`. > + * While it's technically possible to use remote storages such as NFS or SMB, Up-front, I wrote some possible smaller improvements upfront but then a replacement (see below), but I kept the others Would do s/remote storages/remote storage/ (We use "storages" quite a few times already, but if possible keeping it singular sounds nicer IMO) > + the additional latency and overhead drastically reduces performance and it's s/additional latency and overhead/additional latency overhead/ ? or "network overhead" If it'd stay as is, the "reduces" should be changed to "reduce" ("latency and overhead" is plural). > + not recommended to use such a setup. The last part would be better off with just: "... and is not recommended" But I'd rather reword the whole thing to focus more on what the actual issue is, i.e., not NFS or SMB/CIFS per se, but if the network accessing them is slow. Maybe something like: * Avoid using remote storage, like NFS or SMB/CIFS, connected over a slow (< 10 Gbps) and/or high latency (> 1 ms) link. Such a storage can dramatically reduce performance and may even negatively impact the backup source, e.g. by causing IO hangs. I pulled the numbers in parentheses out of thin air, but IMO they shouldn't be too far off from 2024 Slow™, no hard feelings on adapting them though. > > * Redundant Multi-GBit/s network interface cards (NICs) > _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages 2024-06-11 18:05 ` [pbs-devel] " Thomas Lamprecht @ 2024-06-12 6:39 ` Dominik Csapak 2024-06-12 15:40 ` Thomas Lamprecht 0 siblings, 1 reply; 7+ messages in thread From: Dominik Csapak @ 2024-06-12 6:39 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox Backup Server development discussion On 6/11/24 8:05 PM, Thomas Lamprecht wrote: > This section is a quite central and important one, so I'm being a bit > more nitpicking with it than other content. NFS boxes are still quite > popular, a blanket recommendation against them quite probably won't > help our cause or reducing noise in our getting help channels. > > Dietmar already applied this, so would need a follow-up please. sure > > Am 11/06/2024 um 11:30 schrieb Dominik Csapak: >> such as NFS or SMB. They will not provide the expected performance >> and it's better to recommend against them. > > Not so sure about doing recommending against them as a blanket statement, > the "remote" part might adjective is a bit subtle and, e.g., using a local > full flash NVMe storage attached over a 100G link with latency in the µs > surely beats basically any local spinner only storage and probably even > a lot of SATA attached SSD ones. well alone the fact of using nfs makes some operations a few magnitudes slower. e.g. here locally creating a datastore locally takes a few seconds (probably fast due to the page cache) but a locally mounted nfs (so no network involved) on the same disk takes a few minutes. so at least some file creation/deletion operations are some magnitudes slower just by using nfs (though i guess there are some options/ipmlementations that can influence that such as async/sync export options) also a remote SMB share from windows (same physical host though, so again, no real network) takes ~ a minute for the same operation so yes, while I generally agree that using remote storage can be fast enough, using any of them increases some file operations by a significant amount, even when using fast storage and fast network (i know that datastore creation is not the best benchmark for this, but shows that there is significant overhead on some operations) > > Also, it can be totally fine to use as second datastore, i.e. in a setup > with a (smaller) datastore backed by (e.g. local) fast storage that is > then periodically synced to a slower remote. > >> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> >> --- >> if we want to discourage users even more, we could also detect it on >> datastore creation and put a warning into the task log > > I would avoid that, at least not without actually measuring how the > storage performs (which is probably quite prone to errors, or would > require periodic measurements). fine with me > >> >> also if we ever come around to implementing the 'health' page thomas >> wished for, we can put a warning/error there too >> >> docs/system-requirements.rst | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/docs/system-requirements.rst b/docs/system-requirements.rst >> index fb920865..17756b7b 100644 >> --- a/docs/system-requirements.rst >> +++ b/docs/system-requirements.rst >> @@ -41,6 +41,9 @@ Recommended Server System Requirements >> * Use only SSDs, for best results >> * If HDDs are used: Using a metadata cache is highly recommended, for example, >> add a ZFS :ref:`special device mirror <local_zfs_special_device>`. >> + * While it's technically possible to use remote storages such as NFS or SMB, > > Up-front, I wrote some possible smaller improvements upfront but then > a replacement (see below), but I kept the others > > Would do s/remote storages/remote storage/ > > (We use "storages" quite a few times already, but if possible keeping it > singular sounds nicer IMO) ok > >> + the additional latency and overhead drastically reduces performance and it's > > s/additional latency and overhead/additional latency overhead/ ? > > or "network overhead" > > If it'd stay as is, the "reduces" should be changed to "reduce" ("latency and > overhead" is plural). > i meant actually two things here, the network latency and the additional overhead of the second filesystem layer > >> + not recommended to use such a setup. > > The last part would be better off with just: > > "... and is not recommended" > agreed, i was on the edge a bit with that wording anyway but just leaving it off sounds better. > > But I'd rather reword the whole thing to focus more on what the actual issue is, > i.e., not NFS or SMB/CIFS per se, but if the network accessing them is slow. > Maybe something like: > > * Avoid using remote storage, like NFS or SMB/CIFS, connected over a slow > (< 10 Gbps) and/or high latency (> 1 ms) link. Such a storage can > dramatically reduce performance and may even negatively impact the > backup source, e.g. by causing IO hangs. > > I pulled the numbers in parentheses out of thin air, but IMO they shouldn't be too far > off from 2024 Slow™, no hard feelings on adapting them though. IMHO i'd not mention any specific numbers at all, unless we actually benchmarked such a setup. so what about: * Avoid using remote storage, like NFS or SMB/CIFS, connected over a slow and/or high latency link. Such a storage can dramatically reduce performance and may even negatively impact the backup source, e.g. by causing IO hangs. If you want to use such a storage, make sure it performs as expected by testing it before using it in production. By adding that additional sentence we hopefully nudge some users into actually testing before deploying it, instead of then complaining that it's slow. > >> >> * Redundant Multi-GBit/s network interface cards (NICs) >> > _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages 2024-06-12 6:39 ` Dominik Csapak @ 2024-06-12 15:40 ` Thomas Lamprecht 2024-06-13 8:02 ` Dominik Csapak 0 siblings, 1 reply; 7+ messages in thread From: Thomas Lamprecht @ 2024-06-12 15:40 UTC (permalink / raw) To: Dominik Csapak, Proxmox Backup Server development discussion Am 12/06/2024 um 08:39 schrieb Dominik Csapak: > > On 6/11/24 8:05 PM, Thomas Lamprecht wrote: >> This section is a quite central and important one, so I'm being a bit >> more nitpicking with it than other content. NFS boxes are still quite >> popular, a blanket recommendation against them quite probably won't >> help our cause or reducing noise in our getting help channels. >> >> Dietmar already applied this, so would need a follow-up please. > > sure > >> >> Am 11/06/2024 um 11:30 schrieb Dominik Csapak: >>> such as NFS or SMB. They will not provide the expected performance >>> and it's better to recommend against them. >> >> Not so sure about doing recommending against them as a blanket statement, >> the "remote" part might adjective is a bit subtle and, e.g., using a local >> full flash NVMe storage attached over a 100G link with latency in the µs >> surely beats basically any local spinner only storage and probably even >> a lot of SATA attached SSD ones. > > well alone the fact of using nfs makes some operations a few magnitudes > slower. e.g. here locally creating a datastore locally takes a few > seconds (probably fast due to the page cache) but a locally > mounted nfs (so no network involved) on the same disk takes > a few minutes. so at least some file creation/deletion operations > are some magnitudes slower just by using nfs (though i guess > there are some options/ipmlementations that can influence that > such as async/sync export options) > > also a remote SMB share from windows (same physical host though, so > again, no real network) takes ~ a minute for the same operation > > so yes, while I generally agree that using remote storage can be fast > enough, using any of them increases some file operations by a > significant amount, even when using fast storage and fast network Just because there is some overhead (that is the result of a trade-off to get a parallel/simultaneous accessible FS) doesn't mean that we should recommend against an FS, which is IMO a bit strange to do in a system requirement recommendation list anyway (there's a huge list of things that'd need to get added then here, from not using USB 1.0 pen drives as backing storage to not sliding strong magnets over the server). > > (i know that datastore creation is not the best benchmark for this, > but shows that there is significant overhead on some operations) Yeah, one creates a datastore only once, and on actual backup there are at max a few mkdirs, not 65k, so not really relevant here. Also, just because there's some overhead, allowing simultaneous mounts doesn't come for free, it doesn't mean that it's actually a problem for actual backup. As said, a blanket recommendation against a setup that is already rather frequent is IMO just deterring (future) users. >> >> Also, it can be totally fine to use as second datastore, i.e. in a setup >> with a (smaller) datastore backed by (e.g. local) fast storage that is >> then periodically synced to a slower remote. >> >>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> >>> --- >>> if we want to discourage users even more, we could also detect it on >>> datastore creation and put a warning into the task log >> >> I would avoid that, at least not without actually measuring how the >> storage performs (which is probably quite prone to errors, or would >> require periodic measurements). > > fine with me > >> >>> >>> also if we ever come around to implementing the 'health' page thomas >>> wished for, we can put a warning/error there too >>> >>> docs/system-requirements.rst | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/docs/system-requirements.rst b/docs/system-requirements.rst >>> index fb920865..17756b7b 100644 >>> --- a/docs/system-requirements.rst >>> +++ b/docs/system-requirements.rst >>> @@ -41,6 +41,9 @@ Recommended Server System Requirements >>> * Use only SSDs, for best results >>> * If HDDs are used: Using a metadata cache is highly recommended, for example, >>> add a ZFS :ref:`special device mirror <local_zfs_special_device>`. >>> + * While it's technically possible to use remote storages such as NFS or SMB, >> >> Up-front, I wrote some possible smaller improvements upfront but then >> a replacement (see below), but I kept the others >> >> Would do s/remote storages/remote storage/ >> >> (We use "storages" quite a few times already, but if possible keeping it >> singular sounds nicer IMO) > > ok > >> >>> + the additional latency and overhead drastically reduces performance and it's >> >> s/additional latency and overhead/additional latency overhead/ ? >> >> or "network overhead" >> >> If it'd stay as is, the "reduces" should be changed to "reduce" ("latency and >> overhead" is plural). >> > > i meant actually two things here, the network latency and the additional > overhead of the second filesystem layer Then it'd have helped me if to avoid mixing a specific overhead (latency) with a generic mentioning of the word overhead, like: "... the added overhead of networking and providing concurrent file system access drastically reduces performance ..." But that sounds a bit convoluted, so the best option here might be to just use "added overhead". >> >> But I'd rather reword the whole thing to focus more on what the actual issue is, >> i.e., not NFS or SMB/CIFS per se, but if the network accessing them is slow. >> Maybe something like: >> >> * Avoid using remote storage, like NFS or SMB/CIFS, connected over a slow >> (< 10 Gbps) and/or high latency (> 1 ms) link. Such a storage can >> dramatically reduce performance and may even negatively impact the >> backup source, e.g. by causing IO hangs. >> >> I pulled the numbers in parentheses out of thin air, but IMO they shouldn't be too far >> off from 2024 Slow™, no hard feelings on adapting them though. > > IMHO i'd not mention any specific numbers at all, unless we actually > benchmarked such a setup. so what about: Not sure what numbers from a benchmark would be of use here? One knows what fast storage can do latency wise and how much bandwidth is a good baseline – granted, the numbers are not helping for every specific setup, but doing some benchmark won't change that either. Anyway, won't matter, see below. > > * Avoid using remote storage, like NFS or SMB/CIFS, connected over a > slow and/or high latency link. Such a storage can dramatically reduce > performance and may even negatively impact the backup source, e.g. by > causing IO hangs. If you want to use such a storage, make sure it > performs as expected by testing it before using it in production. > That starts to get rather convoluted, tbh., the more I think about this, the more I prefer just reverting the whole thing, I see no gain in "bashing" NFS/SMB just because they have some overhead. If, we could simply adapt the "Use only SSDs, for best results" point to: "Prefer fast local storage that delivers high IOPS for random IO workloads; use only enterprise SSDs for best results." Would be a better fit to convey that fast local storage should be preferred, especially in a "recommended" (not "recommended against") list. > > By adding that additional sentence we hopefully nudge some users > into actually testing before deploying it, instead of then > complaining that it's slow. If only; from forum and office request it's quite sensible to assume that a good amount of users already have their storage box, and they'd need to do so to be able to test it in any way, so already too late. It might be better to describe a setup how to still be able to use their existing, NFS/SMB/... attached storage in the best way possible. E.g., by doing a fast small local storage for incoming backups and use the bigger remote storage only through syncing to it. This has a few benefits beside getting good performance with existing, slower storage (of any type), like having already an extra copy of most recent data. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages 2024-06-12 15:40 ` Thomas Lamprecht @ 2024-06-13 8:02 ` Dominik Csapak 2024-06-17 15:58 ` Thomas Lamprecht 0 siblings, 1 reply; 7+ messages in thread From: Dominik Csapak @ 2024-06-13 8:02 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox Backup Server development discussion On 6/12/24 17:40, Thomas Lamprecht wrote: > Am 12/06/2024 um 08:39 schrieb Dominik Csapak: >> >> On 6/11/24 8:05 PM, Thomas Lamprecht wrote: >>> This section is a quite central and important one, so I'm being a bit >>> more nitpicking with it than other content. NFS boxes are still quite >>> popular, a blanket recommendation against them quite probably won't >>> help our cause or reducing noise in our getting help channels. >>> >>> Dietmar already applied this, so would need a follow-up please. >> >> sure >> >>> >>> Am 11/06/2024 um 11:30 schrieb Dominik Csapak: >>>> such as NFS or SMB. They will not provide the expected performance >>>> and it's better to recommend against them. >>> >>> Not so sure about doing recommending against them as a blanket statement, >>> the "remote" part might adjective is a bit subtle and, e.g., using a local >>> full flash NVMe storage attached over a 100G link with latency in the µs >>> surely beats basically any local spinner only storage and probably even >>> a lot of SATA attached SSD ones. >> >> well alone the fact of using nfs makes some operations a few magnitudes >> slower. e.g. here locally creating a datastore locally takes a few >> seconds (probably fast due to the page cache) but a locally >> mounted nfs (so no network involved) on the same disk takes >> a few minutes. so at least some file creation/deletion operations >> are some magnitudes slower just by using nfs (though i guess >> there are some options/ipmlementations that can influence that >> such as async/sync export options) >> >> also a remote SMB share from windows (same physical host though, so >> again, no real network) takes ~ a minute for the same operation >> >> so yes, while I generally agree that using remote storage can be fast >> enough, using any of them increases some file operations by a >> significant amount, even when using fast storage and fast network > > Just because there is some overhead (that is the result of a trade-off > to get a parallel/simultaneous accessible FS) doesn't mean that we > should recommend against an FS, which is IMO a bit strange to do > in a system requirement recommendation list anyway (there's a huge > list of things that'd need to get added then here, from not using > USB 1.0 pen drives as backing storage to not sliding strong magnets > over the server). but we already do recommend against using remote storage regularly, just not in the docs but in the forum. (so do many of our users) we also recommend against slow storage, but that can also work depending on the use case/workload/exact setup > >> >> (i know that datastore creation is not the best benchmark for this, >> but shows that there is significant overhead on some operations) > > Yeah, one creates a datastore only once, and on actual backup there > are at max a few mkdirs, not 65k, so not really relevant here. > Also, just because there's some overhead, allowing simultaneous mounts > doesn't come for free, it doesn't mean that it's actually a problem for > actual backup. As said, a blanket recommendation against a setup that > is already rather frequent is IMO just deterring (future) users. it's not only datastore creation, also garbage collection and all operations that has to access many files in succession suffers from the over head here. my point is that the overhead of using a remote fs (regardless which) adds so much overhead that it often turns what would be 'reasonable' performance locally into 'unreasonably slow' so you'd have to massively overcompensate for that in hardware. This is possible ofc, but highly unlikely for the vast majority of users. > > >>> >>> Also, it can be totally fine to use as second datastore, i.e. in a setup >>> with a (smaller) datastore backed by (e.g. local) fast storage that is >>> then periodically synced to a slower remote. >>> >>>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> >>>> --- >>>> if we want to discourage users even more, we could also detect it on >>>> datastore creation and put a warning into the task log >>> >>> I would avoid that, at least not without actually measuring how the >>> storage performs (which is probably quite prone to errors, or would >>> require periodic measurements). >> >> fine with me >> >>> >>>> >>>> also if we ever come around to implementing the 'health' page thomas >>>> wished for, we can put a warning/error there too >>>> >>>> docs/system-requirements.rst | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/docs/system-requirements.rst b/docs/system-requirements.rst >>>> index fb920865..17756b7b 100644 >>>> --- a/docs/system-requirements.rst >>>> +++ b/docs/system-requirements.rst >>>> @@ -41,6 +41,9 @@ Recommended Server System Requirements >>>> * Use only SSDs, for best results >>>> * If HDDs are used: Using a metadata cache is highly recommended, for example, >>>> add a ZFS :ref:`special device mirror <local_zfs_special_device>`. >>>> + * While it's technically possible to use remote storages such as NFS or SMB, >>> >>> Up-front, I wrote some possible smaller improvements upfront but then >>> a replacement (see below), but I kept the others >>> >>> Would do s/remote storages/remote storage/ >>> >>> (We use "storages" quite a few times already, but if possible keeping it >>> singular sounds nicer IMO) >> >> ok >> >>> >>>> + the additional latency and overhead drastically reduces performance and it's >>> >>> s/additional latency and overhead/additional latency overhead/ ? >>> >>> or "network overhead" >>> >>> If it'd stay as is, the "reduces" should be changed to "reduce" ("latency and >>> overhead" is plural). >>> >> >> i meant actually two things here, the network latency and the additional >> overhead of the second filesystem layer > > Then it'd have helped me if to avoid mixing a specific overhead (latency) with > a generic mentioning of the word overhead, like: > > "... the added overhead of networking and providing concurrent file system access > drastically reduces performance ..." > > But that sounds a bit convoluted, so the best option here might be to just > use "added overhead". > > >>> >>> But I'd rather reword the whole thing to focus more on what the actual issue is, >>> i.e., not NFS or SMB/CIFS per se, but if the network accessing them is slow. >>> Maybe something like: >>> >>> * Avoid using remote storage, like NFS or SMB/CIFS, connected over a slow >>> (< 10 Gbps) and/or high latency (> 1 ms) link. Such a storage can >>> dramatically reduce performance and may even negatively impact the >>> backup source, e.g. by causing IO hangs. >>> >>> I pulled the numbers in parentheses out of thin air, but IMO they shouldn't be too far >>> off from 2024 Slow™, no hard feelings on adapting them though. >> >> IMHO i'd not mention any specific numbers at all, unless we actually >> benchmarked such a setup. so what about: > > Not sure what numbers from a benchmark would be of use here? One knows what > fast storage can do latency wise and how much bandwidth is a good baseline > – granted, the numbers are not helping for every specific setup, but doing > some benchmark won't change that either. > Anyway, won't matter, see below. > >> >> * Avoid using remote storage, like NFS or SMB/CIFS, connected over a >> slow and/or high latency link. Such a storage can dramatically reduce >> performance and may even negatively impact the backup source, e.g. by >> causing IO hangs. If you want to use such a storage, make sure it >> performs as expected by testing it before using it in production. >> > > That starts to get rather convoluted, tbh., the more I think about this, > the more I prefer just reverting the whole thing, I see no gain in > "bashing" NFS/SMB just because they have some overhead. > > If, we could simply adapt the "Use only SSDs, for best results" point to: > > "Prefer fast local storage that delivers high IOPS for random IO workloads; use only enterprise SSDs for best results." > > Would be a better fit to convey that fast local storage should be preferred, > especially in a "recommended" (not "recommended against") list. > > >> >> By adding that additional sentence we hopefully nudge some users >> into actually testing before deploying it, instead of then >> complaining that it's slow. > > If only; from forum and office request it's quite sensible to assume > that a good amount of users already have their storage box, and they'd > need to do so to be able to test it in any way, so already too late. > > It might be better to describe a setup how to still be able to use their > existing, NFS/SMB/... attached storage in the best way possible. E.g., by > doing a fast small local storage for incoming backups and use the bigger > remote storage only through syncing to it. This has a few benefits beside > getting good performance with existing, slower storage (of any type), like > having already an extra copy of most recent data. ultimately it's your call, but personally i'd prefer a broad statement that defers users from using a sub optimal setup in the first place than not mentioning it at all in the official docs and explaining every week in the forums that it's a bad idea this is the same as recommending fast disk, as one can use slow disks in some (small) setups successfully without problems, but it does not scale properly so we recommend against it. for remote storage, the vast majority of users won't probably invest in a super high performance nas/san box so recommending against using those is worth mentioning in the docs IMHO it does not have to be in the system requirements though, we could also put a longer explanation in e.g. the FAQ or datastore section. i just put it in the system requirements because we call out slow disks there too and i guessed this is one of the more read sections. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages 2024-06-13 8:02 ` Dominik Csapak @ 2024-06-17 15:58 ` Thomas Lamprecht 0 siblings, 0 replies; 7+ messages in thread From: Thomas Lamprecht @ 2024-06-17 15:58 UTC (permalink / raw) To: Dominik Csapak, Proxmox Backup Server development discussion Am 13/06/2024 um 10:02 schrieb Dominik Csapak: > On 6/12/24 17:40, Thomas Lamprecht wrote: > but we already do recommend against using remote storage regularly, > just not in the docs but in the forum. (so do many of our users) > > we also recommend against slow storage, but that can also work > depending on the use case/workload/exact setup If a user complains it's safe to assume that it's too slow for their use case, otherwise they would not be in the forum. It's also OK to tell users that their storage is too slow and a local storage with some SSDs might be a (relatively) cheap alternative to address that, especially in the previous mentioned combination where a small and fast local storage is used for incoming backups while still using the remote storage to sync a longer history of backups too. Both have nothing to do with a blanket recommendation against remote storage, i.e., without looking at the actual setup closely, and I hope not that's such blanket statements are currently done frequently without context. >>> >>> (i know that datastore creation is not the best benchmark for this, >>> but shows that there is significant overhead on some operations) >> >> Yeah, one creates a datastore only once, and on actual backup there >> are at max a few mkdirs, not 65k, so not really relevant here. >> Also, just because there's some overhead, allowing simultaneous mounts >> doesn't come for free, it doesn't mean that it's actually a problem for >> actual backup. As said, a blanket recommendation against a setup that >> is already rather frequent is IMO just deterring (future) users. > > it's not only datastore creation, also garbage collection and > all operations that has to access many files in succession suffers > from the over head here. > > my point is that the overhead of using a remote fs (regardless which) > adds so much overhead that it often turns what would be 'reasonable' > performance locally into 'unreasonably slow' so you'd have to massively > overcompensate for that in hardware. This is possible ofc, but highly > unlikely for the vast majority of users. > That a storage being remote makes it unusable slow for PBS by definition is just not true (see next paragraph of my reply for expanding on that). >> >> If only; from forum and office request it's quite sensible to assume >> that a good amount of users already have their storage box, and they'd >> need to do so to be able to test it in any way, so already too late. >> >> It might be better to describe a setup how to still be able to use their >> existing, NFS/SMB/... attached storage in the best way possible. E.g., by >> doing a fast small local storage for incoming backups and use the bigger >> remote storage only through syncing to it. This has a few benefits beside >> getting good performance with existing, slower storage (of any type), like >> having already an extra copy of most recent data. > > ultimately it's your call, but personally i'd prefer a broad statement > that defers users from using a sub optimal setup in the first place > than not mentioning it at all in the official docs and explaining > every week in the forums that it's a bad idea Again, just because a storage is remote just does *not* mean that it has to be too slow to be used. I.e., just because there is _some_ overhead it does *not* mean that it will make the storage unusable. Ceph, e.g., is a remote storage that can be made plenty of fast, as our own benchmark papers how, and some users in huge environments even have to use it for backups as nothing else can scale amount of data and performance. Or take Blockbridge, they're providing fast remote storage through NVMe over TCP. So by counterexample, including our *own* benchmarks, I think we really can establish as a fact that there can be remote storage setups that are fast, and I do not see any point in arguing that further. > > this is the same as recommending fast disk, as one can use slow disks > in some (small) setups successfully without problems, but it does not > scale properly so we recommend against it. for remote storage, It really isn't, recommending for fast local storage in a recommended system specs section is not the same > the vast majority of users won't probably invest in a super > high performance nas/san box so recommending against using those > is worth mentioning in the docs IMHO As mentioned in my last reply, with that logic we have thousands+ things to recommend against, lots of old/low-power/ HW, some USB HW (some other nice one can be totally fine again), ... this would blow up the section such over some time, that almost nobody would read it to completion, not really helping such annoying cases in the forum or other channels (that cannot be really fixed by just adding a bulletin point, IME they're even encouraged to further go in the wrong direction if argumentation isn't sound (and sometimes even then..)). > > it does not have to be in the system requirements though, we could > also put a longer explanation in e.g. the FAQ or datastore section. > i just put it in the system requirements because we call out > slow disks there too and i guessed this is one of the more > read sections. > I reworked the system requirements part to my previous proposal, that fit's the style of recommending for things, not against, and tells the user what's actually important, not some possible correlation to that. https://git.proxmox.com/?p=proxmox-backup.git;a=commitdiff;h=5c15fb97b4d507c2f60428b3dba376bdbfadf116 This is getting long again and so only as short draft that would need some more thoughts and expansion, but a IMO better help that recommending against such things would be to provide a CLI command that allows users to test some basic throughput and access times (e.g. with cold/flushed FS cache) and use these measurements to extrapolate on some GC/Verify examples that try to mirror some real-world smaller/medium/big setups. While naturally still not perfect it would tell the user much more to see that a work load with, e.g., 30 VMs (backup group), with each say ~100 GB of space usage, and 10 snapshots per backup group each, would need roughly X time for a GC and Y time for a verification of all data. Surely quite a bit more complex to do sanely, but something like that would IMO *much* more helpful. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-06-17 15:58 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-06-11 9:30 [pbs-devel] [PATCH proxmox-backup] docs: add note for not using remote storages Dominik Csapak 2024-06-11 9:42 ` [pbs-devel] applied: " Dietmar Maurer 2024-06-11 18:05 ` [pbs-devel] " Thomas Lamprecht 2024-06-12 6:39 ` Dominik Csapak 2024-06-12 15:40 ` Thomas Lamprecht 2024-06-13 8:02 ` Dominik Csapak 2024-06-17 15:58 ` Thomas Lamprecht
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox