* [pve-devel] [PATCH storage 1/2] rbd plugin: status: drop outdated fallback @ 2025-05-13 13:31 Fiona Ebner 2025-05-13 13:31 ` [pve-devel] [PATCH storage 2/2] rbd plugin: status: explain why percentage value can be different from Ceph Fiona Ebner 0 siblings, 1 reply; 6+ messages in thread From: Fiona Ebner @ 2025-05-13 13:31 UTC (permalink / raw) To: pve-devel As commit e79ab52 ("Fix #2346: rbd storage shows wrong %-usage") mentions, Ceph provides a 'stored' field since version 14.2.2 as an approximation of the actually stored amount of user data. The commit forgot to update the accompanying comment however. The 'bytes_used' field refers to the raw usage without factoring out replication (default: 3). The 'max_avail' value is after factoring out replication, so using 'bytes_used' in the same calculation would lead to very confusing results. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- src/PVE/Storage/RBDPlugin.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm index 73bc97e..154fa00 100644 --- a/src/PVE/Storage/RBDPlugin.pm +++ b/src/PVE/Storage/RBDPlugin.pm @@ -702,9 +702,9 @@ sub status { } # max_avail -> max available space for data w/o replication in the pool - # bytes_used -> data w/o replication in the pool + # stored -> amount of user data w/o replication in the pool my $free = $d->{stats}->{max_avail}; - my $used = $d->{stats}->{stored} // $d->{stats}->{bytes_used}; + my $used = $d->{stats}->{stored}; my $total = $used + $free; my $active = 1; -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 6+ messages in thread
* [pve-devel] [PATCH storage 2/2] rbd plugin: status: explain why percentage value can be different from Ceph 2025-05-13 13:31 [pve-devel] [PATCH storage 1/2] rbd plugin: status: drop outdated fallback Fiona Ebner @ 2025-05-13 13:31 ` Fiona Ebner 2025-05-14 8:22 ` Fiona Ebner 0 siblings, 1 reply; 6+ messages in thread From: Fiona Ebner @ 2025-05-13 13:31 UTC (permalink / raw) To: pve-devel Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- src/PVE/Storage/RBDPlugin.pm | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm index 154fa00..b56f8e4 100644 --- a/src/PVE/Storage/RBDPlugin.pm +++ b/src/PVE/Storage/RBDPlugin.pm @@ -703,6 +703,12 @@ sub status { # max_avail -> max available space for data w/o replication in the pool # stored -> amount of user data w/o replication in the pool + # NOTE These values are used because they are most natural from a user perspective. + # However, the %USED/percent_used value in Ceph is calculated from values before factoring out + # replication, namely 'bytes_used / (bytes_used + avail_raw)'. In certain setups, e.g. with LZ4 + # compression, this percentage can be noticeably different form the percentage + # 'stored / (stored + max_avail)' shown in the Proxmox VE CLI/UI. See also src/mon/PGMap.cc from + # the Ceph source code, which also mentions that 'stored' is an approximation. my $free = $d->{stats}->{max_avail}; my $used = $d->{stats}->{stored}; my $total = $used + $free; -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] [PATCH storage 2/2] rbd plugin: status: explain why percentage value can be different from Ceph 2025-05-13 13:31 ` [pve-devel] [PATCH storage 2/2] rbd plugin: status: explain why percentage value can be different from Ceph Fiona Ebner @ 2025-05-14 8:22 ` Fiona Ebner 2025-05-14 9:06 ` Fabian Grünbichler 0 siblings, 1 reply; 6+ messages in thread From: Fiona Ebner @ 2025-05-14 8:22 UTC (permalink / raw) To: pve-devel Am 13.05.25 um 15:31 schrieb Fiona Ebner: > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> > --- > src/PVE/Storage/RBDPlugin.pm | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm > index 154fa00..b56f8e4 100644 > --- a/src/PVE/Storage/RBDPlugin.pm > +++ b/src/PVE/Storage/RBDPlugin.pm > @@ -703,6 +703,12 @@ sub status { > > # max_avail -> max available space for data w/o replication in the pool > # stored -> amount of user data w/o replication in the pool > + # NOTE These values are used because they are most natural from a user perspective. > + # However, the %USED/percent_used value in Ceph is calculated from values before factoring out > + # replication, namely 'bytes_used / (bytes_used + avail_raw)'. In certain setups, e.g. with LZ4 > + # compression, this percentage can be noticeably different form the percentage > + # 'stored / (stored + max_avail)' shown in the Proxmox VE CLI/UI. See also src/mon/PGMap.cc from > + # the Ceph source code, which also mentions that 'stored' is an approximation. > my $free = $d->{stats}->{max_avail}; > my $used = $d->{stats}->{stored}; > my $total = $used + $free; Thinking about this again, I don't think continuing to use 'stored' is best after all, because that is before compression. And this is where the mismatch really comes from AFAICT. For highly compressible data, the mismatch between actual usage on the storage and 'stored' can be very big (in a quick test using the 'yes' command to fill an RBD image, I got stored = 2 * (used / replication_count)). And here in the storage stats we are interested in the usage on the storage, not the actual amount of data written by the user. For ZFS we also don't use 'logicalused', but 'used'. From src/osd/osd_types.h: > int64_t data_stored = 0; ///< Bytes actually stored by the user > int64_t data_compressed = 0; ///< Bytes stored after compression > int64_t data_compressed_allocated = 0; ///< Bytes allocated for compressed data > int64_t data_compressed_original = 0; ///< Bytes that were compressed _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] [PATCH storage 2/2] rbd plugin: status: explain why percentage value can be different from Ceph 2025-05-14 8:22 ` Fiona Ebner @ 2025-05-14 9:06 ` Fabian Grünbichler 2025-05-14 9:31 ` Fiona Ebner 0 siblings, 1 reply; 6+ messages in thread From: Fabian Grünbichler @ 2025-05-14 9:06 UTC (permalink / raw) To: Proxmox VE development discussion, Fiona Ebner > Fiona Ebner <f.ebner@proxmox.com> hat am 14.05.2025 10:22 CEST geschrieben: > > > Am 13.05.25 um 15:31 schrieb Fiona Ebner: > > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> > > --- > > src/PVE/Storage/RBDPlugin.pm | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm > > index 154fa00..b56f8e4 100644 > > --- a/src/PVE/Storage/RBDPlugin.pm > > +++ b/src/PVE/Storage/RBDPlugin.pm > > @@ -703,6 +703,12 @@ sub status { > > > > # max_avail -> max available space for data w/o replication in the pool > > # stored -> amount of user data w/o replication in the pool > > + # NOTE These values are used because they are most natural from a user perspective. > > + # However, the %USED/percent_used value in Ceph is calculated from values before factoring out > > + # replication, namely 'bytes_used / (bytes_used + avail_raw)'. In certain setups, e.g. with LZ4 > > + # compression, this percentage can be noticeably different form the percentage > > + # 'stored / (stored + max_avail)' shown in the Proxmox VE CLI/UI. See also src/mon/PGMap.cc from > > + # the Ceph source code, which also mentions that 'stored' is an approximation. > > my $free = $d->{stats}->{max_avail}; > > my $used = $d->{stats}->{stored}; > > my $total = $used + $free; > > Thinking about this again, I don't think continuing to use 'stored' is > best after all, because that is before compression. And this is where > the mismatch really comes from AFAICT. For highly compressible data, the > mismatch between actual usage on the storage and 'stored' can be very > big (in a quick test using the 'yes' command to fill an RBD image, I got > stored = 2 * (used / replication_count)). And here in the storage stats > we are interested in the usage on the storage, not the actual amount of > data written by the user. For ZFS we also don't use 'logicalused', but > 'used'. but for ZFS, we actually use the "logical" view provided by `zfs list/get`, not the "physical" view provided by `zpool list/get` (and even the latter would already account for redundancy). e.g., with a testpool consisting of three mirrored vdevs of size 1G, with a single dataset filled with a file with 512MB of random data: $ zpool list -v testpool NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT testpool 960M 513M 447M - - 42% 53% 1.00x ONLINE - mirror-0 960M 513M 447M - - 42% 53.4% - ONLINE /tmp/vdev1.img 1G - - - - - - - ONLINE /tmp/vdev2.img 1G - - - - - - - ONLINE /tmp/vdev3.img 1G - - - - - - - ONLINE and what we use for the storage status: $ zfs get available,used testpool/data NAME PROPERTY VALUE SOURCE testpool/data available 319M - testpool/data used 512M - if we switch away from `stored`, we'd have to account for replication ourselves to match that, right? and we don't have that information readily available (and also no idea how to handle EC pools?)? wouldn't we just exchange one wrong set of numbers with another (differently) wrong set of numbers? FWIW, we already provide raw numbers in the pool view, and could maybe expand that view to provide more details? e.g., for my test rbd pool the pool view shows 50,29% used amounting to 163,43GiB, whereas the storage status says 51.38% used amounting to 61.11GB of 118.94GB, with the default 3/2 replication ceph df detail says: { "name": "rbd", "id": 2, "stats": { "stored": 61108710142, => /1000/1000/1000 == storage used "stored_data": 61108699136, "stored_omap": 11006, "objects": 15579, "kb_used": 171373017, "bytes_used": 175485968635, => /1024/1024/1024 == pool used "data_bytes_used": 175485935616, "omap_bytes_used": 33019, "percent_used": 0.5028545260429382, => rounded this is the pool view percentage "max_avail": 57831211008, => (this + stored)/1000/1000/1000 storage total "quota_objects": 0, "quota_bytes": 0, "dirty": 0, "rd": 253354, "rd_bytes": 38036885504, "wr": 75833, "wr_bytes": 33857918976, "compress_bytes_used": 0, "compress_under_bytes": 0, "stored_raw": 183326130176, "avail_raw": 173493638191 } }, > From src/osd/osd_types.h: > > > int64_t data_stored = 0; ///< Bytes actually stored by the user > > int64_t data_compressed = 0; ///< Bytes stored after compression > > int64_t data_compressed_allocated = 0; ///< Bytes allocated for compressed data > > int64_t data_compressed_original = 0; ///< Bytes that were compressed > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] [PATCH storage 2/2] rbd plugin: status: explain why percentage value can be different from Ceph 2025-05-14 9:06 ` Fabian Grünbichler @ 2025-05-14 9:31 ` Fiona Ebner 2025-05-14 11:07 ` Fabian Grünbichler 0 siblings, 1 reply; 6+ messages in thread From: Fiona Ebner @ 2025-05-14 9:31 UTC (permalink / raw) To: Fabian Grünbichler, Proxmox VE development discussion Am 14.05.25 um 11:06 schrieb Fabian Grünbichler: >> Fiona Ebner <f.ebner@proxmox.com> hat am 14.05.2025 10:22 CEST geschrieben: >> >> >> Am 13.05.25 um 15:31 schrieb Fiona Ebner: >>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> >>> --- >>> src/PVE/Storage/RBDPlugin.pm | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm >>> index 154fa00..b56f8e4 100644 >>> --- a/src/PVE/Storage/RBDPlugin.pm >>> +++ b/src/PVE/Storage/RBDPlugin.pm >>> @@ -703,6 +703,12 @@ sub status { >>> >>> # max_avail -> max available space for data w/o replication in the pool >>> # stored -> amount of user data w/o replication in the pool >>> + # NOTE These values are used because they are most natural from a user perspective. >>> + # However, the %USED/percent_used value in Ceph is calculated from values before factoring out >>> + # replication, namely 'bytes_used / (bytes_used + avail_raw)'. In certain setups, e.g. with LZ4 >>> + # compression, this percentage can be noticeably different form the percentage >>> + # 'stored / (stored + max_avail)' shown in the Proxmox VE CLI/UI. See also src/mon/PGMap.cc from >>> + # the Ceph source code, which also mentions that 'stored' is an approximation. >>> my $free = $d->{stats}->{max_avail}; >>> my $used = $d->{stats}->{stored}; >>> my $total = $used + $free; >> >> Thinking about this again, I don't think continuing to use 'stored' is >> best after all, because that is before compression. And this is where >> the mismatch really comes from AFAICT. For highly compressible data, the >> mismatch between actual usage on the storage and 'stored' can be very >> big (in a quick test using the 'yes' command to fill an RBD image, I got >> stored = 2 * (used / replication_count)). And here in the storage stats >> we are interested in the usage on the storage, not the actual amount of >> data written by the user. For ZFS we also don't use 'logicalused', but >> 'used'. > > but for ZFS, we actually use the "logical" view provided by `zfs list/get`, > not the "physical" view provided by `zpool list/get` (and even the latter > would already account for redundancy). But that is not the same logcial view as 'logicalused' which would not consider compression. > > e.g., with a testpool consisting of three mirrored vdevs of size 1G, with > a single dataset filled with a file with 512MB of random data: > > $ zpool list -v testpool > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT > testpool 960M 513M 447M - - 42% 53% 1.00x ONLINE - > mirror-0 960M 513M 447M - - 42% 53.4% - ONLINE > /tmp/vdev1.img 1G - - - - - - - ONLINE > /tmp/vdev2.img 1G - - - - - - - ONLINE > /tmp/vdev3.img 1G - - - - - - - ONLINE > > and what we use for the storage status: > > $ zfs get available,used testpool/data > NAME PROPERTY VALUE SOURCE > testpool/data available 319M - > testpool/data used 512M - > > if we switch away from `stored`, we'd have to account for replication > ourselves to match that, right? and we don't have that information > readily available (and also no idea how to handle EC pools?)? wouldn't > we just exchange one wrong set of numbers with another (differently) > wrong set of numbers? I would've used avail_raw / max_avail to calculate the replication factor and apply that to bytes_used. Sure it won't be perfect, but it should lead to matching the percent_used reported by Ceph: percent_used = used_bytes / (used_bytes + avail_raw) max_avail = avail_raw / rep (rep is called raw_used_rate in Ceph source, but I'm shortening it for readability) Thus: rep = avail_raw / max_avail our_used = used_bytes / rep our_avail = max_avail = avail_raw / rep our_percentage = our_used / (our_used + our_avail) = (used_bytes/rep) / (used_bytes/rep + avail_raw/rep) = then canceling rep = used_bytes / (used_bytes + avail_raw) = percent_used from Ceph The point is that it'd be much better than not considering compression. > > FWIW, we already provide raw numbers in the pool view, and could maybe > expand that view to provide more details? > > e.g., for my test rbd pool the pool view shows 50,29% used amounting to > 163,43GiB, whereas the storage status says 51.38% used amounting to > 61.11GB of 118.94GB, with the default 3/2 replication > > ceph df detail says: > > { > "name": "rbd", > "id": 2, > "stats": { > "stored": 61108710142, => /1000/1000/1000 == storage used But this is not really "storage used". This is the amount of user data, before compression. The actual usage on the storage can be much lower than this. > "stored_data": 61108699136, > "stored_omap": 11006, > "objects": 15579, > "kb_used": 171373017, > "bytes_used": 175485968635, => /1024/1024/1024 == pool used > "data_bytes_used": 175485935616, > "omap_bytes_used": 33019, > "percent_used": 0.5028545260429382, => rounded this is the pool view percentage > "max_avail": 57831211008, => (this + stored)/1000/1000/1000 storage total > "quota_objects": 0, > "quota_bytes": 0, > "dirty": 0, > "rd": 253354, > "rd_bytes": 38036885504, > "wr": 75833, > "wr_bytes": 33857918976, > "compress_bytes_used": 0, > "compress_under_bytes": 0, > "stored_raw": 183326130176, > "avail_raw": 173493638191 > } > }, > > >> From src/osd/osd_types.h: >> >>> int64_t data_stored = 0; ///< Bytes actually stored by the user >>> int64_t data_compressed = 0; ///< Bytes stored after compression >>> int64_t data_compressed_allocated = 0; ///< Bytes allocated for compressed data >>> int64_t data_compressed_original = 0; ///< Bytes that were compressed >> >> >> >> _______________________________________________ >> pve-devel mailing list >> pve-devel@lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] [PATCH storage 2/2] rbd plugin: status: explain why percentage value can be different from Ceph 2025-05-14 9:31 ` Fiona Ebner @ 2025-05-14 11:07 ` Fabian Grünbichler 0 siblings, 0 replies; 6+ messages in thread From: Fabian Grünbichler @ 2025-05-14 11:07 UTC (permalink / raw) To: Fiona Ebner, Proxmox VE development discussion > Fiona Ebner <f.ebner@proxmox.com> hat am 14.05.2025 11:31 CEST geschrieben: > > > Am 14.05.25 um 11:06 schrieb Fabian Grünbichler: > >> Fiona Ebner <f.ebner@proxmox.com> hat am 14.05.2025 10:22 CEST geschrieben: > >> > >> > >> Am 13.05.25 um 15:31 schrieb Fiona Ebner: > >>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> > >>> --- > >>> src/PVE/Storage/RBDPlugin.pm | 6 ++++++ > >>> 1 file changed, 6 insertions(+) > >>> > >>> diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm > >>> index 154fa00..b56f8e4 100644 > >>> --- a/src/PVE/Storage/RBDPlugin.pm > >>> +++ b/src/PVE/Storage/RBDPlugin.pm > >>> @@ -703,6 +703,12 @@ sub status { > >>> > >>> # max_avail -> max available space for data w/o replication in the pool > >>> # stored -> amount of user data w/o replication in the pool > >>> + # NOTE These values are used because they are most natural from a user perspective. > >>> + # However, the %USED/percent_used value in Ceph is calculated from values before factoring out > >>> + # replication, namely 'bytes_used / (bytes_used + avail_raw)'. In certain setups, e.g. with LZ4 > >>> + # compression, this percentage can be noticeably different form the percentage > >>> + # 'stored / (stored + max_avail)' shown in the Proxmox VE CLI/UI. See also src/mon/PGMap.cc from > >>> + # the Ceph source code, which also mentions that 'stored' is an approximation. > >>> my $free = $d->{stats}->{max_avail}; > >>> my $used = $d->{stats}->{stored}; > >>> my $total = $used + $free; > >> > >> Thinking about this again, I don't think continuing to use 'stored' is > >> best after all, because that is before compression. And this is where > >> the mismatch really comes from AFAICT. For highly compressible data, the > >> mismatch between actual usage on the storage and 'stored' can be very > >> big (in a quick test using the 'yes' command to fill an RBD image, I got > >> stored = 2 * (used / replication_count)). And here in the storage stats > >> we are interested in the usage on the storage, not the actual amount of > >> data written by the user. For ZFS we also don't use 'logicalused', but > >> 'used'. > > > > but for ZFS, we actually use the "logical" view provided by `zfs list/get`, > > not the "physical" view provided by `zpool list/get` (and even the latter > > would already account for redundancy). > > But that is not the same logcial view as 'logicalused' which would not > consider compression. yes! > > e.g., with a testpool consisting of three mirrored vdevs of size 1G, with > > a single dataset filled with a file with 512MB of random data: > > > > $ zpool list -v testpool > > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT > > testpool 960M 513M 447M - - 42% 53% 1.00x ONLINE - > > mirror-0 960M 513M 447M - - 42% 53.4% - ONLINE > > /tmp/vdev1.img 1G - - - - - - - ONLINE > > /tmp/vdev2.img 1G - - - - - - - ONLINE > > /tmp/vdev3.img 1G - - - - - - - ONLINE > > > > and what we use for the storage status: > > > > $ zfs get available,used testpool/data > > NAME PROPERTY VALUE SOURCE > > testpool/data available 319M - > > testpool/data used 512M - > > > > if we switch away from `stored`, we'd have to account for replication > > ourselves to match that, right? and we don't have that information > > readily available (and also no idea how to handle EC pools?)? wouldn't > > we just exchange one wrong set of numbers with another (differently) > > wrong set of numbers? > > I would've used avail_raw / max_avail to calculate the replication > factor and apply that to bytes_used. Sure it won't be perfect, but it > should lead to matching the percent_used reported by Ceph: that seems to work, even though `rep` has unexpected values for (some?) EC pools, but that already affects the current calculation as well since we use max_avail there (unless it's avail_raw that's wrong?). e.g., for k=2 m=4 (expected overhead 3x) rep was 2.43 for my pool for k=4 m=2 (expected overhead 1.5x) rep is 1.499999 when empty, avail_raw 175736710881 vs max_avail 117157814272. after I've moved an 8GB volume to it, avail_raw is 164205461484 and max_avail 109470310400 (still 1.499999), but the old avail_raw minus the new bytes_used doesn't add up to the new avail_raw (probably because of additional metadata usage in the non-EC pool?). for k=2 m=2 (expected overhead 2x) it also seems to check out. with a fresh k=2 m=4 pool when empty it starts off with 2.999999, and while transferring the volume the value fluctuates: 133680923334/46138830848 2.89736 132897417462/44299137024 3.00000 so no idea what this means for a busy ceph cluster :-P the numbers here will be quite weird anyway as soon as you have more than one pool, similar to how multiple ZFS datasets will influence eachothers available space unless you have reservations/quotas in place.. the only remaining downside I see with your proposed approach is that the used value might fluctuate in case of OSD failures e.g.: { "name": "rbd", "id": 2, "stats": { "stored": 65403557916, "stored_data": 65403547648, "stored_omap": 10268, "objects": 16790, "kb_used": 191588275, "bytes_used": 196186392661, "data_bytes_used": 196186361856, "omap_bytes_used": 30805, "percent_used": 0.5661442279815674, "max_avail": 50114789376, "quota_objects": 0, "quota_bytes": 0, "dirty": 0, "rd": 334578, "rd_bytes": 64360384512, "wr": 140721, "wr_bytes": 55799670784, "compress_bytes_used": 0, "compress_under_bytes": 0, "stored_raw": 196210688000, "avail_raw": 150344373357 } }, with one node down: "name": "rbd", "id": 2, "stats": { "stored": 93193132327, "stored_data": 93193117696, "stored_omap": 14631, "objects": 16790, "kb_used": 191588275, "bytes_used": 196186392661, "data_bytes_used": 196186361856, "omap_bytes_used": 30805, "percent_used": 0.5661342740058899, "max_avail": 71411171328, "quota_objects": 0, "quota_bytes": 0, "dirty": 0, "rd": 333857, "rd_bytes": 63033369600, "wr": 140721, "wr_bytes": 55799670784, "compress_bytes_used": 0, "compress_under_bytes": 0, "stored_raw": 196210671616, "avail_raw": 150350502408 } }, with the node back up: "name": "rbd", "id": 2, "stats": { "stored": 65403557478, "stored_data": 65403547648, "stored_omap": 9830, "objects": 16790, "kb_used": 191588273, "bytes_used": 196186391346, "data_bytes_used": 196186361856, "omap_bytes_used": 29490, "percent_used": 0.5661147832870483, "max_avail": 50120798208, "quota_objects": 0, "quota_bytes": 0, "dirty": 0, "rd": 333857, "rd_bytes": 63033369600, "wr": 140721, "wr_bytes": 55799670784, "compress_bytes_used": 0, "compress_under_bytes": 0, "stored_raw": 196210671616, "avail_raw": 150362387436 } }, but it's actually "stored" and "max_avail" that get messed up in that case, so the existing values are wrong already anyway, as taking OSDs down will cause a bump in the used and total value :-/ > our_used = used_bytes / rep > our_avail = max_avail = avail_raw / rep but this would still be wrong, as rep and max_avail would change with OSDs going down, but used_bytes remains the same.. e.g. in the example above, used would (still) jump from 61G to 86G to 61G what a mess > our_percentage = our_used / (our_used + our_avail) = > (used_bytes/rep) / (used_bytes/rep + avail_raw/rep) = > then canceling rep > = used_bytes / (used_bytes + avail_raw) = percent_used from Ceph > > The point is that it'd be much better than not considering compression. > > > > > FWIW, we already provide raw numbers in the pool view, and could maybe > > expand that view to provide more details? > > > > e.g., for my test rbd pool the pool view shows 50,29% used amounting to > > 163,43GiB, whereas the storage status says 51.38% used amounting to > > 61.11GB of 118.94GB, with the default 3/2 replication > > > > ceph df detail says: > > > > { > > "name": "rbd", > > "id": 2, > > "stats": { > > "stored": 61108710142, => /1000/1000/1000 == storage used > > But this is not really "storage used". This is the amount of user data, > before compression. The actual usage on the storage can be much lower > than this. > > > "stored_data": 61108699136, > > "stored_omap": 11006, > > "objects": 15579, > > "kb_used": 171373017, > > "bytes_used": 175485968635, => /1024/1024/1024 == pool used > > "data_bytes_used": 175485935616, > > "omap_bytes_used": 33019, > > "percent_used": 0.5028545260429382, => rounded this is the pool view percentage > > "max_avail": 57831211008, => (this + stored)/1000/1000/1000 storage total > > "quota_objects": 0, > > "quota_bytes": 0, > > "dirty": 0, > > "rd": 253354, > > "rd_bytes": 38036885504, > > "wr": 75833, > > "wr_bytes": 33857918976, > > "compress_bytes_used": 0, > > "compress_under_bytes": 0, > > "stored_raw": 183326130176, > > "avail_raw": 173493638191 > > } > > }, > > > > > >> From src/osd/osd_types.h: > >> > >>> int64_t data_stored = 0; ///< Bytes actually stored by the user > >>> int64_t data_compressed = 0; ///< Bytes stored after compression > >>> int64_t data_compressed_allocated = 0; ///< Bytes allocated for compressed data > >>> int64_t data_compressed_original = 0; ///< Bytes that were compressed > >> > >> > >> > >> _______________________________________________ > >> pve-devel mailing list > >> pve-devel@lists.proxmox.com > >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-05-14 11:07 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-05-13 13:31 [pve-devel] [PATCH storage 1/2] rbd plugin: status: drop outdated fallback Fiona Ebner 2025-05-13 13:31 ` [pve-devel] [PATCH storage 2/2] rbd plugin: status: explain why percentage value can be different from Ceph Fiona Ebner 2025-05-14 8:22 ` Fiona Ebner 2025-05-14 9:06 ` Fabian Grünbichler 2025-05-14 9:31 ` Fiona Ebner 2025-05-14 11:07 ` Fabian Grünbichler
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.Service provided by Proxmox Server Solutions GmbH | Privacy | Legal