From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Stefan Hanreich <s.hanreich@proxmox.com>
Cc: pdm-devel@lists.proxmox.com
Subject: Re: [pdm-devel] [PATCH proxmox 1/3] pve-api-types: add FixedString type
Date: Wed, 12 Nov 2025 11:50:01 +0100 [thread overview]
Message-ID: <azh3ifpvho4xs4qqti6wbmkjn5atenshwbap3ycijv3mx3badm@uxj5bjp33xx3> (raw)
In-Reply-To: <20251112092225.17890-2-s.hanreich@proxmox.com>
On Wed, Nov 12, 2025 at 10:22:05AM +0100, Stefan Hanreich wrote:
> A custom, immutable, string type, that is copiable, which can hold up
> to 23 characters. It will be used later for storing the name of
> unknown enum variants when parsing the return value of enum
> properties. The main reason for introducing this type is that its copy
> as opposed to String, which lets us keep the Copy implementation for
> the existing enums.
>
> Main reason for choosing 23 characters is the fact that String is 24
> bytes large as well (ptr, len, capacity).
>
> Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
> ---
> pve-api-types/src/types/fixed_string.rs | 331 ++++++++++++++++++++++++
> pve-api-types/src/types/mod.rs | 3 +
> 2 files changed, 334 insertions(+)
> create mode 100644 pve-api-types/src/types/fixed_string.rs
>
> diff --git a/pve-api-types/src/types/fixed_string.rs b/pve-api-types/src/types/fixed_string.rs
> new file mode 100644
> index 00000000..f1692227
> --- /dev/null
> +++ b/pve-api-types/src/types/fixed_string.rs
> @@ -0,0 +1,331 @@
> +use std::cmp::Ordering;
> +use std::error::Error;
> +use std::fmt::{Debug, Display, Formatter};
> +use std::ops::Deref;
> +use std::str::FromStr;
> +
> +use serde::{Deserialize, Serialize};
> +
> +/// Error type used by constructors of [`FixedString`]
> +#[derive(Clone, Copy, Debug)]
> +pub enum FixedStringError {
> + TooLong,
> +}
IMO this does not need to be an enum. `pub struct TooLongError` should
be enough.
> +
> +impl Error for FixedStringError {}
> +
> +impl Display for FixedStringError {
> + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
^ you're importing `Formatter`.
And on that note: personally I prefer to import `std::fmt` and prefix
its items with `fmt::*`.
> + f.write_str(match self {
> + Self::TooLong => "string is longer than 23 characters",
> + })
> + }
> +}
> +
> +/// An immutable string type with a maximum size of 23 bytes.
> +///
> +/// After construction it is guaranteed that its contents are:
> +/// * valid utf-8
> +/// * not longer than 23 characters
> +///
> +/// FixedString is immutable, therefore it is sufficient to validate the invariants only at
> +/// construction time to guarantee that they will always hold during the lifecycle of the
> +/// struct.
> +#[derive(Clone, Copy)]
> +pub struct FixedString {
> + buf: [u8; 23],
> + len: u8,
> +}
> +
> +impl FixedString {
> + /// Creates a new FixedString instance from a str reference.
> + ///
> + /// # Errors
> + /// This function will return an error if:
> + /// * The passed string is longer than 23 bytes
> + pub fn new(value: &str) -> Result<Self, FixedStringError> {
> + if value.len() > 23 {
> + return Err(FixedStringError::TooLong);
> + }
> +
> + let mut buf = [0; 23];
> + buf[..value.len()].copy_from_slice(value.as_bytes());
> +
> + Ok(Self {
> + buf,
> + // SAFETY: self.len is at least 0 and at most 23, which fits into u8
> + len: value.len() as u8,
> + })
> + }
> +
> + /// Creates a new FixedString instance from a str reference, truncating if required
> + ///
> + /// It works by copying the bytes from the supplied value into its own local buffer. If the
> + /// string is longer than 23 bytes, then this methods will truncate the string at the nearest
> + /// utf-8 boundary, so even if a string longer than 23 characters is passed in, it is possible
> + /// that it gets truncated to a length less than 23 if the utf-8 boundary doesn't perfectly
> + /// align with the 23 bytes limit.
> + pub fn new_truncated(value: &str) -> Self {
I don't think we should keep this unless we also have a way of checking
for this.
In case we do want to use this: since values up to 31 (a potential
still-cheap-enough extension to this `FixedString`'s length (since it
would then just be 4 pointers in size)) only require 5 bits, we have 3
bits remaining in the length byte which could be used to signal that
this was a truncated string, which could be returned via a
`.is_truncated()` method, and we could refuse to serialize such a string
without calling some extra method first (and include some marker in the
Debug impl)?
But IMO: for now, we just don't need it.
> + // TODO: replace, once [`str::floor_char_boundary`] has been stabilized
> + let len = if value.len() > 23 {
> + let mut index = 23;
> +
> + while index > 0 && !value.is_char_boundary(index) {
> + index = index - 1;
index -= 1; ;-)
or:
panic!("Who let Dominik add emojis to enums?")
> + }
> +
> + index
> + } else {
> + value.len()
> + };
^ Actually the above could just be shortened to
let mut len = value.len();
if len > 23 {
len = 23;
while ... { len -= 1 }
}
> +
> + let mut buf = [0; 23];
> + buf[..len].copy_from_slice(&value.as_bytes()[..len]);
> +
> + Self {
> + buf,
> + // SAFETY: self.len is at least 0 and at most 23, which fits into u8
> + len: len as u8,
> + }
> + }
> +
> + /// Returns a str reference to the stored data
> + #[inline]
> + pub fn as_str(&self) -> &str {
> + // SAFETY: self.buf must be a valid utf-8 string by construction
> + unsafe { str::from_utf8_unchecked(self.as_bytes()) }
> + }
> +
> + /// Returns a reference to the set bytes in the stored buffer
> + #[inline]
> + pub fn as_bytes(&self) -> &[u8] {
> + // SAFETY: self.len >= 0 and self.len <= 23 by construction
> + unsafe { self.buf.get_unchecked(..self.len as usize) }
> + }
> +}
> +
> +impl PartialEq for FixedString {
IMO - regarding ordering where it matters - it would be easier to read
such impls when they're purely forwarded, eg.:
PartialEq::eq(self.as_str(), other)
since then all implementations look the same (no need for `Some()`
wrapping in the `PartialOrd` impl, for instance.
Also, it could be generated with a simple forwarding macro which would
IMO also make review easier than having to go through each of them
individually:
macro_rules! forward_impl_to_bytes {
($($trait:ident {$fn:ident -> $out:ty })+) => {
$(
impl $trait for FixedString {
fn $fn(&self, other: &Self) -> $out {
<[u8] as $trait>::$fn(self.as_bytes(), other.as_bytes())
}
}
)+
};
}
forward_impl_to_bytes! {
PartialEq { eq -> bool }
PartialOrd { partial_cmp -> Option<Ordering> }
Ord { cmp -> Ordering }
}
And for the ones below:
macro_rules! forward_impl_to_str_bidir {
($($trait:ident {$fn:ident -> $out:ty })+) => {
$(
impl $trait<str> for FixedString {
fn $fn(&self, other: &str) -> $out {
<str as $trait>::$fn(self.as_str(), other)
}
}
impl $trait<FixedString> for &str {
fn $fn(&self, other: &FixedString) -> $out {
<str as $trait>::$fn(self, other.as_str())
}
}
)+
};
}
forward_impl_to_str_bidir! {
PartialEq { eq -> bool }
PartialOrd { partial_cmp -> Option<Ordering> }
}
Yes I realize that for 2 traits this doesn't save much, but it would
make the intention clearer and IMO is still easier to review.
> + #[inline]
> + fn eq(&self, other: &FixedString) -> bool {
> + self.as_bytes() == other.as_bytes()
> + }
> +}
> +
> +impl Eq for FixedString {}
> +
> +impl PartialOrd for FixedString {
> + fn partial_cmp(&self, other: &FixedString) -> Option<Ordering> {
> + Some(self.cmp(other))
> + }
> +}
> +
> +impl Ord for FixedString {
> + #[inline]
> + fn cmp(&self, other: &FixedString) -> Ordering {
> + self.as_bytes().cmp(other.as_bytes())
> + }
> +}
> +
> +impl Display for FixedString {
> + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
> + f.write_str(self.as_str())
Consider forwarding to &str's fmt instead so that things such as
alignment etc. work:
Display::fmt(self.as_str(), f)
so that `println!("{the_enum:>8}")` will print " foo" instead of
just "foo", and `{x:0^8}` produces "00foo000" ;-)
> + }
> +}
> +
> +impl Debug for FixedString {
> + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
> + f.write_str(self.as_str())
Same here with `Debug::fmt(self.as_str(), f)`.
> + }
> +}
> +
> +impl Deref for FixedString {
> + type Target = str;
> +
> + fn deref(&self) -> &str {
> + self.as_str()
> + }
> +}
> +
> +impl AsRef<str> for FixedString {
> + fn as_ref(&self) -> &str {
> + self.as_str()
> + }
> +}
I'd also add `AsRef<[u8]>`, and `Borrow<str>`. (The latter would be
required to allow a `HashMap<FixedString, T>` to use `.get("key")` with
`&str` for a key - not that that's something we'd expect to see...
> +
> +impl TryFrom<String> for FixedString {
> + type Error = FixedStringError;
> +
> + fn try_from(value: String) -> Result<Self, Self::Error> {
> + FixedString::new(value.as_str())
> + }
> +}
> +
> +impl FromStr for FixedString {
> + type Err = FixedStringError;
> +
> + fn from_str(value: &str) -> Result<Self, Self::Err> {
> + FixedString::new(value)
> + }
> +}
> +
> +impl TryFrom<&str> for FixedString {
> + type Error = FixedStringError;
> +
> + fn try_from(value: &str) -> Result<Self, Self::Error> {
> + FixedString::new(value)
> + }
> +}
> +
> +impl PartialEq<str> for FixedString {
> + fn eq(&self, other: &str) -> bool {
> + self.as_str().eq(other)
> + }
> +}
> +
> +impl PartialEq<FixedString> for &str {
> + fn eq(&self, other: &FixedString) -> bool {
> + self.eq(&other.as_str())
> + }
> +}
> +
> +impl PartialOrd<str> for FixedString {
> + fn partial_cmp(&self, other: &str) -> Option<Ordering> {
> + Some(self.as_str().cmp(&other))
> + }
> +}
> +
> +impl PartialOrd<FixedString> for &str {
> + fn partial_cmp(&self, other: &FixedString) -> Option<Ordering> {
> + Some(self.cmp(&other.as_str()))
> + }
> +}
> +
> +impl Serialize for FixedString {
> + fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
> + where
> + S: serde::Serializer,
> + {
> + serializer.serialize_str(self.as_str())
> + }
> +}
> +
> +impl<'de> Deserialize<'de> for FixedString {
> + fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
> + where
> + D: serde::Deserializer<'de>,
> + {
> + deserializer.deserialize_str(FixedStringVisitor)
> + }
> +}
> +
> +struct FixedStringVisitor;
^ I'd prefer to move the type and Visitor impl into the `fn deserialize`.
> +
> +impl<'de> serde::de::Visitor<'de> for FixedStringVisitor {
> + type Value = FixedString;
> +
> + fn expecting(&self, f: &mut Formatter) -> std::fmt::Result {
> + f.write_str("a string that is at most 23 bytes long")
> + }
> +
> + fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
> + where
> + E: serde::de::Error,
> + {
> + v.try_into().map_err(E::custom)
> + }
> +
> + fn visit_string<E>(self, v: String) -> Result<Self::Value, E>
> + where
> + E: serde::de::Error,
> + {
^ You can drop this. `visit_string` and `visit_borrowed_str` by default
forward to `visit_str`.
> + v.try_into().map_err(E::custom)
> + }
> +}
> +
> +#[cfg(test)]
> +mod tests {
> + use super::*;
> +
> + use serde_plain;
> +
> + #[test]
> + fn test_construct() {
> + let fixed_string = FixedString::new("").expect("empty string is valid");
> + assert_eq!("", fixed_string);
> +
> + let fixed_string = FixedString::new("a").expect("valid string");
> + assert_eq!("a", fixed_string);
> +
> + let fixed_string = FixedString::new("🌏🌏🌏🌏🌏").expect("valid string");
> + assert_eq!("🌏🌏🌏🌏🌏", fixed_string);
> +
> + let fixed_string =
> + FixedString::new("aaaaaaaaaaaaaaaaaaaaaaa").expect("23 characters are allowed");
> + assert_eq!("aaaaaaaaaaaaaaaaaaaaaaa", fixed_string);
> +
> + FixedString::new("🌏🌏🌏🌏🌏🌏").expect_err("string too long");
> + FixedString::new("aaaaaaaaaaaaaaaaaaaaaaaa").expect_err("string too long");
> + }
> +
> + #[test]
> + fn test_serialize_deserialize() {
> + let valid_string = "aaaaaaaaaaaaaaaaaaaaaaa";
> +
> + let fixed_string: FixedString =
> + serde_plain::from_str("aaaaaaaaaaaaaaaaaaaaaaa").expect("deserialization works");
> + assert_eq!(valid_string, fixed_string);
> +
> + let serialized_string =
> + serde_plain::to_string(&fixed_string).expect("can be serialized into a string");
> + assert_eq!(valid_string, serialized_string);
> +
> + serde_plain::from_str::<FixedString>("aaaaaaaaaaaaaaaaaaaaaaaa")
> + .expect_err("cannot deserialize string that is too long");
> + }
> +
> + #[test]
> + fn test_ord() {
> + let fixed_string = FixedString::new("abc").expect("valid string");
> +
> + assert!(fixed_string == fixed_string);
> + assert!(fixed_string >= fixed_string);
> + assert!(fixed_string <= fixed_string);
> +
> + assert!("ab" < fixed_string);
> + assert!("abc" == fixed_string);
> + assert!("abcd" > fixed_string);
> +
> + let larger_fixed_string = FixedString::new("abcde").expect("valid string");
> +
> + assert!(larger_fixed_string > fixed_string);
> + assert!(fixed_string < larger_fixed_string);
> + }
> +
> + #[test]
> + fn test_truncate() {
> + let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaaaaaaaaaaaa");
> + assert_eq!("aaaaaaaaaaaaaaaaaaaaaaa", fixed_string);
> +
> + let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaa💯");
> + assert_eq!("aaaaaaaaaaaaaaaaaa💯", fixed_string);
> +
> + // '💯' is 4 bytes long
> + // first byte of '💯' is inside the 23 byte limit, but it's not a valid utf-8 boundary so
> + // it should get truncated
> + let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaaaaa💯");
> + assert_eq!("aaaaaaaaaaaaaaaaaaaaa", fixed_string);
> +
> + let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaaaaaa💯");
> + assert_eq!("aaaaaaaaaaaaaaaaaaaaaa", fixed_string);
> +
> + let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaaaaaaa💯");
> + assert_eq!("aaaaaaaaaaaaaaaaaaaaaaa", fixed_string);
> +
> + // '🌏' is 4 bytes long
> + let fixed_string = FixedString::new_truncated("🌏🌏🌏🌏🌏");
> + assert_eq!("🌏🌏🌏🌏🌏", fixed_string);
> +
> + let fixed_string = FixedString::new_truncated("🌏🌏🌏🌏🌏🌏");
> + assert_eq!("🌏🌏🌏🌏🌏", fixed_string);
> + }
> +}
> diff --git a/pve-api-types/src/types/mod.rs b/pve-api-types/src/types/mod.rs
> index fe52a169..9653dd46 100644
> --- a/pve-api-types/src/types/mod.rs
> +++ b/pve-api-types/src/types/mod.rs
> @@ -17,6 +17,9 @@ pub mod array;
> pub mod stringlist;
> pub mod verifiers;
>
> +mod fixed_string;
> +pub use fixed_string::{FixedString, FixedStringError};
> +
> include!("../generated/types.rs");
>
> /// A PVE Upid, contrary to a PBS Upid, contains no 'task-id' number.
> --
> 2.47.3
_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel
next prev parent reply other threads:[~2025-11-12 10:49 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-12 9:22 [pdm-devel] [RFC proxmox{, -yew-comp, -datacenter-manager}/yew-mobile-gui 0/6] Add fallback variant to enum properties in Proxmox VE Rust API types Stefan Hanreich
2025-11-12 9:22 ` [pdm-devel] [PATCH proxmox 1/3] pve-api-types: add FixedString type Stefan Hanreich
2025-11-12 10:50 ` Wolfgang Bumiller [this message]
2025-11-12 16:39 ` Stefan Hanreich
2025-11-12 9:22 ` [pdm-devel] [PATCH proxmox 2/3] pve-api-types: generate fallback variant for enums Stefan Hanreich
2025-11-12 9:22 ` [pdm-devel] [PATCH proxmox 3/3] pve-api-types: regenerate Stefan Hanreich
2025-11-12 9:22 ` [pdm-devel] [PATCH proxmox-yew-comp 1/1] pve: qemu: handle fallback enum variants Stefan Hanreich
2025-11-12 9:22 ` [pdm-devel] [PATCH proxmox-datacenter-manager 1/1] tree-wide: handle new unknown " Stefan Hanreich
2025-11-12 12:25 ` Lukas Wagner
2025-11-12 9:22 ` [pdm-devel] [PATCH pve-yew-mobile-gui 1/1] tree-wide: handle fallback enum values Stefan Hanreich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=azh3ifpvho4xs4qqti6wbmkjn5atenshwbap3ycijv3mx3badm@uxj5bjp33xx3 \
--to=w.bumiller@proxmox.com \
--cc=pdm-devel@lists.proxmox.com \
--cc=s.hanreich@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox