all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Stefan Hanreich <s.hanreich@proxmox.com>
To: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: pdm-devel@lists.proxmox.com
Subject: Re: [pdm-devel] [PATCH proxmox 1/3] pve-api-types: add FixedString type
Date: Wed, 12 Nov 2025 17:39:46 +0100	[thread overview]
Message-ID: <0786d417-db35-4c7e-96e2-66620ad26e9c@proxmox.com> (raw)
In-Reply-To: <azh3ifpvho4xs4qqti6wbmkjn5atenshwbap3ycijv3mx3badm@uxj5bjp33xx3>

thanks for the review! Incorporated all your suggestions, waiting with
the v2 til tomorrow, in case something else comes up.

On 11/12/25 11:49 AM, Wolfgang Bumiller wrote:
> On Wed, Nov 12, 2025 at 10:22:05AM +0100, Stefan Hanreich wrote:
>> A custom, immutable, string type, that is copiable, which can hold up
>> to 23 characters. It will be used later for storing the name of
>> unknown enum variants when parsing the return value of enum
>> properties. The main reason for introducing this type is that its copy
>> as opposed to String, which lets us keep the Copy implementation for
>> the existing enums.
>>
>> Main reason for choosing 23 characters is the fact that String is 24
>> bytes large as well (ptr, len, capacity).
>>
>> Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
>> ---
>>  pve-api-types/src/types/fixed_string.rs | 331 ++++++++++++++++++++++++
>>  pve-api-types/src/types/mod.rs          |   3 +
>>  2 files changed, 334 insertions(+)
>>  create mode 100644 pve-api-types/src/types/fixed_string.rs
>>
>> diff --git a/pve-api-types/src/types/fixed_string.rs b/pve-api-types/src/types/fixed_string.rs
>> new file mode 100644
>> index 00000000..f1692227
>> --- /dev/null
>> +++ b/pve-api-types/src/types/fixed_string.rs
>> @@ -0,0 +1,331 @@
>> +use std::cmp::Ordering;
>> +use std::error::Error;
>> +use std::fmt::{Debug, Display, Formatter};
>> +use std::ops::Deref;
>> +use std::str::FromStr;
>> +
>> +use serde::{Deserialize, Serialize};
>> +
>> +/// Error type used by constructors of [`FixedString`]
>> +#[derive(Clone, Copy, Debug)]
>> +pub enum FixedStringError {
>> +    TooLong,
>> +}
> 
> IMO this does not need to be an enum. `pub struct TooLongError` should
> be enough.

also thought about it, wanted it to be potentially extendable without
too much hassle - but works for me as well. Additional error variants
are unlikely anyway..

>> +
>> +impl Error for FixedStringError {}
>> +
>> +impl Display for FixedStringError {
>> +    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
> 
> ^ you're importing `Formatter`.
> And on that note: personally I prefer to import `std::fmt` and prefix
> its items with `fmt::*`.
> 
>> +        f.write_str(match self {
>> +            Self::TooLong => "string is longer than 23 characters",
>> +        })
>> +    }
>> +}
>> +
>> +/// An immutable string type with a maximum size of 23 bytes.
>> +///
>> +/// After construction it is guaranteed that its contents are:
>> +/// * valid utf-8
>> +/// * not longer than 23 characters
>> +///
>> +/// FixedString is immutable, therefore it is sufficient to validate the invariants only at
>> +/// construction time to guarantee that they will always hold during the lifecycle of the
>> +/// struct.
>> +#[derive(Clone, Copy)]
>> +pub struct FixedString {
>> +    buf: [u8; 23],
>> +    len: u8,
>> +}
>> +
>> +impl FixedString {
>> +    /// Creates a new FixedString instance from a str reference.
>> +    ///
>> +    /// # Errors
>> +    /// This function will return an error if:
>> +    /// * The passed string is longer than 23 bytes
>> +    pub fn new(value: &str) -> Result<Self, FixedStringError> {
>> +        if value.len() > 23 {
>> +            return Err(FixedStringError::TooLong);
>> +        }
>> +
>> +        let mut buf = [0; 23];
>> +        buf[..value.len()].copy_from_slice(value.as_bytes());
>> +
>> +        Ok(Self {
>> +            buf,
>> +            // SAFETY: self.len is at least 0 and at most 23, which fits into u8
>> +            len: value.len() as u8,
>> +        })
>> +    }
>> +
>> +    /// Creates a new FixedString instance from a str reference, truncating if required
>> +    ///
>> +    /// It works by copying the bytes from the supplied value into its own local buffer. If the
>> +    /// string is longer than 23 bytes, then this methods will truncate the string at the nearest
>> +    /// utf-8 boundary, so even if a string longer than 23 characters is passed in, it is possible
>> +    /// that it gets truncated to a length less than 23 if the utf-8 boundary doesn't perfectly
>> +    /// align with the 23 bytes limit.
>> +    pub fn new_truncated(value: &str) -> Self {
> 
> I don't think we should keep this unless we also have a way of checking
> for this.
> In case we do want to use this: since values up to 31 (a potential
> still-cheap-enough extension to this `FixedString`'s length (since it
> would then just be 4 pointers in size)) only require 5 bits, we have 3
> bits remaining in the length byte which could be used to signal that
> this was a truncated string, which could be returned via a
> `.is_truncated()` method, and we could refuse to serialize such a string
> without calling some extra method first (and include some marker in the
> Debug impl)?
> 
> But IMO: for now, we just don't need it.
> 

dropped it now then

>> +        // TODO: replace, once [`str::floor_char_boundary`] has been stabilized
>> +        let len = if value.len() > 23 {
>> +            let mut index = 23;
>> +
>> +            while index > 0 && !value.is_char_boundary(index) {
>> +                index = index - 1;
> 
> index -= 1; ;-)
> 
> or:
>     panic!("Who let Dominik add emojis to enums?")
> 
>> +            }
>> +
>> +            index
>> +        } else {
>> +            value.len()
>> +        };
> 
> ^ Actually the above could just be shortened to
> 
> let mut len = value.len();
> if len > 23 {
>     len = 23;
>     while ... { len -= 1 }
> }
> 
>> +
>> +        let mut buf = [0; 23];
>> +        buf[..len].copy_from_slice(&value.as_bytes()[..len]);
>> +
>> +        Self {
>> +            buf,
>> +            // SAFETY: self.len is at least 0 and at most 23, which fits into u8
>> +            len: len as u8,
>> +        }
>> +    }
>> +
>> +    /// Returns a str reference to the stored data
>> +    #[inline]
>> +    pub fn as_str(&self) -> &str {
>> +        // SAFETY: self.buf must be a valid utf-8 string by construction
>> +        unsafe { str::from_utf8_unchecked(self.as_bytes()) }
>> +    }
>> +
>> +    /// Returns a reference to the set bytes in the stored buffer
>> +    #[inline]
>> +    pub fn as_bytes(&self) -> &[u8] {
>> +        // SAFETY: self.len >= 0 and self.len <= 23 by construction
>> +        unsafe { self.buf.get_unchecked(..self.len as usize) }
>> +    }
>> +}
>> +
>> +impl PartialEq for FixedString {
> 
> IMO - regarding ordering where it matters - it would be easier to read
> such impls when they're purely forwarded, eg.:
> 
>     PartialEq::eq(self.as_str(), other)
> 
> since then all implementations look the same (no need for `Some()`
> wrapping in the `PartialOrd` impl, for instance.

yeah, that's fair - idk why I didn't do it in the first place...

> Also, it could be generated with a simple forwarding macro which would
> IMO also make review easier than having to go through each of them
> individually:
> 
>     macro_rules! forward_impl_to_bytes {
>         ($($trait:ident {$fn:ident -> $out:ty })+) => {
>             $(
>                 impl $trait for FixedString {
>                     fn $fn(&self, other: &Self) -> $out {
>                         <[u8] as $trait>::$fn(self.as_bytes(), other.as_bytes())
>                     }
>                 }
>             )+
>         };
>     }
> 
>     forward_impl_to_bytes! {
>         PartialEq { eq -> bool }
>         PartialOrd { partial_cmp -> Option<Ordering> }
>         Ord { cmp -> Ordering }
>     }
> 
> 
> And for the ones below:
> 
>     macro_rules! forward_impl_to_str_bidir {
>         ($($trait:ident {$fn:ident -> $out:ty })+) => {
>             $(
>                 impl $trait<str> for FixedString {
>                     fn $fn(&self, other: &str) -> $out {
>                         <str as $trait>::$fn(self.as_str(), other)
>                     }
>                 }
> 
>                 impl $trait<FixedString> for &str {
>                     fn $fn(&self, other: &FixedString) -> $out {
>                         <str as $trait>::$fn(self, other.as_str())
>                     }
>                 }
>             )+
>         };
>     }
> 
>     forward_impl_to_str_bidir! {
>         PartialEq { eq -> bool }
>         PartialOrd { partial_cmp -> Option<Ordering> }
>     }
> 
> Yes I realize that for 2 traits this doesn't save much, but it would
> make the intention clearer and IMO is still easier to review.

still, this makes adding additional traits a breeze and I'm sure we
might need to do that at some point. I kept the inline for both cases
though - what do you think? makes sense imo.

>> +    #[inline]
>> +    fn eq(&self, other: &FixedString) -> bool {
>> +        self.as_bytes() == other.as_bytes()
>> +    }
>> +}
>> +
>> +impl Eq for FixedString {}
>> +
>> +impl PartialOrd for FixedString {
>> +    fn partial_cmp(&self, other: &FixedString) -> Option<Ordering> {
>> +        Some(self.cmp(other))
>> +    }
>> +}
>> +
>> +impl Ord for FixedString {
>> +    #[inline]
>> +    fn cmp(&self, other: &FixedString) -> Ordering {
>> +        self.as_bytes().cmp(other.as_bytes())
>> +    }
>> +}
>> +
>> +impl Display for FixedString {
>> +    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
>> +        f.write_str(self.as_str())
> 
> Consider forwarding to &str's fmt instead so that things such as
> alignment etc. work:
> 
>     Display::fmt(self.as_str(), f)
> 
> so that `println!("{the_enum:>8}")` will print "     foo" instead of
> just "foo", and `{x:0^8}` produces "00foo000" ;-)
> 

done

>> +    }
>> +}
>> +
>> +impl Debug for FixedString {
>> +    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
>> +        f.write_str(self.as_str())
> 
> Same here with `Debug::fmt(self.as_str(), f)`.
> 

done

>> +    }
>> +}
>> +
>> +impl Deref for FixedString {
>> +    type Target = str;
>> +
>> +    fn deref(&self) -> &str {
>> +        self.as_str()
>> +    }
>> +}
>> +
>> +impl AsRef<str> for FixedString {
>> +    fn as_ref(&self) -> &str {
>> +        self.as_str()
>> +    }
>> +}
> 
> I'd also add `AsRef<[u8]>`, and `Borrow<str>`. (The latter would be
> required to allow a `HashMap<FixedString, T>` to use `.get("key")` with
> `&str` for a key - not that that's something we'd expect to see...
> 

done

>> +
>> +impl TryFrom<String> for FixedString {
>> +    type Error = FixedStringError;
>> +
>> +    fn try_from(value: String) -> Result<Self, Self::Error> {
>> +        FixedString::new(value.as_str())
>> +    }
>> +}
>> +
>> +impl FromStr for FixedString {
>> +    type Err = FixedStringError;
>> +
>> +    fn from_str(value: &str) -> Result<Self, Self::Err> {
>> +        FixedString::new(value)
>> +    }
>> +}
>> +
>> +impl TryFrom<&str> for FixedString {
>> +    type Error = FixedStringError;
>> +
>> +    fn try_from(value: &str) -> Result<Self, Self::Error> {
>> +        FixedString::new(value)
>> +    }
>> +}
>> +
>> +impl PartialEq<str> for FixedString {
>> +    fn eq(&self, other: &str) -> bool {
>> +        self.as_str().eq(other)
>> +    }
>> +}
>> +
>> +impl PartialEq<FixedString> for &str {
>> +    fn eq(&self, other: &FixedString) -> bool {
>> +        self.eq(&other.as_str())
>> +    }
>> +}
>> +
>> +impl PartialOrd<str> for FixedString {
>> +    fn partial_cmp(&self, other: &str) -> Option<Ordering> {
>> +        Some(self.as_str().cmp(&other))
>> +    }
>> +}
>> +
>> +impl PartialOrd<FixedString> for &str {
>> +    fn partial_cmp(&self, other: &FixedString) -> Option<Ordering> {
>> +        Some(self.cmp(&other.as_str()))
>> +    }
>> +}
>> +
>> +impl Serialize for FixedString {
>> +    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
>> +    where
>> +        S: serde::Serializer,
>> +    {
>> +        serializer.serialize_str(self.as_str())
>> +    }
>> +}
>> +
>> +impl<'de> Deserialize<'de> for FixedString {
>> +    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
>> +    where
>> +        D: serde::Deserializer<'de>,
>> +    {
>> +        deserializer.deserialize_str(FixedStringVisitor)
>> +    }
>> +}
>> +
>> +struct FixedStringVisitor;
> 
> ^ I'd prefer to move the type and Visitor impl into the `fn deserialize`.
> 

done

>> +
>> +impl<'de> serde::de::Visitor<'de> for FixedStringVisitor {
>> +    type Value = FixedString;
>> +
>> +    fn expecting(&self, f: &mut Formatter) -> std::fmt::Result {
>> +        f.write_str("a string that is at most 23 bytes long")
>> +    }
>> +
>> +    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
>> +    where
>> +        E: serde::de::Error,
>> +    {
>> +        v.try_into().map_err(E::custom)
>> +    }
>> +
>> +    fn visit_string<E>(self, v: String) -> Result<Self::Value, E>
>> +    where
>> +        E: serde::de::Error,
>> +    {
> 
> ^ You can drop this. `visit_string` and `visit_borrowed_str` by default
> forward to `visit_str`.
> 

done

>> +        v.try_into().map_err(E::custom)
>> +    }
>> +}
>> +
>> +#[cfg(test)]
>> +mod tests {
>> +    use super::*;
>> +
>> +    use serde_plain;
>> +
>> +    #[test]
>> +    fn test_construct() {
>> +        let fixed_string = FixedString::new("").expect("empty string is valid");
>> +        assert_eq!("", fixed_string);
>> +
>> +        let fixed_string = FixedString::new("a").expect("valid string");
>> +        assert_eq!("a", fixed_string);
>> +
>> +        let fixed_string = FixedString::new("🌏🌏🌏🌏🌏").expect("valid string");
>> +        assert_eq!("🌏🌏🌏🌏🌏", fixed_string);
>> +
>> +        let fixed_string =
>> +            FixedString::new("aaaaaaaaaaaaaaaaaaaaaaa").expect("23 characters are allowed");
>> +        assert_eq!("aaaaaaaaaaaaaaaaaaaaaaa", fixed_string);
>> +
>> +        FixedString::new("🌏🌏🌏🌏🌏🌏").expect_err("string too long");
>> +        FixedString::new("aaaaaaaaaaaaaaaaaaaaaaaa").expect_err("string too long");
>> +    }
>> +
>> +    #[test]
>> +    fn test_serialize_deserialize() {
>> +        let valid_string = "aaaaaaaaaaaaaaaaaaaaaaa";
>> +
>> +        let fixed_string: FixedString =
>> +            serde_plain::from_str("aaaaaaaaaaaaaaaaaaaaaaa").expect("deserialization works");
>> +        assert_eq!(valid_string, fixed_string);
>> +
>> +        let serialized_string =
>> +            serde_plain::to_string(&fixed_string).expect("can be serialized into a string");
>> +        assert_eq!(valid_string, serialized_string);
>> +
>> +        serde_plain::from_str::<FixedString>("aaaaaaaaaaaaaaaaaaaaaaaa")
>> +            .expect_err("cannot deserialize string that is too long");
>> +    }
>> +
>> +    #[test]
>> +    fn test_ord() {
>> +        let fixed_string = FixedString::new("abc").expect("valid string");
>> +
>> +        assert!(fixed_string == fixed_string);
>> +        assert!(fixed_string >= fixed_string);
>> +        assert!(fixed_string <= fixed_string);
>> +
>> +        assert!("ab" < fixed_string);
>> +        assert!("abc" == fixed_string);
>> +        assert!("abcd" > fixed_string);
>> +
>> +        let larger_fixed_string = FixedString::new("abcde").expect("valid string");
>> +
>> +        assert!(larger_fixed_string > fixed_string);
>> +        assert!(fixed_string < larger_fixed_string);
>> +    }
>> +
>> +    #[test]
>> +    fn test_truncate() {
>> +        let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaaaaaaaaaaaa");
>> +        assert_eq!("aaaaaaaaaaaaaaaaaaaaaaa", fixed_string);
>> +
>> +        let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaa💯");
>> +        assert_eq!("aaaaaaaaaaaaaaaaaa💯", fixed_string);
>> +
>> +        // '💯' is 4 bytes long
>> +        // first byte of '💯' is inside the 23 byte limit, but it's not a valid utf-8 boundary so
>> +        // it should get truncated
>> +        let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaaaaa💯");
>> +        assert_eq!("aaaaaaaaaaaaaaaaaaaaa", fixed_string);
>> +
>> +        let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaaaaaa💯");
>> +        assert_eq!("aaaaaaaaaaaaaaaaaaaaaa", fixed_string);
>> +
>> +        let fixed_string = FixedString::new_truncated("aaaaaaaaaaaaaaaaaaaaaaa💯");
>> +        assert_eq!("aaaaaaaaaaaaaaaaaaaaaaa", fixed_string);
>> +
>> +        // '🌏' is 4 bytes long
>> +        let fixed_string = FixedString::new_truncated("🌏🌏🌏🌏🌏");
>> +        assert_eq!("🌏🌏🌏🌏🌏", fixed_string);
>> +
>> +        let fixed_string = FixedString::new_truncated("🌏🌏🌏🌏🌏🌏");
>> +        assert_eq!("🌏🌏🌏🌏🌏", fixed_string);
>> +    }
>> +}
>> diff --git a/pve-api-types/src/types/mod.rs b/pve-api-types/src/types/mod.rs
>> index fe52a169..9653dd46 100644
>> --- a/pve-api-types/src/types/mod.rs
>> +++ b/pve-api-types/src/types/mod.rs
>> @@ -17,6 +17,9 @@ pub mod array;
>>  pub mod stringlist;
>>  pub mod verifiers;
>>  
>> +mod fixed_string;
>> +pub use fixed_string::{FixedString, FixedStringError};
>> +
>>  include!("../generated/types.rs");
>>  
>>  /// A PVE Upid, contrary to a PBS Upid, contains no 'task-id' number.
>> -- 
>> 2.47.3



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel

  reply	other threads:[~2025-11-12 16:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-12  9:22 [pdm-devel] [RFC proxmox{, -yew-comp, -datacenter-manager}/yew-mobile-gui 0/6] Add fallback variant to enum properties in Proxmox VE Rust API types Stefan Hanreich
2025-11-12  9:22 ` [pdm-devel] [PATCH proxmox 1/3] pve-api-types: add FixedString type Stefan Hanreich
2025-11-12 10:50   ` Wolfgang Bumiller
2025-11-12 16:39     ` Stefan Hanreich [this message]
2025-11-12  9:22 ` [pdm-devel] [PATCH proxmox 2/3] pve-api-types: generate fallback variant for enums Stefan Hanreich
2025-11-12  9:22 ` [pdm-devel] [PATCH proxmox 3/3] pve-api-types: regenerate Stefan Hanreich
2025-11-12  9:22 ` [pdm-devel] [PATCH proxmox-yew-comp 1/1] pve: qemu: handle fallback enum variants Stefan Hanreich
2025-11-12  9:22 ` [pdm-devel] [PATCH proxmox-datacenter-manager 1/1] tree-wide: handle new unknown " Stefan Hanreich
2025-11-12 12:25   ` Lukas Wagner
2025-11-12  9:22 ` [pdm-devel] [PATCH pve-yew-mobile-gui 1/1] tree-wide: handle fallback enum values Stefan Hanreich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0786d417-db35-4c7e-96e2-66620ad26e9c@proxmox.com \
    --to=s.hanreich@proxmox.com \
    --cc=pdm-devel@lists.proxmox.com \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal