all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH proxmox-backup] fix #7303: tape: handle NUL bytes in SCSI strings better
@ 2026-02-11 14:59 Dominik Csapak
  2026-02-12 15:13 ` Fabian Grünbichler
  0 siblings, 1 reply; 4+ messages in thread
From: Dominik Csapak @ 2026-02-11 14:59 UTC (permalink / raw)
  To: pbs-devel

When dealing with ASCII fields in answers from drives and changers, we
assumed that the data is simply ascii characters padded by spaces, with
potentially a NUL byte at the end. This is indicated for example by the
IBM library documentation about the Primary Volume Tag Information[0]:

```
This is a 36 byte ASCII field that contains the cartridge bar code
label, left-adjusted and padded on the right with blanks.
```

Some changers may reverse that though, and have a NUL terminated string
followed by space padding (e.g. "FOO\0 ").

When looking into the LTO-9 SCSI reference from IBM[1], they describe an
ASCII field as follows:

```
When used to describe a field, indicates that the field contains only
ASCII printable characters (i.e., code values 20h to 7Eh) and may be
terminated with one or more ASCII null (00h) characters.
```

To be on the safe side here, limit the string to before the first NUL
byte, and then trim the result.

Added some tests to verify the desired behavior here.

0: https://www.ibm.com/docs/en/ts4500-tape-library/1.12.1?topic=storage-dvcidb0
1: https://www.ibm.com/support/pages/node/6490249

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 pbs-tape/src/sgutils2.rs | 44 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/pbs-tape/src/sgutils2.rs b/pbs-tape/src/sgutils2.rs
index 6fec6ed5c..270420ce5 100644
--- a/pbs-tape/src/sgutils2.rs
+++ b/pbs-tape/src/sgutils2.rs
@@ -697,10 +697,12 @@ impl<'a, F: AsRawFd> SgRaw<'a, F> {
 
 /// Converts SCSI ASCII text into String, trim zero and spaces
 pub fn scsi_ascii_to_string(data: &[u8]) -> String {
-    String::from_utf8_lossy(data)
-        .trim_matches(char::from(0))
-        .trim()
-        .to_string()
+    let mut view = data;
+
+    if let Some(idx) = data.iter().position(|c| *c == 0u8) {
+        view = &view[..idx];
+    }
+    String::from_utf8_lossy(view).trim().to_string()
 }
 
 /// Read SCSI Inquiry page
@@ -1012,3 +1014,37 @@ pub fn scsi_request_sense<F: AsRawFd>(file: &mut F) -> Result<RequestSenseFixed,
 
     Ok(sense)
 }
+
+#[cfg(test)]
+mod test {
+    use crate::sgutils2::scsi_ascii_to_string;
+
+    #[test]
+    fn test_scsi_ascii_to_string() {
+        fn test(input: &'static str, expected: &'static str) {
+            let output = scsi_ascii_to_string(input.as_bytes());
+            assert_eq!(&output, expected);
+        }
+
+        test("TAPE00L1", "TAPE00L1");
+        test("TAPE00L1  ", "TAPE00L1");
+        test("TAPE00L1               ", "TAPE00L1");
+        test("TAPE00L1 \0", "TAPE00L1");
+        test("TAPE00L1  \0\0", "TAPE00L1");
+        test("TAPE00L1\0", "TAPE00L1");
+        test("TAPE00L1\0 ", "TAPE00L1");
+        test("TAPE00L1\0\0  ", "TAPE00L1");
+        test("TAPE0\0L1\0\0  ", "TAPE0");
+        test("TAPE0 \0L1  ", "TAPE0");
+        test("\0TAPE00L1", "");
+        test(" TAPE00L1", "TAPE00L1");
+        test("", "");
+        test(" ", "");
+        test("  ", "");
+        test("\0", "");
+        test("\0\0", "");
+        test("\0 ", "");
+        test(" \0 ", "");
+        test("  \0\0  ", "");
+    }
+}
-- 
2.47.3





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH proxmox-backup] fix #7303: tape: handle NUL bytes in SCSI strings better
  2026-02-11 14:59 [PATCH proxmox-backup] fix #7303: tape: handle NUL bytes in SCSI strings better Dominik Csapak
@ 2026-02-12 15:13 ` Fabian Grünbichler
  2026-02-12 15:29   ` Dominik Csapak
  0 siblings, 1 reply; 4+ messages in thread
From: Fabian Grünbichler @ 2026-02-12 15:13 UTC (permalink / raw)
  To: Dominik Csapak, pbs-devel

On February 11, 2026 3:59 pm, Dominik Csapak wrote:
> When dealing with ASCII fields in answers from drives and changers, we
> assumed that the data is simply ascii characters padded by spaces, with
> potentially a NUL byte at the end. This is indicated for example by the
> IBM library documentation about the Primary Volume Tag Information[0]:
> 
> ```
> This is a 36 byte ASCII field that contains the cartridge bar code
> label, left-adjusted and padded on the right with blanks.
> ```

doesn't this clash with the trimming below (before and after this
patch), which will remove whitespace from both start and end?

> Some changers may reverse that though, and have a NUL terminated string
> followed by space padding (e.g. "FOO\0 ").

what about "FOO\0BAR" ? this would now be truncated to "FOO" as well,
whereas before it was treated as "FOOBAR"?

maybe it would be safer to trim_end_matches using \0 and ' ', and then
assert the result doesn't have any bytes outside of the specified range,
and only then convert to a string?

what do other implementations accept here/how do they handle this?

> 
> When looking into the LTO-9 SCSI reference from IBM[1], they describe an
> ASCII field as follows:
> 
> ```
> When used to describe a field, indicates that the field contains only
> ASCII printable characters (i.e., code values 20h to 7Eh) and may be
> terminated with one or more ASCII null (00h) characters.
> ```
> 
> To be on the safe side here, limit the string to before the first NUL
> byte, and then trim the result.
> 
> Added some tests to verify the desired behavior here.
> 
> 0: https://www.ibm.com/docs/en/ts4500-tape-library/1.12.1?topic=storage-dvcidb0
> 1: https://www.ibm.com/support/pages/node/6490249
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  pbs-tape/src/sgutils2.rs | 44 ++++++++++++++++++++++++++++++++++++----
>  1 file changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/pbs-tape/src/sgutils2.rs b/pbs-tape/src/sgutils2.rs
> index 6fec6ed5c..270420ce5 100644
> --- a/pbs-tape/src/sgutils2.rs
> +++ b/pbs-tape/src/sgutils2.rs
> @@ -697,10 +697,12 @@ impl<'a, F: AsRawFd> SgRaw<'a, F> {
>  
>  /// Converts SCSI ASCII text into String, trim zero and spaces
>  pub fn scsi_ascii_to_string(data: &[u8]) -> String {
> -    String::from_utf8_lossy(data)
> -        .trim_matches(char::from(0))
> -        .trim()
> -        .to_string()
> +    let mut view = data;
> +
> +    if let Some(idx) = data.iter().position(|c| *c == 0u8) {
> +        view = &view[..idx];
> +    }
> +    String::from_utf8_lossy(view).trim().to_string()
>  }
>  
>  /// Read SCSI Inquiry page
> @@ -1012,3 +1014,37 @@ pub fn scsi_request_sense<F: AsRawFd>(file: &mut F) -> Result<RequestSenseFixed,
>  
>      Ok(sense)
>  }
> +
> +#[cfg(test)]
> +mod test {
> +    use crate::sgutils2::scsi_ascii_to_string;
> +
> +    #[test]
> +    fn test_scsi_ascii_to_string() {
> +        fn test(input: &'static str, expected: &'static str) {
> +            let output = scsi_ascii_to_string(input.as_bytes());
> +            assert_eq!(&output, expected);
> +        }
> +
> +        test("TAPE00L1", "TAPE00L1");
> +        test("TAPE00L1  ", "TAPE00L1");
> +        test("TAPE00L1               ", "TAPE00L1");
> +        test("TAPE00L1 \0", "TAPE00L1");
> +        test("TAPE00L1  \0\0", "TAPE00L1");
> +        test("TAPE00L1\0", "TAPE00L1");
> +        test("TAPE00L1\0 ", "TAPE00L1");
> +        test("TAPE00L1\0\0  ", "TAPE00L1");
> +        test("TAPE0\0L1\0\0  ", "TAPE0");
> +        test("TAPE0 \0L1  ", "TAPE0");
> +        test("\0TAPE00L1", "");
> +        test(" TAPE00L1", "TAPE00L1");
> +        test("", "");
> +        test(" ", "");
> +        test("  ", "");
> +        test("\0", "");
> +        test("\0\0", "");
> +        test("\0 ", "");
> +        test(" \0 ", "");
> +        test("  \0\0  ", "");
> +    }
> +}
> -- 
> 2.47.3
> 
> 
> 
> 
> 
> 




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH proxmox-backup] fix #7303: tape: handle NUL bytes in SCSI strings better
  2026-02-12 15:13 ` Fabian Grünbichler
@ 2026-02-12 15:29   ` Dominik Csapak
  2026-02-13  9:22     ` applied: " Fabian Grünbichler
  0 siblings, 1 reply; 4+ messages in thread
From: Dominik Csapak @ 2026-02-12 15:29 UTC (permalink / raw)
  To: Fabian Grünbichler, pbs-devel



On 2/12/26 4:13 PM, Fabian Grünbichler wrote:
> On February 11, 2026 3:59 pm, Dominik Csapak wrote:
>> When dealing with ASCII fields in answers from drives and changers, we
>> assumed that the data is simply ascii characters padded by spaces, with
>> potentially a NUL byte at the end. This is indicated for example by the
>> IBM library documentation about the Primary Volume Tag Information[0]:
>>
>> ```
>> This is a 36 byte ASCII field that contains the cartridge bar code
>> label, left-adjusted and padded on the right with blanks.
>> ```
> 
> doesn't this clash with the trimming below (before and after this
> patch), which will remove whitespace from both start and end?

well, yes, but if the label is "invalid", there is probably something
else wrong going on, so it shouldn't make much difference.

i can of course change it to 'trim_end'

> 
>> Some changers may reverse that though, and have a NUL terminated string
>> followed by space padding (e.g. "FOO\0 ").
> 
> what about "FOO\0BAR" ? this would now be truncated to "FOO" as well,
> whereas before it was treated as "FOOBAR"?

no, before it would be treated as 'FOO\0BAR' (since trim only removes it
from beginning and end, not in the middle).

I sadly have no evidence this ever occurs, but seeing the issue in the
bug, my assumption is that the hardware simply overwrites the buffer
with it's internal string representation which might be NUL terminated.

in that case i can imagine a codepath not prefilling the buffer with
spaces, leading to e.g. first writing:

'FOOBAR\0'

and afterwards

'FOO\0'

which would lead to 'FOO\0AR\0'. in this case we should only use 'FOO'..


> 
> maybe it would be safer to trim_end_matches using \0 and ' ', and then
> assert the result doesn't have any bytes outside of the specified range,
> and only then convert to a string?
> 
> what do other implementations accept here/how do they handle this?

The only implementation i know of is in mtx, which looks like this:

---
void copy_barcode(unsigned char *src, unsigned char *dest)
{
         int i;

         for (i=0; i < 36; i++)
         {
                 *dest = *src++;

                 if ((*dest < 32) || (*dest > 127))
                 {
                         *dest = '\0';
                 }

                 dest++;
         }
         *dest = 0; /* null-terminate */
}
---

so they overwrite all 'invalid' characters with null bytes
and use it as a C string (so it'll only read until the first 
invalid/null byte)

but they're only using it to display it, we need it to reference it for
our internal bacodes, so i think the logic used here makes sense?


> 
>>
>> When looking into the LTO-9 SCSI reference from IBM[1], they describe an
>> ASCII field as follows:
>>
>> ```
>> When used to describe a field, indicates that the field contains only
>> ASCII printable characters (i.e., code values 20h to 7Eh) and may be
>> terminated with one or more ASCII null (00h) characters.
>> ```
>>
>> To be on the safe side here, limit the string to before the first NUL
>> byte, and then trim the result.
>>
>> Added some tests to verify the desired behavior here.
>>
>> 0: https://www.ibm.com/docs/en/ts4500-tape-library/1.12.1?topic=storage-dvcidb0
>> 1: https://www.ibm.com/support/pages/node/6490249
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>>   pbs-tape/src/sgutils2.rs | 44 ++++++++++++++++++++++++++++++++++++----
>>   1 file changed, 40 insertions(+), 4 deletions(-)
>>
>> diff --git a/pbs-tape/src/sgutils2.rs b/pbs-tape/src/sgutils2.rs
>> index 6fec6ed5c..270420ce5 100644
>> --- a/pbs-tape/src/sgutils2.rs
>> +++ b/pbs-tape/src/sgutils2.rs
>> @@ -697,10 +697,12 @@ impl<'a, F: AsRawFd> SgRaw<'a, F> {
>>   
>>   /// Converts SCSI ASCII text into String, trim zero and spaces
>>   pub fn scsi_ascii_to_string(data: &[u8]) -> String {
>> -    String::from_utf8_lossy(data)
>> -        .trim_matches(char::from(0))
>> -        .trim()
>> -        .to_string()
>> +    let mut view = data;
>> +
>> +    if let Some(idx) = data.iter().position(|c| *c == 0u8) {
>> +        view = &view[..idx];
>> +    }
>> +    String::from_utf8_lossy(view).trim().to_string()
>>   }
>>   
>>   /// Read SCSI Inquiry page
>> @@ -1012,3 +1014,37 @@ pub fn scsi_request_sense<F: AsRawFd>(file: &mut F) -> Result<RequestSenseFixed,
>>   
>>       Ok(sense)
>>   }
>> +
>> +#[cfg(test)]
>> +mod test {
>> +    use crate::sgutils2::scsi_ascii_to_string;
>> +
>> +    #[test]
>> +    fn test_scsi_ascii_to_string() {
>> +        fn test(input: &'static str, expected: &'static str) {
>> +            let output = scsi_ascii_to_string(input.as_bytes());
>> +            assert_eq!(&output, expected);
>> +        }
>> +
>> +        test("TAPE00L1", "TAPE00L1");
>> +        test("TAPE00L1  ", "TAPE00L1");
>> +        test("TAPE00L1               ", "TAPE00L1");
>> +        test("TAPE00L1 \0", "TAPE00L1");
>> +        test("TAPE00L1  \0\0", "TAPE00L1");
>> +        test("TAPE00L1\0", "TAPE00L1");
>> +        test("TAPE00L1\0 ", "TAPE00L1");
>> +        test("TAPE00L1\0\0  ", "TAPE00L1");
>> +        test("TAPE0\0L1\0\0  ", "TAPE0");
>> +        test("TAPE0 \0L1  ", "TAPE0");
>> +        test("\0TAPE00L1", "");
>> +        test(" TAPE00L1", "TAPE00L1");
>> +        test("", "");
>> +        test(" ", "");
>> +        test("  ", "");
>> +        test("\0", "");
>> +        test("\0\0", "");
>> +        test("\0 ", "");
>> +        test(" \0 ", "");
>> +        test("  \0\0  ", "");
>> +    }
>> +}
>> -- 
>> 2.47.3
>>
>>
>>
>>
>>
>>





^ permalink raw reply	[flat|nested] 4+ messages in thread

* applied: [PATCH proxmox-backup] fix #7303: tape: handle NUL bytes in SCSI strings better
  2026-02-12 15:29   ` Dominik Csapak
@ 2026-02-13  9:22     ` Fabian Grünbichler
  0 siblings, 0 replies; 4+ messages in thread
From: Fabian Grünbichler @ 2026-02-13  9:22 UTC (permalink / raw)
  To: Dominik Csapak, pbs-devel

On February 12, 2026 4:29 pm, Dominik Csapak wrote:
> 
> 
> On 2/12/26 4:13 PM, Fabian Grünbichler wrote:
>> On February 11, 2026 3:59 pm, Dominik Csapak wrote:
>>> When dealing with ASCII fields in answers from drives and changers, we
>>> assumed that the data is simply ascii characters padded by spaces, with
>>> potentially a NUL byte at the end. This is indicated for example by the
>>> IBM library documentation about the Primary Volume Tag Information[0]:
>>>
>>> ```
>>> This is a 36 byte ASCII field that contains the cartridge bar code
>>> label, left-adjusted and padded on the right with blanks.
>>> ```
>> 
>> doesn't this clash with the trimming below (before and after this
>> patch), which will remove whitespace from both start and end?
> 
> well, yes, but if the label is "invalid", there is probably something
> else wrong going on, so it shouldn't make much difference.
> 
> i can of course change it to 'trim_end'
> 
>> 
>>> Some changers may reverse that though, and have a NUL terminated string
>>> followed by space padding (e.g. "FOO\0 ").
>> 
>> what about "FOO\0BAR" ? this would now be truncated to "FOO" as well,
>> whereas before it was treated as "FOOBAR"?
> 
> no, before it would be treated as 'FOO\0BAR' (since trim only removes it
> from beginning and end, not in the middle).

right, printing the str will omit the \0, but it is actually preserved
by from_utf8_lossy :)

> I sadly have no evidence this ever occurs, but seeing the issue in the
> bug, my assumption is that the hardware simply overwrites the buffer
> with it's internal string representation which might be NUL terminated.
> 
> in that case i can imagine a codepath not prefilling the buffer with
> spaces, leading to e.g. first writing:
> 
> 'FOOBAR\0'
> 
> and afterwards
> 
> 'FOO\0'
> 
> which would lead to 'FOO\0AR\0'. in this case we should only use 'FOO'..

ack, I guess it makes sense (as much as things can make sense here ;))

Applied, thanks!

[1/1] fix #7303: tape: handle NUL bytes in SCSI strings better
      commit: baa2077c4b867d686423c6b22be5cb2ee12cad7d

Best regards,
-- 
Fabian Grünbichler <f.gruenbichler@proxmox.com>




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-02-13  9:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-11 14:59 [PATCH proxmox-backup] fix #7303: tape: handle NUL bytes in SCSI strings better Dominik Csapak
2026-02-12 15:13 ` Fabian Grünbichler
2026-02-12 15:29   ` Dominik Csapak
2026-02-13  9:22     ` applied: " Fabian Grünbichler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal