public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
	pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup] fix #7303: tape: handle NUL bytes in SCSI strings better
Date: Thu, 12 Feb 2026 16:29:21 +0100	[thread overview]
Message-ID: <67209c23-4945-46f9-bd18-847652b2452d@proxmox.com> (raw)
In-Reply-To: <1770908450.3jphpww7do.astroid@yuna.none>



On 2/12/26 4:13 PM, Fabian Grünbichler wrote:
> On February 11, 2026 3:59 pm, Dominik Csapak wrote:
>> When dealing with ASCII fields in answers from drives and changers, we
>> assumed that the data is simply ascii characters padded by spaces, with
>> potentially a NUL byte at the end. This is indicated for example by the
>> IBM library documentation about the Primary Volume Tag Information[0]:
>>
>> ```
>> This is a 36 byte ASCII field that contains the cartridge bar code
>> label, left-adjusted and padded on the right with blanks.
>> ```
> 
> doesn't this clash with the trimming below (before and after this
> patch), which will remove whitespace from both start and end?

well, yes, but if the label is "invalid", there is probably something
else wrong going on, so it shouldn't make much difference.

i can of course change it to 'trim_end'

> 
>> Some changers may reverse that though, and have a NUL terminated string
>> followed by space padding (e.g. "FOO\0 ").
> 
> what about "FOO\0BAR" ? this would now be truncated to "FOO" as well,
> whereas before it was treated as "FOOBAR"?

no, before it would be treated as 'FOO\0BAR' (since trim only removes it
from beginning and end, not in the middle).

I sadly have no evidence this ever occurs, but seeing the issue in the
bug, my assumption is that the hardware simply overwrites the buffer
with it's internal string representation which might be NUL terminated.

in that case i can imagine a codepath not prefilling the buffer with
spaces, leading to e.g. first writing:

'FOOBAR\0'

and afterwards

'FOO\0'

which would lead to 'FOO\0AR\0'. in this case we should only use 'FOO'..


> 
> maybe it would be safer to trim_end_matches using \0 and ' ', and then
> assert the result doesn't have any bytes outside of the specified range,
> and only then convert to a string?
> 
> what do other implementations accept here/how do they handle this?

The only implementation i know of is in mtx, which looks like this:

---
void copy_barcode(unsigned char *src, unsigned char *dest)
{
         int i;

         for (i=0; i < 36; i++)
         {
                 *dest = *src++;

                 if ((*dest < 32) || (*dest > 127))
                 {
                         *dest = '\0';
                 }

                 dest++;
         }
         *dest = 0; /* null-terminate */
}
---

so they overwrite all 'invalid' characters with null bytes
and use it as a C string (so it'll only read until the first 
invalid/null byte)

but they're only using it to display it, we need it to reference it for
our internal bacodes, so i think the logic used here makes sense?


> 
>>
>> When looking into the LTO-9 SCSI reference from IBM[1], they describe an
>> ASCII field as follows:
>>
>> ```
>> When used to describe a field, indicates that the field contains only
>> ASCII printable characters (i.e., code values 20h to 7Eh) and may be
>> terminated with one or more ASCII null (00h) characters.
>> ```
>>
>> To be on the safe side here, limit the string to before the first NUL
>> byte, and then trim the result.
>>
>> Added some tests to verify the desired behavior here.
>>
>> 0: https://www.ibm.com/docs/en/ts4500-tape-library/1.12.1?topic=storage-dvcidb0
>> 1: https://www.ibm.com/support/pages/node/6490249
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>>   pbs-tape/src/sgutils2.rs | 44 ++++++++++++++++++++++++++++++++++++----
>>   1 file changed, 40 insertions(+), 4 deletions(-)
>>
>> diff --git a/pbs-tape/src/sgutils2.rs b/pbs-tape/src/sgutils2.rs
>> index 6fec6ed5c..270420ce5 100644
>> --- a/pbs-tape/src/sgutils2.rs
>> +++ b/pbs-tape/src/sgutils2.rs
>> @@ -697,10 +697,12 @@ impl<'a, F: AsRawFd> SgRaw<'a, F> {
>>   
>>   /// Converts SCSI ASCII text into String, trim zero and spaces
>>   pub fn scsi_ascii_to_string(data: &[u8]) -> String {
>> -    String::from_utf8_lossy(data)
>> -        .trim_matches(char::from(0))
>> -        .trim()
>> -        .to_string()
>> +    let mut view = data;
>> +
>> +    if let Some(idx) = data.iter().position(|c| *c == 0u8) {
>> +        view = &view[..idx];
>> +    }
>> +    String::from_utf8_lossy(view).trim().to_string()
>>   }
>>   
>>   /// Read SCSI Inquiry page
>> @@ -1012,3 +1014,37 @@ pub fn scsi_request_sense<F: AsRawFd>(file: &mut F) -> Result<RequestSenseFixed,
>>   
>>       Ok(sense)
>>   }
>> +
>> +#[cfg(test)]
>> +mod test {
>> +    use crate::sgutils2::scsi_ascii_to_string;
>> +
>> +    #[test]
>> +    fn test_scsi_ascii_to_string() {
>> +        fn test(input: &'static str, expected: &'static str) {
>> +            let output = scsi_ascii_to_string(input.as_bytes());
>> +            assert_eq!(&output, expected);
>> +        }
>> +
>> +        test("TAPE00L1", "TAPE00L1");
>> +        test("TAPE00L1  ", "TAPE00L1");
>> +        test("TAPE00L1               ", "TAPE00L1");
>> +        test("TAPE00L1 \0", "TAPE00L1");
>> +        test("TAPE00L1  \0\0", "TAPE00L1");
>> +        test("TAPE00L1\0", "TAPE00L1");
>> +        test("TAPE00L1\0 ", "TAPE00L1");
>> +        test("TAPE00L1\0\0  ", "TAPE00L1");
>> +        test("TAPE0\0L1\0\0  ", "TAPE0");
>> +        test("TAPE0 \0L1  ", "TAPE0");
>> +        test("\0TAPE00L1", "");
>> +        test(" TAPE00L1", "TAPE00L1");
>> +        test("", "");
>> +        test(" ", "");
>> +        test("  ", "");
>> +        test("\0", "");
>> +        test("\0\0", "");
>> +        test("\0 ", "");
>> +        test(" \0 ", "");
>> +        test("  \0\0  ", "");
>> +    }
>> +}
>> -- 
>> 2.47.3
>>
>>
>>
>>
>>
>>





  reply	other threads:[~2026-02-12 15:28 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-11 14:59 Dominik Csapak
2026-02-12 15:13 ` Fabian Grünbichler
2026-02-12 15:29   ` Dominik Csapak [this message]
2026-02-13  9:22     ` applied: " Fabian Grünbichler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67209c23-4945-46f9-bd18-847652b2452d@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal