From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 9B5E51FF13F for ; Thu, 12 Feb 2026 16:28:46 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 472D613E6B; Thu, 12 Feb 2026 16:29:33 +0100 (CET) Message-ID: <67209c23-4945-46f9-bd18-847652b2452d@proxmox.com> Date: Thu, 12 Feb 2026 16:29:21 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Subject: Re: [PATCH proxmox-backup] fix #7303: tape: handle NUL bytes in SCSI strings better To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= , pbs-devel@lists.proxmox.com References: <20260211145947.645587-1-d.csapak@proxmox.com> <1770908450.3jphpww7do.astroid@yuna.none> Content-Language: en-US From: Dominik Csapak In-Reply-To: <1770908450.3jphpww7do.astroid@yuna.none> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1770910165831 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.031 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: 2IUPEXDSL6DQ7NCZPEGHYQL7RHMWQFVG X-Message-ID-Hash: 2IUPEXDSL6DQ7NCZPEGHYQL7RHMWQFVG X-MailFrom: d.csapak@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox Backup Server development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 2/12/26 4:13 PM, Fabian Grünbichler wrote: > On February 11, 2026 3:59 pm, Dominik Csapak wrote: >> When dealing with ASCII fields in answers from drives and changers, we >> assumed that the data is simply ascii characters padded by spaces, with >> potentially a NUL byte at the end. This is indicated for example by the >> IBM library documentation about the Primary Volume Tag Information[0]: >> >> ``` >> This is a 36 byte ASCII field that contains the cartridge bar code >> label, left-adjusted and padded on the right with blanks. >> ``` > > doesn't this clash with the trimming below (before and after this > patch), which will remove whitespace from both start and end? well, yes, but if the label is "invalid", there is probably something else wrong going on, so it shouldn't make much difference. i can of course change it to 'trim_end' > >> Some changers may reverse that though, and have a NUL terminated string >> followed by space padding (e.g. "FOO\0 "). > > what about "FOO\0BAR" ? this would now be truncated to "FOO" as well, > whereas before it was treated as "FOOBAR"? no, before it would be treated as 'FOO\0BAR' (since trim only removes it from beginning and end, not in the middle). I sadly have no evidence this ever occurs, but seeing the issue in the bug, my assumption is that the hardware simply overwrites the buffer with it's internal string representation which might be NUL terminated. in that case i can imagine a codepath not prefilling the buffer with spaces, leading to e.g. first writing: 'FOOBAR\0' and afterwards 'FOO\0' which would lead to 'FOO\0AR\0'. in this case we should only use 'FOO'.. > > maybe it would be safer to trim_end_matches using \0 and ' ', and then > assert the result doesn't have any bytes outside of the specified range, > and only then convert to a string? > > what do other implementations accept here/how do they handle this? The only implementation i know of is in mtx, which looks like this: --- void copy_barcode(unsigned char *src, unsigned char *dest) { int i; for (i=0; i < 36; i++) { *dest = *src++; if ((*dest < 32) || (*dest > 127)) { *dest = '\0'; } dest++; } *dest = 0; /* null-terminate */ } --- so they overwrite all 'invalid' characters with null bytes and use it as a C string (so it'll only read until the first invalid/null byte) but they're only using it to display it, we need it to reference it for our internal bacodes, so i think the logic used here makes sense? > >> >> When looking into the LTO-9 SCSI reference from IBM[1], they describe an >> ASCII field as follows: >> >> ``` >> When used to describe a field, indicates that the field contains only >> ASCII printable characters (i.e., code values 20h to 7Eh) and may be >> terminated with one or more ASCII null (00h) characters. >> ``` >> >> To be on the safe side here, limit the string to before the first NUL >> byte, and then trim the result. >> >> Added some tests to verify the desired behavior here. >> >> 0: https://www.ibm.com/docs/en/ts4500-tape-library/1.12.1?topic=storage-dvcidb0 >> 1: https://www.ibm.com/support/pages/node/6490249 >> >> Signed-off-by: Dominik Csapak >> --- >> pbs-tape/src/sgutils2.rs | 44 ++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 40 insertions(+), 4 deletions(-) >> >> diff --git a/pbs-tape/src/sgutils2.rs b/pbs-tape/src/sgutils2.rs >> index 6fec6ed5c..270420ce5 100644 >> --- a/pbs-tape/src/sgutils2.rs >> +++ b/pbs-tape/src/sgutils2.rs >> @@ -697,10 +697,12 @@ impl<'a, F: AsRawFd> SgRaw<'a, F> { >> >> /// Converts SCSI ASCII text into String, trim zero and spaces >> pub fn scsi_ascii_to_string(data: &[u8]) -> String { >> - String::from_utf8_lossy(data) >> - .trim_matches(char::from(0)) >> - .trim() >> - .to_string() >> + let mut view = data; >> + >> + if let Some(idx) = data.iter().position(|c| *c == 0u8) { >> + view = &view[..idx]; >> + } >> + String::from_utf8_lossy(view).trim().to_string() >> } >> >> /// Read SCSI Inquiry page >> @@ -1012,3 +1014,37 @@ pub fn scsi_request_sense(file: &mut F) -> Result> >> Ok(sense) >> } >> + >> +#[cfg(test)] >> +mod test { >> + use crate::sgutils2::scsi_ascii_to_string; >> + >> + #[test] >> + fn test_scsi_ascii_to_string() { >> + fn test(input: &'static str, expected: &'static str) { >> + let output = scsi_ascii_to_string(input.as_bytes()); >> + assert_eq!(&output, expected); >> + } >> + >> + test("TAPE00L1", "TAPE00L1"); >> + test("TAPE00L1 ", "TAPE00L1"); >> + test("TAPE00L1 ", "TAPE00L1"); >> + test("TAPE00L1 \0", "TAPE00L1"); >> + test("TAPE00L1 \0\0", "TAPE00L1"); >> + test("TAPE00L1\0", "TAPE00L1"); >> + test("TAPE00L1\0 ", "TAPE00L1"); >> + test("TAPE00L1\0\0 ", "TAPE00L1"); >> + test("TAPE0\0L1\0\0 ", "TAPE0"); >> + test("TAPE0 \0L1 ", "TAPE0"); >> + test("\0TAPE00L1", ""); >> + test(" TAPE00L1", "TAPE00L1"); >> + test("", ""); >> + test(" ", ""); >> + test(" ", ""); >> + test("\0", ""); >> + test("\0\0", ""); >> + test("\0 ", ""); >> + test(" \0 ", ""); >> + test(" \0\0 ", ""); >> + } >> +} >> -- >> 2.47.3 >> >> >> >> >> >>