From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <s.sterz@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id A2A2B63009
 for <pbs-devel@lists.proxmox.com>; Wed, 23 Feb 2022 15:42:07 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 914192FDA6
 for <pbs-devel@lists.proxmox.com>; Wed, 23 Feb 2022 15:41:37 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 7E8152FD98
 for <pbs-devel@lists.proxmox.com>; Wed, 23 Feb 2022 15:41:35 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 47E2A46296
 for <pbs-devel@lists.proxmox.com>; Wed, 23 Feb 2022 15:41:35 +0100 (CET)
Message-ID: <25de793a-af20-5ecf-0367-313dfa1446cc@proxmox.com>
Date: Wed, 23 Feb 2022 15:41:33 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.6.0
Content-Language: en-US
To: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
References: <20220222112556.3747239-1-s.sterz@proxmox.com>
 <20220222112556.3747239-4-s.sterz@proxmox.com>
 <20220223102835.liwg3ylprgudeuhn@olga.proxmox.com>
From: Stefan Sterz <s.sterz@proxmox.com>
In-Reply-To: <20220223102835.liwg3ylprgudeuhn@olga.proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.000 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Subject: Re: [pbs-devel] [PATCH proxmox-backup 3/5] fix #3067: api: add
 multi-line comments to node.cfg
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 23 Feb 2022 14:42:07 -0000

responses inline

On 23.02.22 11:28, Wolfgang Bumiller wrote:
> some comments inline
> 
> On Tue, Feb 22, 2022 at 12:25:54PM +0100, Stefan Sterz wrote:
>> -- snip --
>>   
>> +    pub MULTI_LINE_COMMENT_REGEX = r"(?m)^([[:^cntrl:]]*)\s*$";
> 
> I don't think the trailing `\s*` is necessary?
> 

Yes, you are right, stole that from the Perl implementation, which uses 
"m/^\#(.*)\s*$/" (to parse the configuration file, hence the extra 
"\#"). I am not too familiar with Perl so that's probably my mistake, 
but just to be sure, is it necessary over there [1]? Tried playing 
around with it a bit, but couldn't figure out the purpose of "\s" since 
"." already matches any character. Unless Perl does "lazy" matching 
here, then it trims the trailing whitespace. Note that, AFAICT, both "." 
and "[:^cntrl:]" do not match the line feed character (\n) (and 
[:^cntrl:] does not match the carriage return either), but that those 
occur after the "end of line" ("$") and, thus, would not be matched by 
"\s" either.

[1]: 
https://git.proxmox.com/?p=qemu-server.git;a=blob;f=PVE/QemuServer.pm;h=a99f1a56f64a4789a9dc62184856bab2927d68c8;hb=HEAD#l2369

>> -- snip -- 
>>  
>>   use pbs_api_types::{EMAIL_SCHEMA, OPENSSL_CIPHERS_TLS_1_2_SCHEMA, OPENSSL_CIPHERS_TLS_1_3_SCHEMA,
>> -    SINGLE_LINE_COMMENT_SCHEMA};
>> +    MULTI_LINE_COMMENT_SCHEMA};
> 
> Please check how rustfmt would deal with the above `use` statement ;-)
> 

Done, although rustfmt complains about quite a few more things in those 
three files. Should I leave them or let rustfmt fix them while I am at it?

>> -- snip -- >>>> +    let comment = input.lines()
>> +        .take_while(|l| l.starts_with('#'))
>> +        .map(|l| {
>> +            let mut ch = l.chars();
>> +            ch.next();
>> +            ch.as_str()
> 
> ^ The `take_while` ensures `l` starts with '#', so you could just use
> 
>      .map(|l| &l[1..])
> 
> Alternatively, since 1.57 (and we're at 1.58 now), you could also
> combine the `.take_while` and `.map` into:
> 
>      .map_while(|l| l.strip_prefix("#"))
> 
> However...
> 
>> +        })
>> +        .fold(String::new(), |acc, l| acc + l + "\n");
>> +
>> +    if !comment.is_empty() {
>> +        config.insert("comment".to_string(), Value::String(comment));
>> +    }
>> +
>>       for (lineno, line) in input.lines().enumerate() {
> 
> ... here we're starting over, so maybe we should refactor this a little.
> Eg. we could use a `.lines().enumerate().peekable()` iterator:
> 
>      let mut lines = input.lines().enumerate().peekable();
>      let mut comments = String::new();
>      while let Some((_, line)) = iter.next_if(|(_, line)| line.starts_with('#') {
>          comments.push_str(&line[1..]);
>          comments.push('\n');
>      }
> 
>      for (lineno, line) in lines {
>      <...>
> 

Done. Thanks :-)

>> -- snip --
>>
>> +    if object.contains_key("comment") {
>> +        let comment = match object.get("comment") {
> 
> `contains_key` + `get` is somewhat wasteful and should be combined.
> 
>      if let Some(comment) = object.get("comment") {
> 
> For the type check you can then use `.as_str().ok_or_else(...)`
> 
> Or alternatively use a single match for both checks:
> 
>      match object.get("comment") {
>          Some(Value::String(comment)) => {
>              <the loop>
>          }
>          Some(_) => bail!(...),
>          None => (),
>      }
> 

Done. Combined it to:

      if let Some(Value::String(comment)) = object.get("comment") {
          /* loop here */
      }

Hope that's alright, personally I find that more legible than the match 
or `as_str().ok_or_else()`, although it is quite compact.

>> +            Some(Value::String(v)) => v,
>> +            _ => bail!("only strings can be comments"),
>> +        };
>> +
>> +        for lines in comment.lines() {
>> +            writeln!(output, "#{}", lines)?;
>> +        }
>> +    }
>> +
>>       for (key, value) in object.iter() {
> 
> Given that we type-check the comment above _and_ the data matching the
> schema is a precondition for calling this function, I'd just put an
> 
>      if key == "comment" { continue }
> 
> here rather than the conditional check limited to the `Value::String` case below.
> 

Coming back to rustfmt, it seems to want to format this as a multi-line 
if. So either, we ignore that and make it a single line if anyway, let 
rustfmt have its way, or we could do:

      match value {
              _ if key == "comment" => continue,
              Value::Null => continue,           // delete this entry
              /*..*/
      }

I'll switch it to whatever option is deemed nicer.

>> -- snip --
>> >> +
>> +#[test]
>> +fn test_with_comment() {
>> +    use proxmox_schema::ApiType;
>> +
>> +    // let's just reuse some schema we actually have available:
>> +    use crate::config::node::NodeConfig;
>> +
>> +    const NODE_INPUT: &str = "\
>> +        #this should\n\
>> +        #be included\n\
>> +        acme: account=pebble\n\
>> +        # this should not\n\
> 
> ^ I find it curous that your 'comment' section doesn't have leading
> spaces, while the "real comment" does ;-)
> 
> Should we think about trimming off up to 1 space when parsing the lines
> and prefix them with `"# "` when printing them out?
> Though I suppose it doesn't matter much as since we drop "real" comments
> on a RMW-cycle these files don't really need to bother much with being
> all that eye-pleasing I guess...

Note that in cases where indentation matters for Markdown rendering, 
such as lists, it is not trivial to distinguish between a space that is 
necessary for proper indentation and a space that a user wants for 
"cosmetic" reasons. It is currently valid Markdown to indent nested 
lists with only one space, i.e. the following note:

#- one
# - two

will render two as a nested list.

Further, AFAICT, that is how PVE handles these comments (just a 
preceding '#' no extra space etc.) [1,2]. I think there is some value in 
keeping this behavior consistent between products. My "real comment" has 
an extra space, because I find the lack of a space to be somewhat 
"claustrophobic" :-)

If someone where to add an extra space manually, that should still be 
parse-able. Extra spaces would be present, but not visible in the 
rendered HTML (due to how Markdown/HTML deal with extra spaces, with the 
exception of the cases mentioned above). From my testing that is 
consistent with PVE's behavior.

[1]: 
https://git.proxmox.com/?p=qemu-server.git;a=blob;f=PVE/QemuServer.pm;h=a99f1a56f64a4789a9dc62184856bab2927d68c8;hb=HEAD#l2369
[2]: 
https://git.proxmox.com/?p=qemu-server.git;a=blob;f=PVE/QemuServer.pm;h=a99f1a56f64a4789a9dc62184856bab2927d68c8;hb=HEAD#l2497