From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id C9E2BBAA8B for ; Wed, 20 Mar 2024 17:59:45 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id AF9F713E52 for ; Wed, 20 Mar 2024 17:59:15 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 20 Mar 2024 17:59:14 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id EB8A548BAA for ; Wed, 20 Mar 2024 17:59:13 +0100 (CET) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Wed, 20 Mar 2024 17:59:12 +0100 Message-Id: From: "Max Carrara" To: "Max Carrara" , "Proxmox VE development discussion" X-Mailer: aerc 0.17.0-72-g6a84f1331f1c References: <20240305150758.252669-1-m.carrara@proxmox.com> <20240305150758.252669-7-m.carrara@proxmox.com> <1710839809.dxcgevda47.astroid@yuna.none> In-Reply-To: X-SPAM-LEVEL: Spam detection results: 0 AWL -0.374 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_ASCII_DIVIDERS 0.8 Email that uses ascii formatting dividers and possible spam tricks KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [boost.org, ceph.com, cephconfig.pm, proxmox.com] Subject: Re: [pve-devel] [PATCH v4 pve-storage 06/16] cephconfig: support line-continuations in parser X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Mar 2024 16:59:45 -0000 On Tue Mar 19, 2024 at 4:59 PM CET, Max Carrara wrote: > On Tue Mar 19, 2024 at 10:37 AM CET, Fabian Gr=C3=BCnbichler wrote: > > On March 5, 2024 4:07 pm, Max Carrara wrote: > > > Ceph's docs state the following [0]: > > >> The backslash character `\` is used as the line-continuation marker > > >> that combines the next line with the current one. > > >=20 > > > This commit implements the support of such line-continuations in our > > > parser. > > >=20 > > > The line following a line ending with a '\' has its whitespace > > > preserved, which matches the behaviour in Ceph's original > > > implementation [1]. In other words, leading and trailing whitespace i= s > > > not stripped from a continued line. > > > > it's actually a bit more complicated.. ceph only supports line > > continuations inside values (well, in key value lines after the key ;))= , > > and only if they are unquoted.. Upon further research and confirming the behaviour via `ceph-conf` (thanks for the tip btw!) line continuations are in fact supported in different parts as well. Consider the following example 'ceph.conf' file: ``` [clie\ nt] # some comment foo\ \ \ \ =3D \ bar ``` The continued `client` section header does actually get parsed by `ceph-conf` without any issues - the trailing comment and whitespace are also ignored. Where it gets really interesting is the continuation right after 'foo': Because keys are defined using `raw[]` [0], whatever is skipped by the parser is still included in the parsed output [1]. This has the consequence that the four continued lines are in fact not skipped and instead read as literal newline characters. After the equals sign, the line continuation is skipped as expected. By providing literal newlines via the shell, the above can easily be verified: $ ceph-conf -c ceph_cancer.conf -s client foo^M^M^M^M bar (The ^M is a literal newline and can usually be obtained by typing CTRL+V, Enter in your shell.) To make matters even worse, quoted values may in fact be *directly* followed by continuations (`ceph-conf` fails otherwise): ``` [client] foo =3D "bar"\ baz =3D qux ``` The above is considered "correct" because the escaped newline counts as whitespace. If you were to put some spaces into the empty line after the "foo" key, these would be skipped as well. For completeness's sake, this also parses: ``` [client] foo =3D "bar"\ # some comment baz =3D qux ``` However, the following is invalid: ``` [client] foo =3D "bar"\ baz =3D qux ``` ... because the parser sees: ``` [client] foo =3D "bar"baz =3D qux ``` ... which is not allowed, because a quoted value may only be followed what the grammar defines as "empty_line" [2]. So, this doesn't really make the parsing logic regarding line continuations any simpler: 1. Section headers may contain line continuations 2. Section headers may be followed by whitespace + comments (after ']' 3. Keys are parsed "raw" and may therefore be continued --> Will probably just not handle this case, as there are no config keys that contain newline characters or anything of the sort - why would there be? Why would a user need this? 4. Unquoted values may contain line continuations 5. Quoted values may be *directly* followed by a line continuation character, as long as the remaining stuff is whitespace or a comment 6. Bonus point: Quoted values MUST NOT *contain* line continuations, as they're parsed as `lexeme[]`s [3] ... so, see you in v5 ;) [0]: https://git.proxmox.com/?p=3Dceph.git;a=3Dblob;f=3Dceph/src/common/Con= fUtils.cc;h=3D2f78fd02bf9e27467275752e6f3bca0c5e3946ce;hb=3Drefs/heads/mast= er#l182 [1]: https://www.boost.org/doc/libs/1_53_0/libs/spirit/doc/html/spirit/qi/r= eference/directive/raw.html [2]: https://git.proxmox.com/?p=3Dceph.git;a=3Dblob;f=3Dceph/src/common/Con= fUtils.cc;h=3D2f78fd02bf9e27467275752e6f3bca0c5e3946ce;hb=3Drefs/heads/mast= er#l188 [3]: https://www.boost.org/doc/libs/1_53_0/libs/spirit/doc/html/spirit/qi/r= eference/directive/lexeme.html > > As mentioned in my other reply, I'll probably have to revise the whole > parsing logic to take that into account... but thanks for being so > thorough! > > > > > >=20 > > > [0]: https://docs.ceph.com/en/reef/rados/configuration/ceph-conf/#cha= nges-introduced-in-octopus > > > [1]: https://git.proxmox.com/?p=3Dceph.git;a=3Dblob;f=3Dceph/src/comm= on/ConfUtils.cc;h=3D2f78fd02bf9e27467275752e6f3bca0c5e3946ce;hb=3Drefs/head= s/master#l262 > > >=20 > > > Signed-off-by: Max Carrara > > > --- > > > Changes v2 --> v3: > > > * new > > > Changes v3 --> v4: > > > * none > > >=20 > > > src/PVE/CephConfig.pm | 28 ++++++++++++++++++++++++---- > > > 1 file changed, 24 insertions(+), 4 deletions(-) > > >=20 > > > diff --git a/src/PVE/CephConfig.pm b/src/PVE/CephConfig.pm > > > index 74a92eb..80f71b0 100644 > > > --- a/src/PVE/CephConfig.pm > > > +++ b/src/PVE/CephConfig.pm > > > @@ -19,13 +19,33 @@ sub parse_ceph_config { > > > return $cfg if !defined($raw); > > > =20 > > > my @lines =3D split /\n/, $raw; > > > + my @lines_normalized; > > > + > > > + my $re_comment_not_escaped =3D qr/(? > > + my $re_leading_ws =3D qr/^\s+/; > > > + my $re_trailing_ws =3D qr/\s+$/; > > > + > > > + while (scalar(@lines)) { > > > + my $line =3D shift(@lines); > > > + $line =3D~ s/$re_comment_not_escaped//; > > > + $line =3D~ s/$re_leading_ws//; > > > + $line =3D~ s/$re_trailing_ws//; > > > + next if !$line; > > > + > > > + # merge lines ending with continuation character '\' > > > + while ($line =3D~ s/\\$//) { > > > + my $next_line =3D shift(@lines); > > > + $next_line =3D~ s/$re_comment_not_escaped//; > > > + $next_line =3D~ s/$re_trailing_ws//; > > > + $line .=3D $next_line; > > > + } > > > + > > > + push(@lines_normalized, $line); > > > + } > > > =20 > > > my $section; > > > =20 > > > - for my $line (@lines) { > > > - $line =3D~ s/(? > > - $line =3D~ s/^\s+//; > > > - $line =3D~ s/\s+$//; > > > + for my $line (@lines_normalized) { > > > next if !$line; > > > =20 > > > if ($line =3D~ m/^\[(.+)\]$/) { > > > --=20 > > > 2.39.2 > > >=20 > > >=20 > > >=20 > > > _______________________________________________ > > > pve-devel mailing list > > > pve-devel@lists.proxmox.com > > > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > >=20 > > >=20 > > >=20 > > > > > > _______________________________________________ > > pve-devel mailing list > > pve-devel@lists.proxmox.com > > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel