From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id B1070EF6D for ; Thu, 28 Sep 2023 11:28:00 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 8A6BF1426D for ; Thu, 28 Sep 2023 11:27:30 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 28 Sep 2023 11:27:29 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 35B7E4442B for ; Thu, 28 Sep 2023 11:27:29 +0200 (CEST) Date: Thu, 28 Sep 2023 11:27:28 +0200 (CEST) From: Christian Ebner To: Wolfgang Bumiller Cc: pbs-devel@lists.proxmox.com Message-ID: <532454423.5350.1695893248395@webmail.proxmox.com> In-Reply-To: <37sal4okdwhrkqslzsdbtxtah53zl5z6vyu6x44wv6xr3gha6n@zkbdjo42wrlj> References: <20230922071621.12670-1-c.ebner@proxmox.com> <20230922071621.12670-5-c.ebner@proxmox.com> <908189970.5090.1695815737265@webmail.proxmox.com> <702617592.5309.1695888460618@webmail.proxmox.com> <37sal4okdwhrkqslzsdbtxtah53zl5z6vyu6x44wv6xr3gha6n@zkbdjo42wrlj> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Priority: 3 Importance: Normal X-Mailer: Open-Xchange Mailer v7.10.6-Rev50 X-Originating-Client: open-xchange-appsuite X-SPAM-LEVEL: Spam detection results: 0 AWL 0.093 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [RFC pxar 4/20] fix #3174: metadata: impl fn to calc byte size X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Sep 2023 09:28:00 -0000 > On 28.09.2023 11:00 CEST Wolfgang Bumiller wrote: > > > On Thu, Sep 28, 2023 at 10:07:40AM +0200, Christian Ebner wrote: > > I was giving this some more thought and are not really convinced that sending > > this trough an encoder instance, which digests the encoded byte stream and counts > > the bytes is the right approach here. > > How about moving the logic `encode_metadata` from `Encoder` into > `Metadata` with an `Option<&mut SeqWrite>` parameter, not a full > Encoder, and just having the encoding vs counting logic live right next > to each other depending on whether the writer is Some? > That should be as cheap as it gets? > Hmm, the Metadata should however not be concerned about how it might be encoded in different contexts. That is something only the encoder should be concerned about. > > > > The purpose of this function is to calculate the bytes, which I can easily skip over > > *without* having to call any expensive encoding/decoding functionality. > > I might get around this by simply calling the decoder on the byte stream, than I do > > not need this at all (if I'm not missing something). Might that be the better approach? > > I'm not sure decoding is that much cheaper than dummy-encoding... > depending on the data I'd say it could even be more expensive in some > cases? (rare cases though, only with lots of ACLs/xattrs around I > suppose...) Probably so, although the data structures are somewhat simple in this case. > > > > > Additionally, and maybe even better, I might get rid of this also by letting the > > PXAR_APPENDIX_REF offset point to the start of the file payload entry, instead of the > > file entry as is now, thereby being able to blindly skip over this already to begin with. > > Although I am not sure if that is the best approach for handling the metadata, which should > > ideally not be encoded twice, once before the PXAR_APPENDIX_REF and the PXAR_PAYLOAD. > > Not sure why skipping data would encode it twice? Or did you mean to > imply that previously we pointed to metadata, but when instead pointing > to the payload we need to instead encode it in the new archive which we > previously did not need to do? Yes, I was referring to the latter, having to encode the metadata also in the regular part of the archive would give the same data twice, once the newly encoded and the same in the appended chunks. Storing this twice bloats the size unnecessarily. Which brings me to another point I did not take into consideration so far: How to handle files which metadata changed but was not checked against. Since the catalog only contains size and mtime, only these are comparable. But I need the current xattrs, acls ecc... Will have to look up if changing those actually changes the mtime as well.