From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id E81A993DAB for ; Wed, 10 Apr 2024 09:04:25 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id C79A38EAE for ; Wed, 10 Apr 2024 09:03:55 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 10 Apr 2024 09:03:54 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 760E34363B for ; Wed, 10 Apr 2024 09:03:54 +0200 (CEST) Date: Wed, 10 Apr 2024 09:03:49 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Christian Ebner , pbs-devel@lists.proxmox.com References: <20240328123707.336951-1-c.ebner@proxmox.com> <20240328123707.336951-48-c.ebner@proxmox.com> <171230603494.1926770.11920216027784905682@yuna.proxmox.com> In-Reply-To: < MIME-Version: 1.0 User-Agent: astroid/0.16.0 (https://github.com/astroidmail/astroid) Message-Id: <1712731289.pe9iwgvz9y.astroid@yuna.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.058 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH v3 proxmox-backup 47/58] client: pxar: add look-ahead caching X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Apr 2024 07:04:26 -0000 On April 9, 2024 4:53 pm, Christian Ebner wrote: > On 4/5/24 10:33, Fabian Gr=C3=BCnbichler wrote: >> Quoting Christian Ebner (2024-03-28 13:36:56) >>> + >>> + let boundary =3D encoder.payload_position()?; >>> + let offset =3D >>> + self.reused_chunks >>> + .insert(indices, boundary, start_padding, end_= padding); >>> + >>> + self.caching_enabled =3D true; >>> + let cache_entry =3D CacheEntry::RegEntry(CacheEntryDat= a::new( >>> + fd, >>> + c_file_name.into(), >>> + *stat, >>> + metadata.clone(), >>> + offset, >>> + )); >>> + self.cached_entries.push(cache_entry); >>> + >>> + match self.reused_chunks.suggested() { >>> + Suggested::Reuse =3D> self.flush_cached_to_archive= (encoder, true, true).await?, >>> + Suggested::Reencode =3D> { >>> + self.flush_cached_to_archive(encoder, false, t= rue).await? >>> + } >>> + Suggested::CheckNext =3D> {} >>> + } >>> + >>> + return Ok(()); >>> + } >>> + } >>> + >>> + self.flush_cached_to_archive(encoder, false, true).await?; >>> + self.add_entry(encoder, previous_metadata_accessor, fd.as_raw_= fd(), c_file_name, stat) >>> + .await >>=20 >> this part here is where I think we mishandle some edge cases, like menti= oned in >> the ReusedChunks patch comments.. even keeping back the last chunk doesn= 't save >> us from losing some re-usable files sometimes.. >=20 > Still need to have a closer look at what you mean exactly here... The=20 > code path itself should be fine I think, or am I missing your point here? >=20 > It is rather the match against the `suggested` (now called `action`)=20 > where the wrong decision might be made. yes, sorry for not phrasing that more explicitly - I just meant to say that we mishandle those wrong "Suggestions" here (because "CheckNext" is suggested too often, we then proceed with re-encoding if the next file is not re-usable. there also is the second issue about re-usable file sequences crossing chunk boundaries and padding that can cause the suggestion to take a wrong turn). not like "this code here is broken", but "it does the wrong thing based on wrong information" ;) the fix is (most likely) in the suggestion part, not in the handling part here.. but depending on the solution, it might also be necessary to adapt something here :)