From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <c.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 81B879228E
 for <pbs-devel@lists.proxmox.com>; Fri,  5 Apr 2024 12:26:13 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 577C41161F
 for <pbs-devel@lists.proxmox.com>; Fri,  5 Apr 2024 12:26:13 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pbs-devel@lists.proxmox.com>; Fri,  5 Apr 2024 12:26:12 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 7FF2D463EE
 for <pbs-devel@lists.proxmox.com>; Fri,  5 Apr 2024 12:26:12 +0200 (CEST)
Message-ID: <46483c00-25e8-4980-a81e-1fbe0ef9fcdf@proxmox.com>
Date: Fri, 5 Apr 2024 12:26:11 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
To: Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com>,
 =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>
References: <20240328123707.336951-1-c.ebner@proxmox.com>
 <20240328123707.336951-39-c.ebner@proxmox.com>
 <1712235368.4ka0m21w6d.astroid@yuna.none>
Content-Language: en-US, de-DE
From: Christian Ebner <c.ebner@proxmox.com>
In-Reply-To: <1712235368.4ka0m21w6d.astroid@yuna.none>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.031 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pbs-devel] [PATCH v3 proxmox-backup 38/58] upload stream: impl
 reused chunk injector
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Fri, 05 Apr 2024 10:26:13 -0000

On 4/4/24 16:24, Fabian Grünbichler wrote:
> 
> but I'd like the following even better, since it allows us to get rid of
> the buffer altogether:
> 
>      fn poll_next(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Option<Self::Item>> {
>          let mut this = self.project();
> 
>          let mut injections = this.injection_queue.lock().unwrap();
> 
>          // check whether we have something to inject
>          if let Some(inject) = injections.pop_front() {
>              let offset = this.stream_len.load(Ordering::SeqCst) as u64;
> 
>              if inject.boundary == offset {
>                  // inject now
>                  let mut chunks = Vec::new();
>                  let mut csum = this.index_csum.lock().unwrap();
> 
>                  // account for injected chunks
>                  for chunk in inject.chunks {
>                      let offset = this
>                          .stream_len
>                          .fetch_add(chunk.size() as usize, Ordering::SeqCst)
>                          as u64;
>                      this.reused_len
>                          .fetch_add(chunk.size() as usize, Ordering::SeqCst);
>                      let digest = chunk.digest();
>                      chunks.push((offset, digest));
>                      let end_offset = offset + chunk.size();
>                      csum.update(&end_offset.to_le_bytes());
>                      csum.update(&digest);
>                  }
>                  let chunk_info = InjectedChunksInfo::Known(chunks);
>                  return Poll::Ready(Some(Ok(chunk_info)));
>              } else if inject.boundary < offset {
>                  // incoming new chunks and injections didn't line up?
>                  return Poll::Ready(Some(Err(anyhow!("invalid injection boundary"))));
>              } else {
>                  // inject later
>                  injections.push_front(inject);
>              }
>          }
> 
>          // nothing to inject now, let's see if there's further input
>          match ready!(this.input.as_mut().poll_next(cx)) {
>              None => Poll::Ready(None),
>              Some(Err(err)) => Poll::Ready(Some(Err(err))),
>              Some(Ok(raw)) if raw.is_empty() => {
>                  Poll::Ready(Some(Err(anyhow!("unexpected empty raw data"))))
>              }
>              Some(Ok(raw)) => {
>                  let offset = this.stream_len.fetch_add(raw.len(), Ordering::SeqCst) as u64;
>                  let data = InjectedChunksInfo::Raw((offset, raw));
> 
>                  Poll::Ready(Some(Ok(data)))
>              }
>          }
>      }
> 
> but technically all this accounting could move back to the backup_writer
> as well, if the injected chunk info also contained the size..
> 

Yes, this is much more compact! Also, moving this to the backup writer 
as suggested should allow to further reduce code even more there, at 
least from the initial refactoring it seems to behave just fine.