From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pbs-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id B2A7A1FF15D
	for <inbox@lore.proxmox.com>; Thu,  8 Aug 2024 08:53:31 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 62F2C181C1;
	Thu,  8 Aug 2024 08:53:42 +0200 (CEST)
Message-ID: <0a451e8e-fa0f-44ec-ae6e-f4dcb95a82ab@proxmox.com>
Date: Thu, 8 Aug 2024 08:53:38 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird Beta
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
 Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com>
References: <20240805092414.1178930-1-d.csapak@proxmox.com>
 <2e54f16c-f9a3-488b-b22a-04bd45ffbd05@proxmox.com>
Content-Language: en-US
From: Dominik Csapak <d.csapak@proxmox.com>
In-Reply-To: <2e54f16c-f9a3-488b-b22a-04bd45ffbd05@proxmox.com>
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.016 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pbs-devel] applied-series: [PATCH proxmox-backup v3 0/5]
 improve compression throughput
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox Backup Server development discussion
 <pbs-devel@lists.proxmox.com>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: pbs-devel-bounces@lists.proxmox.com
Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com>

On 8/7/24 19:06, Thomas Lamprecht wrote:
> On 05/08/2024 11:24, Dominik Csapak wrote:
>> in my tests (against current master) it improved the throughput if
>> the source/target storage is fast enough (tmpfs -> tmpfs):
>>
>> Type                master (MiB/s)   with my patches (MiB/s)
>> .img file           ~614             ~767
>> pxar one big file   ~657             ~807
>> pxar small files    ~576             ~627
>>
>> (these results are also in the relevant commit message)
>>
>> It would be great, if someone else can cross check my results here.
>> Note: the the pxar code being faster than the img code seems to stem
>> from better multithreading pipelining in that code or in tokio (pxar
>> codepath scales more directly with more cores than the .img codepath)
>>
>> changes from v2:
>> * use zstd_safe instead of zstd so we have access to the underlying
>>    error code
>> * add test for the error code handling since that's not part of the
>>    public zstd api, only an implementation detail (albeit one that's
>>    not likely to change soon)
>> * seperated the tests for the decode(encode()) roundtrip so a failure
>>    can more easily assigned to a specific codepath
>>
>> changes from v1:
>> * reorder patches so that the data blob writer removal is the first one
>> * add tests for DataBlob that we can decode what we encoded
>>    (to see that my patches don't mess up the chunk generation)
>> * add new patch to cleanup the `encode` function a bit
>>
>> Dominik Csapak (5):
>>    remove data blob writer
>>    datastore: test DataBlob encode/decode roundtrip
>>    datastore: data blob: add helper and test for checking zstd_safe error
>>      code
>>    datastore: data blob: increase compression throughput
>>    datastore: DataBlob encode: simplify code
>>
>>   Cargo.toml                            |   1 +
>>   pbs-datastore/Cargo.toml              |   1 +
>>   pbs-datastore/src/data_blob.rs        | 193 ++++++++++++++++-------
>>   pbs-datastore/src/data_blob_writer.rs | 212 --------------------------
>>   pbs-datastore/src/lib.rs              |   2 -
>>   tests/blob_writer.rs                  | 105 -------------
>>   6 files changed, 144 insertions(+), 370 deletions(-)
>>   delete mode 100644 pbs-datastore/src/data_blob_writer.rs
>>   delete mode 100644 tests/blob_writer.rs
>>
> 
> Applied, with some rewording of the commit message and some slight
> adaption to the test commit.
> 
> Ps, it seems the zstd crate authors aren't so sure why they use the
> 32 KB buffer either, which FWICT is the underlying issue here:
> 
> https://docs.rs/zstd/latest/src/zstd/stream/zio/writer.rs.html#41-42
> 
> But it's a bit hard to follow, to me this looks less like a allocation
> pattern issue (on its own), but rather than a increased overhead due to
> processing in 32 KiB chunks, the extra copying itself naturally doesn't
> help, but that's not a bad allocation pattern but rather a single
> (FWICT) avoidable allocations for the small buffer, but as said, not
> 100% sure as the code is rather over-engineered... Anyhow, I tried to
> add these findings, including the uncertainty they have, in the commit
> message to have some better background.
> 
> I know you could have done this in a v4, but it felt faster to just
> amend the changes, especially since I have a few days off and would
> have to recreate the mental context anyway.


Ah ok, thanks for investigating (I was not patient enough for that seemingly..)

also thanks for the amending of the commit messages :)


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel