From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9AC237199B for ; Thu, 19 May 2022 15:35:23 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 8C9B98714 for ; Thu, 19 May 2022 15:35:23 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 2DC958707 for ; Thu, 19 May 2022 15:35:22 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id F09EF436D2 for ; Thu, 19 May 2022 15:35:21 +0200 (CEST) Date: Thu, 19 May 2022 15:35:15 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox Backup Server development discussion References: <20220518112455.2393668-1-d.csapak@proxmox.com> In-Reply-To: <20220518112455.2393668-1-d.csapak@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.15.0 (https://github.com/astroidmail/astroid) Message-Id: <1652950355.dp00h7vxx4.astroid@nora.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.172 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pbs-devel] [RFC PATCH proxmox-backup] datastore: implement consitency tuning for datastores X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2022 13:35:23 -0000 On May 18, 2022 1:24 pm, Dominik Csapak wrote: > currently, we don't (f)sync on chunk insertion (or at any point after > that), which can lead to broken chunks in case of e.g. an unexpected > powerloss. To fix that, offer a tuning option for datastores that > controls the level of syncs it does: >=20 > * None (old default): same as current state, no (f)syncs done at any poin= t > * Filesystem (new default): at the end of a backup, the datastore issues > a syncfs(2) to the filesystem of the datastore > * File: issues an fsync on each chunk as they get inserted > (using our 'replace_file' helper) >=20 > a small benchmark showed the following (times in mm:ss): > setup: virtual pbs, 4 cores, 8GiB memory, ext4 on spinner >=20 > size none filesystem file > 2GiB (fits in ram) 00:13 0:41 01:00 > 33GiB 05:21 05:31 13:45 >=20 > so if the backup fits in memory, there is a large difference between all > of the modes (expected), but as soon as it exceeds the memory size, > the difference between not syncing and syncing the fs at the end becomes > much smaller. >=20 > i also tested on an nvme, but there the syncs basically made no differenc= e >=20 > Signed-off-by: Dominik Csapak > --- > it would be nice if anybody else tries to recreate the benchmarks on > different setups, to verify (or disprove) my findings FWIW: randfile on tmpfs as source, backed up as fidx randfile regenerated for every run, PBS restarted for every run PBS in VM (8GB ram, disks on zvols on spinner + special + log), datastore o= n ext4: SIZE: 4096 MODE: none Duration: 22.51s SIZE: 4096 MODE: filesystem Duration: 28.11s SIZE: 4096 MODE: file Duration: 54.47s SIZE: 16384 MODE: none Duration: 202.42s SIZE: 16384 MODE: filesystem Duration: 275.36s SIZE: 16384 MODE: file Duration: 311.97s same VM, datastore on single-disk ZFS pool: SIZE: 1024 MODE: none Duration: 5.03s SIZE: 1024 MODE: file Duration: 22.91s SIZE: 1024 MODE: filesystem Duration: 15.57s SIZE: 4096 MODE: none Duration: 41.02s SIZE: 4096 MODE: file Duration: 135.94s SIZE: 4096 MODE: filesystem Duration: 146.88s SIZE: 16384 MODE: none Duration: 336.10s rest ended in tears cause of restricted resources in the VM PBS baremetal same as source, datastore on ZFS on spinner+special+log: SIZE: 1024 MODE: none Duration: 4.90s SIZE: 1024 MODE: file Duration: 4.92s SIZE: 1024 MODE: filesystem Duration: 4.94s SIZE: 4096 MODE: none Duration: 19.56s SIZE: 4096 MODE: file Duration: 31.67s SIZE: 4096 MODE: filesystem Duration: 38.54s SIZE: 16384 MODE: none Duration: 189.77s SIZE: 16384 MODE: file Duration: 178.81s SIZE: 16384 MODE: filesystem Duration: 159.26s ^^ this is rather unexpected, I suspect something messed with the 'none'=20 case here, so I re-ran it: SIZE: 1024 MODE: none Duration: 4.90s SIZE: 1024 MODE: file Duration: 4.92s SIZE: 1024 MODE: filesystem Duration: 4.98s SIZE: 4096 MODE: none Duration: 19.77s SIZE: 4096 MODE: file Duration: 19.68s SIZE: 4096 MODE: filesystem Duration: 19.61s SIZE: 16384 MODE: none Duration: 133.93s SIZE: 16384 MODE: file Duration: 146.88s SIZE: 16384 MODE: filesystem Duration: 152.94s and once more with ~30GB (ARC is just 16G): SIZE: 30000 MODE: none Duration: 368.58s SIZE: 30000 MODE: file Duration: 292.05s (!!!) SIZE: 30000 MODE: filesystem Duration: 431.73s repeated once more: SIZE: 30000 MODE: none Duration: 419.75s SIZE: 30000 MODE: file Duration: 302.73s SIZE: 30000 MODE: filesystem Duration: 409.07s so.. rather weird? possible noisy measurements though, as this is on my=20 workstation ;) PBS baremetal same as source, datastore on ZFS on NVME (no surprises=20 there): SIZE: 1024 MODE: file Duration: 4.92s SIZE: 1024 MODE: filesystem Duration: 4.95s SIZE: 1024 MODE: none Duration: 4.96s SIZE: 4096 MODE: file Duration: 19.69s SIZE: 4096 MODE: filesystem Duration: 19.78s SIZE: 4096 MODE: none Duration: 19.67s SIZE: 16384 MODE: file Duration: 81.39s SIZE: 16384 MODE: filesystem Duration: 78.86s SIZE: 16384 MODE: none Duration: 78.38s SIZE: 30000 MODE: none Duration: 142.65s SIZE: 30000 MODE: file Duration: 143.43s SIZE: 30000 MODE: filesystem Duration: 143.15s