public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: "Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Cc: Wolfgang Bumiller <w.bumiller@proxmox.com>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Dietmar Maurer <dietmar@proxmox.com>
Subject: Re: [pbs-devel] [RFC PATCH proxmox-backup] datastore: implement consitency tuning for datastores
Date: Thu, 19 May 2022 15:49:50 +0200	[thread overview]
Message-ID: <d7e3b34e-e633-a5db-af1a-2fc3f319f96e@proxmox.com> (raw)
In-Reply-To: <1652950355.dp00h7vxx4.astroid@nora.none>

On 5/19/22 15:35, Fabian Grünbichler wrote:
> On May 18, 2022 1:24 pm, Dominik Csapak wrote:
>> currently, we don't (f)sync on chunk insertion (or at any point after
>> that), which can lead to broken chunks in case of e.g. an unexpected
>> powerloss. To fix that, offer a tuning option for datastores that
>> controls the level of syncs it does:
>>
>> * None (old default): same as current state, no (f)syncs done at any point
>> * Filesystem (new default): at the end of a backup, the datastore issues
>>    a syncfs(2) to the filesystem of the datastore
>> * File: issues an fsync on each chunk as they get inserted
>>    (using our 'replace_file' helper)
>>
>> a small benchmark showed the following (times in mm:ss):
>> setup: virtual pbs, 4 cores, 8GiB memory, ext4 on spinner
>>
>> size                none    filesystem  file
>> 2GiB (fits in ram)   00:13   0:41        01:00
>> 33GiB                05:21   05:31       13:45
>>
>> so if the backup fits in memory, there is a large difference between all
>> of the modes (expected), but as soon as it exceeds the memory size,
>> the difference between not syncing and syncing the fs at the end becomes
>> much smaller.
>>
>> i also tested on an nvme, but there the syncs basically made no difference
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>> it would be nice if anybody else tries to recreate the benchmarks on
>> different setups, to verify (or disprove) my findings
> 
> FWIW:
> 
> randfile on tmpfs as source, backed up as fidx
> randfile regenerated for every run, PBS restarted for every run
> 
> PBS in VM (8GB ram, disks on zvols on spinner + special + log), datastore on ext4:
> 
> SIZE: 4096 MODE: none Duration: 22.51s
> SIZE: 4096 MODE: filesystem Duration: 28.11s
> SIZE: 4096 MODE: file Duration: 54.47s
> 
> SIZE: 16384 MODE: none Duration: 202.42s
> SIZE: 16384 MODE: filesystem Duration: 275.36s
> SIZE: 16384 MODE: file Duration: 311.97s
> 
> same VM, datastore on single-disk ZFS pool:
> 
> SIZE: 1024 MODE: none Duration: 5.03s
> SIZE: 1024 MODE: file Duration: 22.91s
> SIZE: 1024 MODE: filesystem Duration: 15.57s
> 
> SIZE: 4096 MODE: none Duration: 41.02s
> SIZE: 4096 MODE: file Duration: 135.94s
> SIZE: 4096 MODE: filesystem Duration: 146.88s
> 
> SIZE: 16384 MODE: none Duration: 336.10s
> rest ended in tears cause of restricted resources in the VM
> 
> PBS baremetal same as source, datastore on ZFS on spinner+special+log:
> 
> SIZE: 1024 MODE: none Duration: 4.90s
> SIZE: 1024 MODE: file Duration: 4.92s
> SIZE: 1024 MODE: filesystem Duration: 4.94s
> 
> SIZE: 4096 MODE: none Duration: 19.56s
> SIZE: 4096 MODE: file Duration: 31.67s
> SIZE: 4096 MODE: filesystem Duration: 38.54s
> 
> SIZE: 16384 MODE: none Duration: 189.77s
> SIZE: 16384 MODE: file Duration: 178.81s
> SIZE: 16384 MODE: filesystem Duration: 159.26s
> 
> ^^ this is rather unexpected, I suspect something messed with the 'none'
> case here, so I re-ran it:
> 
> SIZE: 1024 MODE: none Duration: 4.90s
> SIZE: 1024 MODE: file Duration: 4.92s
> SIZE: 1024 MODE: filesystem Duration: 4.98s
> SIZE: 4096 MODE: none Duration: 19.77s
> SIZE: 4096 MODE: file Duration: 19.68s
> SIZE: 4096 MODE: filesystem Duration: 19.61s
> SIZE: 16384 MODE: none Duration: 133.93s
> SIZE: 16384 MODE: file Duration: 146.88s
> SIZE: 16384 MODE: filesystem Duration: 152.94s
> 
> and once more with ~30GB (ARC is just 16G):
> 
> SIZE: 30000 MODE: none Duration: 368.58s
> SIZE: 30000 MODE: file Duration: 292.05s  (!!!)
> SIZE: 30000 MODE: filesystem Duration: 431.73s
> 
> repeated once more:
> 
> SIZE: 30000 MODE: none Duration: 419.75s
> SIZE: 30000 MODE: file Duration: 302.73s
> SIZE: 30000 MODE: filesystem Duration: 409.07s
> 
> so.. rather weird? possible noisy measurements though, as this is on my
> workstation ;)
> 
> PBS baremetal same as source, datastore on ZFS on NVME (no surprises
> there):
> 
> SIZE: 1024 MODE: file Duration: 4.92s
> SIZE: 1024 MODE: filesystem Duration: 4.95s
> SIZE: 1024 MODE: none Duration: 4.96s
> 
> SIZE: 4096 MODE: file Duration: 19.69s
> SIZE: 4096 MODE: filesystem Duration: 19.78s
> SIZE: 4096 MODE: none Duration: 19.67s
> 
> SIZE: 16384 MODE: file Duration: 81.39s
> SIZE: 16384 MODE: filesystem Duration: 78.86s
> SIZE: 16384 MODE: none Duration: 78.38s
> 
> SIZE: 30000 MODE: none Duration: 142.65s
> SIZE: 30000 MODE: file Duration: 143.43s
> SIZE: 30000 MODE: filesystem Duration: 143.15s
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 

so what i gather from this benchmark is that on ext4, many fsyncs are more expensive than
a single syncfs, but on zfs, it's very close together, leaning toward many fsyncs
to be faster ? (aside from the case where fsync is faster than not doing it ???)

in any case, doing some kind of syncing *will* slow down backups one way or another
(leaving the weird zfs case aside for the moment). so the question is if we
make one of the new modes the default or not...

i'd put them in there even if we leave the default, so the admin can decide how much
crash-consistency the pbs has.

any other input? @wolfgang, @thomas, @dietmar?




  reply	other threads:[~2022-05-19 13:50 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18 11:24 Dominik Csapak
2022-05-19 13:35 ` Fabian Grünbichler
2022-05-19 13:49   ` Dominik Csapak [this message]
2022-05-20  7:07 ` Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7e3b34e-e633-a5db-af1a-2fc3f319f96e@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=dietmar@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal