[pbs-devel] [PATCH proxmox-backup v3] docs: explain some technical details about datastores/chunks

public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed

* [pbs-devel] [PATCH proxmox-backup v3] docs: explain some technical details about datastores/chunks
@ 2020-12-11 12:17 Dominik Csapak
  2021-02-04 11:34 ` [pbs-devel] applied: " Thomas Lamprecht
  0 siblings, 1 reply; 2+ messages in thread
From: Dominik Csapak @ 2020-12-11 12:17 UTC (permalink / raw)
  To: pbs-devel

adds explanations for:
* what datastores are
* their relation with snapshots/chunks
* basic information about chunk directory structures
* fixed-/dynamically-sized chunks
* special handling of encrypted chunks
* hash collision probability
* limitation of file-based backups

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
changes from v2:
* incoporate @F.Ebners and @D.Whytes suggestions, thanks :)

 docs/index.rst              |   1 +
 docs/technical-overview.rst | 166 ++++++++++++++++++++++++++++++++++++
 docs/terminology.rst        |   3 +
 3 files changed, 170 insertions(+)
 create mode 100644 docs/technical-overview.rst

diff --git a/docs/index.rst b/docs/index.rst
index fffcb4fd..f3e6bf0c 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -33,6 +33,7 @@ in the section entitled "GNU Free Documentation License".
    pve-integration.rst
    pxar-tool.rst
    sysadmin.rst
+   technical-overview.rst
    faq.rst
 
 .. raw:: latex
diff --git a/docs/technical-overview.rst b/docs/technical-overview.rst
new file mode 100644
index 00000000..8a97fb9a
--- /dev/null
+++ b/docs/technical-overview.rst
@@ -0,0 +1,166 @@
+Technical Overview
+==================
+
+.. _technical_overview:
+
+Datastores
+----------
+
+A Datastore is the logical place where :ref:`Backup Snapshots <backup_snapshot>`
+and their chunks are stored. Snapshots consist of a manifest, blobs,
+dynamic- and fixed-indexes (see :ref:`terminology`), and are stored in the
+following directory structure:
+
+ <datastore-root>/<type>/<id>/<time>/
+
+The deduplication of datastores is based on reusing chunks, which are
+referenced by the indexes in a backup snapshot. This means that multiple
+indexes can reference the same chunks, reducing the amount of space
+needed to contain the data (even across backup snapshots).
+
+Chunks
+------
+
+A chunk is some (possibly encrypted) data with a CRC-32 checksum at
+the end and a type marker at the beginning. It is identified by the
+SHA-256 checksum of its content.
+
+To generate such chunks, backup data is split either into fixed-size or
+dynamically sized chunks. The same content will be hashed to the same
+checksum.
+
+The chunks of a datastore are found in
+
+ <datastore-root>/.chunks/
+
+This chunk directory is further subdivided by the first four byte of the
+chunks checksum, so the chunk with the checksum
+
+ a342e8151cbf439ce65f3df696b54c67a114982cc0aa751f2852c2f7acc19a8b
+
+lives in
+
+ <datastore-root>/.chunks/a342/
+
+This is done to reduce the number of files per directory, as having
+many files per directory can be bad for file system performance.
+
+These chunk directories ('0000'-'ffff') will be preallocated when a datastore is
+created.
+
+Fixed-sized Chunks
+^^^^^^^^^^^^^^^^^^
+
+For block based backups (like VMs), fixed-sized chunks are used. The content
+(disk image), is split into chunks of the same length (typically 4 MiB).
+
+This works very well for VM images, since the file system on the guest
+most often tries to allocate files in contiguous pieces, so new files get
+new blocks, and changing existing files changes only their own blocks.
+
+As an optimization, VMs in `Proxmox VE`_ can make use of 'dirty bitmaps',
+which can track the changed blocks of an image. Since these bitmap
+are also a representation of the image split into chunks, we have
+a direct relation between dirty blocks of the image and chunks we have
+to upload, so only modified chunks of the disk have to be uploaded for a backup.
+
+Since we always split the image into chunks of the same size, unchanged
+blocks will result in identical checksums for those chunks, so such chunks do not
+need to be backed up again. This way storage snapshots are not needed to find
+the changed blocks.
+
+For consistency, `Proxmox VE`_ uses a QEMU internal snapshot mechanism, that
+does not rely on storage snapshots either.
+
+Dynamically sized Chunks
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+If one does not want to backup block-based systems but rather file-based systems,
+using fixed-sized chunks is not a good idea, since every time a file
+would change in size, the remaining data gets shifted around and this
+would result in many chunks changing, reducing the amount of deduplication.
+
+To improve this, `Proxmox Backup`_ Server uses dynamically sized chunks
+instead. Instead of splitting an image into fixed sizes, it first generates
+a consistent file archive (:ref:`pxar <pxar-format>`) and uses a rolling hash
+over this on-the-fly generated archive to calculate chunk boundaries.
+
+We use a variant of Buzhash which is a cyclic polynomial algorithm.
+It works by continuously calculating a checksum while iterating over the
+data, and on certain conditions it triggers a hash boundary.
+
+Assuming that most files of the system that is to be backed up have not changed,
+eventually the algorithm triggers the boundary on the same data as a previous
+backup, resulting in chunks that can be reused.
+
+Encrypted Chunks
+^^^^^^^^^^^^^^^^
+
+Encrypted chunks are a special case. Both fixed- and dynamically sized
+chunks can be encrypted, and they are handled in a slightly different manner
+than normal chunks.
+
+The hashes of encrypted chunks are calculated not with the actual (encrypted)
+chunk content, but with the plaintext content concatenated with
+the encryption key. This way, two chunks of the same data encrypted with
+different keys generate two different checksums and no collisions occur for
+multiple encryption keys.
+
+This is done to speed up the client part of the backup, since it only needs
+to encrypt chunks that are actually getting uploaded. Chunks that exist
+already in the previous backup, do not need to be encrypted and uploaded.
+
+Caveats and Limitations
+-----------------------
+
+Notes on hash collisions
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Every hashing algorithm has a chance to produce collisions, meaning two (or more)
+inputs generate the same checksum. For SHA-256, this chance is negligible.
+To calculate such a collision, one can use the ideas of the 'birthday problem'
+from probability theory. For big numbers, this is actually infeasible to
+calculate with regular computers, but there is a good approximation:
+
+.. math::
+
+ p(n, d) = 1 - e^{-n^2/(2d)}
+
+Where `n` is the number of tries, and `d` is the number of possibilities.
+So for example, if we assume a large datastore of 1 PiB, and an average chunk
+size of 4 MiB, we have :math:`n = 268435456` tries, and :math:`d = 2^{256}`
+possibilities.  Using the above formula we get that the probability of a
+collision in that scenario is:
+
+.. math::
+
+ 3.1115 * 10^{-61}
+
+For context, in a lottery game of 6 of 45, the chance to correctly guess all
+6 numbers is only :math:`1.2277 * 10^{-7}`.
+
+So it is extremely unlikely that such a collision would occur by accident
+in a normal datastore.
+
+Additionally, SHA-256 is prone to length extension attacks, but since
+there is an upper limit for how big the chunk are, this is not a
+problem, since a potential attacker cannot arbitrarily add content to
+the data beyond that limit.
+
+File-based Backup
+^^^^^^^^^^^^^^^^^
+
+Since dynamically sized chunks (for file-based backups) are created on a custom
+archive format (pxar) and not over the files directly, there is no relation
+between files and the chunks. This means we have to read all files again
+for every backup, otherwise it would not be possible to generate a consistent
+pxar archive where the original chunks can be reused.
+
+Verification of encrypted chunks
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+For encrypted chunks, only the checksum of the original (plaintext) data
+is available, making it impossible for the server (without the encryption key),
+to verify its content against it. Instead only the CRC-32 checksum gets checked.
+
+
diff --git a/docs/terminology.rst b/docs/terminology.rst
index 89ec7646..3eff9780 100644
--- a/docs/terminology.rst
+++ b/docs/terminology.rst
@@ -1,3 +1,5 @@
+.. _terminology:
+
 Terminology
 ===========
 
@@ -99,6 +101,7 @@ Backup Group
 The tuple ``<type>/<ID>`` is called a backup group. Such a group
 may contain one or more backup snapshots.
 
+.. _backup_snapshot:
 
 Backup Snapshot
 ---------------
-- 
2.20.1





^ permalink raw reply	[flat|nested] 2+ messages in thread

* [pbs-devel] applied: [PATCH proxmox-backup v3] docs: explain some technical details about datastores/chunks
  2020-12-11 12:17 [pbs-devel] [PATCH proxmox-backup v3] docs: explain some technical details about datastores/chunks Dominik Csapak
@ 2021-02-04 11:34 ` Thomas Lamprecht
  0 siblings, 0 replies; 2+ messages in thread
From: Thomas Lamprecht @ 2021-02-04 11:34 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Dominik Csapak

On 11.12.20 13:17, Dominik Csapak wrote:
> adds explanations for:
> * what datastores are
> * their relation with snapshots/chunks
> * basic information about chunk directory structures
> * fixed-/dynamically-sized chunks
> * special handling of encrypted chunks
> * hash collision probability
> * limitation of file-based backups
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> changes from v2:
> * incoporate @F.Ebners and @D.Whytes suggestions, thanks :)
> 
>  docs/index.rst              |   1 +
>  docs/technical-overview.rst | 166 ++++++++++++++++++++++++++++++++++++
>  docs/terminology.rst        |   3 +
>  3 files changed, 170 insertions(+)
>  create mode 100644 docs/technical-overview.rst
> 
>

applied, thanks!

FYI, I made two followups:
* fixing up the line length (seemed a bit arbitrary, maybe let this get enforced by
  your editor for .rst/.adoc files)
* a few style nits to better comply with our technical writing guide, mostly
  dropping the usage of "we" (we who?). see our guide in the intranet (Avoid first
  person (plural)) for details. Threw in also some additions in that comment, was
  to much work to separate those two from each other..




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-02-04 11:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-11 12:17 [pbs-devel] [PATCH proxmox-backup v3] docs: explain some technical details about datastores/chunks Dominik Csapak
2021-02-04 11:34 ` [pbs-devel] applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal