public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>,
	Dominik Csapak <d.csapak@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 1/2] datastore: add tuning options for the number of reading threads
Date: Thu, 2 May 2024 14:22:39 +0200	[thread overview]
Message-ID: <61aae3bc-fe47-4e5b-a411-9c5e08fec901@proxmox.com> (raw)
In-Reply-To: <20240430093939.1318786-1-d.csapak@proxmox.com>

subject should have s/reading/verification read/ and it's one option
so singular please.

On 30/04/2024 11:39, Dominik Csapak wrote:
> adds a new 'read-threads' tuning options to datastores that control
> how many threads are used for verification operations
> 
> depending on the underlying storage and the used cpu, this can make a
> big difference in verification time.

Not that I don't believe you, but actual numbers would be nice to have
here ;-)

> 
> the default remains at the current 4 threads.
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  pbs-api-types/src/datastore.rs |  9 +++++++++
>  pbs-datastore/src/datastore.rs | 14 ++++++++++++++
>  src/backup/verify.rs           |  2 +-
>  www/Utils.js                   |  5 +++++
>  www/datastore/OptionView.js    |  8 ++++++++
>  5 files changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/pbs-api-types/src/datastore.rs b/pbs-api-types/src/datastore.rs
> index 31767417a..2ad2ae063 100644
> --- a/pbs-api-types/src/datastore.rs
> +++ b/pbs-api-types/src/datastore.rs
> @@ -209,6 +209,13 @@ pub enum DatastoreFSyncLevel {
>              type: ChunkOrder,
>              optional: true,
>          },
> +        "read-threads": {

the name is confusing for users reading configs and devs reading the code
using this, as one would expect that this is for all read operations, e.g.,
backup restore or even GC. A API description can be great to clear up subtle
things, but it should not be required to get the very basic understanding
on what this is used for.

I'd rather have two different settings for this and the tape one, e.g.
`verification-read-threads` and `tape-backup-read-threads`

What I'd even prefer more is not having this at all, but rather auto-determine
it from the amount of cores and the disk types (and possibly its bus), e.g.:

- if hdd (/sys/block/sda/queue/rotational is 1) then always use one thread
- if sdd then depending on thread count use 2 to 4 – for example, as pseudo
  code calculate it through #threads/4.max(4).min(1)

Could be even better to have a pool per backing storage to reserve threads
from, which is naturally more complex but the default behaviour should
ideally be OK for most users, so IMO at least worth a good amount of
consideration in how to makes this smarter per default – I could imagine
that one might even get some hints from the kernel (IO) scheduler.

> +            description: "Controls how many threads are used for reading from the datastore for verification.",
> +            type: usize,
> +            optional: true,
> +            minimum: 1,
> +            default: 4,
> +        },
>      },
>  )]
>  #[derive(Serialize, Deserialize, Default)]
> @@ -220,6 +227,8 @@ pub struct DatastoreTuning {
>      pub chunk_order: Option<ChunkOrder>,
>      #[serde(skip_serializing_if = "Option::is_none")]
>      pub sync_level: Option<DatastoreFSyncLevel>,
> +    #[serde(skip_serializing_if = "Option::is_none")]
> +    pub read_threads: Option<usize>,
>  }
>  
>  pub const DATASTORE_TUNING_STRING_SCHEMA: Schema = StringSchema::new("Datastore tuning options")
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index f95da7615..2c23bd1f1 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -14,6 +14,7 @@ use proxmox_schema::ApiType;
>  use proxmox_sys::error::SysError;
>  use proxmox_sys::fs::{file_read_optional_string, replace_file, CreateOptions};
>  use proxmox_sys::fs::{lock_dir_noblock, DirLockGuard};
> +use proxmox_sys::linux::procfs::read_proc_stat;
>  use proxmox_sys::process_locker::ProcessLockSharedGuard;
>  use proxmox_sys::WorkerTaskContext;
>  use proxmox_sys::{task_log, task_warn};
> @@ -61,6 +62,7 @@ pub struct DataStoreImpl {
>      chunk_order: ChunkOrder,
>      last_digest: Option<[u8; 32]>,
>      sync_level: DatastoreFSyncLevel,
> +    read_threads: usize,

if we really add this I'd do so through a struct that impls default and added
as something 'thread_limits' here, having a `verification`, `tape_read`, ...
members.

>  }
>  
>  impl DataStoreImpl {
> @@ -75,6 +77,7 @@ impl DataStoreImpl {
>              chunk_order: Default::default(),
>              last_digest: None,
>              sync_level: Default::default(),
> +            read_threads: 0,
>          })
>      }
>  }
> @@ -313,6 +316,7 @@ impl DataStore {
>              chunk_order: tuning.chunk_order.unwrap_or_default(),
>              last_digest,
>              sync_level: tuning.sync_level.unwrap_or_default(),
> +            read_threads: tuning.read_threads.unwrap_or(4),
>          })
>      }
>  
> @@ -1377,6 +1381,16 @@ impl DataStore {
>          Ok(())
>      }
>  
> +    /// returns the number of read thread that should be used for operations
> +    /// limits to the number of available cores (if reading from /proc/stat succeeds)
> +    pub fn get_read_threads(&self) -> usize {
> +        // if we cannot get the real cpu count, don't limit us
> +        let core_count = read_proc_stat()
> +            .map(|proc| proc.cpu_count as usize)
> +            .unwrap_or(usize::max_value());
> +        self.inner.read_threads.clamp(1, core_count)
> +    }
> +
>      /// Destroy a datastore. This requires that there are no active operations on the datastore.
>      ///
>      /// This is a synchronous operation and should be run in a worker-thread.
> diff --git a/src/backup/verify.rs b/src/backup/verify.rs
> index c972e5328..4537dbdc9 100644
> --- a/src/backup/verify.rs
> +++ b/src/backup/verify.rs
> @@ -125,7 +125,7 @@ fn verify_index_chunks(
>  
>      let decoder_pool = ParallelHandler::new(
>          "verify chunk decoder",
> -        4,
> +        datastore2.get_read_threads(),
>          move |(chunk, digest, size): (DataBlob, [u8; 32], u64)| {
>              let chunk_crypt_mode = match chunk.crypt_mode() {
>                  Err(err) => {
> diff --git a/www/Utils.js b/www/Utils.js
> index 1d7351a32..8fd102486 100644
> --- a/www/Utils.js
> +++ b/www/Utils.js
> @@ -790,6 +790,11 @@ Ext.define('PBS.Utils', {
>  	sync = PBS.Utils.tuningOptions['sync-level'][sync ?? '__default__'];
>  	options.push(`${gettext('Sync Level')}: ${sync}`);
>  
> +	let readThreads = tuning['read-threads'];
> +	delete tuning['read-threads'];
> +	readThreads ??= Proxmox.Utils.defaultText + ` (4)`;
> +	options.push(`${gettext('Read Threads')}: ${readThreads}`);
> +
>  	for (const [k, v] of Object.entries(tuning)) {
>  	    options.push(`${k}: ${v}`);
>  	}
> diff --git a/www/datastore/OptionView.js b/www/datastore/OptionView.js
> index e1f38af6f..ee90f1dca 100644
> --- a/www/datastore/OptionView.js
> +++ b/www/datastore/OptionView.js
> @@ -271,6 +271,14 @@ Ext.define('PBS.Datastore.Options', {
>  			    deleteEmpty: true,
>  			    value: '__default__',
>  			},
> +			{
> +			    xtype: 'proxmoxintegerfield',
> +			    name: 'read-threads',
> +			    fieldLabel: gettext('Read Threads'),

without any hint or tooltip this is just a completely confusingly
option.. If it would be specialized for the actual different
operations (now and future one)

> +			    min: 0,
> +			    emptyText: '4',
> +			    deleteEmpty: true,
> +			},
>  		    ],
>  		},
>  	    },



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  parent reply	other threads:[~2024-05-02 12:23 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-30  9:39 Dominik Csapak
2024-04-30  9:39 ` [pbs-devel] [PATCH proxmox-backup 2/2] tape: use datastores 'read-thread' for tape backup Dominik Csapak
2024-04-30 11:00   ` Dominik Csapak
2024-05-02 12:22 ` Thomas Lamprecht [this message]
2024-05-02 13:14   ` [pbs-devel] [PATCH proxmox-backup 1/2] datastore: add tuning options for the number of reading threads Dominik Csapak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61aae3bc-fe47-4e5b-a411-9c5e08fec901@proxmox.com \
    --to=t.lamprecht@proxmox.com \
    --cc=d.csapak@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal