From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Hannes Laimer <h.laimer@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [PATCH v3 proxmox-backup 3/3] add index recovery to pb-debug
Date: Tue, 9 Feb 2021 11:36:22 +0100 [thread overview]
Message-ID: <20210209103622.hh642wejc7djz56y@olga.proxmox.com> (raw)
In-Reply-To: <20210205075806.888558-4-h.laimer@proxmox.com>
On Fri, Feb 05, 2021 at 08:58:06AM +0100, Hannes Laimer wrote:
> Adds possibility to recover data from an index file. Options:
> - chunks: path to the directory where the chunks are saved
> - file: the index file that should be recovered(must be either .fidx or
> didx)
> - [opt] keyfile: path to a keyfile, if the data was encrypted, a keyfile is
> needed
> - [opt] skip-crc: boolean, if true, read chunks wont be verified with their
> crc-sum, increases the restore speed by a lot
>
> Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
> ---
>
> src/bin/proxmox-backup-debug.rs | 6 +-
> src/bin/proxmox_backup_debug/mod.rs | 2 +
> src/bin/proxmox_backup_debug/recover.rs | 109 ++++++++++++++++++++++++
> 3 files changed, 115 insertions(+), 2 deletions(-)
> create mode 100644 src/bin/proxmox_backup_debug/recover.rs
>
> diff --git a/src/bin/proxmox-backup-debug.rs b/src/bin/proxmox-backup-debug.rs
> index 079b2bc8..8bd01a59 100644
> --- a/src/bin/proxmox-backup-debug.rs
> +++ b/src/bin/proxmox-backup-debug.rs
> @@ -22,7 +22,9 @@ pub const KEYFILE_SCHEMA: Schema = StringSchema::new(
> fn main() {
> proxmox_backup::tools::setup_safe_path_env();
>
> - let cmd_def = CliCommandMap::new().insert("inspect", inspect_commands());
> + let cmd_def = CliCommandMap::new()
> + .insert("inspect", inspect_commands())
> + .insert("recover", recover_commands());
>
> let rpcenv = CliEnvironment::new();
> run_cli_command(
> @@ -30,4 +32,4 @@ fn main() {
> rpcenv,
> Some(|future| proxmox_backup::tools::runtime::main(future)),
> );
> -}
> +}
> \ No newline at end of file
> diff --git a/src/bin/proxmox_backup_debug/mod.rs b/src/bin/proxmox_backup_debug/mod.rs
> index 644583db..62df7754 100644
> --- a/src/bin/proxmox_backup_debug/mod.rs
> +++ b/src/bin/proxmox_backup_debug/mod.rs
> @@ -1,2 +1,4 @@
> mod inspect;
> pub use inspect::*;
> +mod recover;
> +pub use recover::*;
> diff --git a/src/bin/proxmox_backup_debug/recover.rs b/src/bin/proxmox_backup_debug/recover.rs
> new file mode 100644
> index 00000000..61025adf
> --- /dev/null
> +++ b/src/bin/proxmox_backup_debug/recover.rs
> @@ -0,0 +1,109 @@
> +use std::fs::File;
> +use std::io::{Read, Write};
> +use std::path::Path;
> +
> +use anyhow::{bail, format_err, Error};
> +
> +use proxmox::api::api;
> +use proxmox::api::cli::{CliCommand, CliCommandMap, CommandLineInterface};
> +use serde_json::Value;
> +
> +use proxmox_backup::backup::{
> + load_and_decrypt_key, CryptConfig, DataBlob, DynamicIndexReader, FixedIndexReader, IndexFile,
> +};
> +use proxmox_backup::tools;
> +
> +use crate::{get_encryption_key_password, KEYFILE_SCHEMA, PATH_SCHEMA};
> +
> +use proxmox::tools::digest_to_hex;
> +
> +use std::time::Instant;
> +
> +#[api(
> + input: {
> + properties: {
> + file: {
> + schema: PATH_SCHEMA,
> + },
> + chunks: {
> + schema: PATH_SCHEMA,
> + },
> + "keyfile": {
> + schema: KEYFILE_SCHEMA,
> + optional: true,
> + },
> + "skip-crc": {
> + type: Boolean,
> + optional: true,
> + default: false,
> + description: "Skip the crc verification, increases the restore speed immensely",
> + }
> + }
> + }
> +)]
> +/// Recover a index file
'an'
Perhapse add some more information of what this actually does and
particularly what its limitations are.
> +fn recover_index(skip_crc: bool, param: Value) -> Result<Value, Error> {
> + let start = Instant::now();
> + let file_path = Path::new(tools::required_string_param(¶m, "file")?);
> + let chunks_path = Path::new(tools::required_string_param(¶m, "chunks")?);
> +
> + let key_file_param = param["keyfile"].as_str();
> + let key_file_path = key_file_param.map(|path| Path::new(path));
> +
> + let index: Box<dyn IndexFile> = match file_path.extension() {
> + Some(ext) if ext.eq("fidx") => Box::new(
> + FixedIndexReader::open(file_path)
> + .map_err(|e| format_err!("could not read index - {}", e))?,
> + ),
> + Some(ext) if ext.eq("didx") => Box::new(
> + DynamicIndexReader::open(file_path)
> + .map_err(|e| format_err!("could not read index - {}", e))?,
> + ),
> + _ => bail!("index file must either be a .fidx or a .didx file"),
> + };
Come to think of it, we do have magic bytes at the top of all of these
file types, so shouldn't a *debug* binary use *that* instead of the
file's *name*? ;-)
*And* perhaps warn if the name doesn't match its type...
So for this we could just have a helper `fn detect_file_type() -> FileType`,
which we could use in all the other functions of this binary as well.
> +
> + let mut crypt_conf_opt = None;
> + let mut crypt_conf;
> +
> + let output_filename = file_path.file_stem().unwrap().to_str().unwrap();
> + let output_path = Path::new(output_filename);
> + let mut output_file = File::create(output_path)
> + .map_err(|e| format_err!("could not create output file - {}", e))?;
> +
I feel like the code below should already exist somewhere in some
half-way reusable form. It might be worth trying to instantiate a
`LocalDataStore` and use its `read_chunk` method, but I'm not sure if it
helps much. Plus it wouldn't reuse the buffer...
> + for pos in 0..index.index_count() {
> + let chunk_digest = index.index_digest(pos).unwrap();
> + let digest_str = digest_to_hex(chunk_digest);
> + let digest_prefix = &digest_str[0..4];
> + let chunk_path = chunks_path.join(digest_prefix).join(digest_str);
> + let mut chunk_file = std::fs::File::open(&chunk_path)
> + .map_err(|e| format_err!("could not open chunk file - {}", e))?;
> +
> + let mut data = Vec::with_capacity(1024 * 1024);
...but neither do you right now ;-)
Please move this above the loop and just use `.clear()` here (which does
not deallocate).
> + chunk_file.read_to_end(&mut data)?;
> + let chunk_blob = DataBlob::from_raw(data)?;
> +
> + if !skip_crc {
> + chunk_blob.verify_crc()?;
> + }
> +
> + if key_file_path.is_some() && chunk_blob.is_encrypted() && crypt_conf_opt.is_none() {
> + let (key, _created, _fingerprint) =
> + load_and_decrypt_key(&key_file_path.unwrap(), &get_encryption_key_password)?;
> + crypt_conf = CryptConfig::new(key)?;
> + crypt_conf_opt = Some(&crypt_conf);
> + }
> +
> + output_file.write_all(
> + chunk_blob
> + .decode(crypt_conf_opt, Some(chunk_digest))?
> + .as_slice(),
> + )?;
> + }
> + println!("{} sec.", start.elapsed().as_secs_f32());
As a frequent user of the `time` shell command I'm wondering if this is
really something I want to always see? ;-)
> + Ok(Value::Null)
> +}
> +
> +pub fn recover_commands() -> CommandLineInterface {
> + let cmd_def = CliCommandMap::new().insert("index", CliCommand::new(&API_METHOD_RECOVER_INDEX));
> + cmd_def.into()
> +}
> --
> 2.20.1
prev parent reply other threads:[~2021-02-09 10:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-05 7:58 [pbs-devel] [PATCH v3 proxmox-backup 0/3] add proxmox-backup-debug binary Hannes Laimer
2021-02-05 7:58 ` [pbs-devel] [PATCH v3 proxmox-backup 1/3] add chunk inspection to pb-debug Hannes Laimer
2021-02-09 10:16 ` Wolfgang Bumiller
2021-02-05 7:58 ` [pbs-devel] [PATCH v3 proxmox-backup 2/3] add file(.blob, .fidx, .didx) " Hannes Laimer
2021-02-09 10:23 ` Wolfgang Bumiller
2021-02-05 7:58 ` [pbs-devel] [PATCH v3 proxmox-backup 3/3] add index recovery " Hannes Laimer
2021-02-09 10:36 ` Wolfgang Bumiller [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210209103622.hh642wejc7djz56y@olga.proxmox.com \
--to=w.bumiller@proxmox.com \
--cc=h.laimer@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal