public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour
@ 2022-04-26 10:13 Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 1/5] restore-daemon: start disk initialization in parallel to the api Dominik Csapak
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:13 UTC (permalink / raw)
  To: pve-devel

this series improves the behaviour of the file-restore when some mount
operations take longer than the 30 second pveproxy timeout, and improves
the startup speed of the restore vm

we do this by moving the disk init into the background of the daemon
startup, while the hyper server is starting, and by adding a timeout
option to the file-restore binary.

to be able to report the error back to the gui, we return a
'SERVICE_UNAVAILABLE' error in the timeout case, and the gui tries
again (up to 10 times).

backup patches 2,3 could be applied independently, as they even make
sense withouth the rest

pve-storage depends on a new pve-common&file-restore

widget-toolkit makes only sense with the other patches, but is
designed to not change behaviour with the old api

alternatively i could drop the 'json-error' parameter and do it
always, or when timeout it set.

changes from v1:
* rebased on master
* moved the json-error and timeout directly into pve-common (hardcoded)
  since there is only one usage of that function

proxmox-backup:

Dominik Csapak (5):
  restore-daemon: start disk initialization in parallel to the api
  restore-daemon: put blocking code into 'block_in_place'
  restore-daemon: avoid auto-mounting zpools
  file-restore: factor out 'list_files'
  file-restore: add 'timeout' and 'json-error' parameter

 proxmox-file-restore/src/main.rs              | 206 ++++++++++++------
 proxmox-restore-daemon/src/main.rs            |  31 ++-
 .../src/proxmox_restore_daemon/api.rs         |  12 +-
 .../src/proxmox_restore_daemon/disk.rs        |  20 +-
 4 files changed, 166 insertions(+), 103 deletions(-)

pve-common:

Dominik Csapak (1):
  PBSClient: file_restore_list: add json-error and timeout parameter

 src/PVE/PBSClient.pm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

pve-storage:

Dominik Csapak (1):
  api: FileRestore: decode and return proper error with new file-restore
    params

 PVE/API2/Storage/FileRestore.pm | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

proxmox-widget-toolkit:

Dominik Csapak (1):
  window/FileBrowser: try reload again when getting a 503 error

 src/window/FileBrowser.js | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH proxmox-backup v2 1/5] restore-daemon: start disk initialization in parallel to the api
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
@ 2022-04-26 10:13 ` Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 2/5] restore-daemon: put blocking code into 'block_in_place' Dominik Csapak
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:13 UTC (permalink / raw)
  To: pve-devel

this way, the vm can start up faster, and the actual disk init happens
in parallel. this avoids unnecessary timeouts when starting the vm

if the call panics, we still abort the vm with an error

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 proxmox-restore-daemon/src/main.rs | 31 ++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/proxmox-restore-daemon/src/main.rs b/proxmox-restore-daemon/src/main.rs
index 8f479de9..96e40de9 100644
--- a/proxmox-restore-daemon/src/main.rs
+++ b/proxmox-restore-daemon/src/main.rs
@@ -36,6 +36,15 @@ lazy_static! {
     };
 }
 
+fn init_disk_state() {
+    info!("scanning all disks...");
+    {
+        let _disk_state = DISK_STATE.lock().unwrap();
+    }
+
+    info!("disk scan complete.")
+}
+
 /// This is expected to be run by 'proxmox-file-restore' within a mini-VM
 fn main() -> Result<(), Error> {
     pbs_tools::setup_libc_malloc_opts();
@@ -56,15 +65,6 @@ fn main() -> Result<(), Error> {
     info!("setup basic system environment...");
     setup_system_env().map_err(|err| format_err!("system environment setup failed: {}", err))?;
 
-    // scan all attached disks now, before starting the API
-    // this will panic and stop the VM if anything goes wrong
-    info!("scanning all disks...");
-    {
-        let _disk_state = DISK_STATE.lock().unwrap();
-    }
-
-    info!("disk scan complete, starting main runtime...");
-
     proxmox_async::runtime::main(run())
 }
 
@@ -95,6 +95,13 @@ fn setup_system_env() -> Result<(), Error> {
 async fn run() -> Result<(), Error> {
     watchdog_init();
 
+    let init_future = async move {
+        match tokio::time::timeout(std::time::Duration::from_secs(120), tokio::task::spawn_blocking(init_disk_state)).await {
+            Ok(res) => res.map_err(|err| format_err!("disk init failed: {}", err)),
+            Err(_) => bail!("disk init timed out after 120 seconds"),
+        }
+    };
+
     let adaptor = StaticAuthAdapter::new()
         .map_err(|err| format_err!("reading ticket file failed: {}", err))?;
 
@@ -106,7 +113,11 @@ async fn run() -> Result<(), Error> {
     let receiver_stream = ReceiverStream::new(connections);
     let acceptor = hyper::server::accept::from_stream(receiver_stream);
 
-    hyper::Server::builder(acceptor).serve(rest_server).await?;
+    let hyper_future = async move {
+        hyper::Server::builder(acceptor).serve(rest_server).await.map_err(|err| format_err!("hyper finished with error: {}", err))
+    };
+
+    tokio::try_join!(init_future, hyper_future)?;
 
     bail!("hyper server exited");
 }
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH proxmox-backup v2 2/5] restore-daemon: put blocking code into 'block_in_place'
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 1/5] restore-daemon: start disk initialization in parallel to the api Dominik Csapak
@ 2022-04-26 10:13 ` Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 3/5] restore-daemon: avoid auto-mounting zpools Dominik Csapak
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:13 UTC (permalink / raw)
  To: pve-devel

DISK_STATE.lock() and '.resolve()' can both block since they access
the disks. Putting them into a 'block_in_place' tells tokio that
this blocks an it can act accordingly

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 .../src/proxmox_restore_daemon/api.rs                | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
index ef9d53b7..aeb5a71d 100644
--- a/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
+++ b/proxmox-restore-daemon/src/proxmox_restore_daemon/api.rs
@@ -148,8 +148,10 @@ fn list(
     let path_str = OsStr::from_bytes(&path[..]);
     let param_path_buf = Path::new(path_str);
 
-    let mut disk_state = crate::DISK_STATE.lock().unwrap();
-    let query_result = disk_state.resolve(param_path_buf)?;
+    let query_result = proxmox_async::runtime::block_in_place(move || {
+        let mut disk_state = crate::DISK_STATE.lock().unwrap();
+        disk_state.resolve(param_path_buf)
+    })?;
 
     match query_result {
         ResolveResult::Path(vm_path) => {
@@ -270,10 +272,10 @@ fn extract(
 
         let pxar = param["pxar"].as_bool().unwrap_or(true);
 
-        let query_result = {
+        let query_result = proxmox_async::runtime::block_in_place(move || {
             let mut disk_state = crate::DISK_STATE.lock().unwrap();
-            disk_state.resolve(path)?
-        };
+            disk_state.resolve(path)
+        })?;
 
         let vm_path = match query_result {
             ResolveResult::Path(vm_path) => vm_path,
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH proxmox-backup v2 3/5] restore-daemon: avoid auto-mounting zpools
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 1/5] restore-daemon: start disk initialization in parallel to the api Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 2/5] restore-daemon: put blocking code into 'block_in_place' Dominik Csapak
@ 2022-04-26 10:13 ` Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 4/5] file-restore: factor out 'list_files' Dominik Csapak
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:13 UTC (permalink / raw)
  To: pve-devel

the duration of mounting zpools not only correspond to the number of disks,
but also to the content (many subvols for example) which we cannot know
beforehand. so avoid mounting them at the start, and mount it only when
the user requests a listing/extraction with the zpool in path

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 .../src/proxmox_restore_daemon/disk.rs        | 20 +------------------
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/proxmox-restore-daemon/src/proxmox_restore_daemon/disk.rs b/proxmox-restore-daemon/src/proxmox_restore_daemon/disk.rs
index f397ecbe..9aa849ac 100644
--- a/proxmox-restore-daemon/src/proxmox_restore_daemon/disk.rs
+++ b/proxmox-restore-daemon/src/proxmox_restore_daemon/disk.rs
@@ -453,30 +453,12 @@ impl DiskState {
         cmd.args(["import", "-d", "/dev"].iter());
         let result = run_command(cmd, None).unwrap();
         for (pool, disks) in Self::parse_zpool_import(&result) {
-            let mut bucket = Bucket::ZPool(ZFSBucketData {
+            let bucket = Bucket::ZPool(ZFSBucketData {
                 name: pool.clone(),
                 size: None,
                 mountpoint: None,
             });
 
-            // anything more than 5 disks we assume to take too long to mount, so we don't
-            // automatically - this means that no size can be reported
-            if disks.len() <= 5 {
-                let mp = filesystems.ensure_mounted(&mut bucket);
-                info!(
-                    "zpool '{}' (on: {:?}) auto-mounted at '{:?}' (size: {:?})",
-                    &pool,
-                    &disks,
-                    mp,
-                    bucket.size(0)
-                );
-            } else {
-                info!(
-                    "zpool '{}' (on: {:?}) auto-mount skipped, too many disks",
-                    &pool, &disks
-                );
-            }
-
             for disk in disks {
                 if let Some(fidx) = drive_info.get(&disk) {
                     match disk_map.get_mut(fidx) {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH proxmox-backup v2 4/5] file-restore: factor out 'list_files'
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
                   ` (2 preceding siblings ...)
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 3/5] restore-daemon: avoid auto-mounting zpools Dominik Csapak
@ 2022-04-26 10:13 ` Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 5/5] file-restore: add 'timeout' and 'json-error' parameter Dominik Csapak
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:13 UTC (permalink / raw)
  To: pve-devel

we'll want to reuse that in a later patch

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 proxmox-file-restore/src/main.rs | 156 +++++++++++++++++--------------
 1 file changed, 84 insertions(+), 72 deletions(-)

diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs
index ab50c636..4b327f6b 100644
--- a/proxmox-file-restore/src/main.rs
+++ b/proxmox-file-restore/src/main.rs
@@ -26,7 +26,7 @@ use pbs_client::tools::{
     },
     REPO_URL_SCHEMA,
 };
-use pbs_client::{BackupReader, RemoteChunkReader};
+use pbs_client::{BackupReader, BackupRepository, RemoteChunkReader};
 use pbs_config::key_config::decrypt_key;
 use pbs_datastore::catalog::{ArchiveEntry, CatalogReader, DirEntryAttribute};
 use pbs_datastore::dynamic_index::{BufferedDynamicReader, LocalDynamicReadAt};
@@ -92,6 +92,84 @@ fn keyfile_path(param: &Value) -> Option<String> {
     None
 }
 
+async fn list_files(
+    repo: BackupRepository,
+    snapshot: BackupDir,
+    path: ExtractPath,
+    crypt_config: Option<Arc<CryptConfig>>,
+    keyfile: Option<String>,
+    driver: Option<BlockDriverType>,
+) -> Result<Vec<ArchiveEntry>, Error> {
+    let client = connect(&repo)?;
+    let client = BackupReader::start(
+        client,
+        crypt_config.clone(),
+        repo.store(),
+        snapshot.group.ty,
+        &snapshot.group.id,
+        snapshot.time,
+        true,
+    )
+    .await?;
+
+    let (manifest, _) = client.download_manifest().await?;
+    manifest.check_fingerprint(crypt_config.as_ref().map(Arc::as_ref))?;
+
+    match path {
+        ExtractPath::ListArchives => {
+            let mut entries = vec![];
+            for file in manifest.files() {
+                if !file.filename.ends_with(".pxar.didx") && !file.filename.ends_with(".img.fidx") {
+                    continue;
+                }
+                let path = format!("/{}", file.filename);
+                let attr = if file.filename.ends_with(".pxar.didx") {
+                    // a pxar file is a file archive, so it's root is also a directory root
+                    Some(&DirEntryAttribute::Directory { start: 0 })
+                } else {
+                    None
+                };
+                entries.push(ArchiveEntry::new_with_size(
+                    path.as_bytes(),
+                    attr,
+                    Some(file.size),
+                ));
+            }
+
+            Ok(entries)
+        }
+        ExtractPath::Pxar(file, mut path) => {
+            let index = client
+                .download_dynamic_index(&manifest, CATALOG_NAME)
+                .await?;
+            let most_used = index.find_most_used_chunks(8);
+            let file_info = manifest.lookup_file_info(CATALOG_NAME)?;
+            let chunk_reader = RemoteChunkReader::new(
+                client.clone(),
+                crypt_config,
+                file_info.chunk_crypt_mode(),
+                most_used,
+            );
+            let reader = BufferedDynamicReader::new(index, chunk_reader);
+            let mut catalog_reader = CatalogReader::new(reader);
+
+            let mut fullpath = file.into_bytes();
+            fullpath.append(&mut path);
+
+            catalog_reader.list_dir_contents(&fullpath)
+        }
+        ExtractPath::VM(file, path) => {
+            let details = SnapRestoreDetails {
+                manifest,
+                repo,
+                snapshot,
+                keyfile,
+            };
+            data_list(driver, details, file, path).await
+        }
+    }
+}
+
 #[api(
    input: {
        properties: {
@@ -163,78 +241,12 @@ async fn list(snapshot: String, path: String, base64: bool, param: Value) -> Res
         }
     };
 
-    let client = connect(&repo)?;
-    let client = BackupReader::start(
-        client,
-        crypt_config.clone(),
-        repo.store(),
-        snapshot.group.ty,
-        &snapshot.group.id,
-        snapshot.time,
-        true,
-    )
-    .await?;
-
-    let (manifest, _) = client.download_manifest().await?;
-    manifest.check_fingerprint(crypt_config.as_ref().map(Arc::as_ref))?;
-
-    let result = match path {
-        ExtractPath::ListArchives => {
-            let mut entries = vec![];
-            for file in manifest.files() {
-                if !file.filename.ends_with(".pxar.didx") && !file.filename.ends_with(".img.fidx") {
-                    continue;
-                }
-                let path = format!("/{}", file.filename);
-                let attr = if file.filename.ends_with(".pxar.didx") {
-                    // a pxar file is a file archive, so it's root is also a directory root
-                    Some(&DirEntryAttribute::Directory { start: 0 })
-                } else {
-                    None
-                };
-                entries.push(ArchiveEntry::new_with_size(
-                    path.as_bytes(),
-                    attr,
-                    Some(file.size),
-                ));
-            }
-
-            Ok(entries)
-        }
-        ExtractPath::Pxar(file, mut path) => {
-            let index = client
-                .download_dynamic_index(&manifest, CATALOG_NAME)
-                .await?;
-            let most_used = index.find_most_used_chunks(8);
-            let file_info = manifest.lookup_file_info(CATALOG_NAME)?;
-            let chunk_reader = RemoteChunkReader::new(
-                client.clone(),
-                crypt_config,
-                file_info.chunk_crypt_mode(),
-                most_used,
-            );
-            let reader = BufferedDynamicReader::new(index, chunk_reader);
-            let mut catalog_reader = CatalogReader::new(reader);
-
-            let mut fullpath = file.into_bytes();
-            fullpath.append(&mut path);
+    let driver: Option<BlockDriverType> = match param.get("driver") {
+        Some(drv) => Some(serde::Deserialize::deserialize(drv)?),
+        None => None,
+    };
 
-            catalog_reader.list_dir_contents(&fullpath)
-        }
-        ExtractPath::VM(file, path) => {
-            let details = SnapRestoreDetails {
-                manifest,
-                repo,
-                snapshot,
-                keyfile,
-            };
-            let driver: Option<BlockDriverType> = match param.get("driver") {
-                Some(drv) => Some(serde::Deserialize::deserialize(drv)?),
-                None => None,
-            };
-            data_list(driver, details, file, path).await
-        }
-    }?;
+    let result = list_files(repo, snapshot, path, crypt_config, keyfile, driver).await?;
 
     let options = default_table_format_options()
         .sortby("type", false)
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH proxmox-backup v2 5/5] file-restore: add 'timeout' and 'json-error' parameter
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
                   ` (3 preceding siblings ...)
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 4/5] file-restore: factor out 'list_files' Dominik Csapak
@ 2022-04-26 10:13 ` Dominik Csapak
  2022-04-26 10:13 ` [pve-devel] [PATCH common v2 1/1] PBSClient: file_restore_list: add json-error and timeout parameter Dominik Csapak
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:13 UTC (permalink / raw)
  To: pve-devel

timeout limits the code with the given timeout in seconds, and
'json-error' return json to stdout when the call returns an error like
this:

{
    "msg": "error message",
    "error": true,
    "code": <HTTP_STATUS_CODE>, // if it was an http error
}

with both options set, a client can more easily determine if the call
ran into a timeout (since it will return a 503 error), and can poll
it again

both is done behind new parameters, so that we can stay backwards-compatible

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 proxmox-file-restore/src/main.rs | 62 ++++++++++++++++++++++++++++++--
 1 file changed, 59 insertions(+), 3 deletions(-)

diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs
index 4b327f6b..6b5e65b9 100644
--- a/proxmox-file-restore/src/main.rs
+++ b/proxmox-file-restore/src/main.rs
@@ -11,6 +11,7 @@ use proxmox_router::cli::{
     get_output_format, run_cli_command, CliCommand, CliCommandMap, CliEnvironment, ColumnConfig,
     OUTPUT_FORMAT,
 };
+use proxmox_router::{http_err, HttpError};
 use proxmox_schema::api;
 use proxmox_sys::fs::{create_path, CreateOptions};
 use pxar::accessor::aio::Accessor;
@@ -211,6 +212,18 @@ async fn list_files(
                schema: OUTPUT_FORMAT,
                optional: true,
            },
+           "json-error": {
+               type: Boolean,
+               description: "If set, errors are returned as json instead of writing to stderr",
+               optional: true,
+               default: false,
+           },
+           "timeout": {
+               type: Integer,
+               description: "Defines the maximum time the call can should take.",
+               minimum: 1,
+               optional: true,
+           },
        }
    },
    returns: {
@@ -222,7 +235,14 @@ async fn list_files(
    }
 )]
 /// List a directory from a backup snapshot.
-async fn list(snapshot: String, path: String, base64: bool, param: Value) -> Result<(), Error> {
+async fn list(
+    snapshot: String,
+    path: String,
+    base64: bool,
+    json_error: bool,
+    timeout: Option<u64>,
+    param: Value,
+) -> Result<(), Error> {
     let repo = extract_repository_from_value(&param)?;
     let snapshot: BackupDir = snapshot.parse()?;
     let path = parse_path(path, base64)?;
@@ -246,7 +266,43 @@ async fn list(snapshot: String, path: String, base64: bool, param: Value) -> Res
         None => None,
     };
 
-    let result = list_files(repo, snapshot, path, crypt_config, keyfile, driver).await?;
+    let result = if let Some(timeout) = timeout {
+        match tokio::time::timeout(
+            std::time::Duration::from_secs(timeout),
+            list_files(repo, snapshot, path, crypt_config, keyfile, driver),
+        )
+        .await
+        {
+            Ok(res) => res,
+            Err(_) => Err(http_err!(SERVICE_UNAVAILABLE, "list not finished in time")),
+        }
+    } else {
+        list_files(repo, snapshot, path, crypt_config, keyfile, driver).await
+    };
+
+    let output_format = get_output_format(&param);
+
+    if let Err(err) = result {
+        if !json_error {
+            return Err(err);
+        }
+        let (msg, code) = match err.downcast_ref::<HttpError>() {
+            Some(HttpError { code, message }) => (message.clone(), Some(code)),
+            None => (err.to_string(), None),
+        };
+        let mut json_err = json!({
+            "error": true,
+            "message": msg,
+        });
+        if let Some(code) = code {
+            json_err["code"] = Value::from(code.as_u16());
+        }
+        match output_format.as_ref() {
+            "json-pretty" => println!("{}", serde_json::to_string_pretty(&json_err)?),
+            _ => println!("{}", serde_json::to_string(&json_err)?),
+        }
+        return Ok(());
+    }
 
     let options = default_table_format_options()
         .sortby("type", false)
@@ -258,7 +314,7 @@ async fn list(snapshot: String, path: String, base64: bool, param: Value) -> Res
 
     let output_format = get_output_format(&param);
     format_and_print_result_full(
-        &mut json!(result),
+        &mut json!(result.unwrap()),
         &API_METHOD_LIST.returns,
         &output_format,
         &options,
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH common v2 1/1] PBSClient: file_restore_list: add json-error and timeout parameter
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
                   ` (4 preceding siblings ...)
  2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 5/5] file-restore: add 'timeout' and 'json-error' parameter Dominik Csapak
@ 2022-04-26 10:13 ` Dominik Csapak
  2022-04-26 10:14 ` [pve-devel] [PATCH storage v2 1/1] api: FileRestore: decode and return proper error with new file-restore params Dominik Csapak
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:13 UTC (permalink / raw)
  To: pve-devel

we always want the restore_list to use the json-error and timeout here.
Set the timeout to 25 seconds so there is a little headroom between
this and pveproxys 30s one.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/PVE/PBSClient.pm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/PVE/PBSClient.pm b/src/PVE/PBSClient.pm
index 21dc363..baedea6 100644
--- a/src/PVE/PBSClient.pm
+++ b/src/PVE/PBSClient.pm
@@ -343,10 +343,14 @@ sub status {
 
 sub file_restore_list {
     my ($self, $snapshot, $filepath, $base64) = @_;
+
+    my $params = [ $snapshot, $filepath, "--base64", $base64 ? 1 : 0,
+	'--json-error', 1, '--timeout', 25];
+
     return run_client_cmd(
 	$self,
 	"list",
-	[ $snapshot, $filepath, "--base64", $base64 ? 1 : 0 ],
+	$params,
 	0,
 	"proxmox-file-restore",
     );
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH storage v2 1/1] api: FileRestore: decode and return proper error with new file-restore params
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
                   ` (5 preceding siblings ...)
  2022-04-26 10:13 ` [pve-devel] [PATCH common v2 1/1] PBSClient: file_restore_list: add json-error and timeout parameter Dominik Csapak
@ 2022-04-26 10:14 ` Dominik Csapak
  2022-04-26 10:14 ` [pve-devel] [PATCH widget-toolkit v2 1/1] window/FileBrowser: try reload again when getting a 503 error Dominik Csapak
  2022-04-27 17:43 ` [pve-devel] partiall-applied: [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Thomas Lamprecht
  8 siblings, 0 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:14 UTC (permalink / raw)
  To: pve-devel

pbsclient now uses a timeout of 25 seconds do decode and return the error
if one is returned in json form.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Storage/FileRestore.pm | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/PVE/API2/Storage/FileRestore.pm b/PVE/API2/Storage/FileRestore.pm
index ccc56e5..c7278c9 100644
--- a/PVE/API2/Storage/FileRestore.pm
+++ b/PVE/API2/Storage/FileRestore.pm
@@ -121,13 +121,26 @@ __PACKAGE__->register_method ({
 	my $client = PVE::PBSClient->new($scfg, $storeid);
 	my $ret = $client->file_restore_list($snap, $path, $base64);
 
-	# 'leaf' is a proper JSON boolean, map to perl-y bool
-	# TODO: make PBSClient decode all bools always as 1/0?
-	foreach my $item (@$ret) {
-	    $item->{leaf} = $item->{leaf} ? 1 : 0;
+	if (ref($ret) eq "HASH") {
+	    my $msg = $ret->{message};
+	    if ($ret->{error}) {
+		if (my $code = $ret->{code}) {
+		    die PVE::Exception->new("$msg\n", code => $code);
+		} else {
+		    die "$msg\n";
+		}
+	    }
+	} elsif (ref($ret) eq "ARRAY") {
+	    # 'leaf' is a proper JSON boolean, map to perl-y bool
+	    # TODO: make PBSClient decode all bools always as 1/0?
+	    foreach my $item (@$ret) {
+		$item->{leaf} = $item->{leaf} ? 1 : 0;
+	    }
+
+	    return $ret;
 	}
 
-	return $ret;
+	die "invalid proxmox-file-restore output";
     }});
 
 __PACKAGE__->register_method ({
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH widget-toolkit v2 1/1] window/FileBrowser: try reload again when getting a 503 error
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
                   ` (6 preceding siblings ...)
  2022-04-26 10:14 ` [pve-devel] [PATCH storage v2 1/1] api: FileRestore: decode and return proper error with new file-restore params Dominik Csapak
@ 2022-04-26 10:14 ` Dominik Csapak
  2022-04-27 17:43 ` [pve-devel] partiall-applied: [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Thomas Lamprecht
  8 siblings, 0 replies; 10+ messages in thread
From: Dominik Csapak @ 2022-04-26 10:14 UTC (permalink / raw)
  To: pve-devel

for the file restore, we return a 503 error when we were not finished
mounting a disk in the restore vm, so ignore that error and try again
(up to 10 times) so a file listing now has a "real" timeout of
up to 300 seconds (30s pveproxy timeout * 10) instead of only 30,
which should be enough for most situations.

we also increase the proxy timeout to 60 seconds, since if one has many
disks, all of them will try to load at the same time, but the browser
has a maximum request limit and will stall+queue the remaining ones. so
those will not run into the extjs timeout when we increase it here.

for older backends without the new 503 returning feature, the calls
will still run into a pveproxy timeout anyway.

we also have to reimplement the 'monStoreErrors' functionality to
get a slightly different behaviour:
we disable the default extj loadMask of the treepanel and set it
ourselves. then on 503 we leave it up, and only remove it on success
or error (for non initial loads)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/window/FileBrowser.js | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/src/window/FileBrowser.js b/src/window/FileBrowser.js
index 2efa988..f4a22b6 100644
--- a/src/window/FileBrowser.js
+++ b/src/window/FileBrowser.js
@@ -146,6 +146,9 @@ Ext.define("Proxmox.window.FileBrowser", {
 
 	errorHandler: function(error, msg) {
 	    let me = this;
+	    if (error?.status === 503) {
+		return false;
+	    }
 	    me.lookup('downloadBtn').setDisabled(true);
 	    me.lookup('downloadTar').setDisabled(true);
 	    if (me.initialLoadDone) {
@@ -167,9 +170,37 @@ Ext.define("Proxmox.window.FileBrowser", {
 	    let proxy = store.getProxy();
 
 	    let errorCallback = (error, msg) => me.errorHandler(error, msg);
-	    Proxmox.Utils.monStoreErrors(tree, store, true, errorCallback);
 	    proxy.setUrl(view.listURL);
+	    proxy.setTimeout(60*1000);
 	    proxy.setExtraParams(view.extraParams);
+
+	    tree.mon(store, 'beforeload', () => {
+		Proxmox.Utils.setErrorMask(tree, true);
+	    });
+	    tree.mon(store, 'load', (treestore, rec, success, operation, node) => {
+		if (success) {
+		    Proxmox.Utils.setErrorMask(tree, false);
+		    return;
+		}
+		if (!node.loadCount) {
+		    node.loadCount = 0; // ensure its numeric
+		}
+		// trigger a reload if we got a 503 answer from the proxy
+		if (operation?.error?.status === 503 && node.loadCount < 10) {
+		    node.collapse();
+		    node.expand();
+		    node.loadCount++;
+		    return;
+		}
+
+		let error = operation.getError();
+		let msg = Proxmox.Utils.getResponseErrorMessage(error);
+		if (!errorCallback(error, msg)) {
+		    Proxmox.Utils.setErrorMask(tree, msg);
+		} else {
+		    Proxmox.Utils.setErrorMask(tree, false);
+		}
+	    });
 	    store.load((rec, op, success) => {
 		let root = store.getRoot();
 		root.expand(); // always expand invisible root node
@@ -217,6 +248,10 @@ Ext.define("Proxmox.window.FileBrowser", {
 		},
 	    },
 
+	    viewConfig: {
+		loadMask: false,
+	    },
+
 	    columns: [
 		{
 		    text: gettext('Name'),
-- 
2.30.2





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] partiall-applied: [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour
  2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
                   ` (7 preceding siblings ...)
  2022-04-26 10:14 ` [pve-devel] [PATCH widget-toolkit v2 1/1] window/FileBrowser: try reload again when getting a 503 error Dominik Csapak
@ 2022-04-27 17:43 ` Thomas Lamprecht
  8 siblings, 0 replies; 10+ messages in thread
From: Thomas Lamprecht @ 2022-04-27 17:43 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak
  Cc: Proxmox Backup Server development discussion

On 26.04.22 12:13, Dominik Csapak wrote:
> this series improves the behaviour of the file-restore when some mount
> operations take longer than the 30 second pveproxy timeout, and improves
> the startup speed of the restore vm
> 
> we do this by moving the disk init into the background of the daemon
> startup, while the hyper server is starting, and by adding a timeout
> option to the file-restore binary.
> 
> to be able to report the error back to the gui, we return a
> 'SERVICE_UNAVAILABLE' error in the timeout case, and the gui tries
> again (up to 10 times).
> 
> backup patches 2,3 could be applied independently, as they even make
> sense withouth the rest
> 
> pve-storage depends on a new pve-common&file-restore
> 
> widget-toolkit makes only sense with the other patches, but is
> designed to not change behaviour with the old api
> 
> alternatively i could drop the 'json-error' parameter and do it
> always, or when timeout it set.
> 
> changes from v1:
> * rebased on master
> * moved the json-error and timeout directly into pve-common (hardcoded)
>   since there is only one usage of that function
> 
> proxmox-backup:
> 
> Dominik Csapak (5):
>   restore-daemon: start disk initialization in parallel to the api
>   restore-daemon: put blocking code into 'block_in_place'
>   restore-daemon: avoid auto-mounting zpools
>   file-restore: factor out 'list_files'
>   file-restore: add 'timeout' and 'json-error' parameter
> 
>  proxmox-file-restore/src/main.rs              | 206 ++++++++++++------
>  proxmox-restore-daemon/src/main.rs            |  31 ++-
>  .../src/proxmox_restore_daemon/api.rs         |  12 +-
>  .../src/proxmox_restore_daemon/disk.rs        |  20 +-
>  4 files changed, 166 insertions(+), 103 deletions(-)

applied above PBS ones.

fwiw, I'd appreciate getting PBS related patches also on the PBS list, though I somehow
deleted them when I did not find them there ^^.
And just for clarity, there's no need to always CC the full series if it's a pve/pbs
mixed one.




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-27 17:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-26 10:13 [pve-devel] [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Dominik Csapak
2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 1/5] restore-daemon: start disk initialization in parallel to the api Dominik Csapak
2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 2/5] restore-daemon: put blocking code into 'block_in_place' Dominik Csapak
2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 3/5] restore-daemon: avoid auto-mounting zpools Dominik Csapak
2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 4/5] file-restore: factor out 'list_files' Dominik Csapak
2022-04-26 10:13 ` [pve-devel] [PATCH proxmox-backup v2 5/5] file-restore: add 'timeout' and 'json-error' parameter Dominik Csapak
2022-04-26 10:13 ` [pve-devel] [PATCH common v2 1/1] PBSClient: file_restore_list: add json-error and timeout parameter Dominik Csapak
2022-04-26 10:14 ` [pve-devel] [PATCH storage v2 1/1] api: FileRestore: decode and return proper error with new file-restore params Dominik Csapak
2022-04-26 10:14 ` [pve-devel] [PATCH widget-toolkit v2 1/1] window/FileBrowser: try reload again when getting a 503 error Dominik Csapak
2022-04-27 17:43 ` [pve-devel] partiall-applied: [PATCH proxmox-backup/common/storage/wt v2] improve file-restore timeout behaviour Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal