* [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects @ 2025-04-08 12:58 Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] client: reader: drop dead code Christian Ebner ` (5 more replies) 0 siblings, 6 replies; 13+ messages in thread From: Christian Ebner @ 2025-04-08 12:58 UTC (permalink / raw) To: pbs-devel These patches attempt to improve the server side error handling for backukp reader client disconnects. During regular operation, the server currently cannot distinguish a disconnect because of error from a disconnect because of finished operation. This leaves behind a task in failed state, which is unexpected and might cause confusion [0]. To improve error handling, follow the approach taken for the backup writer tasks, letting the client signal it has successfully finished via an api call and catch the disconnect error for that case. The signaling of messages/warnings in case of client side aborts triggered by the user as suggested by Fabian has not been implemented, as that is currently not transparently handled by the client (only handled by BackupReader's Drop impl), therefore seemed out of scope for this patch series. changes since version 3: - rebased onto current master Reported in the community forum: [0] https://forum.proxmox.com/threads/158306/ Christian Ebner (5): client: reader: drop dead code backup debug: diff: refactor backup reader creation api: reader: handle reader client disconnects client: reader: add finish method to signal client state to server client: backup reader: call finish before dropping backup readers examples/download-speed.rs | 2 ++ pbs-client/src/backup_reader.rs | 10 +++--- proxmox-backup-client/src/catalog.rs | 2 ++ proxmox-backup-client/src/main.rs | 7 ++++ proxmox-backup-client/src/mount.rs | 3 ++ proxmox-file-restore/src/main.rs | 19 +++++++--- src/api2/reader/environment.rs | 20 ++++++++++- src/api2/reader/mod.rs | 24 ++++++++++--- src/bin/proxmox_backup_debug/diff.rs | 53 +++++++++++++--------------- src/server/pull.rs | 7 +++- src/server/push.rs | 2 +- src/server/sync.rs | 30 +++++++++------- 12 files changed, 123 insertions(+), 56 deletions(-) -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* [pbs-devel] [PATCH v4 proxmox-backup 1/5] client: reader: drop dead code 2025-04-08 12:58 [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Christian Ebner @ 2025-04-08 12:58 ` Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 2/5] backup debug: diff: refactor backup reader creation Christian Ebner ` (4 subsequent siblings) 5 siblings, 0 replies; 13+ messages in thread From: Christian Ebner @ 2025-04-08 12:58 UTC (permalink / raw) To: pbs-devel The `force_close` method of `BackupReader` has no call sites. Commit dd066d28e2 ("src/api2/reader.rs: implement backup reader protocol") introduced the `force_close` method as dead code, by following along the lines of the same logic for the client. The subsequent reorganization by commit 9e490a74 ("src/client/backup_reader.rs: split BackupReader code into separate file") did not change that as well, making this dead code ever since. Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- changes since version 3: - no changes pbs-client/src/backup_reader.rs | 4 ---- 1 file changed, 4 deletions(-) diff --git a/pbs-client/src/backup_reader.rs b/pbs-client/src/backup_reader.rs index 88cba599b..18442ebca 100644 --- a/pbs-client/src/backup_reader.rs +++ b/pbs-client/src/backup_reader.rs @@ -117,10 +117,6 @@ impl BackupReader { self.h2.download(path, Some(param), output).await } - pub fn force_close(self) { - self.abort.abort(); - } - /// Download backup manifest (index.json) /// /// The manifest signature is verified if we have a crypt_config. -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* [pbs-devel] [PATCH v4 proxmox-backup 2/5] backup debug: diff: refactor backup reader creation 2025-04-08 12:58 [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] client: reader: drop dead code Christian Ebner @ 2025-04-08 12:58 ` Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 3/5] api: reader: handle reader client disconnects Christian Ebner ` (3 subsequent siblings) 5 siblings, 0 replies; 13+ messages in thread From: Christian Ebner @ 2025-04-08 12:58 UTC (permalink / raw) To: pbs-devel Move the backup reader instantiation code out of the corresponding helper function to its only call side. In preparation for adding the backup reader finish api call on termination. By moving this code, the backup reader can be handled more easily. No functional change intended. Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- changes since version 3: - no changes src/bin/proxmox_backup_debug/diff.rs | 40 +++++++++++----------------- 1 file changed, 16 insertions(+), 24 deletions(-) diff --git a/src/bin/proxmox_backup_debug/diff.rs b/src/bin/proxmox_backup_debug/diff.rs index 02389d0e5..fc65f3120 100644 --- a/src/bin/proxmox_backup_debug/diff.rs +++ b/src/bin/proxmox_backup_debug/diff.rs @@ -249,7 +249,22 @@ async fn open_dynamic_index( archive_name: &BackupArchiveName, params: &RepoParams, ) -> Result<(DynamicIndexReader, Accessor), Error> { - let backup_reader = create_backup_reader(snapshot, params).await?; + let backup_dir = match snapshot.parse::<BackupPart>()? { + BackupPart::Dir(dir) => dir, + BackupPart::Group(_group) => { + bail!("A full snapshot path must be provided."); + } + }; + let client = connect(¶ms.repo)?; + let backup_reader = BackupReader::start( + &client, + params.crypt_config.clone(), + params.repo.store(), + ¶ms.namespace, + &backup_dir, + false, + ) + .await?; let (manifest, _) = backup_reader.download_manifest().await?; manifest.check_fingerprint(params.crypt_config.as_ref().map(Arc::as_ref))?; @@ -279,29 +294,6 @@ async fn open_dynamic_index( Ok((lookup_index, accessor)) } -async fn create_backup_reader( - snapshot: &str, - params: &RepoParams, -) -> Result<Arc<BackupReader>, Error> { - let backup_dir = match snapshot.parse::<BackupPart>()? { - BackupPart::Dir(dir) => dir, - BackupPart::Group(_group) => { - bail!("A full snapshot path must be provided."); - } - }; - let client = connect(¶ms.repo)?; - let backup_reader = BackupReader::start( - &client, - params.crypt_config.clone(), - params.repo.store(), - ¶ms.namespace, - &backup_dir, - false, - ) - .await?; - Ok(backup_reader) -} - /// Get a list of chunk digests for an index file. fn chunk_digests_for_index(index: &dyn IndexFile) -> Vec<&ChunkDigest> { let mut all_chunks = Vec::new(); -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* [pbs-devel] [PATCH v4 proxmox-backup 3/5] api: reader: handle reader client disconnects 2025-04-08 12:58 [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] client: reader: drop dead code Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 2/5] backup debug: diff: refactor backup reader creation Christian Ebner @ 2025-04-08 12:58 ` Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] client: reader: add finish method to signal client state to server Christian Ebner ` (2 subsequent siblings) 5 siblings, 0 replies; 13+ messages in thread From: Christian Ebner @ 2025-04-08 12:58 UTC (permalink / raw) To: pbs-devel Currently, if a reader client disconnects after finishing its work, the connection will be closed by the client without notifying the server. The future handling the connection on then server side will then return with a connection error, and in consequence the reader worker task will log with error state. This can cause confusion [0], as this is not an error but normal behaviour. Instead of failing, provide an api endpoint for the client to signal it has finished operation. The server sets the connection environment state accordingly, and suppresses the disconnection error if the flag has been set. This follows the same logic used for the backup writer, introduced by commit b428af97 ("backup: avoid Transport endpoint is not connected error"). Report in the community forum: [0] https://forum.proxmox.com/threads/158306/ Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- changes since version 3: - rebased onto current master src/api2/reader/environment.rs | 20 +++++++++++++++++++- src/api2/reader/mod.rs | 24 ++++++++++++++++++++---- 2 files changed, 39 insertions(+), 5 deletions(-) diff --git a/src/api2/reader/environment.rs b/src/api2/reader/environment.rs index 3b2f06f43..3cdc8e394 100644 --- a/src/api2/reader/environment.rs +++ b/src/api2/reader/environment.rs @@ -1,5 +1,5 @@ use std::collections::HashSet; -use std::sync::{Arc, RwLock}; +use std::sync::{Arc, Mutex, RwLock}; use serde_json::{json, Value}; @@ -24,6 +24,11 @@ pub struct ReaderEnvironment { pub datastore: Arc<DataStore>, pub backup_dir: BackupDir, allowed_chunks: Arc<RwLock<HashSet<[u8; 32]>>>, + connection_state: Arc<Mutex<ConnectionState>>, +} + +struct ConnectionState { + client_finished: bool, } impl ReaderEnvironment { @@ -44,6 +49,9 @@ impl ReaderEnvironment { formatter: JSON_FORMATTER, backup_dir, allowed_chunks: Arc::new(RwLock::new(HashSet::new())), + connection_state: Arc::new(Mutex::new(ConnectionState { + client_finished: false, + })), } } @@ -69,6 +77,16 @@ impl ReaderEnvironment { pub fn check_chunk_access(&self, digest: [u8; 32]) -> bool { self.allowed_chunks.read().unwrap().contains(&digest) } + + pub(crate) fn client_finished(&self) -> bool { + let state = self.connection_state.lock().unwrap(); + state.client_finished + } + + pub(crate) fn finish(&self) { + let mut state = self.connection_state.lock().unwrap(); + state.client_finished = true; + } } impl RpcEnvironment for ReaderEnvironment { diff --git a/src/api2/reader/mod.rs b/src/api2/reader/mod.rs index cc791299c..7c7c711f9 100644 --- a/src/api2/reader/mod.rs +++ b/src/api2/reader/mod.rs @@ -14,7 +14,7 @@ use proxmox_router::{ http_err, list_subdirs_api_method, ApiHandler, ApiMethod, ApiResponseFuture, Permission, Router, RpcEnvironment, SubdirMap, }; -use proxmox_schema::{BooleanSchema, ObjectSchema}; +use proxmox_schema::{api, BooleanSchema, ObjectSchema}; use proxmox_sortable_macro::sortable; use pbs_api_types::{ @@ -187,9 +187,16 @@ fn upgrade_to_backup_reader_protocol( http.initial_connection_window_size(window_size); http.max_frame_size(4 * 1024 * 1024); - http.serve_connection(conn, service) - .map_err(Error::from) - .await + if let Err(err) = http.serve_connection(conn, service).await { + // Avoid Transport endpoint is not connected (os error 107) + // fixme: find a better way to test for that error + if !(err.to_string().starts_with("connection error") + && env2.client_finished()) + { + return Err(Error::from(err)); + } + } + Ok(()) }; futures::select! { @@ -223,6 +230,7 @@ const READER_API_SUBDIRS: SubdirMap = &[ "download", &Router::new().download(&API_METHOD_DOWNLOAD_FILE), ), + ("finish", &Router::new().post(&API_METHOD_FINISH)), ("speedtest", &Router::new().download(&API_METHOD_SPEEDTEST)), ]; @@ -342,6 +350,14 @@ fn download_chunk( .boxed() } +#[api] +/// Signal the reader instance finished successfully +fn finish(_info: &ApiMethod, rpcenv: &mut dyn RpcEnvironment) -> Result<Value, Error> { + let env: &ReaderEnvironment = rpcenv.as_ref(); + env.finish(); + Ok(Value::Null) +} + /* this is too slow fn download_chunk_old( _parts: Parts, -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* [pbs-devel] [PATCH v4 proxmox-backup 4/5] client: reader: add finish method to signal client state to server 2025-04-08 12:58 [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Christian Ebner ` (2 preceding siblings ...) 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 3/5] api: reader: handle reader client disconnects Christian Ebner @ 2025-04-08 12:58 ` Christian Ebner 2025-04-09 13:53 ` Max Carrara 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 5/5] client: backup reader: call finish before dropping backup readers Christian Ebner 2025-04-09 12:53 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Fiona Ebner 5 siblings, 1 reply; 13+ messages in thread From: Christian Ebner @ 2025-04-08 12:58 UTC (permalink / raw) To: pbs-devel Signal the server that the client has finished its operation and is about to close the connection. This allows the server side to react accordingly. Termination of the reader connection after successuful completion is now no longer logged as connection error, which has caused confusion [0]. Report in the community forum: [0] https://forum.proxmox.com/threads/158306/ Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- changes since version 3: - no changes pbs-client/src/backup_reader.rs | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/pbs-client/src/backup_reader.rs b/pbs-client/src/backup_reader.rs index 18442ebca..3474c8ce3 100644 --- a/pbs-client/src/backup_reader.rs +++ b/pbs-client/src/backup_reader.rs @@ -77,6 +77,12 @@ impl BackupReader { Ok(BackupReader::new(h2, abort, crypt_config)) } + /// Terminate reader session by signaling server via `finish` api call before closing connection + pub async fn finish(self: Arc<Self>) -> Result<(), Error> { + let _value = self.post("finish", None).await?; + Ok(()) + } + /// Execute a GET request pub async fn get(&self, path: &str, param: Option<Value>) -> Result<Value, Error> { self.h2.get(path, param).await -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [pbs-devel] [PATCH v4 proxmox-backup 4/5] client: reader: add finish method to signal client state to server 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] client: reader: add finish method to signal client state to server Christian Ebner @ 2025-04-09 13:53 ` Max Carrara 2025-04-09 14:27 ` Christian Ebner 0 siblings, 1 reply; 13+ messages in thread From: Max Carrara @ 2025-04-09 13:53 UTC (permalink / raw) To: Proxmox Backup Server development discussion On Tue Apr 8, 2025 at 2:58 PM CEST, Christian Ebner wrote: > Signal the server that the client has finished its operation and is > about to close the connection. This allows the server side to react > accordingly. > > Termination of the reader connection after successuful completion is > now no longer logged as connection error, which has caused confusion > [0]. > > Report in the community forum: > [0] https://forum.proxmox.com/threads/158306/ > > Signed-off-by: Christian Ebner <c.ebner@proxmox.com> > --- > changes since version 3: > - no changes > > pbs-client/src/backup_reader.rs | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/pbs-client/src/backup_reader.rs b/pbs-client/src/backup_reader.rs > index 18442ebca..3474c8ce3 100644 > --- a/pbs-client/src/backup_reader.rs > +++ b/pbs-client/src/backup_reader.rs > @@ -77,6 +77,12 @@ impl BackupReader { > Ok(BackupReader::new(h2, abort, crypt_config)) > } > > + /// Terminate reader session by signaling server via `finish` api call before closing connection > + pub async fn finish(self: Arc<Self>) -> Result<(), Error> { > + let _value = self.post("finish", None).await?; > + Ok(()) > + } There are two concerns I have with this approach here: 1. While I like moving out of `self` here (I actually love it when state is represented via the type system) calling `post` here like this might cause a race: `self: Arc<Self>` might still be referenced somewhere else, as in, there might still be some other operations going on. 2. Calling `finish()` is not enforced. In patch 05 you're calling `finish()` in 9 locations in total if I counted correctly, which means that there are 9 locations where haphazard changes could introduce subtle bugs. What I'd instead suggest is enforcing the call to happen through the type system -- here's a *very* rough example: with_new_reader(..., |reader: &BackupReader| { // Do stuff in here ... // Return a result upon successful completion, which then signals // to with_new_reader() that finish() should be called Ok(...) }) fn with_new_reader<F>(..., func: F) -> Result<(), Error> where F: FnOnce(BackupReader) -> Result<(), Error> { // [...] set up reader, then call func() on it let reader = ... match func(&reader) { Ok(()) => reader.finish().await, Err(...) => ..., } } The idea behind this is that the closure enforces the scope in which the reader is used for operations. Once the closure ends, `finish()` is called depending on the result the closure returns. Instead of just returning `()`, you could also add some kind of enum representing the possible "exiting" values / states of the reader, in case there's more stuff to handle (now or in the future). The thing is though... implementing this would require a rather large set of changes throughout our code, because we currently pass around `Arc<BackupReader>` quite a lot (*sigh*), which really gets in the way when one wants to enforce a certain order of operations (i.e. preventing `finish()` from being called too early). Since all of the methods of `BackupReader` take `&self` you could check if you can get away with s/Arc<BackupReader>/&BackupReader/. Let me know what you think! > + > /// Execute a GET request > pub async fn get(&self, path: &str, param: Option<Value>) -> Result<Value, Error> { > self.h2.get(path, param).await _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [pbs-devel] [PATCH v4 proxmox-backup 4/5] client: reader: add finish method to signal client state to server 2025-04-09 13:53 ` Max Carrara @ 2025-04-09 14:27 ` Christian Ebner 0 siblings, 0 replies; 13+ messages in thread From: Christian Ebner @ 2025-04-09 14:27 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Max Carrara On 4/9/25 15:53, Max Carrara wrote: > On Tue Apr 8, 2025 at 2:58 PM CEST, Christian Ebner wrote: >> Signal the server that the client has finished its operation and is >> about to close the connection. This allows the server side to react >> accordingly. >> >> Termination of the reader connection after successuful completion is >> now no longer logged as connection error, which has caused confusion >> [0]. >> >> Report in the community forum: >> [0] https://forum.proxmox.com/threads/158306/ >> >> Signed-off-by: Christian Ebner <c.ebner@proxmox.com> >> --- >> changes since version 3: >> - no changes >> >> pbs-client/src/backup_reader.rs | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/pbs-client/src/backup_reader.rs b/pbs-client/src/backup_reader.rs >> index 18442ebca..3474c8ce3 100644 >> --- a/pbs-client/src/backup_reader.rs >> +++ b/pbs-client/src/backup_reader.rs >> @@ -77,6 +77,12 @@ impl BackupReader { >> Ok(BackupReader::new(h2, abort, crypt_config)) >> } >> >> + /// Terminate reader session by signaling server via `finish` api call before closing connection >> + pub async fn finish(self: Arc<Self>) -> Result<(), Error> { >> + let _value = self.post("finish", None).await?; >> + Ok(()) >> + } > > There are two concerns I have with this approach here: > > 1. While I like moving out of `self` here (I actually love it when > state is represented via the type system) calling `post` here like > this might cause a race: `self: Arc<Self>` might still be > referenced somewhere else, as in, there might still be some other > operations going on. > > 2. Calling `finish()` is not enforced. In patch 05 you're calling > `finish()` in 9 locations in total if I counted correctly, which > means that there are 9 locations where haphazard changes could > introduce subtle bugs. > > What I'd instead suggest is enforcing the call to happen through the > type system -- here's a *very* rough example: > > with_new_reader(..., |reader: &BackupReader| { > // Do stuff in here ... > > // Return a result upon successful completion, which then signals > // to with_new_reader() that finish() should be called > Ok(...) > }) > > fn with_new_reader<F>(..., func: F) -> Result<(), Error> > where > F: FnOnce(BackupReader) -> Result<(), Error> { > > // [...] set up reader, then call func() on it > let reader = ... > > match func(&reader) { > Ok(()) => reader.finish().await, > Err(...) => ..., > } > } > > The idea behind this is that the closure enforces the scope in which the > reader is used for operations. Once the closure ends, `finish()` is > called depending on the result the closure returns. Instead of just > returning `()`, you could also add some kind of enum representing the > possible "exiting" values / states of the reader, in case there's more > stuff to handle (now or in the future). > > The thing is though... implementing this would require a rather large > set of changes throughout our code, because we currently pass around > `Arc<BackupReader>` quite a lot (*sigh*), which really gets in the way > when one wants to enforce a certain order of operations (i.e. preventing > `finish()` from being called too early). > > Since all of the methods of `BackupReader` take `&self` you could check > if you can get away with s/Arc<BackupReader>/&BackupReader/. > > Let me know what you think! Thanks for your suggestions. Given that this will however require more in-depth changes and has a larger regression potential this will be postponed to after the next point release (as discussed of list with Thomas). _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* [pbs-devel] [PATCH v4 proxmox-backup 5/5] client: backup reader: call finish before dropping backup readers 2025-04-08 12:58 [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Christian Ebner ` (3 preceding siblings ...) 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] client: reader: add finish method to signal client state to server Christian Ebner @ 2025-04-08 12:58 ` Christian Ebner 2025-04-09 12:53 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Fiona Ebner 5 siblings, 0 replies; 13+ messages in thread From: Christian Ebner @ 2025-04-08 12:58 UTC (permalink / raw) To: pbs-devel Signal the backup server that the readers have terminated with success, so the server gracefully handles disconnections and does not log them as error. Signed-off-by: Christian Ebner <c.ebner@proxmox.com> --- changes since version 3: - no changes examples/download-speed.rs | 2 ++ proxmox-backup-client/src/catalog.rs | 2 ++ proxmox-backup-client/src/main.rs | 7 +++++++ proxmox-backup-client/src/mount.rs | 3 +++ proxmox-file-restore/src/main.rs | 19 +++++++++++++----- src/bin/proxmox_backup_debug/diff.rs | 13 ++++++++---- src/server/pull.rs | 7 ++++++- src/server/push.rs | 2 +- src/server/sync.rs | 30 +++++++++++++++++----------- 9 files changed, 62 insertions(+), 23 deletions(-) diff --git a/examples/download-speed.rs b/examples/download-speed.rs index fe700982b..3583135fb 100644 --- a/examples/download-speed.rs +++ b/examples/download-speed.rs @@ -62,6 +62,8 @@ async fn run() -> Result<(), Error> { (bytes as f64) / (elapsed * 1024.0 * 1024.0) ); + client.finish().await?; + Ok(()) } diff --git a/proxmox-backup-client/src/catalog.rs b/proxmox-backup-client/src/catalog.rs index b1b22ff24..60c59137c 100644 --- a/proxmox-backup-client/src/catalog.rs +++ b/proxmox-backup-client/src/catalog.rs @@ -152,6 +152,7 @@ async fn dump_catalog(param: Value) -> Result<Value, Error> { catalog_reader.dump()?; record_repository(&repo); + client.finish().await?; Ok(Value::Null) } @@ -287,6 +288,7 @@ async fn catalog_shell(param: Value) -> Result<(), Error> { state.shell().await?; record_repository(&repo); + client.finish().await?; Ok(()) } diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs index 44f4f5db5..9f6ff991e 100644 --- a/proxmox-backup-client/src/main.rs +++ b/proxmox-backup-client/src/main.rs @@ -1005,6 +1005,7 @@ async fn create_backup( let mut catalog = None; let mut catalog_result_rx = None; + let mut prev_backup_reader = None; let log_file = |desc: &str, file: &str, target: &str| { let what = if dry_run { "Would upload" } else { "Upload" }; @@ -1109,6 +1110,8 @@ async fn create_backup( true, ) .await?; + // Allows to finish the backup reader instance + prev_backup_reader = Some(backup_reader.clone()); previous_ref = prepare_reference( &target, manifest.clone(), @@ -1256,6 +1259,9 @@ async fn create_backup( .await?; client.finish().await?; + if let Some(backup_reader) = prev_backup_reader { + backup_reader.finish().await?; + } let end_time = std::time::Instant::now(); let elapsed = end_time.duration_since(start_time); @@ -1751,6 +1757,7 @@ async fn restore( ) .await?; } + client.finish().await?; Ok(Value::Null) } diff --git a/proxmox-backup-client/src/mount.rs b/proxmox-backup-client/src/mount.rs index a5fee8329..275339a4f 100644 --- a/proxmox-backup-client/src/mount.rs +++ b/proxmox-backup-client/src/mount.rs @@ -299,6 +299,8 @@ async fn mount_do(param: Value, pipe: Option<OwnedFd>) -> Result<Value, Error> { // exit on interrupted } } + + client.finish().await?; } else if server_archive_name.archive_type() == ArchiveType::FixedIndex { let file_info = manifest.lookup_file_info(&server_archive_name)?; let index = client @@ -361,6 +363,7 @@ async fn mount_do(param: Value, pipe: Option<OwnedFd>) -> Result<Value, Error> { } log::info!("Image unmapped"); + client.finish().await?; } else { bail!("unknown archive file extension (expected .pxar or .img)"); } diff --git a/proxmox-file-restore/src/main.rs b/proxmox-file-restore/src/main.rs index 7888c38d6..8d87b2352 100644 --- a/proxmox-file-restore/src/main.rs +++ b/proxmox-file-restore/src/main.rs @@ -121,7 +121,7 @@ async fn list_files( let (manifest, _) = client.download_manifest().await?; manifest.check_fingerprint(crypt_config.as_ref().map(Arc::as_ref))?; - match path { + let result = match path { ExtractPath::ListArchives => { let mut entries = vec![]; for file in manifest.files() { @@ -206,7 +206,11 @@ async fn list_files( }; data_list(driver, details, file, path).await } - } + }; + + client.finish().await?; + + result } #[api( @@ -486,9 +490,13 @@ async fn extract( .await?; let reader = if let Some(payload_archive_name) = payload_archive_name { - let (payload_reader, payload_size) = - get_remote_pxar_reader(&payload_archive_name, client, &manifest, crypt_config) - .await?; + let (payload_reader, payload_size) = get_remote_pxar_reader( + &payload_archive_name, + client.clone(), + &manifest, + crypt_config, + ) + .await?; pxar::PxarVariant::Split(reader, (payload_reader, payload_size)) } else { pxar::PxarVariant::Unified(reader) @@ -541,6 +549,7 @@ async fn extract( bail!("cannot extract '{orig_path}'"); } } + client.finish().await?; Ok(()) } diff --git a/src/bin/proxmox_backup_debug/diff.rs b/src/bin/proxmox_backup_debug/diff.rs index fc65f3120..4462a1187 100644 --- a/src/bin/proxmox_backup_debug/diff.rs +++ b/src/bin/proxmox_backup_debug/diff.rs @@ -163,8 +163,10 @@ async fn diff_archive( compare_contents: bool, output_params: &OutputParams, ) -> Result<(), Error> { - let (index_a, accessor_a) = open_dynamic_index(snapshot_a, file_name, repo_params).await?; - let (index_b, accessor_b) = open_dynamic_index(snapshot_b, file_name, repo_params).await?; + let (index_a, accessor_a, backup_reader_a) = + open_dynamic_index(snapshot_a, file_name, repo_params).await?; + let (index_b, accessor_b, backup_reader_b) = + open_dynamic_index(snapshot_b, file_name, repo_params).await?; // vecs of chunk digests, in their correct order let chunks_a = chunk_digests_for_index(&index_a); @@ -217,6 +219,9 @@ async fn diff_archive( show_file_list(&added_files, &deleted_files, &modified_files, output_params)?; + backup_reader_a.finish().await?; + backup_reader_b.finish().await?; + Ok(()) } @@ -248,7 +253,7 @@ async fn open_dynamic_index( snapshot: &str, archive_name: &BackupArchiveName, params: &RepoParams, -) -> Result<(DynamicIndexReader, Accessor), Error> { +) -> Result<(DynamicIndexReader, Accessor, Arc<BackupReader>), Error> { let backup_dir = match snapshot.parse::<BackupPart>()? { BackupPart::Dir(dir) => dir, BackupPart::Group(_group) => { @@ -291,7 +296,7 @@ async fn open_dynamic_index( let reader: Arc<dyn ReadAt + Send + Sync> = Arc::new(LocalDynamicReadAt::new(reader)); let accessor = Accessor::new(pxar::PxarVariant::Unified(reader), archive_size).await?; - Ok((lookup_index, accessor)) + Ok((lookup_index, accessor, backup_reader)) } /// Get a list of chunk digests for an index file. diff --git a/src/server/pull.rs b/src/server/pull.rs index 8fb491cd4..56e5a74e9 100644 --- a/src/server/pull.rs +++ b/src/server/pull.rs @@ -635,7 +635,7 @@ async fn pull_group( .store .backup_dir(target_ns.clone(), from_snapshot.clone())?; - let reader = params + let (reader, backup_reader) = params .source .reader(source_namespace, &from_snapshot) .await?; @@ -648,6 +648,11 @@ async fn pull_group( ) .await; + if let Some(backup_reader) = backup_reader { + // ignore errors + let _result = backup_reader.finish().await; + } + progress.done_snapshots = pos as u64 + 1; info!("percentage done: {progress}"); diff --git a/src/server/push.rs b/src/server/push.rs index e71012ed8..62e13d48e 100644 --- a/src/server/push.rs +++ b/src/server/push.rs @@ -797,7 +797,7 @@ pub(crate) async fn push_snapshot( .backup_dir(namespace.clone(), snapshot.clone())?; // Reader locks the snapshot - let reader = params.source.reader(namespace, snapshot).await?; + let (reader, _) = params.source.reader(namespace, snapshot).await?; // Does not lock the manifest, but the reader already assures a locked snapshot let source_manifest = match backup_dir.load_manifest() { diff --git a/src/server/sync.rs b/src/server/sync.rs index 528e2054c..47ce96c0b 100644 --- a/src/server/sync.rs +++ b/src/server/sync.rs @@ -257,7 +257,7 @@ pub(crate) trait SyncSource: Send + Sync { &self, ns: &BackupNamespace, dir: &BackupDir, - ) -> Result<Arc<dyn SyncSourceReader>, Error>; + ) -> Result<(Arc<dyn SyncSourceReader>, Option<Arc<BackupReader>>), Error>; } pub(crate) struct RemoteSource { @@ -404,13 +404,16 @@ impl SyncSource for RemoteSource { &self, ns: &BackupNamespace, dir: &BackupDir, - ) -> Result<Arc<dyn SyncSourceReader>, Error> { + ) -> Result<(Arc<dyn SyncSourceReader>, Option<Arc<BackupReader>>), Error> { let backup_reader = BackupReader::start(&self.client, None, self.repo.store(), ns, dir, true).await?; - Ok(Arc::new(RemoteSourceReader { - backup_reader, - dir: dir.clone(), - })) + Ok(( + Arc::new(RemoteSourceReader { + backup_reader: backup_reader.clone(), + dir: dir.clone(), + }), + Some(backup_reader), + )) } } @@ -477,16 +480,19 @@ impl SyncSource for LocalSource { &self, ns: &BackupNamespace, dir: &BackupDir, - ) -> Result<Arc<dyn SyncSourceReader>, Error> { + ) -> Result<(Arc<dyn SyncSourceReader>, Option<Arc<BackupReader>>), Error> { let dir = self.store.backup_dir(ns.clone(), dir.clone())?; let guard = dir .lock() .with_context(|| format!("while reading snapshot '{dir:?}' for a sync job"))?; - Ok(Arc::new(LocalSourceReader { - _dir_lock: Arc::new(Mutex::new(guard)), - path: dir.full_path(), - datastore: dir.datastore().clone(), - })) + Ok(( + Arc::new(LocalSourceReader { + _dir_lock: Arc::new(Mutex::new(guard)), + path: dir.full_path(), + datastore: dir.datastore().clone(), + }), + None, + )) } } -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects 2025-04-08 12:58 [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Christian Ebner ` (4 preceding siblings ...) 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 5/5] client: backup reader: call finish before dropping backup readers Christian Ebner @ 2025-04-09 12:53 ` Fiona Ebner 2025-04-09 12:55 ` Fiona Ebner 2025-04-09 13:25 ` Christian Ebner 5 siblings, 2 replies; 13+ messages in thread From: Fiona Ebner @ 2025-04-09 12:53 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Christian Ebner Am 08.04.25 um 14:58 schrieb Christian Ebner: > These patches attempt to improve the server side error handling for > backukp reader client disconnects. > > During regular operation, the server currently cannot distinguish a > disconnect because of error from a disconnect because of finished > operation. This leaves behind a task in failed state, which is > unexpected and might cause confusion [0]. > > To improve error handling, follow the approach taken for the backup > writer tasks, letting the client signal it has successfully finished > via an api call and catch the disconnect error for that case. > > The signaling of messages/warnings in case of client side aborts > triggered by the user as suggested by Fabian has not been > implemented, as that is currently not transparently handled by the > client (only handled by BackupReader's Drop impl), therefore seemed > out of scope for this patch series. > > changes since version 3: > - rebased onto current master > > Reported in the community forum: > [0] https://forum.proxmox.com/threads/158306/ > While I wasn't able to reproduce the exact issue as reported in the forum thread I was able to run into a slightly different one by issuing the following two commands proxmox-backup-client backup --repository 10.10.100.180:bigone ex.pxar:example proxmox-backup-client backup --repository 10.10.100.180:bigone ex.pxar:example --change-detection-mode metadata resulting in a backup reader task that had to read no chunks (and that was the correct timing for me to trigger the race I guess): > 2025-04-09T14:36:21+02:00: starting new backup reader datastore 'bigone': "/mnt/datastore/bigone" > 2025-04-09T14:36:21+02:00: protocol upgrade done > 2025-04-09T14:36:21+02:00: TASK ERROR: connection error: connection reset: connection reset While testing that the refactor for "proxmox-backup-debug diff" did not mess anything up I ran into a completely different issue, namely, that it doesn't work for split archives: > root@pbs1:~# proxmox-backup-debug diff archive ct/131/2025-02-11T12:40:25Z ct/131/2025-04-09T11:29:09Z root.pxar --repository localhost:bigone > Error: Unable to open dynamic index "/mnt/datastore/bigone/ct/131/2025-04-09T11:29:09Z/root.pxar.didx" - No such file or directory (os error 2) > root@pbs1:~# ls /mnt/datastore/bigone/ct/131/2025-04-09T11:29:09Z > client.log.blob index.json.blob pct.conf.blob root.mpxar.didx root.ppxar.didx Another issue I found is that using a patched client but unpatched server will result in a client error: > Error: Path '/finish' not found. and non-zero exit code. Can the client check whether the server supports the new endpoint and handle this gracefully? _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects 2025-04-09 12:53 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Fiona Ebner @ 2025-04-09 12:55 ` Fiona Ebner 2025-04-09 13:20 ` Thomas Lamprecht 2025-04-09 13:25 ` Christian Ebner 1 sibling, 1 reply; 13+ messages in thread From: Fiona Ebner @ 2025-04-09 12:55 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Christian Ebner Am 09.04.25 um 14:53 schrieb Fiona Ebner: > Am 08.04.25 um 14:58 schrieb Christian Ebner: >> Reported in the community forum: >> [0] https://forum.proxmox.com/threads/158306/ >> > > While I wasn't able to reproduce the exact issue as reported in the > forum thread I was able to run into a slightly different one by issuing > the following two commands > > proxmox-backup-client backup --repository 10.10.100.180:bigone > ex.pxar:example > proxmox-backup-client backup --repository 10.10.100.180:bigone > ex.pxar:example --change-detection-mode metadata > > resulting in a backup reader task that had to read no chunks (and that > was the correct timing for me to trigger the race I guess): > >> 2025-04-09T14:36:21+02:00: starting new backup reader datastore 'bigone': "/mnt/datastore/bigone" >> 2025-04-09T14:36:21+02:00: protocol upgrade done >> 2025-04-09T14:36:21+02:00: TASK ERROR: connection error: connection reset: connection reset Totally forgot to mention that this issue was gone with both patched client and patched server :) _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects 2025-04-09 12:55 ` Fiona Ebner @ 2025-04-09 13:20 ` Thomas Lamprecht 2025-04-09 13:35 ` Fiona Ebner 0 siblings, 1 reply; 13+ messages in thread From: Thomas Lamprecht @ 2025-04-09 13:20 UTC (permalink / raw) To: Proxmox Backup Server development discussion, Fiona Ebner, Christian Ebner Am 09.04.25 um 14:55 schrieb Fiona Ebner: > Am 09.04.25 um 14:53 schrieb Fiona Ebner: >> Am 08.04.25 um 14:58 schrieb Christian Ebner: >>> Reported in the community forum: >>> [0] https://forum.proxmox.com/threads/158306/ >>> >> >> While I wasn't able to reproduce the exact issue as reported in the >> forum thread I was able to run into a slightly different one by issuing >> the following two commands >> >> proxmox-backup-client backup --repository 10.10.100.180:bigone >> ex.pxar:example >> proxmox-backup-client backup --repository 10.10.100.180:bigone >> ex.pxar:example --change-detection-mode metadata >> >> resulting in a backup reader task that had to read no chunks (and that >> was the correct timing for me to trigger the race I guess): >> >>> 2025-04-09T14:36:21+02:00: starting new backup reader datastore 'bigone': "/mnt/datastore/bigone" >>> 2025-04-09T14:36:21+02:00: protocol upgrade done >>> 2025-04-09T14:36:21+02:00: TASK ERROR: connection error: connection reset: connection reset > > Totally forgot to mention that this issue was gone with both patched > client and patched server :) As in T-b? :-) _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects 2025-04-09 13:20 ` Thomas Lamprecht @ 2025-04-09 13:35 ` Fiona Ebner 0 siblings, 0 replies; 13+ messages in thread From: Fiona Ebner @ 2025-04-09 13:35 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox Backup Server development discussion, Christian Ebner Am 09.04.25 um 15:20 schrieb Thomas Lamprecht: > Am 09.04.25 um 14:55 schrieb Fiona Ebner: >> Am 09.04.25 um 14:53 schrieb Fiona Ebner: >>> Am 08.04.25 um 14:58 schrieb Christian Ebner: >>>> Reported in the community forum: >>>> [0] https://forum.proxmox.com/threads/158306/ >>>> >>> >>> While I wasn't able to reproduce the exact issue as reported in the >>> forum thread I was able to run into a slightly different one by issuing >>> the following two commands >>> >>> proxmox-backup-client backup --repository 10.10.100.180:bigone >>> ex.pxar:example >>> proxmox-backup-client backup --repository 10.10.100.180:bigone >>> ex.pxar:example --change-detection-mode metadata >>> >>> resulting in a backup reader task that had to read no chunks (and that >>> was the correct timing for me to trigger the race I guess): >>> >>>> 2025-04-09T14:36:21+02:00: starting new backup reader datastore 'bigone': "/mnt/datastore/bigone" >>>> 2025-04-09T14:36:21+02:00: protocol upgrade done >>>> 2025-04-09T14:36:21+02:00: TASK ERROR: connection error: connection reset: connection reset >> >> Totally forgot to mention that this issue was gone with both patched >> client and patched server :) > > As in T-b? :-) I wanted to write a sentence that it's fixed together with the T-b trailer, but then noticed the issue when the client is new and the server is old and then wrote neither of those, because I consider that issue important enough to be addressed first. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects 2025-04-09 12:53 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Fiona Ebner 2025-04-09 12:55 ` Fiona Ebner @ 2025-04-09 13:25 ` Christian Ebner 1 sibling, 0 replies; 13+ messages in thread From: Christian Ebner @ 2025-04-09 13:25 UTC (permalink / raw) To: Fiona Ebner, Proxmox Backup Server development discussion On 4/9/25 14:53, Fiona Ebner wrote: > Am 08.04.25 um 14:58 schrieb Christian Ebner: >> These patches attempt to improve the server side error handling for >> backukp reader client disconnects. >> >> During regular operation, the server currently cannot distinguish a >> disconnect because of error from a disconnect because of finished >> operation. This leaves behind a task in failed state, which is >> unexpected and might cause confusion [0]. >> >> To improve error handling, follow the approach taken for the backup >> writer tasks, letting the client signal it has successfully finished >> via an api call and catch the disconnect error for that case. >> >> The signaling of messages/warnings in case of client side aborts >> triggered by the user as suggested by Fabian has not been >> implemented, as that is currently not transparently handled by the >> client (only handled by BackupReader's Drop impl), therefore seemed >> out of scope for this patch series. >> >> changes since version 3: >> - rebased onto current master >> >> Reported in the community forum: >> [0] https://forum.proxmox.com/threads/158306/ >> > > While I wasn't able to reproduce the exact issue as reported in the > forum thread I was able to run into a slightly different one by issuing > the following two commands > > proxmox-backup-client backup --repository 10.10.100.180:bigone > ex.pxar:example > proxmox-backup-client backup --repository 10.10.100.180:bigone > ex.pxar:example --change-detection-mode metadata > > resulting in a backup reader task that had to read no chunks (and that > was the correct timing for me to trigger the race I guess): > >> 2025-04-09T14:36:21+02:00: starting new backup reader datastore 'bigone': "/mnt/datastore/bigone" >> 2025-04-09T14:36:21+02:00: protocol upgrade done >> 2025-04-09T14:36:21+02:00: TASK ERROR: connection error: connection reset: connection reset > > While testing that the refactor for "proxmox-backup-debug diff" did not > mess anything up I ran into a completely different issue, namely, that > it doesn't work for split archives: > >> root@pbs1:~# proxmox-backup-debug diff archive ct/131/2025-02-11T12:40:25Z ct/131/2025-04-09T11:29:09Z root.pxar --repository localhost:bigone >> Error: Unable to open dynamic index "/mnt/datastore/bigone/ct/131/2025-04-09T11:29:09Z/root.pxar.didx" - No such file or directory (os error 2) >> root@pbs1:~# ls /mnt/datastore/bigone/ct/131/2025-04-09T11:29:09Z >> client.log.blob index.json.blob pct.conf.blob root.mpxar.didx root.ppxar.didx Hmm, okay will open an issue for this, as this will require bigger changes. > > > Another issue I found is that using a patched client but unpatched > server will result in a client error: > >> Error: Path '/finish' not found. > > and non-zero exit code. Can the client check whether the server supports > the new endpoint and handle this gracefully? Yeah, good point! I will see how to fix this, should be able to extract the information I need from the version endpoint. _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-04-09 14:28 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-04-08 12:58 [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] client: reader: drop dead code Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 2/5] backup debug: diff: refactor backup reader creation Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 3/5] api: reader: handle reader client disconnects Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] client: reader: add finish method to signal client state to server Christian Ebner 2025-04-09 13:53 ` Max Carrara 2025-04-09 14:27 ` Christian Ebner 2025-04-08 12:58 ` [pbs-devel] [PATCH v4 proxmox-backup 5/5] client: backup reader: call finish before dropping backup readers Christian Ebner 2025-04-09 12:53 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] handle reader client disconnects Fiona Ebner 2025-04-09 12:55 ` Fiona Ebner 2025-04-09 13:20 ` Thomas Lamprecht 2025-04-09 13:35 ` Fiona Ebner 2025-04-09 13:25 ` Christian Ebner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inboxService provided by Proxmox Server Solutions GmbH | Privacy | Legal