public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium
@ 2024-07-09 10:47 Stoiko Ivanov
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 1/5] bump proxmox-apt to 0.11 and adapt to changes Stoiko Ivanov
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-07-09 10:47 UTC (permalink / raw)
  To: pve-devel

supersedes: https://lists.proxmox.com/pipermail/pve-devel/2024-June/064278.html

v1->v2:
* had quite a few chats with Fabian off-list, as noted by him - Thanks!
* noticed that the core-issue was probably an error in using path.is_empty
  (which checks for an empty path '', not for an empty directory) fixed with
  patch 3/5
* noticed that the current gc-code removes orphaned files (a.k.a. files
  not hardlinked to a checksum file) in any case - so dropped the idea of
  an explicit command-line switch (as sync_pool calls gc on the target
  pool as well) - can gladly rework both to add this as explicit
  safe-guard, but currently do not think it has much merit
* patch 1/5 addresses the recent changes to proxmox-apt - I hope it's ok
  as is.
* patch 2/5 is unrelated, but it confused me enough while going through
  the code.

tested with a local setup.

original cover-letter for v1:

This patchset fixes a small glitch that we noticed in a pom-setup, creating
regular snapshots, without cleaning them up regularly.

Eventually medium sync becomes quite slow.
After removing many snapshots and running garbage collection both on the
mirror as well as on the medium the run-time for the sync still took quite
long. strace showed that the process still walked through the directories
for each snapshot on the medium - they were not cleaned up after all the
files inside were removed.

tested the patch locally (which is the reason for patch 1/2).


Stoiko Ivanov (5):
  bump proxmox-apt to 0.11 and adapt to changes.
  pool: drop superfluous check for impossible path combination
  pool: unlink_file: fix check for empty directory
  pool: gc: remove empty directories under link_dir
  pool: remove unused imports

 Cargo.toml     |  3 ++-
 debian/control |  3 ++-
 src/lib.rs     |  5 +++--
 src/mirror.rs  | 11 +++++------
 src/pool.rs    | 24 ++++++++++++++----------
 5 files changed, 26 insertions(+), 20 deletions(-)

-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror v2 1/5] bump proxmox-apt to 0.11 and adapt to changes.
  2024-07-09 10:47 [pve-devel] [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Stoiko Ivanov
@ 2024-07-09 10:47 ` Stoiko Ivanov
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 2/5] pool: drop superfluous check for impossible path combination Stoiko Ivanov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-07-09 10:47 UTC (permalink / raw)
  To: pve-devel

The recent changes to proxmox-apt along with the introduction of the
proxmox_apt_apit_types crate led to the following changes.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 Cargo.toml     |  3 ++-
 debian/control |  3 ++-
 src/lib.rs     |  5 +++--
 src/mirror.rs  | 11 +++++------
 4 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/Cargo.toml b/Cargo.toml
index edcdb87..39461c5 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -26,7 +26,8 @@ sequoia-openpgp = { version = "1.12" }
 walkdir = "2.3.1"
 xz2 = "0.1"
 
-proxmox-apt = { version = "0.10.9" }
+proxmox-apt = "0.11"
+proxmox-apt-api-types = "1.0"
 proxmox-async = "0.4"
 proxmox-http = { version = "0.9", features = [ "client-sync", "client-trait" ]}
 proxmox-router = { version = "2", features = [ "cli" ], default_features = false }
diff --git a/debian/control b/debian/control
index 86b6cfb..8063c2b 100644
--- a/debian/control
+++ b/debian/control
@@ -17,7 +17,8 @@ Build-Depends: bash-completion,
                librust-lazy-static-1+default-dev (>= 1.4-~~),
                librust-nix-0.26+default-dev (>= 0.26.1-~~),
                librust-openssl-0.10+default-dev,
-               librust-proxmox-apt-0.10+default-dev (>= 0.10.9~~),
+               librust-proxmox-apt-api-types-dev,
+               librust-proxmox-apt-0.11+default-dev,
                librust-proxmox-async-0.4+default-dev,
                librust-proxmox-http-0.9+client-sync-dev,
                librust-proxmox-http-0.9+client-trait-dev,
diff --git a/src/lib.rs b/src/lib.rs
index 8de1f33..8d5ca3f 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -16,7 +16,8 @@ use std::{
 
 use anyhow::{format_err, Error};
 use medium::MirrorInfo;
-use proxmox_apt::repositories::{APTRepository, APTRepositoryFile, APTRepositoryFileType};
+use proxmox_apt::repositories::{APTRepositoryFileImpl, APTRepositoryImpl};
+use proxmox_apt_api_types::{APTRepository, APTRepositoryFile, APTRepositoryFileType};
 use types::Snapshot;
 
 /// Main configuration file containing definitions of mirrors, external media and subscription keys.
@@ -140,7 +141,7 @@ pub fn generate_repo_file_line(
     repo.uris = vec![format!("file://{}", snapshot_path)];
 
     repo.options
-        .push(proxmox_apt::repositories::APTRepositoryOption {
+        .push(proxmox_apt_api_types::APTRepositoryOption {
             key: "check-valid-until".to_string(),
             values: vec!["false".to_string()],
         });
diff --git a/src/mirror.rs b/src/mirror.rs
index 073df86..40ee120 100644
--- a/src/mirror.rs
+++ b/src/mirror.rs
@@ -20,13 +20,12 @@ use crate::{
     types::{Diff, Snapshot, SNAPSHOT_REGEX},
     FetchResult, Progress,
 };
-use proxmox_apt::{
-    deb822::{
-        CheckSums, CompressionType, FileReference, FileReferenceType, PackagesFile, ReleaseFile,
-        SourcesFile,
-    },
-    repositories::{APTRepository, APTRepositoryPackageType},
+
+use proxmox_apt::deb822::{
+    CheckSums, CompressionType, FileReference, FileReferenceType, PackagesFile, ReleaseFile,
+    SourcesFile,
 };
+use proxmox_apt_api_types::{APTRepository, APTRepositoryPackageType};
 
 use crate::helpers;
 
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror v2 2/5] pool: drop superfluous check for impossible path combination
  2024-07-09 10:47 [pve-devel] [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Stoiko Ivanov
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 1/5] bump proxmox-apt to 0.11 and adapt to changes Stoiko Ivanov
@ 2024-07-09 10:47 ` Stoiko Ivanov
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 3/5] pool: unlink_file: fix check for empty directory Stoiko Ivanov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-07-09 10:47 UTC (permalink / raw)
  To: pve-devel

commit c598cb154ef3fa6e1a8af840ded40f713d973d3d changed the pool
layout to have the pool directory (.pool for a mirror) on the same
level as the link directory (instead of below), to enable pool-sharing
across multiple mirrors.

the condition will never be true, drop the if statement to avoid
confusion in the future.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/pool.rs | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/pool.rs b/src/pool.rs
index f20f5ea..611dcc9 100644
--- a/src/pool.rs
+++ b/src/pool.rs
@@ -265,9 +265,6 @@ impl PoolLockGuard<'_> {
 
         for link_entry in WalkDir::new(&self.pool.link_dir).into_iter() {
             let path = link_entry?.into_path();
-            if self.path_in_pool(&path) {
-                continue;
-            };
 
             let meta = path.metadata()?;
             if !meta.is_file() {
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror v2 3/5] pool: unlink_file: fix check for empty directory
  2024-07-09 10:47 [pve-devel] [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Stoiko Ivanov
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 1/5] bump proxmox-apt to 0.11 and adapt to changes Stoiko Ivanov
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 2/5] pool: drop superfluous check for impossible path combination Stoiko Ivanov
@ 2024-07-09 10:47 ` Stoiko Ivanov
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 4/5] pool: gc: remove empty directories under link_dir Stoiko Ivanov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-07-09 10:47 UTC (permalink / raw)
  To: pve-devel

path.is_empty() checks for the empty-path, not an empty directory [0].
as the check that the path is below the link_dir happens anyways in
the if we can directly call std::fs::remove_dir (which is even safer
than the std::fs::remove_dir_all call used in pool::remove_dir()).

the oversight seems to have been in place since the intial commit. I
ran across the issue when removing many snapshots of a Debian Bookworm
repository, syncing this to a medium, and still having a vast amount
of empty directories left behind (as debian has one directory per
package), which in turn increases the sync run-time.

[0] https://docs.rs/nix/latest/nix/trait.NixPath.html#tymethod.is_empty

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/pool.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/pool.rs b/src/pool.rs
index 611dcc9..b4f2a6a 100644
--- a/src/pool.rs
+++ b/src/pool.rs
@@ -428,11 +428,11 @@ impl PoolLockGuard<'_> {
         while let Some(parent) = path.parent() {
             path = parent;
 
-            if !self.pool.path_in_link_dir(path) || !path.is_empty() {
+            if !self.pool.path_in_link_dir(path) || path.read_dir()?.next().is_some() {
                 break;
             }
 
-            remove_dir(path)?;
+            std::fs::remove_dir(path)?;
         }
 
         Ok(())
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror v2 4/5] pool: gc: remove empty directories under link_dir
  2024-07-09 10:47 [pve-devel] [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Stoiko Ivanov
                   ` (2 preceding siblings ...)
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 3/5] pool: unlink_file: fix check for empty directory Stoiko Ivanov
@ 2024-07-09 10:47 ` Stoiko Ivanov
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 5/5] pool: remove unused imports Stoiko Ivanov
  2024-07-10 10:42 ` [pve-devel] applied-series: [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Fabian Grünbichler
  5 siblings, 0 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-07-09 10:47 UTC (permalink / raw)
  To: pve-devel

garbage collection currently is quite aggressive in removing all files
under the link_dir, which are not a hard-link to a checksum file.
removing directories that remain empty below the link_dir should thus
not too dangerous.

without this patch, removing a snapshot on a mirror, running gc there,
and syncing everything to a medium, leaves the medium with an
hierarchy of empty directories below the removed snapshot (the files
get cleaned up the directories remain).

using WalkDir::content_first() seems better than to check for
emptiness after each file-removal [0]

[0] https://docs.rs/walkdir/latest/walkdir/struct.WalkDir.html#method.contents_first

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/pool.rs | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/pool.rs b/src/pool.rs
index b4f2a6a..56dcbb4 100644
--- a/src/pool.rs
+++ b/src/pool.rs
@@ -451,6 +451,7 @@ impl PoolLockGuard<'_> {
     /// Run a garbage collection, removing
     /// - any checksum files that have no links outside of `pool_dir`
     /// - any files in `link_dir` that have no corresponding checksum files
+    /// - any empty directories below `link_dir` remaining after the file removal
     pub(crate) fn gc(&self) -> Result<(usize, u64), Error> {
         let (inode_map, _link_count) = self.get_inode_csum_map()?;
 
@@ -459,7 +460,8 @@ impl PoolLockGuard<'_> {
 
         let handle_entry = |entry: Result<walkdir::DirEntry, walkdir::Error>,
                             count: &mut usize,
-                            size: &mut u64|
+                            size: &mut u64,
+                            remove_empty_dir: bool|
          -> Result<(), Error> {
             let path = entry?.into_path();
             if path == self.lock_path() {
@@ -467,6 +469,10 @@ impl PoolLockGuard<'_> {
             }
 
             let meta = path.metadata()?;
+            if remove_empty_dir && meta.is_dir() && path.read_dir()?.next().is_none() {
+                std::fs::remove_dir(path)?;
+                return Ok(());
+            }
             if !meta.is_file() {
                 return Ok(());
             };
@@ -507,11 +513,12 @@ impl PoolLockGuard<'_> {
         };
 
         WalkDir::new(&self.pool.link_dir)
+            .contents_first(true)
             .into_iter()
-            .try_for_each(|entry| handle_entry(entry, &mut count, &mut size))?;
+            .try_for_each(|entry| handle_entry(entry, &mut count, &mut size, true))?;
         WalkDir::new(&self.pool.pool_dir)
             .into_iter()
-            .try_for_each(|entry| handle_entry(entry, &mut count, &mut size))?;
+            .try_for_each(|entry| handle_entry(entry, &mut count, &mut size, false))?;
 
         Ok((count, size))
     }
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror v2 5/5] pool: remove unused imports
  2024-07-09 10:47 [pve-devel] [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Stoiko Ivanov
                   ` (3 preceding siblings ...)
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 4/5] pool: gc: remove empty directories under link_dir Stoiko Ivanov
@ 2024-07-09 10:47 ` Stoiko Ivanov
  2024-07-10 10:42 ` [pve-devel] applied-series: [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Fabian Grünbichler
  5 siblings, 0 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-07-09 10:47 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/pool.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/pool.rs b/src/pool.rs
index 56dcbb4..7db976e 100644
--- a/src/pool.rs
+++ b/src/pool.rs
@@ -1,14 +1,14 @@
 use std::{
     cmp::max,
     collections::{hash_map::Entry, HashMap},
-    fs::{hard_link, remove_dir, File, Metadata},
+    fs::{hard_link, File, Metadata},
     ops::Deref,
     os::linux::fs::MetadataExt,
     path::{Path, PathBuf},
 };
 
 use anyhow::{bail, format_err, Error};
-use nix::{unistd, NixPath};
+use nix::unistd;
 
 use proxmox_apt::deb822::CheckSums;
 use proxmox_sys::fs::{create_path, file_get_contents, replace_file, CreateOptions};
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] applied-series: [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium
  2024-07-09 10:47 [pve-devel] [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Stoiko Ivanov
                   ` (4 preceding siblings ...)
  2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 5/5] pool: remove unused imports Stoiko Ivanov
@ 2024-07-10 10:42 ` Fabian Grünbichler
  5 siblings, 0 replies; 7+ messages in thread
From: Fabian Grünbichler @ 2024-07-10 10:42 UTC (permalink / raw)
  To: Proxmox VE development discussion

thanks!

On July 9, 2024 12:47 pm, Stoiko Ivanov wrote:
> supersedes: https://lists.proxmox.com/pipermail/pve-devel/2024-June/064278.html
> 
> v1->v2:
> * had quite a few chats with Fabian off-list, as noted by him - Thanks!
> * noticed that the core-issue was probably an error in using path.is_empty
>   (which checks for an empty path '', not for an empty directory) fixed with
>   patch 3/5
> * noticed that the current gc-code removes orphaned files (a.k.a. files
>   not hardlinked to a checksum file) in any case - so dropped the idea of
>   an explicit command-line switch (as sync_pool calls gc on the target
>   pool as well) - can gladly rework both to add this as explicit
>   safe-guard, but currently do not think it has much merit
> * patch 1/5 addresses the recent changes to proxmox-apt - I hope it's ok
>   as is.
> * patch 2/5 is unrelated, but it confused me enough while going through
>   the code.
> 
> tested with a local setup.
> 
> original cover-letter for v1:
> 
> This patchset fixes a small glitch that we noticed in a pom-setup, creating
> regular snapshots, without cleaning them up regularly.
> 
> Eventually medium sync becomes quite slow.
> After removing many snapshots and running garbage collection both on the
> mirror as well as on the medium the run-time for the sync still took quite
> long. strace showed that the process still walked through the directories
> for each snapshot on the medium - they were not cleaned up after all the
> files inside were removed.
> 
> tested the patch locally (which is the reason for patch 1/2).
> 
> 
> Stoiko Ivanov (5):
>   bump proxmox-apt to 0.11 and adapt to changes.
>   pool: drop superfluous check for impossible path combination
>   pool: unlink_file: fix check for empty directory
>   pool: gc: remove empty directories under link_dir
>   pool: remove unused imports
> 
>  Cargo.toml     |  3 ++-
>  debian/control |  3 ++-
>  src/lib.rs     |  5 +++--
>  src/mirror.rs  | 11 +++++------
>  src/pool.rs    | 24 ++++++++++++++----------
>  5 files changed, 26 insertions(+), 20 deletions(-)
> 
> -- 
> 2.39.2
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-07-10 10:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-09 10:47 [pve-devel] [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Stoiko Ivanov
2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 1/5] bump proxmox-apt to 0.11 and adapt to changes Stoiko Ivanov
2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 2/5] pool: drop superfluous check for impossible path combination Stoiko Ivanov
2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 3/5] pool: unlink_file: fix check for empty directory Stoiko Ivanov
2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 4/5] pool: gc: remove empty directories under link_dir Stoiko Ivanov
2024-07-09 10:47 ` [pve-devel] [PATCH proxmox-offline-mirror v2 5/5] pool: remove unused imports Stoiko Ivanov
2024-07-10 10:42 ` [pve-devel] applied-series: [PATCH proxmox-offline-mirror v2 0/5] remove snapshot directories vanished on source also on medium Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal