all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support
@ 2022-10-18  9:20 Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-apt 1/2] packages file: add section field Fabian Grünbichler
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Fabian Grünbichler @ 2022-10-18  9:20 UTC (permalink / raw)
  To: pve-devel

this series implements filtering based on package section (exact match)
or package name (glob), and extends mirroring support to source
packages/deb-src repositories.

technically the first patch in proxmox-apt is a breaking change, but the
only user of the changed struct is proxmox-offline-mirror, which doesn't
do any incompatible initializations.

proxmox-apt:

Fabian Grünbichler (2):
  packages file: add section field
  deb822: source index support

 src/deb822/mod.rs                             |      3 +
 src/deb822/packages_file.rs                   |      2 +
 src/deb822/release_file.rs                    |      2 +-
 src/deb822/sources_file.rs                    |    255 +
 ..._debian_dists_bullseye_main_source_Sources | 858657 +++++++++++++++
 5 files changed, 858918 insertions(+), 1 deletion(-)
 create mode 100644 src/deb822/sources_file.rs
 create mode 100644 tests/deb822/sources/deb.debian.org_debian_dists_bullseye_main_source_Sources

proxmox-offline-mirror:

Fabian Grünbichler (4):
  mirror: add exclusion of packages/sections
  mirror: implement source packages mirroring
  fix #4264: only require either Release or InRelease
  mirror: refactor fetch_binary/source_packages

 Cargo.toml                                    |   1 +
 debian/control                                |   2 +
 src/bin/proxmox-offline-mirror.rs             |   4 +-
 src/bin/proxmox_offline_mirror_cmds/config.rs |   8 +
 src/config.rs                                 |  40 +-
 src/mirror.rs                                 | 483 ++++++++++++++----
 6 files changed, 437 insertions(+), 101 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] [PATCH proxmox-apt 1/2] packages file: add section field
  2022-10-18  9:20 [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Fabian Grünbichler
@ 2022-10-18  9:20 ` Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-apt 2/2] deb822: source index support Fabian Grünbichler
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Fabian Grünbichler @ 2022-10-18  9:20 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    technically a breaking change, but the only user (pom) doesn't care.
    not bumping to an incompatible version would avoid the need to bump the dep in proxmox-perl-rs

 src/deb822/packages_file.rs | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/deb822/packages_file.rs b/src/deb822/packages_file.rs
index a51f71e..90b21c6 100644
--- a/src/deb822/packages_file.rs
+++ b/src/deb822/packages_file.rs
@@ -57,6 +57,7 @@ pub struct PackageEntry {
     pub size: usize,
     pub installed_size: Option<usize>,
     pub checksums: CheckSums,
+    pub section: String,
 }
 
 #[derive(Debug, Default, PartialEq, Eq)]
@@ -83,6 +84,7 @@ impl TryFrom<PackagesFileRaw> for PackageEntry {
             size: value.size.parse::<usize>()?,
             installed_size,
             checksums: CheckSums::default(),
+            section: value.section,
         };
 
         if let Some(md5) = value.md5_sum {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] [PATCH proxmox-apt 2/2] deb822: source index support
  2022-10-18  9:20 [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-apt 1/2] packages file: add section field Fabian Grünbichler
@ 2022-10-18  9:20 ` Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 1/4] mirror: add exclusion of packages/sections Fabian Grünbichler
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Fabian Grünbichler @ 2022-10-18  9:20 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
the test file needs to be downloaded from the referenced URL and
uncompressed (it's too big to send as patch).

its SHA256sum is e7777c1d305f5e0a31bcf2fe26e955436986edb5c211c03a362c7d557c899349 

 src/deb822/mod.rs                             |      3 +
 src/deb822/release_file.rs                    |      2 +-
 src/deb822/sources_file.rs                    |    255 +
 ..._debian_dists_bullseye_main_source_Sources | 858657 +++++++++++++++
 4 files changed, 858916 insertions(+), 1 deletion(-)
 create mode 100644 src/deb822/sources_file.rs
 create mode 100644 tests/deb822/sources/deb.debian.org_debian_dists_bullseye_main_source_Sources

diff --git a/src/deb822/mod.rs b/src/deb822/mod.rs
index 7a1bb0e..59e7c21 100644
--- a/src/deb822/mod.rs
+++ b/src/deb822/mod.rs
@@ -5,6 +5,9 @@ pub use release_file::{CompressionType, FileReference, FileReferenceType, Releas
 mod packages_file;
 pub use packages_file::PackagesFile;
 
+mod sources_file;
+pub use sources_file::SourcesFile;
+
 #[derive(Copy, Clone, Debug, Default, PartialEq, Eq, PartialOrd, Ord)]
 pub struct CheckSums {
     pub md5: Option<[u8; 16]>,
diff --git a/src/deb822/release_file.rs b/src/deb822/release_file.rs
index c50c095..85d3436 100644
--- a/src/deb822/release_file.rs
+++ b/src/deb822/release_file.rs
@@ -245,7 +245,7 @@ impl FileReferenceType {
     }
 
     pub fn is_package_index(&self) -> bool {
-        matches!(self, FileReferenceType::Packages(_, _))
+        matches!(self, FileReferenceType::Packages(_, _) | FileReferenceType::Sources(_))
     }
 }
 
diff --git a/src/deb822/sources_file.rs b/src/deb822/sources_file.rs
new file mode 100644
index 0000000..a13d84f
--- /dev/null
+++ b/src/deb822/sources_file.rs
@@ -0,0 +1,255 @@
+use std::collections::HashMap;
+
+use anyhow::{bail, Error, format_err};
+use rfc822_like::de::Deserializer;
+use serde::Deserialize;
+use serde_json::Value;
+
+use super::CheckSums;
+//Uploaders
+//
+//Homepage
+//
+//Version Control System (VCS) fields
+//
+//Testsuite
+//
+//Dgit
+//
+//Standards-Version (mandatory)
+//
+//Build-Depends et al
+//
+//Package-List (recommended)
+//
+//Checksums-Sha1 and Checksums-Sha256 (mandatory)
+//
+//Files (mandatory)
+
+
+
+#[derive(Debug, Deserialize)]
+#[serde(rename_all = "PascalCase")]
+pub struct SourcesFileRaw {
+    pub format: String,
+    pub package: String,
+    pub binary: Option<Vec<String>>,
+    pub version: String,
+    pub section: Option<String>,
+    pub priority: Option<String>,
+    pub maintainer: String,
+    pub uploaders: Option<String>,
+    pub architecture: Option<String>,
+    pub directory: String,
+    pub files: String,
+    #[serde(rename = "Checksums-Sha256")]
+    pub sha256: Option<String>,
+    #[serde(rename = "Checksums-Sha512")]
+    pub sha512: Option<String>,
+    #[serde(flatten)]
+    pub extra_fields: HashMap<String, Value>,
+}
+
+#[derive(Debug, PartialEq, Eq)]
+pub struct SourcePackageEntry {
+    pub format: String,
+    pub package: String,
+    pub binary: Option<Vec<String>>,
+    pub version: String,
+    pub architecture: Option<String>,
+    pub section: Option<String>,
+    pub priority: Option<String>,
+    pub maintainer: String,
+    pub uploaders: Option<String>,
+    pub directory: String,
+    pub files: HashMap<String, SourcePackageFileReference>,
+}
+
+#[derive(Debug, PartialEq, Eq)]
+pub struct SourcePackageFileReference {
+    pub file: String,
+    pub size: usize,
+    pub checksums: CheckSums,
+}
+
+impl SourcePackageEntry {
+    pub fn size(&self) -> usize {
+        self.files.values().map(|f| f.size).sum()
+    }
+}
+
+#[derive(Debug, Default, PartialEq, Eq)]
+/// A parsed representation of a Release file
+pub struct SourcesFile {
+    pub source_packages: Vec<SourcePackageEntry>,
+}
+
+impl TryFrom<SourcesFileRaw> for SourcePackageEntry {
+    type Error = Error;
+
+    fn try_from(value: SourcesFileRaw) -> Result<Self, Self::Error> {
+        let mut parsed = SourcePackageEntry {
+            package: value.package,
+            binary: value.binary,
+            version: value.version,
+            architecture: value.architecture,
+            files: HashMap::new(),
+            format: value.format,
+            section: value.section,
+            priority: value.priority,
+            maintainer: value.maintainer,
+            uploaders: value.uploaders,
+            directory: value.directory,
+        };
+
+        for file_reference in value.files.lines() {
+            let (file_name, size, md5) = parse_file_reference(file_reference, 16)?;
+            let entry = parsed.files.entry(file_name.clone()).or_insert_with(|| SourcePackageFileReference { file: file_name, size, checksums: CheckSums::default()});
+            entry.checksums.md5 = Some(md5.try_into().map_err(|_|format_err!("unexpected checksum length"))?);
+            if entry.size != size {
+                bail!("Size mismatch: {} != {}", entry.size, size);
+            }
+        }
+
+        if let Some(sha256) = value.sha256 {
+            for line in sha256.lines() {
+                let (file_name, size, sha256) = parse_file_reference(line, 32)?;
+                let entry = parsed.files.entry(file_name.clone()).or_insert_with(|| SourcePackageFileReference { file: file_name, size, checksums: CheckSums::default()});
+                entry.checksums.sha256 = Some(sha256.try_into().map_err(|_|format_err!("unexpected checksum length"))?);
+                if entry.size != size {
+                    bail!("Size mismatch: {} != {}", entry.size, size);
+                }
+            }
+        };
+
+        if let Some(sha512) = value.sha512 {
+            for line in sha512.lines() {
+                let (file_name, size, sha512) = parse_file_reference(line, 64)?;
+                let entry = parsed.files.entry(file_name.clone()).or_insert_with(|| SourcePackageFileReference { file: file_name, size, checksums: CheckSums::default()});
+                entry.checksums.sha512 = Some(sha512.try_into().map_err(|_|format_err!("unexpected checksum length"))?);
+                if entry.size != size {
+                    bail!("Size mismatch: {} != {}", entry.size, size);
+                }
+            }
+        };
+
+        for (file_name, reference) in &parsed.files {
+            if !reference.checksums.is_secure() {
+                bail!(
+                    "no strong checksum found for source entry '{}'",
+                    file_name
+                );
+            }
+        }
+
+        Ok(parsed)
+    }
+}
+
+impl TryFrom<String> for SourcesFile {
+    type Error = Error;
+
+    fn try_from(value: String) -> Result<Self, Self::Error> {
+        value.as_bytes().try_into()
+    }
+}
+
+impl TryFrom<&[u8]> for SourcesFile {
+    type Error = Error;
+
+    fn try_from(value: &[u8]) -> Result<Self, Self::Error> {
+        let deserialized = <Vec<SourcesFileRaw>>::deserialize(Deserializer::new(value))?;
+        deserialized.try_into()
+    }
+}
+
+impl TryFrom<Vec<SourcesFileRaw>> for SourcesFile {
+    type Error = Error;
+
+    fn try_from(value: Vec<SourcesFileRaw>) -> Result<Self, Self::Error> {
+        let mut source_packages = Vec::with_capacity(value.len());
+        for entry in value {
+            let entry: SourcePackageEntry = entry.try_into()?;
+            source_packages.push(entry);
+        }
+
+        Ok(Self { source_packages })
+    }
+}
+
+fn parse_file_reference(
+    line: &str,
+    csum_len: usize,
+) -> Result<(String, usize, Vec<u8>), Error> {
+    let mut split = line.split_ascii_whitespace();
+
+    let checksum = split
+        .next()
+        .ok_or_else(|| format_err!("Missing 'checksum' field."))?;
+    if checksum.len() > csum_len * 2 {
+        bail!(
+            "invalid checksum length: '{}', expected {} bytes",
+            checksum,
+            csum_len
+        );
+    }
+
+    let checksum = hex::decode(checksum)?;
+
+    let size = split
+        .next()
+        .ok_or_else(|| format_err!("Missing 'size' field."))?
+        .parse::<usize>()?;
+
+    let file = split
+        .next()
+        .ok_or_else(|| format_err!("Missing 'file name' field."))?
+        .to_string();
+
+    Ok((file, size, checksum))
+}
+
+#[test]
+pub fn test_deb_packages_file() {
+    let input = include_str!(concat!(
+        env!("CARGO_MANIFEST_DIR"),
+        "/tests/deb822/sources/deb.debian.org_debian_dists_bullseye_main_source_Sources"
+    ));
+
+    let deserialized =
+        <Vec<SourcesFileRaw>>::deserialize(Deserializer::new(input.as_bytes())).unwrap();
+    assert_eq!(deserialized.len(), 30953);
+
+    let parsed: SourcesFile = deserialized.try_into().unwrap();
+
+    assert_eq!(parsed.source_packages.len(), 30953);
+
+    let found = parsed.source_packages.iter().find(|source| source.package == "base-files").expect("test file contains 'base-files' entry");
+    assert_eq!(found.package, "base-files");
+    assert_eq!(found.format, "3.0 (native)");
+    assert_eq!(found.architecture.as_deref(), Some("any"));
+    assert_eq!(found.directory, "pool/main/b/base-files");
+    assert_eq!(found.section.as_deref(), Some("admin"));
+    assert_eq!(found.version, "11.1+deb11u5");
+
+    let binary_packages = found.binary.as_ref().expect("base-files source package builds base-files binary package");
+    assert_eq!(binary_packages.len(), 1);
+    assert_eq!(binary_packages[0], "base-files");
+    
+    let references = &found.files;
+    assert_eq!(references.len(), 2);
+
+    let dsc_file = "base-files_11.1+deb11u5.dsc";
+    let dsc = references.get(dsc_file).expect("base-files source package contains 'dsc' reference");
+    assert_eq!(dsc.file, dsc_file);
+    assert_eq!(dsc.size, 1110);
+    assert_eq!(dsc.checksums.md5.expect("dsc has md5 checksum"), hex::decode("741c34ac0151262a03de8d5a07bc4271").unwrap()[..]);
+    assert_eq!(dsc.checksums.sha256.expect("dsc has sha256 checksum"), hex::decode("c41a7f00d57759f27e6068240d1ea7ad80a9a752e4fb43850f7e86e967422bd3").unwrap()[..]);
+
+    let tar_file = "base-files_11.1+deb11u5.tar.xz";
+    let tar = references.get(tar_file).expect("base-files source package contains 'tar' reference");
+    assert_eq!(tar.file, tar_file);
+    assert_eq!(tar.size, 65612);
+    assert_eq!(tar.checksums.md5.expect("tar has md5 checksum"), hex::decode("995df33642118b566a4026410e1c6aac").unwrap()[..]);
+    assert_eq!(tar.checksums.sha256.expect("tar has sha256 checksum"), hex::decode("31c9e5745845a73f3d5c8a7868c379d77aaca42b81194679d7ab40cc28e3a0e9").unwrap()[..]);
+}
\ No newline at end of file
diff --git a/tests/deb822/sources/deb.debian.org_debian_dists_bullseye_main_source_Sources b/tests/deb822/sources/deb.debian.org_debian_dists_bullseye_main_source_Sources
new file mode 100644
index 0000000..2b8e387
--- /dev/null
+++ b/tests/deb822/sources/deb.debian.org_debian_dists_bullseye_main_source_Sources
@@ -0,0 +1,1 @@
+DOWNLOAD-ME-FROM: http://snapshot.debian.org/archive/debian/20221017T212657Z/dists/bullseye/main/source/Sources.xz
-- 
2.30.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror 1/4] mirror: add exclusion of packages/sections
  2022-10-18  9:20 [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-apt 1/2] packages file: add section field Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-apt 2/2] deb822: source index support Fabian Grünbichler
@ 2022-10-18  9:20 ` Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 2/4] mirror: implement source packages mirroring Fabian Grünbichler
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Fabian Grünbichler @ 2022-10-18  9:20 UTC (permalink / raw)
  To: pve-devel

to keep the size of mirror snapshots down by excluding unnecessary files
(e.g., games data, browsers, debug packages, ..).

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    requires proxmox-apt with 'section' field

    we could suggest excluding sections like 'games' in the
    wizard/docs..

 Cargo.toml                                    |  1 +
 debian/control                                |  2 +
 src/bin/proxmox-offline-mirror.rs             |  4 +-
 src/bin/proxmox_offline_mirror_cmds/config.rs |  8 +++
 src/config.rs                                 | 40 ++++++++++++-
 src/mirror.rs                                 | 59 ++++++++++++++++++-
 6 files changed, 111 insertions(+), 3 deletions(-)

diff --git a/Cargo.toml b/Cargo.toml
index 76791c8..b2bb188 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -13,6 +13,7 @@ anyhow = "1.0"
 base64 = "0.13"
 bzip2 = "0.4"
 flate2 = "1.0.22"
+globset = "0.4.8"
 hex = "0.4.3"
 lazy_static = "1.4"
 nix = "0.24"
diff --git a/debian/control b/debian/control
index 0741a7b..9fe6605 100644
--- a/debian/control
+++ b/debian/control
@@ -10,6 +10,7 @@ Build-Depends: debhelper (>= 12),
  librust-base64-0.13+default-dev,
  librust-bzip2-0.4+default-dev,
  librust-flate2-1+default-dev (>= 1.0.22-~~),
+ librust-globset-0.4+default-dev (>= 0.4.8-~~),
  librust-hex-0.4+default-dev (>= 0.4.3-~~),
  librust-lazy-static-1+default-dev (>= 1.4-~~),
  librust-nix-0.24+default-dev,
@@ -57,6 +58,7 @@ Depends:
  librust-base64-0.13+default-dev,
  librust-bzip2-0.4+default-dev,
  librust-flate2-1+default-dev (>= 1.0.22-~~),
+ librust-globset-0.4+default-dev (>= 0.4.8-~~),
  librust-hex-0.4+default-dev (>= 0.4.3-~~),
  librust-lazy-static-1+default-dev (>= 1.4-~~),
  librust-nix-0.24+default-dev,
diff --git a/src/bin/proxmox-offline-mirror.rs b/src/bin/proxmox-offline-mirror.rs
index 522056b..07b6ce6 100644
--- a/src/bin/proxmox-offline-mirror.rs
+++ b/src/bin/proxmox-offline-mirror.rs
@@ -13,7 +13,7 @@ use proxmox_offline_mirror::helpers::tty::{
     read_bool_from_tty, read_selection_from_tty, read_string_from_tty,
 };
 use proxmox_offline_mirror::{
-    config::{save_config, MediaConfig, MirrorConfig},
+    config::{save_config, MediaConfig, MirrorConfig, SkipConfig},
     mirror,
     types::{ProductType, MEDIA_ID_SCHEMA, MIRROR_ID_SCHEMA},
 };
@@ -387,6 +387,7 @@ fn action_add_mirror(config: &SectionConfigData) -> Result<Vec<MirrorConfig>, Er
                 base_dir: base_dir.clone(),
                 use_subscription: None,
                 ignore_errors: false,
+                skip: SkipConfig::default(), // TODO sensible default?
             });
         }
     }
@@ -401,6 +402,7 @@ fn action_add_mirror(config: &SectionConfigData) -> Result<Vec<MirrorConfig>, Er
         base_dir,
         use_subscription,
         ignore_errors: false,
+        skip: SkipConfig::default(),
     };
 
     configs.push(main_config);
diff --git a/src/bin/proxmox_offline_mirror_cmds/config.rs b/src/bin/proxmox_offline_mirror_cmds/config.rs
index 5ebf6d5..3ebf4ad 100644
--- a/src/bin/proxmox_offline_mirror_cmds/config.rs
+++ b/src/bin/proxmox_offline_mirror_cmds/config.rs
@@ -266,6 +266,14 @@ pub fn update_mirror(
         data.ignore_errors = ignore_errors
     }
 
+    if let Some(skip_packages) = update.skip.skip_packages {
+        data.skip.skip_packages = Some(skip_packages);
+    }
+
+    if let Some(skip_sections) = update.skip.skip_sections {
+        data.skip.skip_sections = Some(skip_sections);
+    }
+
     config.set_data(&id, "mirror", &data)?;
     proxmox_offline_mirror::config::save_config(&config_file, &config)?;
 
diff --git a/src/config.rs b/src/config.rs
index be8f96b..39b1193 100644
--- a/src/config.rs
+++ b/src/config.rs
@@ -14,6 +14,38 @@ use crate::types::{
     PROXMOX_SUBSCRIPTION_KEY_SCHEMA,
 };
 
+/// Skip Configuration
+#[api(
+    properties: {
+        "skip-sections": {
+            type: Array,
+            optional: true,
+            items: {
+                type: String,
+                description: "Section name",
+            },
+        },
+        "skip-packages": {
+            type: Array,
+            optional: true,
+            items: {
+                type: String,
+                description: "Package name",
+            },
+        },
+    },
+)]
+#[derive(Default, Serialize, Deserialize, Updater, Clone, Debug)]
+#[serde(rename_all = "kebab-case")]
+pub struct SkipConfig {
+    /// Sections which should be skipped
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub skip_sections: Option<Vec<String>>,
+    /// Packages which should be skipped, supports globbing
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub skip_packages: Option<Vec<String>>,
+}
+
 #[api(
     properties: {
         id: {
@@ -46,6 +78,9 @@ use crate::types::{
             optional: true,
             default: false,
         },
+        "skip": {
+            type: SkipConfig,
+        },
     }
 )]
 #[derive(Clone, Debug, Serialize, Deserialize, Updater)]
@@ -73,6 +108,9 @@ pub struct MirrorConfig {
     /// Whether to downgrade download errors to warnings
     #[serde(default)]
     pub ignore_errors: bool,
+    /// Skip package files using these criteria
+    #[serde(default, flatten)]
+    pub skip: SkipConfig,
 }
 
 #[api(
@@ -191,7 +229,7 @@ fn init() -> SectionConfig {
     let mut config = SectionConfig::new(&MIRROR_ID_SCHEMA);
 
     let mirror_schema = match MirrorConfig::API_SCHEMA {
-        Schema::Object(ref obj_schema) => obj_schema,
+        Schema::AllOf(ref all_of_schema) => all_of_schema,
         _ => unreachable!(),
     };
     let mirror_plugin = SectionConfigPlugin::new(
diff --git a/src/mirror.rs b/src/mirror.rs
index dfb4cc9..22dc716 100644
--- a/src/mirror.rs
+++ b/src/mirror.rs
@@ -7,12 +7,13 @@ use std::{
 
 use anyhow::{bail, format_err, Error};
 use flate2::bufread::GzDecoder;
+use globset::{Glob, GlobSetBuilder};
 use nix::libc;
 use proxmox_http::{client::sync::Client, HttpClient, HttpOptions};
 use proxmox_sys::fs::file_get_contents;
 
 use crate::{
-    config::{MirrorConfig, SubscriptionKey},
+    config::{MirrorConfig, SkipConfig, SubscriptionKey},
     convert_repo_line,
     pool::Pool,
     types::{Diff, Snapshot, SNAPSHOT_REGEX},
@@ -47,6 +48,7 @@ struct ParsedMirrorConfig {
     pub auth: Option<String>,
     pub client: Client,
     pub ignore_errors: bool,
+    pub skip: SkipConfig,
 }
 
 impl TryInto<ParsedMirrorConfig> for MirrorConfig {
@@ -76,6 +78,7 @@ impl TryInto<ParsedMirrorConfig> for MirrorConfig {
             auth: None,
             client,
             ignore_errors: self.ignore_errors,
+            skip: self.skip,
         })
     }
 }
@@ -664,8 +667,22 @@ pub fn create_snapshot(
         }
     }
 
+    let skipped_package_globs = if let Some(skipped_packages) = &config.skip.skip_packages {
+        let mut globs = GlobSetBuilder::new();
+        for glob in skipped_packages {
+            let glob = Glob::new(glob)?;
+            globs.add(glob);
+        }
+        let globs = globs.build()?;
+        Some(globs)
+    } else {
+        None
+    };
+
     println!("\nFetching packages..");
     let mut dry_run_progress = Progress::new();
+    let mut total_skipped_count = 0usize;
+    let mut total_skipped_bytes = 0usize;
     for (basename, references) in packages_indices {
         let total_files = references.files.len();
         if total_files == 0 {
@@ -676,7 +693,37 @@ pub fn create_snapshot(
         }
 
         let mut fetch_progress = Progress::new();
+        let mut skipped_count = 0usize;
+        let mut skipped_bytes = 0usize;
         for package in references.files {
+            if let Some(ref sections) = &config.skip.skip_sections {
+                if sections.iter().any(|section| package.section == *section) {
+                    println!(
+                        "\tskipping {} - {}b (section '{}')",
+                        package.package, package.size, package.section
+                    );
+                    skipped_count += 1;
+                    skipped_bytes += package.size;
+                    continue;
+                }
+            }
+            if let Some(skipped_package_globs) = &skipped_package_globs {
+                let matches = skipped_package_globs.matches(&package.package);
+                if !matches.is_empty() {
+                    // safety, skipped_package_globs is set based on this
+                    let globs = config.skip.skip_packages.as_ref().unwrap();
+                    let matches: Vec<String> = matches.iter().map(|i| globs[*i].clone()).collect();
+                    println!(
+                        "\tskipping {} - {}b (package glob(s): {})",
+                        package.package,
+                        package.size,
+                        matches.join(", ")
+                    );
+                    skipped_count += 1;
+                    skipped_bytes += package.size;
+                    continue;
+                }
+            }
             let url = get_repo_url(&config.repository, &package.file);
 
             if dry_run {
@@ -728,6 +775,11 @@ pub fn create_snapshot(
         } else {
             total_progress += fetch_progress;
         }
+        if skipped_count > 0 {
+            total_skipped_count += skipped_count;
+            total_skipped_bytes += skipped_bytes;
+            println!("Skipped downloading {skipped_count} packages totalling {skipped_bytes}b");
+        }
     }
 
     if dry_run {
@@ -736,6 +788,11 @@ pub fn create_snapshot(
     } else {
         println!("\nStats: {total_progress}");
     }
+    if total_count > 0 {
+        println!(
+            "Skipped downloading {total_skipped_count} packages totalling {total_skipped_bytes}b"
+        );
+    }
 
     if !warnings.is_empty() {
         eprintln!("Warnings:");
-- 
2.30.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror 2/4] mirror: implement source packages mirroring
  2022-10-18  9:20 [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Fabian Grünbichler
                   ` (2 preceding siblings ...)
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 1/4] mirror: add exclusion of packages/sections Fabian Grünbichler
@ 2022-10-18  9:20 ` Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 3/4] fix #4264: only require either Release or InRelease Fabian Grünbichler
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Fabian Grünbichler @ 2022-10-18  9:20 UTC (permalink / raw)
  To: pve-devel

similar to the binary package one, but with one additional layer since
each source package consists of 2-3 files, not a single .deb file.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    requires proxmox-apt with source index support

 src/mirror.rs | 158 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 150 insertions(+), 8 deletions(-)

diff --git a/src/mirror.rs b/src/mirror.rs
index 22dc716..37dca97 100644
--- a/src/mirror.rs
+++ b/src/mirror.rs
@@ -22,6 +22,7 @@ use crate::{
 use proxmox_apt::{
     deb822::{
         CheckSums, CompressionType, FileReference, FileReferenceType, PackagesFile, ReleaseFile,
+        SourcesFile,
     },
     repositories::{APTRepository, APTRepositoryPackageType},
 };
@@ -598,10 +599,15 @@ pub fn create_snapshot(
 
     let mut packages_size = 0_usize;
     let mut packages_indices = HashMap::new();
+
+    let mut source_packages_indices = HashMap::new();
+
     let mut failed_references = Vec::new();
     for (component, references) in per_component {
         println!("\nFetching indices for component '{component}'");
         let mut component_deb_size = 0;
+        let mut component_dsc_size = 0;
+
         let mut fetch_progress = Progress::new();
 
         for basename in references {
@@ -642,21 +648,49 @@ pub fn create_snapshot(
                 fetch_progress.update(&res);
 
                 if package_index_data.is_none() && reference.file_type.is_package_index() {
-                    package_index_data = Some(res.data());
+                    package_index_data = Some((&reference.file_type, res.data()));
                 }
             }
-            if let Some(data) = package_index_data {
-                let packages: PackagesFile = data[..].try_into()?;
-                let size: usize = packages.files.iter().map(|p| p.size).sum();
-                println!("\t{} packages totalling {size}", packages.files.len());
-                component_deb_size += size;
-
-                packages_indices.entry(basename).or_insert(packages);
+            if let Some((reference_type, data)) = package_index_data {
+                match reference_type {
+                    FileReferenceType::Packages(_, _) => {
+                        let packages: PackagesFile = data[..].try_into()?;
+                        let size: usize = packages.files.iter().map(|p| p.size).sum();
+                        println!("\t{} packages totalling {size}", packages.files.len());
+                        component_deb_size += size;
+
+                        packages_indices.entry(basename).or_insert(packages);
+                    }
+                    FileReferenceType::Sources(_) => {
+                        let source_packages: SourcesFile = data[..].try_into()?;
+                        let size: usize = source_packages
+                            .source_packages
+                            .iter()
+                            .map(|s| s.size())
+                            .sum();
+                        println!(
+                            "\t{} source packages totalling {size}",
+                            source_packages.source_packages.len()
+                        );
+                        component_dsc_size += size;
+                        source_packages_indices
+                            .entry(basename)
+                            .or_insert(source_packages);
+                    }
+                    unknown => {
+                        eprintln!("Unknown package index '{unknown:?}', skipping processing..")
+                    }
+                }
             }
             println!("Progress: {fetch_progress}");
         }
+
         println!("Total deb size for component: {component_deb_size}");
         packages_size += component_deb_size;
+
+        println!("Total dsc size for component: {component_dsc_size}");
+        packages_size += component_dsc_size;
+
         total_progress += fetch_progress;
     }
     println!("Total deb size: {packages_size}");
@@ -782,6 +816,114 @@ pub fn create_snapshot(
         }
     }
 
+    for (basename, references) in source_packages_indices {
+        let total_source_packages = references.source_packages.len();
+        if total_source_packages == 0 {
+            println!("\n{basename} - no files, skipping.");
+            continue;
+        } else {
+            println!("\n{basename} - {total_source_packages} total source package(s)");
+        }
+
+        let mut fetch_progress = Progress::new();
+        let mut skipped_count = 0usize;
+        let mut skipped_bytes = 0usize;
+        for package in references.source_packages {
+            if let Some(ref sections) = &config.skip.skip_sections {
+                if sections
+                    .iter()
+                    .any(|section| package.section.as_ref() == Some(section))
+                {
+                    println!(
+                        "\tskipping {} - {}b (section '{}')",
+                        package.package,
+                        package.size(),
+                        package.section.as_ref().unwrap(),
+                    );
+                    skipped_count += 1;
+                    skipped_bytes += package.size();
+                    continue;
+                }
+            }
+            if let Some(skipped_package_globs) = &skipped_package_globs {
+                let matches = skipped_package_globs.matches(&package.package);
+                if !matches.is_empty() {
+                    // safety, skipped_package_globs is set based on this
+                    let globs = config.skip.skip_packages.as_ref().unwrap();
+                    let matches: Vec<String> = matches.iter().map(|i| globs[*i].clone()).collect();
+                    println!(
+                        "\tskipping {} - {}b (package glob(s): {})",
+                        package.package,
+                        package.size(),
+                        matches.join(", ")
+                    );
+                    skipped_count += 1;
+                    skipped_bytes += package.size();
+                    continue;
+                }
+            }
+
+            for file_reference in package.files.values() {
+                let path = format!("{}/{}", package.directory, file_reference.file);
+                let url = get_repo_url(&config.repository, &path);
+
+                if dry_run {
+                    if config.pool.contains(&file_reference.checksums) {
+                        fetch_progress.update(&FetchResult {
+                            data: vec![],
+                            fetched: 0,
+                        });
+                    } else {
+                        println!("\t(dry-run) GET missing '{url}' ({}b)", file_reference.size);
+                        fetch_progress.update(&FetchResult {
+                            data: vec![],
+                            fetched: file_reference.size,
+                        });
+                    }
+                } else {
+                    let mut full_path = PathBuf::from(prefix);
+                    full_path.push(&path);
+
+                    match fetch_plain_file(
+                        &config,
+                        &url,
+                        &full_path,
+                        file_reference.size,
+                        &file_reference.checksums,
+                        false,
+                        dry_run,
+                    ) {
+                        Ok(res) => fetch_progress.update(&res),
+                        Err(err) if config.ignore_errors => {
+                            let msg = format!(
+                                "{}: failed to fetch package '{}' - {}",
+                                basename, file_reference.file, err,
+                            );
+                            eprintln!("{msg}");
+                            warnings.push(msg);
+                        }
+                        Err(err) => return Err(err),
+                    }
+                }
+
+                if fetch_progress.file_count() % (max(total_source_packages / 100, 1)) == 0 {
+                    println!("\tProgress: {fetch_progress}");
+                }
+            }
+        }
+        println!("\tProgress: {fetch_progress}");
+        if dry_run {
+            dry_run_progress += fetch_progress;
+        } else {
+            total_progress += fetch_progress;
+        }
+        if skipped_count > 0 {
+            total_skipped_count += skipped_count;
+            total_skipped_bytes += skipped_bytes;
+            println!("Skipped downloading {skipped_count} packages totalling {skipped_bytes}b");
+        }
+    }
+
     if dry_run {
         println!("\nDry-run Stats (indices, downloaded but not persisted):\n{total_progress}");
         println!("\nDry-run stats (packages, new == missing):\n{dry_run_progress}");
-- 
2.30.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror 3/4] fix #4264: only require either Release or InRelease
  2022-10-18  9:20 [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Fabian Grünbichler
                   ` (3 preceding siblings ...)
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 2/4] mirror: implement source packages mirroring Fabian Grünbichler
@ 2022-10-18  9:20 ` Fabian Grünbichler
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 4/4] mirror: refactor fetch_binary/source_packages Fabian Grünbichler
  2022-10-20 12:49 ` [pve-devel] applied-series: [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Thomas Lamprecht
  6 siblings, 0 replies; 8+ messages in thread
From: Fabian Grünbichler @ 2022-10-18  9:20 UTC (permalink / raw)
  To: pve-devel

strictly speaking InRelease is required, and Release optional, but that
might not be true for older repositories. treat failure to fetch either
as non-fatal, provided the other is available.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 src/mirror.rs | 70 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 51 insertions(+), 19 deletions(-)

diff --git a/src/mirror.rs b/src/mirror.rs
index 37dca97..39b7f47 100644
--- a/src/mirror.rs
+++ b/src/mirror.rs
@@ -144,40 +144,61 @@ fn fetch_repo_file(
 
 /// Helper to fetch InRelease (`detached` == false) or Release/Release.gpg (`detached` == true) files from repository.
 ///
-/// Verifies the contained/detached signature, stores all fetched files under `prefix`, and returns the verified raw release file data.
+/// Verifies the contained/detached signature and stores all fetched files under `prefix`.
+/// 
+/// Returns the verified raw release file data, or None if the "fetch" part itself fails.
 fn fetch_release(
     config: &ParsedMirrorConfig,
     prefix: &Path,
     detached: bool,
     dry_run: bool,
-) -> Result<FetchResult, Error> {
+) -> Result<Option<FetchResult>, Error> {
     let (name, fetched, sig) = if detached {
         println!("Fetching Release/Release.gpg files");
-        let sig = fetch_repo_file(
+        let sig = match fetch_repo_file(
             &config.client,
             &get_dist_url(&config.repository, "Release.gpg"),
             1024 * 1024,
             None,
             config.auth.as_deref(),
-        )?;
-        let mut fetched = fetch_repo_file(
+        ) {
+            Ok(res) => res,
+            Err(err) => {
+                eprintln!("Release.gpg fetch failure: {err}");
+                return Ok(None);
+            }
+        };
+
+        let mut fetched = match fetch_repo_file(
             &config.client,
             &get_dist_url(&config.repository, "Release"),
             256 * 1024 * 1024,
             None,
             config.auth.as_deref(),
-        )?;
+        ) {
+            Ok(res) => res,
+            Err(err) => {
+                eprintln!("Release fetch failure: {err}");
+                return Ok(None);
+            }
+        };
         fetched.fetched += sig.fetched;
         ("Release(.gpg)", fetched, Some(sig.data()))
     } else {
         println!("Fetching InRelease file");
-        let fetched = fetch_repo_file(
+        let fetched = match fetch_repo_file(
             &config.client,
             &get_dist_url(&config.repository, "InRelease"),
             256 * 1024 * 1024,
             None,
             config.auth.as_deref(),
-        )?;
+        ) {
+            Ok(res) => res,
+            Err(err) => {
+                eprintln!("InRelease fetch failure: {err}");
+                return Ok(None);
+            }
+        };
         ("InRelease", fetched, None)
     };
 
@@ -193,10 +214,10 @@ fn fetch_release(
     };
 
     if dry_run {
-        return Ok(FetchResult {
+        return Ok(Some(FetchResult {
             data: verified,
             fetched: fetched.fetched,
-        });
+        }));
     }
 
     let locked = &config.pool.lock()?;
@@ -230,10 +251,10 @@ fn fetch_release(
         )?;
     }
 
-    Ok(FetchResult {
+    Ok(Some(FetchResult {
         data: verified,
         fetched: fetched.fetched,
-    })
+    }))
 }
 
 /// Helper to fetch an index file referenced by a `ReleaseFile`.
@@ -510,14 +531,25 @@ pub fn create_snapshot(
         Ok(parsed)
     };
 
-    // we want both on-disk for compat reasons
-    let res = fetch_release(&config, prefix, true, dry_run)?;
-    total_progress.update(&res);
-    let _release = parse_release(res, "Release")?;
+    // we want both on-disk for compat reasons, if both are available
+    let release = fetch_release(&config, prefix, true, dry_run)?
+        .map(|res| {
+            total_progress.update(&res);
+            parse_release(res, "Release")
+        })
+        .transpose()?;
+
+    let in_release = fetch_release(&config, prefix, false, dry_run)?
+        .map(|res| {
+            total_progress.update(&res);
+            parse_release(res, "InRelease")
+        })
+        .transpose()?;
 
-    let res = fetch_release(&config, prefix, false, dry_run)?;
-    total_progress.update(&res);
-    let release = parse_release(res, "InRelease")?;
+    // at least one must be available to proceed
+    let release = release
+        .or(in_release)
+        .ok_or_else(|| format_err!("Neither Release(.gpg) nor InRelease available!"))?;
 
     let mut per_component = HashMap::new();
     let mut others = Vec::new();
-- 
2.30.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] [PATCH proxmox-offline-mirror 4/4] mirror: refactor fetch_binary/source_packages
  2022-10-18  9:20 [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Fabian Grünbichler
                   ` (4 preceding siblings ...)
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 3/4] fix #4264: only require either Release or InRelease Fabian Grünbichler
@ 2022-10-18  9:20 ` Fabian Grünbichler
  2022-10-20 12:49 ` [pve-devel] applied-series: [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Thomas Lamprecht
  6 siblings, 0 replies; 8+ messages in thread
From: Fabian Grünbichler @ 2022-10-18  9:20 UTC (permalink / raw)
  To: pve-devel

and pull out some of the progress variables into a struct.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 src/mirror.rs | 520 ++++++++++++++++++++++++++++----------------------
 1 file changed, 287 insertions(+), 233 deletions(-)

diff --git a/src/mirror.rs b/src/mirror.rs
index 39b7f47..faaaa19 100644
--- a/src/mirror.rs
+++ b/src/mirror.rs
@@ -7,7 +7,7 @@ use std::{
 
 use anyhow::{bail, format_err, Error};
 use flate2::bufread::GzDecoder;
-use globset::{Glob, GlobSetBuilder};
+use globset::{Glob, GlobSet, GlobSetBuilder};
 use nix::libc;
 use proxmox_http::{client::sync::Client, HttpClient, HttpOptions};
 use proxmox_sys::fs::file_get_contents;
@@ -145,7 +145,7 @@ fn fetch_repo_file(
 /// Helper to fetch InRelease (`detached` == false) or Release/Release.gpg (`detached` == true) files from repository.
 ///
 /// Verifies the contained/detached signature and stores all fetched files under `prefix`.
-/// 
+///
 /// Returns the verified raw release file data, or None if the "fetch" part itself fails.
 fn fetch_release(
     config: &ParsedMirrorConfig,
@@ -474,6 +474,259 @@ pub fn list_snapshots(config: &MirrorConfig) -> Result<Vec<Snapshot>, Error> {
     Ok(list)
 }
 
+struct MirrorProgress {
+    warnings: Vec<String>,
+    dry_run: Progress,
+    total: Progress,
+    skip_count: usize,
+    skip_bytes: usize,
+}
+
+fn convert_to_globset(config: &ParsedMirrorConfig) -> Result<Option<GlobSet>, Error> {
+    Ok(if let Some(skipped_packages) = &config.skip.skip_packages {
+        let mut globs = GlobSetBuilder::new();
+        for glob in skipped_packages {
+            let glob = Glob::new(glob)?;
+            globs.add(glob);
+        }
+        let globs = globs.build()?;
+        Some(globs)
+    } else {
+        None
+    })
+}
+
+fn fetch_binary_packages(
+    config: &ParsedMirrorConfig,
+    packages_indices: HashMap<&String, PackagesFile>,
+    dry_run: bool,
+    prefix: &Path,
+    progress: &mut MirrorProgress,
+) -> Result<(), Error> {
+    let skipped_package_globs = convert_to_globset(config)?;
+
+    for (basename, references) in packages_indices {
+        let total_files = references.files.len();
+        if total_files == 0 {
+            println!("\n{basename} - no files, skipping.");
+            continue;
+        } else {
+            println!("\n{basename} - {total_files} total file(s)");
+        }
+
+        let mut fetch_progress = Progress::new();
+        let mut skip_count = 0usize;
+        let mut skip_bytes = 0usize;
+        for package in references.files {
+            if let Some(ref sections) = &config.skip.skip_sections {
+                if sections.iter().any(|section| package.section == *section) {
+                    println!(
+                        "\tskipping {} - {}b (section '{}')",
+                        package.package, package.size, package.section
+                    );
+                    skip_count += 1;
+                    skip_bytes += package.size;
+                    continue;
+                }
+            }
+            if let Some(skipped_package_globs) = &skipped_package_globs {
+                let matches = skipped_package_globs.matches(&package.package);
+                if !matches.is_empty() {
+                    // safety, skipped_package_globs is set based on this
+                    let globs = config.skip.skip_packages.as_ref().unwrap();
+                    let matches: Vec<String> = matches.iter().map(|i| globs[*i].clone()).collect();
+                    println!(
+                        "\tskipping {} - {}b (package glob(s): {})",
+                        package.package,
+                        package.size,
+                        matches.join(", ")
+                    );
+                    skip_count += 1;
+                    skip_bytes += package.size;
+                    continue;
+                }
+            }
+            let url = get_repo_url(&config.repository, &package.file);
+
+            if dry_run {
+                if config.pool.contains(&package.checksums) {
+                    fetch_progress.update(&FetchResult {
+                        data: vec![],
+                        fetched: 0,
+                    });
+                } else {
+                    println!("\t(dry-run) GET missing '{url}' ({}b)", package.size);
+                    fetch_progress.update(&FetchResult {
+                        data: vec![],
+                        fetched: package.size,
+                    });
+                }
+            } else {
+                let mut full_path = PathBuf::from(prefix);
+                full_path.push(&package.file);
+
+                match fetch_plain_file(
+                    config,
+                    &url,
+                    &full_path,
+                    package.size,
+                    &package.checksums,
+                    false,
+                    dry_run,
+                ) {
+                    Ok(res) => fetch_progress.update(&res),
+                    Err(err) if config.ignore_errors => {
+                        let msg = format!(
+                            "{}: failed to fetch package '{}' - {}",
+                            basename, package.file, err,
+                        );
+                        eprintln!("{msg}");
+                        progress.warnings.push(msg);
+                    }
+                    Err(err) => return Err(err),
+                }
+            }
+
+            if fetch_progress.file_count() % (max(total_files / 100, 1)) == 0 {
+                println!("\tProgress: {fetch_progress}");
+            }
+        }
+        println!("\tProgress: {fetch_progress}");
+        if dry_run {
+            progress.dry_run += fetch_progress;
+        } else {
+            progress.total += fetch_progress;
+        }
+        if skip_count > 0 {
+            progress.skip_count += skip_count;
+            progress.skip_bytes += skip_bytes;
+            println!("Skipped downloading {skip_count} packages totalling {skip_bytes}b");
+        }
+    }
+
+    Ok(())
+}
+
+fn fetch_source_packages(
+    config: &ParsedMirrorConfig,
+    source_packages_indices: HashMap<&String, SourcesFile>,
+    dry_run: bool,
+    prefix: &Path,
+    progress: &mut MirrorProgress,
+) -> Result<(), Error> {
+    let skipped_package_globs = convert_to_globset(config)?;
+
+    for (basename, references) in source_packages_indices {
+        let total_source_packages = references.source_packages.len();
+        if total_source_packages == 0 {
+            println!("\n{basename} - no files, skipping.");
+            continue;
+        } else {
+            println!("\n{basename} - {total_source_packages} total source package(s)");
+        }
+
+        let mut fetch_progress = Progress::new();
+        let mut skip_count = 0usize;
+        let mut skip_bytes = 0usize;
+        for package in references.source_packages {
+            if let Some(ref sections) = &config.skip.skip_sections {
+                if sections
+                    .iter()
+                    .any(|section| package.section.as_ref() == Some(section))
+                {
+                    println!(
+                        "\tskipping {} - {}b (section '{}')",
+                        package.package,
+                        package.size(),
+                        package.section.as_ref().unwrap(),
+                    );
+                    skip_count += 1;
+                    skip_bytes += package.size();
+                    continue;
+                }
+            }
+            if let Some(skipped_package_globs) = &skipped_package_globs {
+                let matches = skipped_package_globs.matches(&package.package);
+                if !matches.is_empty() {
+                    // safety, skipped_package_globs is set based on this
+                    let globs = config.skip.skip_packages.as_ref().unwrap();
+                    let matches: Vec<String> = matches.iter().map(|i| globs[*i].clone()).collect();
+                    println!(
+                        "\tskipping {} - {}b (package glob(s): {})",
+                        package.package,
+                        package.size(),
+                        matches.join(", ")
+                    );
+                    skip_count += 1;
+                    skip_bytes += package.size();
+                    continue;
+                }
+            }
+
+            for file_reference in package.files.values() {
+                let path = format!("{}/{}", package.directory, file_reference.file);
+                let url = get_repo_url(&config.repository, &path);
+
+                if dry_run {
+                    if config.pool.contains(&file_reference.checksums) {
+                        fetch_progress.update(&FetchResult {
+                            data: vec![],
+                            fetched: 0,
+                        });
+                    } else {
+                        println!("\t(dry-run) GET missing '{url}' ({}b)", file_reference.size);
+                        fetch_progress.update(&FetchResult {
+                            data: vec![],
+                            fetched: file_reference.size,
+                        });
+                    }
+                } else {
+                    let mut full_path = PathBuf::from(prefix);
+                    full_path.push(&path);
+
+                    match fetch_plain_file(
+                        config,
+                        &url,
+                        &full_path,
+                        file_reference.size,
+                        &file_reference.checksums,
+                        false,
+                        dry_run,
+                    ) {
+                        Ok(res) => fetch_progress.update(&res),
+                        Err(err) if config.ignore_errors => {
+                            let msg = format!(
+                                "{}: failed to fetch package '{}' - {}",
+                                basename, file_reference.file, err,
+                            );
+                            eprintln!("{msg}");
+                            progress.warnings.push(msg);
+                        }
+                        Err(err) => return Err(err),
+                    }
+                }
+
+                if fetch_progress.file_count() % (max(total_source_packages / 100, 1)) == 0 {
+                    println!("\tProgress: {fetch_progress}");
+                }
+            }
+        }
+        println!("\tProgress: {fetch_progress}");
+        if dry_run {
+            progress.dry_run += fetch_progress;
+        } else {
+            progress.total += fetch_progress;
+        }
+        if skip_count > 0 {
+            progress.skip_count += skip_count;
+            progress.skip_bytes += skip_bytes;
+            println!("Skipped downloading {skip_count} packages totalling {skip_bytes}b");
+        }
+    }
+
+    Ok(())
+}
+
 /// Create a new snapshot of the remote repository, fetching and storing files as needed.
 ///
 /// Operates in three phases:
@@ -518,8 +771,13 @@ pub fn create_snapshot(
     let prefix = format!("{snapshot}.tmp");
     let prefix = Path::new(&prefix);
 
-    let mut total_progress = Progress::new();
-    let mut warnings = Vec::new();
+    let mut progress = MirrorProgress {
+        warnings: Vec::new(),
+        skip_count: 0,
+        skip_bytes: 0,
+        dry_run: Progress::new(),
+        total: Progress::new(),
+    };
 
     let parse_release = |res: FetchResult, name: &str| -> Result<ReleaseFile, Error> {
         println!("Parsing {name}..");
@@ -534,14 +792,14 @@ pub fn create_snapshot(
     // we want both on-disk for compat reasons, if both are available
     let release = fetch_release(&config, prefix, true, dry_run)?
         .map(|res| {
-            total_progress.update(&res);
+            progress.total.update(&res);
             parse_release(res, "Release")
         })
         .transpose()?;
 
     let in_release = fetch_release(&config, prefix, false, dry_run)?
         .map(|res| {
-            total_progress.update(&res);
+            progress.total.update(&res);
             parse_release(res, "InRelease")
         })
         .transpose()?;
@@ -671,7 +929,7 @@ pub fn create_snapshot(
                             reference.file_type, reference.path
                         );
                         eprintln!("{msg}");
-                        warnings.push(msg);
+                        progress.warnings.push(msg);
                         failed_references.push(reference);
                         continue;
                     }
@@ -723,7 +981,7 @@ pub fn create_snapshot(
         println!("Total dsc size for component: {component_dsc_size}");
         packages_size += component_dsc_size;
 
-        total_progress += fetch_progress;
+        progress.total += fetch_progress;
     }
     println!("Total deb size: {packages_size}");
     if !failed_references.is_empty() {
@@ -733,244 +991,40 @@ pub fn create_snapshot(
         }
     }
 
-    let skipped_package_globs = if let Some(skipped_packages) = &config.skip.skip_packages {
-        let mut globs = GlobSetBuilder::new();
-        for glob in skipped_packages {
-            let glob = Glob::new(glob)?;
-            globs.add(glob);
-        }
-        let globs = globs.build()?;
-        Some(globs)
-    } else {
-        None
-    };
-
     println!("\nFetching packages..");
-    let mut dry_run_progress = Progress::new();
-    let mut total_skipped_count = 0usize;
-    let mut total_skipped_bytes = 0usize;
-    for (basename, references) in packages_indices {
-        let total_files = references.files.len();
-        if total_files == 0 {
-            println!("\n{basename} - no files, skipping.");
-            continue;
-        } else {
-            println!("\n{basename} - {total_files} total file(s)");
-        }
-
-        let mut fetch_progress = Progress::new();
-        let mut skipped_count = 0usize;
-        let mut skipped_bytes = 0usize;
-        for package in references.files {
-            if let Some(ref sections) = &config.skip.skip_sections {
-                if sections.iter().any(|section| package.section == *section) {
-                    println!(
-                        "\tskipping {} - {}b (section '{}')",
-                        package.package, package.size, package.section
-                    );
-                    skipped_count += 1;
-                    skipped_bytes += package.size;
-                    continue;
-                }
-            }
-            if let Some(skipped_package_globs) = &skipped_package_globs {
-                let matches = skipped_package_globs.matches(&package.package);
-                if !matches.is_empty() {
-                    // safety, skipped_package_globs is set based on this
-                    let globs = config.skip.skip_packages.as_ref().unwrap();
-                    let matches: Vec<String> = matches.iter().map(|i| globs[*i].clone()).collect();
-                    println!(
-                        "\tskipping {} - {}b (package glob(s): {})",
-                        package.package,
-                        package.size,
-                        matches.join(", ")
-                    );
-                    skipped_count += 1;
-                    skipped_bytes += package.size;
-                    continue;
-                }
-            }
-            let url = get_repo_url(&config.repository, &package.file);
-
-            if dry_run {
-                if config.pool.contains(&package.checksums) {
-                    fetch_progress.update(&FetchResult {
-                        data: vec![],
-                        fetched: 0,
-                    });
-                } else {
-                    println!("\t(dry-run) GET missing '{url}' ({}b)", package.size);
-                    fetch_progress.update(&FetchResult {
-                        data: vec![],
-                        fetched: package.size,
-                    });
-                }
-            } else {
-                let mut full_path = PathBuf::from(prefix);
-                full_path.push(&package.file);
-
-                match fetch_plain_file(
-                    &config,
-                    &url,
-                    &full_path,
-                    package.size,
-                    &package.checksums,
-                    false,
-                    dry_run,
-                ) {
-                    Ok(res) => fetch_progress.update(&res),
-                    Err(err) if config.ignore_errors => {
-                        let msg = format!(
-                            "{}: failed to fetch package '{}' - {}",
-                            basename, package.file, err,
-                        );
-                        eprintln!("{msg}");
-                        warnings.push(msg);
-                    }
-                    Err(err) => return Err(err),
-                }
-            }
-
-            if fetch_progress.file_count() % (max(total_files / 100, 1)) == 0 {
-                println!("\tProgress: {fetch_progress}");
-            }
-        }
-        println!("\tProgress: {fetch_progress}");
-        if dry_run {
-            dry_run_progress += fetch_progress;
-        } else {
-            total_progress += fetch_progress;
-        }
-        if skipped_count > 0 {
-            total_skipped_count += skipped_count;
-            total_skipped_bytes += skipped_bytes;
-            println!("Skipped downloading {skipped_count} packages totalling {skipped_bytes}b");
-        }
-    }
 
-    for (basename, references) in source_packages_indices {
-        let total_source_packages = references.source_packages.len();
-        if total_source_packages == 0 {
-            println!("\n{basename} - no files, skipping.");
-            continue;
-        } else {
-            println!("\n{basename} - {total_source_packages} total source package(s)");
-        }
+    fetch_binary_packages(&config, packages_indices, dry_run, prefix, &mut progress)?;
 
-        let mut fetch_progress = Progress::new();
-        let mut skipped_count = 0usize;
-        let mut skipped_bytes = 0usize;
-        for package in references.source_packages {
-            if let Some(ref sections) = &config.skip.skip_sections {
-                if sections
-                    .iter()
-                    .any(|section| package.section.as_ref() == Some(section))
-                {
-                    println!(
-                        "\tskipping {} - {}b (section '{}')",
-                        package.package,
-                        package.size(),
-                        package.section.as_ref().unwrap(),
-                    );
-                    skipped_count += 1;
-                    skipped_bytes += package.size();
-                    continue;
-                }
-            }
-            if let Some(skipped_package_globs) = &skipped_package_globs {
-                let matches = skipped_package_globs.matches(&package.package);
-                if !matches.is_empty() {
-                    // safety, skipped_package_globs is set based on this
-                    let globs = config.skip.skip_packages.as_ref().unwrap();
-                    let matches: Vec<String> = matches.iter().map(|i| globs[*i].clone()).collect();
-                    println!(
-                        "\tskipping {} - {}b (package glob(s): {})",
-                        package.package,
-                        package.size(),
-                        matches.join(", ")
-                    );
-                    skipped_count += 1;
-                    skipped_bytes += package.size();
-                    continue;
-                }
-            }
-
-            for file_reference in package.files.values() {
-                let path = format!("{}/{}", package.directory, file_reference.file);
-                let url = get_repo_url(&config.repository, &path);
-
-                if dry_run {
-                    if config.pool.contains(&file_reference.checksums) {
-                        fetch_progress.update(&FetchResult {
-                            data: vec![],
-                            fetched: 0,
-                        });
-                    } else {
-                        println!("\t(dry-run) GET missing '{url}' ({}b)", file_reference.size);
-                        fetch_progress.update(&FetchResult {
-                            data: vec![],
-                            fetched: file_reference.size,
-                        });
-                    }
-                } else {
-                    let mut full_path = PathBuf::from(prefix);
-                    full_path.push(&path);
-
-                    match fetch_plain_file(
-                        &config,
-                        &url,
-                        &full_path,
-                        file_reference.size,
-                        &file_reference.checksums,
-                        false,
-                        dry_run,
-                    ) {
-                        Ok(res) => fetch_progress.update(&res),
-                        Err(err) if config.ignore_errors => {
-                            let msg = format!(
-                                "{}: failed to fetch package '{}' - {}",
-                                basename, file_reference.file, err,
-                            );
-                            eprintln!("{msg}");
-                            warnings.push(msg);
-                        }
-                        Err(err) => return Err(err),
-                    }
-                }
-
-                if fetch_progress.file_count() % (max(total_source_packages / 100, 1)) == 0 {
-                    println!("\tProgress: {fetch_progress}");
-                }
-            }
-        }
-        println!("\tProgress: {fetch_progress}");
-        if dry_run {
-            dry_run_progress += fetch_progress;
-        } else {
-            total_progress += fetch_progress;
-        }
-        if skipped_count > 0 {
-            total_skipped_count += skipped_count;
-            total_skipped_bytes += skipped_bytes;
-            println!("Skipped downloading {skipped_count} packages totalling {skipped_bytes}b");
-        }
-    }
+    fetch_source_packages(
+        &config,
+        source_packages_indices,
+        dry_run,
+        prefix,
+        &mut progress,
+    )?;
 
     if dry_run {
-        println!("\nDry-run Stats (indices, downloaded but not persisted):\n{total_progress}");
-        println!("\nDry-run stats (packages, new == missing):\n{dry_run_progress}");
+        println!(
+            "\nDry-run Stats (indices, downloaded but not persisted):\n{}",
+            progress.total
+        );
+        println!(
+            "\nDry-run stats (packages, new == missing):\n{}",
+            progress.dry_run
+        );
     } else {
-        println!("\nStats: {total_progress}");
+        println!("\nStats: {}", progress.total);
     }
     if total_count > 0 {
         println!(
-            "Skipped downloading {total_skipped_count} packages totalling {total_skipped_bytes}b"
+            "Skipped downloading {} packages totalling {}b",
+            progress.skip_count, progress.skip_bytes,
         );
     }
 
-    if !warnings.is_empty() {
+    if !progress.warnings.is_empty() {
         eprintln!("Warnings:");
-        for msg in warnings {
+        for msg in progress.warnings {
             eprintln!("- {msg}");
         }
     }
-- 
2.30.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] applied-series: [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support
  2022-10-18  9:20 [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Fabian Grünbichler
                   ` (5 preceding siblings ...)
  2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 4/4] mirror: refactor fetch_binary/source_packages Fabian Grünbichler
@ 2022-10-20 12:49 ` Thomas Lamprecht
  6 siblings, 0 replies; 8+ messages in thread
From: Thomas Lamprecht @ 2022-10-20 12:49 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Am 18/10/2022 um 11:20 schrieb Fabian Grünbichler:
> this series implements filtering based on package section (exact match)
> or package name (glob), and extends mirroring support to source
> packages/deb-src repositories.
> 
> technically the first patch in proxmox-apt is a breaking change, but the
> only user of the changed struct is proxmox-offline-mirror, which doesn't
> do any incompatible initializations.
> 
> proxmox-apt:
> 
> Fabian Grünbichler (2):
>   packages file: add section field
>   deb822: source index support
> 
>  src/deb822/mod.rs                             |      3 +
>  src/deb822/packages_file.rs                   |      2 +
>  src/deb822/release_file.rs                    |      2 +-
>  src/deb822/sources_file.rs                    |    255 +
>  ..._debian_dists_bullseye_main_source_Sources | 858657 +++++++++++++++
>  5 files changed, 858918 insertions(+), 1 deletion(-)
>  create mode 100644 src/deb822/sources_file.rs
>  create mode 100644 tests/deb822/sources/deb.debian.org_debian_dists_bullseye_main_source_Sources
> 
> proxmox-offline-mirror:
> 
> Fabian Grünbichler (4):
>   mirror: add exclusion of packages/sections
>   mirror: implement source packages mirroring
>   fix #4264: only require either Release or InRelease
>   mirror: refactor fetch_binary/source_packages
> 
>  Cargo.toml                                    |   1 +
>  debian/control                                |   2 +
>  src/bin/proxmox-offline-mirror.rs             |   4 +-
>  src/bin/proxmox_offline_mirror_cmds/config.rs |   8 +
>  src/config.rs                                 |  40 +-
>  src/mirror.rs                                 | 483 ++++++++++++++----
>  6 files changed, 437 insertions(+), 101 deletions(-)
> 

applied series, thanks!

Waiting for some doc patches before bumping, describing how to use this with ideally
common sensible section filters like 'games' and 'kernel' as I don't think many people
will find this in the rather hidden usage, at least not until its "too late" and they
already downloaded way more than they wanted (in most cases).




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-10-20 12:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-18  9:20 [pve-devel] [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Fabian Grünbichler
2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-apt 1/2] packages file: add section field Fabian Grünbichler
2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-apt 2/2] deb822: source index support Fabian Grünbichler
2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 1/4] mirror: add exclusion of packages/sections Fabian Grünbichler
2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 2/4] mirror: implement source packages mirroring Fabian Grünbichler
2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 3/4] fix #4264: only require either Release or InRelease Fabian Grünbichler
2022-10-18  9:20 ` [pve-devel] [PATCH proxmox-offline-mirror 4/4] mirror: refactor fetch_binary/source_packages Fabian Grünbichler
2022-10-20 12:49 ` [pve-devel] applied-series: [PATCH-SERIES 0/6] proxmox-offline-mirror filtering & deb-src support Thomas Lamprecht

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal