From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id AB6B61FF191
	for <inbox@lore.proxmox.com>; Mon,  2 Jun 2025 11:33:37 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 5E4941E8F7;
	Mon,  2 Jun 2025 11:33:53 +0200 (CEST)
Mime-Version: 1.0
Date: Mon, 02 Jun 2025 11:33:19 +0200
Message-Id: <DABXZUAQK7EF.1W83RGAP8S4VI@proxmox.com>
From: "Christoph Heiss" <c.heiss@proxmox.com>
To: "Filip Schauer" <f.schauer@proxmox.com>
X-Mailer: aerc 0.20.1
References: <20250520124257.165949-1-f.schauer@proxmox.com>
 <20250520124257.165949-2-f.schauer@proxmox.com>
In-Reply-To: <20250520124257.165949-2-f.schauer@proxmox.com>
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.028 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [lib.rs]
Subject: Re: [pve-devel] [PATCH proxmox 1/9] add proxmox-oci crate
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

Some comments inline.

On Tue May 20, 2025 at 2:42 PM CEST, Filip Schauer wrote:
> [..]
> diff --git a/proxmox-oci/src/lib.rs b/proxmox-oci/src/lib.rs
> new file mode 100644
> index 00000000..57bd48b4
> --- /dev/null
> +++ b/proxmox-oci/src/lib.rs
> @@ -0,0 +1,165 @@
> [..]
> +/// Build a mapping from uncompressed layer digests (as found in the image config's `rootfs.diff_ids`)
> +/// to their corresponding compressed-layer digests (i.e. the filenames under `blobs/<algorithm>/<digest>`)
> +fn build_layer_map<R: Read + Seek>(
> +    oci_tar_image: &mut OciTarImage<R>,
> +    image_manifest: &ImageManifest,
> +) -> HashMap<oci_spec::image::Digest, oci_spec::image::Descriptor> {
> +    let mut layer_mapping = HashMap::new();
> +
> +    for layer in image_manifest.layers() {
> +        match layer.media_type() {
> +            MediaType::ImageLayer | MediaType::ImageLayerNonDistributable => {
> +                layer_mapping.insert(layer.digest().clone(), layer.clone());
> +            }
> +            MediaType::ImageLayerGzip
> +            | MediaType::ImageLayerNonDistributableGzip
> +            | MediaType::ImageLayerZstd
> +            | MediaType::ImageLayerNonDistributableZstd => {
> +                let compressed_blob = oci_tar_image.open_blob(layer.digest()).unwrap();
> +                let mut decoder: Box<dyn Read> = match layer.media_type() {
> +                    MediaType::ImageLayerGzip | MediaType::ImageLayerNonDistributableGzip => {
> +                        Box::new(GzDecoder::new(compressed_blob))
> +                    }
> +                    MediaType::ImageLayerZstd | MediaType::ImageLayerNonDistributableZstd => {
> +                        Box::new(zstd::Decoder::new(compressed_blob).unwrap())
> +                    }
> +                    _ => unreachable!(),
> +                };
> +                let mut hasher = Sha256::new();
> +                let mut buf = proxmox_io::boxed::zeroed(4096);
> +
> +                loop {
> +                    let bytes_read = decoder.read(&mut buf).unwrap();
> +                    if bytes_read == 0 {
> +                        break;
> +                    }
> +
> +                    hasher.update(&buf[..bytes_read]);
> +                }
> +
> +                let uncompressed_digest =
> +                    oci_spec::image::Sha256Digest::from_str(&format!("{:x}", hasher.finalize()))
> +                        .unwrap();

For this variant, the hashing part could be extracted into a separate
function and then the Gzip/Zstd can be proper independent match arms.
E.g. each having the decoder construction, calling the layer hashing
function and then `.insert()` mapping below.

Has the nice side-effect of avoiding things like the `unreachable!()`
above, too.

> +
> +                layer_mapping.insert(uncompressed_digest.into(), layer.clone());
> +            }
> +            _ => (),
> +        }
> +    }
> +
> +    layer_mapping
> +}
> +
> +pub enum ProxmoxOciError {
> +    NotAnOciImage,
> +    OciSpecError(OciSpecError),
> +}

Since this is only used by `parse_oci_image()`, rename it to something
like ParseError and/or ExtractError, depending on the below.

> +
> +impl From<OciSpecError> for ProxmoxOciError {
> +    fn from(oci_spec_err: OciSpecError) -> Self {
> +        Self::OciSpecError(oci_spec_err)
> +    }
> +}
> +
> +impl From<std::io::Error> for ProxmoxOciError {
> +    fn from(io_err: std::io::Error) -> Self {
> +        Self::OciSpecError(io_err.into())
> +    }
> +}
> +
> +pub fn parse_oci_image(

Seeing as this function not only parses, but also extract the contents
at the same time, this should be named something like
`parse_and_extract_image()`?

(The `_oci` is implied through the crate name already anyway.)

> +    oci_tar_path: &str,
> +    rootfs_path: &str,
> +) -> Result<Option<Config>, ProxmoxOciError> {
> +    let oci_tar_file = File::open(oci_tar_path)?;
> +    let mut oci_tar_image = match OciTarImage::new(oci_tar_file) {
> +        Ok(oci_tar_image) => oci_tar_image,
> +        Err(_) => return Err(ProxmoxOciError::NotAnOciImage),

I'd also separate the `OciSpecError::Io(_)` variant here, possible into
a separate `ProxmoxOciError::IoError` variant. Callers might want to
handle I/O errors other than e.g. actual parsing errors.

> +    };
> +    let image_manifest = oci_tar_image.image_manifest(&Arch::Amd64)?;
> +
> +    let image_config_descriptor = image_manifest.config();
> +
> +    if image_config_descriptor.media_type() != &MediaType::ImageConfig {
> +        return Err(ProxmoxOciError::OciSpecError(OciSpecError::Other(
> +            "ImageConfig has wrong media type".into(),
> +        )));

Can simply be separate variant, e.g. `ProxmoxOciError::WrongMediaType`

> +    }
> +
> +    let image_config_file = oci_tar_image.open_blob(image_config_descriptor.digest())?;
> +    let image_config = ImageConfiguration::from_reader(image_config_file)?;
> +
> +    extract_oci_image_rootfs(
> +        &mut oci_tar_image,
> +        &image_manifest,
> +        &image_config,
> +        rootfs_path.into(),
> +    )?;
> +
> +    Ok(image_config.config().clone())
> +}
> +
> +fn extract_oci_image_rootfs<R: Read + Seek>(
> +    oci_tar_image: &mut OciTarImage<R>,
> +    image_manifest: &ImageManifest,
> +    image_config: &ImageConfiguration,
> +    target_path: PathBuf,
> +) -> oci_spec::Result<()> {
> +    if !target_path.exists() {
> +        return Err(OciSpecError::Other(
> +            "Rootfs destination path not found".into(),
> +        ));
> +    }
> +
> +    let layer_map = build_layer_map(oci_tar_image, image_manifest);
> +
> +    for layer in image_config.rootfs().diff_ids() {
> +        let layer_digest = oci_spec::image::Digest::from_str(layer)?;
> +
> +        let layer_descriptor = match layer_map.get(&layer_digest) {
> +            Some(descriptor) => descriptor,
> +            None => {
> +                return Err(OciSpecError::Other(
> +                    "Unknown layer digest found in rootfs.diff_ids".into(),
> +                ));
> +            }
> +        };
> +
> +        let layer_file = oci_tar_image.open_blob(layer_descriptor.digest())?;
> +
> +        let tar_file: Box<dyn Read> = match layer_descriptor.media_type() {
> +            MediaType::ImageLayer | MediaType::ImageLayerNonDistributable => Box::new(layer_file),
> +            MediaType::ImageLayerGzip | MediaType::ImageLayerNonDistributableGzip => {
> +                Box::new(GzDecoder::new(layer_file))
> +            }
> +            MediaType::ImageLayerZstd | MediaType::ImageLayerNonDistributableZstd => {
> +                Box::new(zstd::Decoder::new(layer_file)?)
> +            }
> +            _ => {
> +                return Err(OciSpecError::Other(
> +                    "Encountered invalid media type for rootfs layer".into(),
> +                ));

Again I'd just introduce a separate error variant or even re-use
`ProxmoxOciError::WrongMediaType`, seems appropriate too.

> +            }
> +        };
> +
> +        let mut archive = Archive::new(tar_file);
> +        archive.set_preserve_ownerships(true);
> +        archive.unpack(&target_path)?;
> +    }
> +
> +    Ok(())
> +}
> diff --git a/proxmox-oci/src/oci_tar_image.rs b/proxmox-oci/src/oci_tar_image.rs
> new file mode 100644
> index 00000000..02199c75
> --- /dev/null
> +++ b/proxmox-oci/src/oci_tar_image.rs
> @@ -0,0 +1,173 @@
> [..]
> +pub struct OciTarImage<R: Read + Seek> {
> +    archive: TarArchive<R>,
> +    image_index: ImageIndex,
> +}
> +
> +impl<R: Read + Seek> OciTarImage<R> {
> [..]
> +    pub fn image_manifest(&mut self, architecture: &Arch) -> oci_spec::Result<ImageManifest> {
> +        let image_manifest_descriptor = match self.image_index.manifests().iter().find(|&x| {
> +            x.media_type() == &MediaType::ImageManifest
> +                && x.platform()
> +                    .as_ref()
> +                    .is_none_or(|platform| platform.architecture() == architecture)
> +        }) {
> +            Some(descriptor) => descriptor,
> +            None => {
> +                return Err(OciSpecError::Io(std::io::Error::new(
> +                    std::io::ErrorKind::NotFound,
> +                    "Image manifest not found for architecture",
> +                )));

Doesn't really seem like an I/O error?

> +            }
> +        };
> +
> +        let image_manifest_file = self.open_blob(&image_manifest_descriptor.digest().clone())?;
> +
> +        ImageManifest::from_reader(image_manifest_file)
> +    }
> +}
> +
> +fn get_blob_path(digest: &Digest) -> PathBuf {
> +    let algorithm = digest.algorithm().as_ref();
> +    let digest = digest.digest();
> +    PathBuf::from(format!("blobs/{algorithm}/{digest}"))

The path can also be constructed using the primivites from `PathBuf`
itself, e.g.

  let mut path = PathBuf::new();
  path.push(digest.algorithm().as_ref());
  ..

> +}



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel