public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Lukas Wagner <l.wagner@proxmox.com>
To: Maximiliano Sandoval <m.sandoval@proxmox.com>,
	pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [PATCH proxmox 2/2] http: Teach client how to speak deflate
Date: Wed, 27 Mar 2024 09:59:55 +0100	[thread overview]
Message-ID: <fa68d0ce-11c0-4acf-89ea-835c746e4cb4@proxmox.com> (raw)
In-Reply-To: <20240326152818.639452-2-m.sandoval@proxmox.com>

Hello, thanks for tackling this!
Most of my comments also apply to the first commit.

Regarding the commit message, I think it would be good to 
mention the `Accept-Encoding` and `Content-Encoding` headers (e.g
that you set `Accept-Encoding` on the request on decode the response
body based on `Content-Encoding`).
These are both quite well-known and it makes it clearer what these
commits are about.

Thanks for including some tests, that's always good. Of course
it's hard to unit-test this in a more 'realistic' scenario. :)

On  2024-03-26 16:28, Maximiliano Sandoval wrote:
> The Backup Server can speak deflate so we implement that.
> 
> Note that the spec [1] allows the server to encode the content multiple
> times with different algorithms.
> 
> [1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding
> 
> Suggested-by: Lukas Wagner <l.wagner@proxmox.com>
> Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
> ---
>  proxmox-http/src/client/simple.rs | 98 ++++++++++++++++++++++++-------
>  1 file changed, 78 insertions(+), 20 deletions(-)
> 
> diff --git a/proxmox-http/src/client/simple.rs b/proxmox-http/src/client/simple.rs
> index b33154be..c3afa8d0 100644
> --- a/proxmox-http/src/client/simple.rs
> +++ b/proxmox-http/src/client/simple.rs
> @@ -4,7 +4,7 @@ use std::io::Read;
>  #[cfg(all(feature = "client-trait", feature = "proxmox-async"))]
>  use std::str::FromStr;
>  
> -use flate2::read::GzDecoder;
> +use flate2::read::{DeflateDecoder, GzDecoder};
>  
>  use futures::*;
>  #[cfg(all(feature = "client-trait", feature = "proxmox-async"))]
> @@ -76,7 +76,7 @@ impl Client {
>  
>          request.headers_mut().insert(
>              hyper::header::ACCEPT_ENCODING,
> -            HeaderValue::from_static("gzip"),
> +            HeaderValue::from_static("gzip, deflate"),
>          );
>          request
>              .headers_mut()
> @@ -149,22 +149,24 @@ impl Client {
>          match response {
>              Ok(res) => {
>                  let (mut parts, body) = res.into_parts();
> -                let is_gzip_encoded = parts
> -                    .headers
> -                    .remove(&hyper::header::CONTENT_ENCODING)
> -                    .is_some_and(|h| h == "gzip");
> -
> -                let buf = hyper::body::to_bytes(body).await?;
> -                let new_body = if is_gzip_encoded {
> -                    let mut gz = GzDecoder::new(&buf[..]);
> -                    let mut s = String::new();
> -                    gz.read_to_string(&mut s)?;
> -                    s
> -                } else {
> -                    String::from_utf8(buf.to_vec())
> -                        .map_err(|err| format_err!("Error converting HTTP result data: {}", err))?
> +                let mut buf = hyper::body::to_bytes(body).await?.to_vec();
> +                let content_encoding = parts.headers.remove(&hyper::header::CONTENT_ENCODING);
> +
> +                if let Some(content_encoding) = content_encoding {
> +                    let encodings = content_encoding.to_str()?;
> +                    for encoding in encodings.rsplit([',', ' ']) {
> +                        buf = match encoding {
> +                            "" => buf,  // "a, b" splits into ["a", "", "b"].
> +                            "gzip" => decode_gzip(&buf[..])?,
> +                            "deflate" => decode_deflate(&buf[..])?,
> +                            other => anyhow::bail!("Unknown format: {other}"),
`anyhow::bail!` is already in scope, so you can just use `bail!`
> +                        }
> +                    }

I would suggest moving the decompression to the `request` 
method (maybe as a separate helper function though), 
transforming the `Response<Body` into another `Response<Body>`, 
with the body decompressed.

Right now, the decompression only happens in `convert_body_to_string`,
which means that this breaks users
  - which use the public `Client::request` directly (e.g. the `proxmox-client` crate)
  - which use the `HttpClient<Body,Body>` trait impl of `Client`

If the decompression happens directly in `request`, the users for the crate
should not notice any difference, at least from my understanding :)



>                  };
>  
> +                let new_body = String::from_utf8(buf)
> +                    .map_err(|err| format_err!("Error converting HTTP result data: {}", err))?;
> +
>                  Ok(Response::from_parts(parts, new_body))
>              }
>              Err(err) => Err(err),
> @@ -267,6 +269,10 @@ impl crate::HttpClient<String, String> for Client {
>  mod test {
>      use super::*;
>  
> +    use flate2::write::{DeflateEncoder, GzEncoder};
> +    use flate2::Compression;
> +    use std::io::Write;
> +
>      const BODY: &str = "hello world";
>  
>      #[tokio::test]
> @@ -288,14 +294,66 @@ mod test {
>          assert_eq!(Client::response_body_string(response).await.unwrap(), BODY);
>      }
>  
> -    fn encode_gzip(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> {
> -        use flate2::write::GzEncoder;
> -        use flate2::Compression;
> -        use std::io::Write;
> +    #[tokio::test]
> +    async fn test_parse_response_deflate() {
> +        let encoded = encode_deflate(BODY.as_bytes()).unwrap();
> +        let body = Body::from(encoded);
> +
> +        let response = Response::builder()
> +            .header(hyper::header::CONTENT_ENCODING, "deflate")
> +            .body(body)
> +            .unwrap();
> +        assert_eq!(Client::response_body_string(response).await.unwrap(), BODY);
> +    }
> +
> +    #[tokio::test]
> +    async fn test_parse_response_deflate_gzip() {
> +        let deflate_encoded = encode_deflate(BODY.as_bytes()).unwrap();
> +        let gzip_encoded = encode_gzip(&deflate_encoded).unwrap();
> +        let body = Body::from(gzip_encoded);
> +
> +        let response = Response::builder()
> +            .header(hyper::header::CONTENT_ENCODING, "deflate, gzip")
> +            .body(body)
> +            .unwrap();
> +        assert_eq!(Client::response_body_string(response).await.unwrap(), BODY);
>  
> +        let gzip_encoded = encode_gzip(BODY.as_bytes()).unwrap();
> +        let deflate_encoded = encode_deflate(&gzip_encoded).unwrap();
> +        let body = Body::from(deflate_encoded);
> +
> +        let response = Response::builder()
> +            .header(hyper::header::CONTENT_ENCODING, "gzip, deflate")
> +            .body(body)
> +            .unwrap();
> +        assert_eq!(Client::response_body_string(response).await.unwrap(), BODY);
> +    }
> +
> +    fn encode_deflate(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> {
> +        let mut e = DeflateEncoder::new(Vec::new(), Compression::default());
> +        e.write_all(bytes).unwrap();
> +
> +        e.finish()
> +    }
> +
> +    fn encode_gzip(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> {
>          let mut e = GzEncoder::new(Vec::new(), Compression::default());
>          e.write_all(bytes).unwrap();
>  
>          e.finish()
>      }
>  }
> +
> +fn decode_gzip(buf: &[u8]) -> Result<Vec<u8>, std::io::Error> {
> +    let mut dec = GzDecoder::new(buf);
> +    let mut v = Vec::new();
> +    dec.read_to_end(&mut v)?;
> +    Ok(v)
> +}
> +
> +fn decode_deflate(buf: &[u8]) -> Result<Vec<u8>, std::io::Error> {
> +    let mut dec = DeflateDecoder::new(buf);
> +    let mut v = Vec::new();
> +    dec.read_to_end(&mut v)?;
> +    Ok(v)
> +}

^ I'd put both of them into the `impl Client` block (as associated static helper functions,
not methods (so no self parameter) - but no hard feelings




-- 
- Lukas




  reply	other threads:[~2024-03-27  9:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-26 15:28 [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip Maximiliano Sandoval
2024-03-26 15:28 ` [pbs-devel] [PATCH proxmox 2/2] http: Teach client how to speak deflate Maximiliano Sandoval
2024-03-27  8:59   ` Lukas Wagner [this message]
2024-03-27 11:44 ` [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip Max Carrara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa68d0ce-11c0-4acf-89ea-835c746e4cb4@proxmox.com \
    --to=l.wagner@proxmox.com \
    --cc=m.sandoval@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal