* [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip @ 2024-03-26 15:28 Maximiliano Sandoval 2024-03-26 15:28 ` [pbs-devel] [PATCH proxmox 2/2] http: Teach client how to speak deflate Maximiliano Sandoval 2024-03-27 11:44 ` [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip Max Carrara 0 siblings, 2 replies; 4+ messages in thread From: Maximiliano Sandoval @ 2024-03-26 15:28 UTC (permalink / raw) To: pbs-devel Proxmox VE already supports using gzip to encode and decode the body of http requests. We teach the proxmox-http how to do the same. This can be also tested against http://eu.httpbin.org/#/Response_formats/get_gzip via ```rust async fn test_client() { use crate::client_trait::HttpClient; let client = Client::new(); let response = client.get_string("http://eu.httpbin.org/gzip", None).await.unwrap(); assert_eq!(response, ".."); } ``` Suggested-by: Lukas Wagner <l.wagner@proxmox.com> Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com> --- proxmox-http/Cargo.toml | 5 +++ proxmox-http/src/client/simple.rs | 62 +++++++++++++++++++++++++++++-- 2 files changed, 63 insertions(+), 4 deletions(-) diff --git a/proxmox-http/Cargo.toml b/proxmox-http/Cargo.toml index 11c6f0b0..9ec32e7f 100644 --- a/proxmox-http/Cargo.toml +++ b/proxmox-http/Cargo.toml @@ -21,12 +21,16 @@ tokio = { workspace = true, features = [], optional = true } tokio-openssl = { workspace = true, optional = true } ureq = { version = "2.4", features = ["native-certs"], optional = true } url = { workspace = true, optional = true } +flate2 = { workspace = true, optional = true } proxmox-async = { workspace = true, optional = true } proxmox-sys = { workspace = true, optional = true } proxmox-io = { workspace = true, optional = true } proxmox-lang = { workspace = true, optional = true } +[dev-dependencies] +tokio = { workspace = true, features = [ "macros" ] } + [features] default = [] @@ -39,6 +43,7 @@ rate-limited-stream = [ "tokio?/time", ] client = [ + "dep:flate2", "dep:futures", "dep:hyper", "dep:openssl", diff --git a/proxmox-http/src/client/simple.rs b/proxmox-http/src/client/simple.rs index e9910802..b33154be 100644 --- a/proxmox-http/src/client/simple.rs +++ b/proxmox-http/src/client/simple.rs @@ -1,9 +1,11 @@ use anyhow::{bail, format_err, Error}; use std::collections::HashMap; - +use std::io::Read; #[cfg(all(feature = "client-trait", feature = "proxmox-async"))] use std::str::FromStr; +use flate2::read::GzDecoder; + use futures::*; #[cfg(all(feature = "client-trait", feature = "proxmox-async"))] use http::header::HeaderName; @@ -72,6 +74,10 @@ impl Client { HeaderValue::from_str(Self::DEFAULT_USER_AGENT_STRING)? }; + request.headers_mut().insert( + hyper::header::ACCEPT_ENCODING, + HeaderValue::from_static("gzip"), + ); request .headers_mut() .insert(hyper::header::USER_AGENT, user_agent); @@ -142,11 +148,22 @@ impl Client { ) -> Result<Response<String>, Error> { match response { Ok(res) => { - let (parts, body) = res.into_parts(); + let (mut parts, body) = res.into_parts(); + let is_gzip_encoded = parts + .headers + .remove(&hyper::header::CONTENT_ENCODING) + .is_some_and(|h| h == "gzip"); let buf = hyper::body::to_bytes(body).await?; - let new_body = String::from_utf8(buf.to_vec()) - .map_err(|err| format_err!("Error converting HTTP result data: {}", err))?; + let new_body = if is_gzip_encoded { + let mut gz = GzDecoder::new(&buf[..]); + let mut s = String::new(); + gz.read_to_string(&mut s)?; + s + } else { + String::from_utf8(buf.to_vec()) + .map_err(|err| format_err!("Error converting HTTP result data: {}", err))? + }; Ok(Response::from_parts(parts, new_body)) } @@ -245,3 +262,40 @@ impl crate::HttpClient<String, String> for Client { }) } } + +#[cfg(test)] +mod test { + use super::*; + + const BODY: &str = "hello world"; + + #[tokio::test] + async fn test_parse_response_plain_text() { + let body = Body::from(BODY); + let response = Response::new(body); + assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); + } + + #[tokio::test] + async fn test_parse_response_gzip() { + let encoded = encode_gzip(BODY.as_bytes()).unwrap(); + let body = Body::from(encoded); + + let response = Response::builder() + .header(hyper::header::CONTENT_ENCODING, "gzip") + .body(body) + .unwrap(); + assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); + } + + fn encode_gzip(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> { + use flate2::write::GzEncoder; + use flate2::Compression; + use std::io::Write; + + let mut e = GzEncoder::new(Vec::new(), Compression::default()); + e.write_all(bytes).unwrap(); + + e.finish() + } +} -- 2.39.2 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [pbs-devel] [PATCH proxmox 2/2] http: Teach client how to speak deflate 2024-03-26 15:28 [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip Maximiliano Sandoval @ 2024-03-26 15:28 ` Maximiliano Sandoval 2024-03-27 8:59 ` Lukas Wagner 2024-03-27 11:44 ` [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip Max Carrara 1 sibling, 1 reply; 4+ messages in thread From: Maximiliano Sandoval @ 2024-03-26 15:28 UTC (permalink / raw) To: pbs-devel The Backup Server can speak deflate so we implement that. Note that the spec [1] allows the server to encode the content multiple times with different algorithms. [1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding Suggested-by: Lukas Wagner <l.wagner@proxmox.com> Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com> --- proxmox-http/src/client/simple.rs | 98 ++++++++++++++++++++++++------- 1 file changed, 78 insertions(+), 20 deletions(-) diff --git a/proxmox-http/src/client/simple.rs b/proxmox-http/src/client/simple.rs index b33154be..c3afa8d0 100644 --- a/proxmox-http/src/client/simple.rs +++ b/proxmox-http/src/client/simple.rs @@ -4,7 +4,7 @@ use std::io::Read; #[cfg(all(feature = "client-trait", feature = "proxmox-async"))] use std::str::FromStr; -use flate2::read::GzDecoder; +use flate2::read::{DeflateDecoder, GzDecoder}; use futures::*; #[cfg(all(feature = "client-trait", feature = "proxmox-async"))] @@ -76,7 +76,7 @@ impl Client { request.headers_mut().insert( hyper::header::ACCEPT_ENCODING, - HeaderValue::from_static("gzip"), + HeaderValue::from_static("gzip, deflate"), ); request .headers_mut() @@ -149,22 +149,24 @@ impl Client { match response { Ok(res) => { let (mut parts, body) = res.into_parts(); - let is_gzip_encoded = parts - .headers - .remove(&hyper::header::CONTENT_ENCODING) - .is_some_and(|h| h == "gzip"); - - let buf = hyper::body::to_bytes(body).await?; - let new_body = if is_gzip_encoded { - let mut gz = GzDecoder::new(&buf[..]); - let mut s = String::new(); - gz.read_to_string(&mut s)?; - s - } else { - String::from_utf8(buf.to_vec()) - .map_err(|err| format_err!("Error converting HTTP result data: {}", err))? + let mut buf = hyper::body::to_bytes(body).await?.to_vec(); + let content_encoding = parts.headers.remove(&hyper::header::CONTENT_ENCODING); + + if let Some(content_encoding) = content_encoding { + let encodings = content_encoding.to_str()?; + for encoding in encodings.rsplit([',', ' ']) { + buf = match encoding { + "" => buf, // "a, b" splits into ["a", "", "b"]. + "gzip" => decode_gzip(&buf[..])?, + "deflate" => decode_deflate(&buf[..])?, + other => anyhow::bail!("Unknown format: {other}"), + } + } }; + let new_body = String::from_utf8(buf) + .map_err(|err| format_err!("Error converting HTTP result data: {}", err))?; + Ok(Response::from_parts(parts, new_body)) } Err(err) => Err(err), @@ -267,6 +269,10 @@ impl crate::HttpClient<String, String> for Client { mod test { use super::*; + use flate2::write::{DeflateEncoder, GzEncoder}; + use flate2::Compression; + use std::io::Write; + const BODY: &str = "hello world"; #[tokio::test] @@ -288,14 +294,66 @@ mod test { assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); } - fn encode_gzip(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> { - use flate2::write::GzEncoder; - use flate2::Compression; - use std::io::Write; + #[tokio::test] + async fn test_parse_response_deflate() { + let encoded = encode_deflate(BODY.as_bytes()).unwrap(); + let body = Body::from(encoded); + + let response = Response::builder() + .header(hyper::header::CONTENT_ENCODING, "deflate") + .body(body) + .unwrap(); + assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); + } + + #[tokio::test] + async fn test_parse_response_deflate_gzip() { + let deflate_encoded = encode_deflate(BODY.as_bytes()).unwrap(); + let gzip_encoded = encode_gzip(&deflate_encoded).unwrap(); + let body = Body::from(gzip_encoded); + + let response = Response::builder() + .header(hyper::header::CONTENT_ENCODING, "deflate, gzip") + .body(body) + .unwrap(); + assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); + let gzip_encoded = encode_gzip(BODY.as_bytes()).unwrap(); + let deflate_encoded = encode_deflate(&gzip_encoded).unwrap(); + let body = Body::from(deflate_encoded); + + let response = Response::builder() + .header(hyper::header::CONTENT_ENCODING, "gzip, deflate") + .body(body) + .unwrap(); + assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); + } + + fn encode_deflate(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> { + let mut e = DeflateEncoder::new(Vec::new(), Compression::default()); + e.write_all(bytes).unwrap(); + + e.finish() + } + + fn encode_gzip(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> { let mut e = GzEncoder::new(Vec::new(), Compression::default()); e.write_all(bytes).unwrap(); e.finish() } } + +fn decode_gzip(buf: &[u8]) -> Result<Vec<u8>, std::io::Error> { + let mut dec = GzDecoder::new(buf); + let mut v = Vec::new(); + dec.read_to_end(&mut v)?; + Ok(v) +} + +fn decode_deflate(buf: &[u8]) -> Result<Vec<u8>, std::io::Error> { + let mut dec = DeflateDecoder::new(buf); + let mut v = Vec::new(); + dec.read_to_end(&mut v)?; + Ok(v) +} -- 2.39.2 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [pbs-devel] [PATCH proxmox 2/2] http: Teach client how to speak deflate 2024-03-26 15:28 ` [pbs-devel] [PATCH proxmox 2/2] http: Teach client how to speak deflate Maximiliano Sandoval @ 2024-03-27 8:59 ` Lukas Wagner 0 siblings, 0 replies; 4+ messages in thread From: Lukas Wagner @ 2024-03-27 8:59 UTC (permalink / raw) To: Maximiliano Sandoval, pbs-devel Hello, thanks for tackling this! Most of my comments also apply to the first commit. Regarding the commit message, I think it would be good to mention the `Accept-Encoding` and `Content-Encoding` headers (e.g that you set `Accept-Encoding` on the request on decode the response body based on `Content-Encoding`). These are both quite well-known and it makes it clearer what these commits are about. Thanks for including some tests, that's always good. Of course it's hard to unit-test this in a more 'realistic' scenario. :) On 2024-03-26 16:28, Maximiliano Sandoval wrote: > The Backup Server can speak deflate so we implement that. > > Note that the spec [1] allows the server to encode the content multiple > times with different algorithms. > > [1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding > > Suggested-by: Lukas Wagner <l.wagner@proxmox.com> > Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com> > --- > proxmox-http/src/client/simple.rs | 98 ++++++++++++++++++++++++------- > 1 file changed, 78 insertions(+), 20 deletions(-) > > diff --git a/proxmox-http/src/client/simple.rs b/proxmox-http/src/client/simple.rs > index b33154be..c3afa8d0 100644 > --- a/proxmox-http/src/client/simple.rs > +++ b/proxmox-http/src/client/simple.rs > @@ -4,7 +4,7 @@ use std::io::Read; > #[cfg(all(feature = "client-trait", feature = "proxmox-async"))] > use std::str::FromStr; > > -use flate2::read::GzDecoder; > +use flate2::read::{DeflateDecoder, GzDecoder}; > > use futures::*; > #[cfg(all(feature = "client-trait", feature = "proxmox-async"))] > @@ -76,7 +76,7 @@ impl Client { > > request.headers_mut().insert( > hyper::header::ACCEPT_ENCODING, > - HeaderValue::from_static("gzip"), > + HeaderValue::from_static("gzip, deflate"), > ); > request > .headers_mut() > @@ -149,22 +149,24 @@ impl Client { > match response { > Ok(res) => { > let (mut parts, body) = res.into_parts(); > - let is_gzip_encoded = parts > - .headers > - .remove(&hyper::header::CONTENT_ENCODING) > - .is_some_and(|h| h == "gzip"); > - > - let buf = hyper::body::to_bytes(body).await?; > - let new_body = if is_gzip_encoded { > - let mut gz = GzDecoder::new(&buf[..]); > - let mut s = String::new(); > - gz.read_to_string(&mut s)?; > - s > - } else { > - String::from_utf8(buf.to_vec()) > - .map_err(|err| format_err!("Error converting HTTP result data: {}", err))? > + let mut buf = hyper::body::to_bytes(body).await?.to_vec(); > + let content_encoding = parts.headers.remove(&hyper::header::CONTENT_ENCODING); > + > + if let Some(content_encoding) = content_encoding { > + let encodings = content_encoding.to_str()?; > + for encoding in encodings.rsplit([',', ' ']) { > + buf = match encoding { > + "" => buf, // "a, b" splits into ["a", "", "b"]. > + "gzip" => decode_gzip(&buf[..])?, > + "deflate" => decode_deflate(&buf[..])?, > + other => anyhow::bail!("Unknown format: {other}"), `anyhow::bail!` is already in scope, so you can just use `bail!` > + } > + } I would suggest moving the decompression to the `request` method (maybe as a separate helper function though), transforming the `Response<Body` into another `Response<Body>`, with the body decompressed. Right now, the decompression only happens in `convert_body_to_string`, which means that this breaks users - which use the public `Client::request` directly (e.g. the `proxmox-client` crate) - which use the `HttpClient<Body,Body>` trait impl of `Client` If the decompression happens directly in `request`, the users for the crate should not notice any difference, at least from my understanding :) > }; > > + let new_body = String::from_utf8(buf) > + .map_err(|err| format_err!("Error converting HTTP result data: {}", err))?; > + > Ok(Response::from_parts(parts, new_body)) > } > Err(err) => Err(err), > @@ -267,6 +269,10 @@ impl crate::HttpClient<String, String> for Client { > mod test { > use super::*; > > + use flate2::write::{DeflateEncoder, GzEncoder}; > + use flate2::Compression; > + use std::io::Write; > + > const BODY: &str = "hello world"; > > #[tokio::test] > @@ -288,14 +294,66 @@ mod test { > assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); > } > > - fn encode_gzip(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> { > - use flate2::write::GzEncoder; > - use flate2::Compression; > - use std::io::Write; > + #[tokio::test] > + async fn test_parse_response_deflate() { > + let encoded = encode_deflate(BODY.as_bytes()).unwrap(); > + let body = Body::from(encoded); > + > + let response = Response::builder() > + .header(hyper::header::CONTENT_ENCODING, "deflate") > + .body(body) > + .unwrap(); > + assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); > + } > + > + #[tokio::test] > + async fn test_parse_response_deflate_gzip() { > + let deflate_encoded = encode_deflate(BODY.as_bytes()).unwrap(); > + let gzip_encoded = encode_gzip(&deflate_encoded).unwrap(); > + let body = Body::from(gzip_encoded); > + > + let response = Response::builder() > + .header(hyper::header::CONTENT_ENCODING, "deflate, gzip") > + .body(body) > + .unwrap(); > + assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); > > + let gzip_encoded = encode_gzip(BODY.as_bytes()).unwrap(); > + let deflate_encoded = encode_deflate(&gzip_encoded).unwrap(); > + let body = Body::from(deflate_encoded); > + > + let response = Response::builder() > + .header(hyper::header::CONTENT_ENCODING, "gzip, deflate") > + .body(body) > + .unwrap(); > + assert_eq!(Client::response_body_string(response).await.unwrap(), BODY); > + } > + > + fn encode_deflate(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> { > + let mut e = DeflateEncoder::new(Vec::new(), Compression::default()); > + e.write_all(bytes).unwrap(); > + > + e.finish() > + } > + > + fn encode_gzip(bytes: &[u8]) -> Result<Vec<u8>, std::io::Error> { > let mut e = GzEncoder::new(Vec::new(), Compression::default()); > e.write_all(bytes).unwrap(); > > e.finish() > } > } > + > +fn decode_gzip(buf: &[u8]) -> Result<Vec<u8>, std::io::Error> { > + let mut dec = GzDecoder::new(buf); > + let mut v = Vec::new(); > + dec.read_to_end(&mut v)?; > + Ok(v) > +} > + > +fn decode_deflate(buf: &[u8]) -> Result<Vec<u8>, std::io::Error> { > + let mut dec = DeflateDecoder::new(buf); > + let mut v = Vec::new(); > + dec.read_to_end(&mut v)?; > + Ok(v) > +} ^ I'd put both of them into the `impl Client` block (as associated static helper functions, not methods (so no self parameter) - but no hard feelings -- - Lukas ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip 2024-03-26 15:28 [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip Maximiliano Sandoval 2024-03-26 15:28 ` [pbs-devel] [PATCH proxmox 2/2] http: Teach client how to speak deflate Maximiliano Sandoval @ 2024-03-27 11:44 ` Max Carrara 1 sibling, 0 replies; 4+ messages in thread From: Max Carrara @ 2024-03-27 11:44 UTC (permalink / raw) To: Proxmox Backup Server development discussion On Tue Mar 26, 2024 at 4:28 PM CET, Maximiliano Sandoval wrote: > Proxmox VE already supports using gzip to encode and decode the body of > http requests. We teach the proxmox-http how to do the same. I like the general idea of this, but there are a few things I would like you to reconsider: 1. Your patches only affect the async client (`simple.rs`), but not the synchronous version (`sync.rs`) 2. The compression library you're using is fundamentally blocking (as in, doesn't provide an `async` interface) * While that probably doesn't pose much of a problem if the response's content is rather small, decompressing larger response bodies might cause the underlying executor to block unexpectedly 3. Your patches assume that the client *always* wants to accept compressed content, which is something you don't necessarily want. In particular, there was a compression side-channel attack on HTTPS that exploited compression being always on [0] that you should check out. I've also another example that might explain it a litte better [1]. This isn't to say that I'm fundamentally againt implementing better support for HTTP compression - in fact, I'm glad someone's tackling this! But I do want to encourage you to consider where you want compression and where you don't. Also, even if the difference seems negligible, you should refrain from calling blocking code when doing IO like that. So, IMO, some good opportunities arise here: 1. You could actually check where / when our server compresses data - it shouldn't e.g. compress sensitive information over HTTPS (if you read the two pages I referred to above). 2. You should check out `proxmox-compression` and see if there are any async wrappers for the compression algorithms you added. Should there be any, you can just use them obviously - if there are none, you should make some yourself (IMO). 3. This is a somewhat minor point, but since not everything can or should be compressed, maybe you can make support for compression composable somehow? As in, let the caller of the API decide when compression is sensible and when it isn't. So, I'm curious to see what you'll come up with! :) Apart from the above, there's not really else for me to mention - except that I'm always happy to see more testing being done. Very dank. FWIW, if you really want to test a couple complete request-response round trips with multiple different scenarios regarding compression, you can always spin up an executor with its own HTTP server in a separate thread, even in tests. Should your tests require any additional dependencies that are not used for building the crate, you can of course always tell cargo [2]. [0]: https://www.breachattack.com/ [1]: https://security.stackexchange.com/a/20407 [2]: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#development-dependencies ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-03-27 11:44 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-03-26 15:28 [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip Maximiliano Sandoval 2024-03-26 15:28 ` [pbs-devel] [PATCH proxmox 2/2] http: Teach client how to speak deflate Maximiliano Sandoval 2024-03-27 8:59 ` Lukas Wagner 2024-03-27 11:44 ` [pbs-devel] [PATCH proxmox 1/2] http: teach the Client how to speak gzip Max Carrara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox