* [pbs-devel] [RFC v1 proxmox-backup] fix: #3847 pipe from STDIN to proxmox-backup-client
@ 2025-11-21 15:39 Robert Obkircher
2025-11-28 10:40 ` Christian Ebner
0 siblings, 1 reply; 2+ messages in thread
From: Robert Obkircher @ 2025-11-21 15:39 UTC (permalink / raw)
To: pbs-devel
It was requested [0] that the backup client should suport reading data
from stdin. This could be used to back up files from an unsupported
remote system via ssh or to store the output of mysqldump/pg_dump
commands.
This is currently not supported, because writing fixed indices needs
to know the size ahead of time.
This patch adds experimental support for commands like:
cat "$FILE" | proxmox-backup-client backup data.img:/dev/stdin
proxmox-backup-client backup a.img:<(cmd1) b.img:<(cmd2)
It uses write_at (pwrite64) instead of mmap, because it seemed easier
to get a prototype this way, but it would of course also be possible
to continue using mmap.
Before I spend more time on this I wanted to ask for some opinions:
Does it actually make sense to support this at all?
If so, should it be a separate endpoint/writer or unified?
[0] https://bugzilla.proxmox.com/show_bug.cgi?id=3847
Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
---
pbs-datastore/src/datastore.rs | 2 +-
pbs-datastore/src/fixed_index.rs | 109 +++++++++++++-----------------
proxmox-backup-client/src/main.rs | 16 ++---
src/api2/backup/environment.rs | 3 +
src/api2/backup/mod.rs | 7 +-
5 files changed, 63 insertions(+), 74 deletions(-)
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 0a517923..33e3ba1a 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -549,7 +549,7 @@ impl DataStore {
pub fn create_fixed_writer<P: AsRef<Path>>(
&self,
filename: P,
- size: usize,
+ size: Option<usize>,
chunk_size: usize,
) -> Result<FixedIndexWriter, Error> {
let index = FixedIndexWriter::create(
diff --git a/pbs-datastore/src/fixed_index.rs b/pbs-datastore/src/fixed_index.rs
index 6c3be2d4..fdd7ea1b 100644
--- a/pbs-datastore/src/fixed_index.rs
+++ b/pbs-datastore/src/fixed_index.rs
@@ -1,6 +1,7 @@
use std::fs::File;
-use std::io::Write;
+use std::io::{self, BufReader, Write};
use std::io::{Seek, SeekFrom};
+use std::os::unix::fs::FileExt;
use std::os::unix::io::AsRawFd;
use std::path::{Path, PathBuf};
use std::ptr::NonNull;
@@ -219,20 +220,15 @@ pub struct FixedIndexWriter {
chunk_size: usize,
size: usize,
index_length: usize,
- index: *mut u8,
+ growable_size: bool,
+ write_size_on_close: bool,
pub uuid: [u8; 16],
pub ctime: i64,
}
-// `index` is mmap()ed which cannot be thread-local so should be sendable
-unsafe impl Send for FixedIndexWriter {}
-
impl Drop for FixedIndexWriter {
fn drop(&mut self) {
let _ = std::fs::remove_file(&self.tmp_filename); // ignore errors
- if let Err(err) = self.unmap() {
- log::error!("Unable to unmap file {:?} - {}", self.tmp_filename, err);
- }
}
}
@@ -242,7 +238,7 @@ impl FixedIndexWriter {
pub fn create(
store: Arc<ChunkStore>,
path: &Path,
- size: usize,
+ known_size: Option<usize>,
chunk_size: usize,
) -> Result<Self, Error> {
let full_path = store.relative_path(path);
@@ -264,6 +260,7 @@ impl FixedIndexWriter {
}
let ctime = proxmox_time::epoch_i64();
+ let size = known_size.unwrap_or(0);
let uuid = Uuid::generate();
@@ -281,31 +278,15 @@ impl FixedIndexWriter {
file.write_all(&buffer)?;
let index_length = size.div_ceil(chunk_size);
- let index_size = index_length * 32;
- nix::unistd::ftruncate(&file, (header_size + index_size) as i64)?;
-
- let data = unsafe {
- nix::sys::mman::mmap(
- None,
- std::num::NonZeroUsize::new(index_size)
- .ok_or_else(|| format_err!("invalid index size"))?,
- nix::sys::mman::ProtFlags::PROT_READ | nix::sys::mman::ProtFlags::PROT_WRITE,
- nix::sys::mman::MapFlags::MAP_SHARED,
- &file,
- header_size as i64,
- )
- }?
- .as_ptr()
- .cast::<u8>();
-
Ok(Self {
file,
filename: full_path,
tmp_filename: tmp_path,
chunk_size,
size,
+ growable_size: known_size.is_none(),
+ write_size_on_close: known_size.is_none(),
index_length,
- index: data,
ctime,
uuid: *uuid.as_bytes(),
})
@@ -315,36 +296,32 @@ impl FixedIndexWriter {
self.index_length
}
- fn unmap(&mut self) -> Result<(), Error> {
- let Some(index) = NonNull::new(self.index as *mut std::ffi::c_void) else {
- return Ok(());
- };
-
- let index_size = self.index_length * 32;
-
- if let Err(err) = unsafe { nix::sys::mman::munmap(index, index_size) } {
- bail!("unmap file {:?} failed - {}", self.tmp_filename, err);
- }
-
- self.index = std::ptr::null_mut();
-
- Ok(())
- }
-
pub fn close(&mut self) -> Result<[u8; 32], Error> {
- if self.index.is_null() {
- bail!("cannot close already closed index file.");
+ let header_size = std::mem::size_of::<FixedIndexHeader>();
+ self.file.seek(SeekFrom::Start(header_size as u64))?;
+
+ struct MyHasher(openssl::sha::Sha256);
+ impl std::io::Write for MyHasher {
+ fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
+ self.0.update(buf);
+ Ok(buf.len())
+ }
+ fn flush(&mut self) -> io::Result<()> {
+ Ok(())
+ }
}
-
- let index_size = self.index_length * 32;
- let data = unsafe { std::slice::from_raw_parts(self.index, index_size) };
- let index_csum = openssl::sha::sha256(data);
-
- self.unmap()?;
+ let mut hasher = MyHasher(openssl::sha::Sha256::new());
+ let mut reader = BufReader::new(&mut self.file);
+ io::copy(&mut reader, &mut hasher)?;
+ let index_csum = hasher.0.finish();
let csum_offset = std::mem::offset_of!(FixedIndexHeader, index_csum);
- self.file.seek(SeekFrom::Start(csum_offset as u64))?;
- self.file.write_all(&index_csum)?;
+ self.file.write_all_at(&index_csum, csum_offset as u64)?;
+ if self.write_size_on_close {
+ let size_offset = std::mem::offset_of!(FixedIndexHeader, size);
+ self.file
+ .write_all_at(&(self.size as u64).to_le_bytes(), size_offset as u64)?;
+ }
self.file.flush()?;
if let Err(err) = std::fs::rename(&self.tmp_filename, &self.filename) {
@@ -354,13 +331,25 @@ impl FixedIndexWriter {
Ok(index_csum)
}
- pub fn check_chunk_alignment(&self, offset: usize, chunk_len: usize) -> Result<usize, Error> {
+ pub fn check_chunk_alignment(
+ &mut self,
+ offset: usize,
+ chunk_len: usize,
+ ) -> Result<usize, Error> {
if offset < chunk_len {
bail!("got chunk with small offset ({} < {}", offset, chunk_len);
}
let pos = offset - chunk_len;
+ if self.growable_size && self.size < offset {
+ self.size = offset;
+ self.index_length = pos / self.chunk_size + 1;
+ if chunk_len != self.chunk_size {
+ self.growable_size = false;
+ }
+ }
+
if offset > self.size {
bail!("chunk data exceeds size ({} >= {})", offset, self.size);
}
@@ -393,16 +382,10 @@ impl FixedIndexWriter {
);
}
- if self.index.is_null() {
- bail!("cannot write to closed index file.");
- }
-
let index_pos = index * 32;
- unsafe {
- let dst = self.index.add(index_pos);
- dst.copy_from_nonoverlapping(digest.as_ptr(), 32);
- }
-
+ let header_size = std::mem::size_of::<FixedIndexHeader>();
+ let offset = header_size + index_pos;
+ self.file.write_all_at(digest, offset as u64)?;
Ok(())
}
diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
index 999e5020..2e13950c 100644
--- a/proxmox-backup-client/src/main.rs
+++ b/proxmox-backup-client/src/main.rs
@@ -296,7 +296,7 @@ async fn backup_image<P: AsRef<Path>>(
let stream = FixedChunkStream::new(stream, chunk_size.unwrap_or(4 * 1024 * 1024));
if upload_options.fixed_size.is_none() {
- bail!("cannot backup image with dynamic chunk size!");
+ //bail!("cannot backup image with dynamic chunk size!");
}
let stats = client
@@ -859,15 +859,15 @@ async fn create_backup(
upload_list.push((BackupSpecificationType::PXAR, filename, target, "didx", 0));
}
BackupSpecificationType::IMAGE => {
- if !(file_type.is_file() || file_type.is_block_device()) {
- bail!("got unexpected file type (expected file or block device)");
+ if !(file_type.is_file() || file_type.is_block_device() || file_type.is_fifo()) {
+ bail!("got unexpected file type (expected file, block device or fifo");
}
- let size = image_size(&PathBuf::from(&filename))?;
-
- if size == 0 {
- bail!("got zero-sized file '{}'", filename);
- }
+ let size = if file_type.is_fifo() {
+ 0
+ } else {
+ image_size(&PathBuf::from(&filename))?
+ };
upload_list.push((
BackupSpecificationType::IMAGE,
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index bd9c5211..6f8beda2 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -444,6 +444,9 @@ impl BackupEnvironment {
let end = (offset as usize) + (size as usize);
let idx = data.index.check_chunk_alignment(end, size as usize)?;
+ if end > data.size {
+ data.size = end;
+ }
data.chunk_count += 1;
diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
index 3e6b7a95..d7140612 100644
--- a/src/api2/backup/mod.rs
+++ b/src/api2/backup/mod.rs
@@ -457,7 +457,8 @@ pub const API_METHOD_CREATE_FIXED_INDEX: ApiMethod = ApiMethod::new(
(
"size",
false,
- &IntegerSchema::new("File size.").minimum(1).schema()
+ // temporary hack: 0=dynamic
+ &IntegerSchema::new("File size.").minimum(0).schema()
),
(
"reuse-csum",
@@ -528,7 +529,9 @@ fn create_fixed_index(
reader = Some(index);
}
- let mut writer = env.datastore.create_fixed_writer(&path, size, chunk_size)?;
+ let mut writer =
+ env.datastore
+ .create_fixed_writer(&path, Some(size).filter(|s| *s != 0), chunk_size)?;
if let Some(reader) = reader {
writer.clone_data_from(&reader)?;
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [pbs-devel] [RFC v1 proxmox-backup] fix: #3847 pipe from STDIN to proxmox-backup-client
2025-11-21 15:39 [pbs-devel] [RFC v1 proxmox-backup] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
@ 2025-11-28 10:40 ` Christian Ebner
0 siblings, 0 replies; 2+ messages in thread
From: Christian Ebner @ 2025-11-28 10:40 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Robert Obkircher
Hi, thanks for having a look at this issue, this can be an interesting
feature.
On 11/21/25 4:41 PM, Robert Obkircher wrote:
> It was requested [0] that the backup client should suport reading data
> from stdin. This could be used to back up files from an unsupported
> remote system via ssh or to store the output of mysqldump/pg_dump
> commands.
>
> This is currently not supported, because writing fixed indices needs
> to know the size ahead of time.
>
> This patch adds experimental support for commands like:
> cat "$FILE" | proxmox-backup-client backup data.img:/dev/stdin
> proxmox-backup-client backup a.img:<(cmd1) b.img:<(cmd2)
>
> It uses write_at (pwrite64) instead of mmap, because it seemed easier
> to get a prototype this way, but it would of course also be possible
> to continue using mmap.
>
> Before I spend more time on this I wanted to ask for some opinions:
>
> Does it actually make sense to support this at all?
Yes, the requested dumping of contents from e.g. a database into a
stream which can then be directly be backed up without storing an
temporary file are a valid usecase I think. Adding a support for this
justified.
> If so, should it be a separate endpoint/writer or unified?
Using a different endpoint is not required, in my opinion this might
however be approach a bit differently than what your initial draft does.
I think it would make sense to have the following options for the
proxmox-backup-client, both of which can be implemented independent from
each other:
- Allow the input stream to be a generic raw byte stream by using the
img archive name extension, chunking it using the fixed size chunker and
upload this as fixed index to the proxmox backup server.
- Allow the user to set the mode based on pxar archive name extension
(and check the first bytes of the stream for matching pxar magic number)
to pass in a pre-encoded pxar stream. This would then however also
require to continuosuly check the consistency of the provided pxar
stream, aborting if inconsistencies are detected and is more involved.
For both cases the chunking, upload and encryption can still be
performed using the pre-existing logic.
For the FixedIndexWriter as drafted by your patch, I would however
suggest to try to approach this a bit differently. This could maybe be
done by defining a trait which clearly defines the required method
interfaces first, and then implement this trait for the
FixedIndexWriterSized and maybe a FixedIndexWriterUnsized?
Both implementations could then reuse as much as possible from the
pre-existing logic and allow to not interfer with the already existing
implementation. Also, the mmapped logic for the writer with already
known size must be maintained.
What do you think?
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-11-28 10:40 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-21 15:39 [pbs-devel] [RFC v1 proxmox-backup] fix: #3847 pipe from STDIN to proxmox-backup-client Robert Obkircher
2025-11-28 10:40 ` Christian Ebner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.