public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Stefan Reiter <s.reiter@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH v4 proxmox-backup 15/20] file-restore: add basic VM/block device support
Date: Thu,  1 Apr 2021 17:43:52 +0200	[thread overview]
Message-ID: <20210401154352.4143-1-s.reiter@proxmox.com> (raw)
In-Reply-To: <20210331102202.14767-16-s.reiter@proxmox.com>

Includes methods to start, stop and list QEMU file-restore VMs, as well
as CLI commands do the latter two (start is implicit).

The implementation is abstracted behind the concept of a
"BlockRestoreDriver", so other methods can be implemented later (e.g.
mapping directly to loop devices on the host, using other hypervisors
then QEMU, etc...).

Starting VMs is currently unused but will be needed for further changes.

The design for the QEMU driver uses a locked 'map' file
(/run/proxmox-backup/$UID/restore-vm-map.json) containing a JSON
encoding of currently running VMs. VMs are addressed by a 'name', which
is a systemd-unit encoded combination of repository and snapshot string,
thus uniquely identifying it.

Note that currently you need to run proxmox-file-restore as root to use
this method of restoring.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---

!!! NOTE:
This replaces BOTH patches 14/20 and 15/20 of v3!


v4:
* change state directory to /run/proxmox-backup/$UID since XDG_RUNTIME_DIR is
  not available for root if not logged in on a tty
* create statefile with correct 0600 permissions
* remove setuid binary, start VM directly again - reuses most of the code, but
  without the separate binary

The last change was made after realizing that we cannot call the binary as
www-data unprivileged anyway, since the PBS password file is in /etc/pve/priv...
So instead let's run this as root again, avoid the setuid binary, and instead
focus on making sure the pveproxy<->pvedaemon communication is optimized away
differently - e.g. passing file descriptors or similar.


 src/bin/proxmox-file-restore.rs               |  12 +-
 src/bin/proxmox_client_tools/mod.rs           |  13 +
 src/bin/proxmox_file_restore/block_driver.rs  | 163 +++++++++++
 .../proxmox_file_restore/block_driver_qemu.rs | 277 ++++++++++++++++++
 src/bin/proxmox_file_restore/mod.rs           |   6 +
 src/bin/proxmox_file_restore/qemu_helper.rs   | 274 +++++++++++++++++
 src/buildcfg.rs                               |  17 ++
 7 files changed, 761 insertions(+), 1 deletion(-)
 create mode 100644 src/bin/proxmox_file_restore/block_driver.rs
 create mode 100644 src/bin/proxmox_file_restore/block_driver_qemu.rs
 create mode 100644 src/bin/proxmox_file_restore/mod.rs
 create mode 100644 src/bin/proxmox_file_restore/qemu_helper.rs

diff --git a/src/bin/proxmox-file-restore.rs b/src/bin/proxmox-file-restore.rs
index f8affc03..de2cb971 100644
--- a/src/bin/proxmox-file-restore.rs
+++ b/src/bin/proxmox-file-restore.rs
@@ -35,6 +35,9 @@ use proxmox_client_tools::{
     REPO_URL_SCHEMA,
 };
 
+mod proxmox_file_restore;
+use proxmox_file_restore::*;
+
 enum ExtractPath {
     ListArchives,
     Pxar(String, Vec<u8>),
@@ -369,9 +372,16 @@ fn main() {
         .completion_cb("snapshot", complete_group_or_snapshot)
         .completion_cb("target", tools::complete_file_name);
 
+    let status_cmd_def = CliCommand::new(&API_METHOD_STATUS);
+    let stop_cmd_def = CliCommand::new(&API_METHOD_STOP)
+        .arg_param(&["name"])
+        .completion_cb("name", complete_block_driver_ids);
+
     let cmd_def = CliCommandMap::new()
         .insert("list", list_cmd_def)
-        .insert("extract", restore_cmd_def);
+        .insert("extract", restore_cmd_def)
+        .insert("status", status_cmd_def)
+        .insert("stop", stop_cmd_def);
 
     let rpcenv = CliEnvironment::new();
     run_cli_command(
diff --git a/src/bin/proxmox_client_tools/mod.rs b/src/bin/proxmox_client_tools/mod.rs
index 73744ba2..1cdcf0df 100644
--- a/src/bin/proxmox_client_tools/mod.rs
+++ b/src/bin/proxmox_client_tools/mod.rs
@@ -13,6 +13,7 @@ use proxmox::{
 use proxmox_backup::api2::access::user::UserWithTokens;
 use proxmox_backup::api2::types::*;
 use proxmox_backup::backup::BackupDir;
+use proxmox_backup::buildcfg;
 use proxmox_backup::client::*;
 use proxmox_backup::tools;
 
@@ -372,3 +373,15 @@ pub fn place_xdg_file(
         .and_then(|base| base.place_config_file(file_name).map_err(Error::from))
         .with_context(|| format!("failed to place {} in xdg home", description))
 }
+
+/// Returns a runtime dir owned by the current user.
+/// Note that XDG_RUNTIME_DIR is not always available, especially for non-login users like
+/// "www-data", so we use a custom one in /run/proxmox-backup/<uid> instead.
+pub fn get_user_run_dir() -> Result<std::path::PathBuf, Error> {
+    let uid = nix::unistd::Uid::current();
+    let mut path: std::path::PathBuf = buildcfg::PROXMOX_BACKUP_RUN_DIR.into();
+    path.push(uid.to_string());
+    tools::create_run_dir()?;
+    std::fs::create_dir_all(&path)?;
+    Ok(path)
+}
diff --git a/src/bin/proxmox_file_restore/block_driver.rs b/src/bin/proxmox_file_restore/block_driver.rs
new file mode 100644
index 00000000..9c6fc5ac
--- /dev/null
+++ b/src/bin/proxmox_file_restore/block_driver.rs
@@ -0,0 +1,163 @@
+//! Abstraction layer over different methods of accessing a block backup
+use anyhow::{bail, Error};
+use serde::{Deserialize, Serialize};
+use serde_json::{json, Value};
+
+use std::collections::HashMap;
+use std::future::Future;
+use std::hash::BuildHasher;
+use std::pin::Pin;
+
+use proxmox_backup::backup::{BackupDir, BackupManifest};
+use proxmox_backup::client::BackupRepository;
+
+use proxmox::api::{api, cli::*};
+
+use super::block_driver_qemu::QemuBlockDriver;
+
+/// Contains details about a snapshot that is to be accessed by block file restore
+pub struct SnapRestoreDetails {
+    pub repo: BackupRepository,
+    pub snapshot: BackupDir,
+    pub manifest: BackupManifest,
+}
+
+/// Return value of a BlockRestoreDriver.status() call, 'id' must be valid for .stop(id)
+pub struct DriverStatus {
+    pub id: String,
+    pub data: Value,
+}
+
+pub type Async<R> = Pin<Box<dyn Future<Output = R> + Send>>;
+
+/// An abstract implementation for retrieving data out of a block file backup
+pub trait BlockRestoreDriver {
+    /// Return status of all running/mapped images, result value is (id, extra data), where id must
+    /// match with the ones returned from list()
+    fn status(&self) -> Async<Result<Vec<DriverStatus>, Error>>;
+    /// Stop/Close a running restore method
+    fn stop(&self, id: String) -> Async<Result<(), Error>>;
+    /// Returned ids must be prefixed with driver type so that they cannot collide between drivers,
+    /// the returned values must be passable to stop()
+    fn list(&self) -> Vec<String>;
+}
+
+#[api()]
+#[derive(Debug, Serialize, Deserialize, PartialEq, Clone, Copy)]
+pub enum BlockDriverType {
+    /// Uses a small QEMU/KVM virtual machine to map images securely. Requires PVE-patched QEMU.
+    Qemu,
+}
+
+impl BlockDriverType {
+    fn resolve(&self) -> impl BlockRestoreDriver {
+        match self {
+            BlockDriverType::Qemu => QemuBlockDriver {},
+        }
+    }
+}
+
+const DEFAULT_DRIVER: BlockDriverType = BlockDriverType::Qemu;
+const ALL_DRIVERS: &[BlockDriverType] = &[BlockDriverType::Qemu];
+
+#[api(
+   input: {
+       properties: {
+            "driver": {
+                type: BlockDriverType,
+                optional: true,
+            },
+            "output-format": {
+                schema: OUTPUT_FORMAT,
+                optional: true,
+            },
+        },
+   },
+)]
+/// Retrieve status information about currently running/mapped restore images
+pub async fn status(driver: Option<BlockDriverType>, param: Value) -> Result<(), Error> {
+    let output_format = get_output_format(&param);
+    let text = output_format == "text";
+
+    let mut ret = json!({});
+
+    for dt in ALL_DRIVERS {
+        if driver.is_some() && &driver.unwrap() != dt {
+            continue;
+        }
+
+        let drv_name = format!("{:?}", dt);
+        let drv = dt.resolve();
+        match drv.status().await {
+            Ok(data) if data.is_empty() => {
+                if text {
+                    println!("{}: no mappings", drv_name);
+                } else {
+                    ret[drv_name] = json!({});
+                }
+            }
+            Ok(data) => {
+                if text {
+                    println!("{}:", &drv_name);
+                }
+
+                ret[&drv_name]["ids"] = json!({});
+                for status in data {
+                    if text {
+                        println!("{} \t({})", status.id, status.data);
+                    } else {
+                        ret[&drv_name]["ids"][status.id] = status.data;
+                    }
+                }
+            }
+            Err(err) => {
+                if text {
+                    eprintln!("error getting status from driver '{}' - {}", drv_name, err);
+                } else {
+                    ret[drv_name] = json!({ "error": format!("{}", err) });
+                }
+            }
+        }
+    }
+
+    if !text {
+        format_and_print_result(&ret, &output_format);
+    }
+
+    Ok(())
+}
+
+#[api(
+   input: {
+       properties: {
+            "name": {
+                type: String,
+                description: "The name of the VM to stop.",
+            },
+        },
+   },
+)]
+/// Immediately stop/unmap a given image. Not typically necessary, as VMs will stop themselves
+/// after a timer anyway.
+pub async fn stop(name: String) -> Result<(), Error> {
+    for drv in ALL_DRIVERS.iter().map(BlockDriverType::resolve) {
+        if drv.list().contains(&name) {
+            return drv.stop(name).await;
+        }
+    }
+
+    bail!("no mapping with name '{}' found", name);
+}
+
+/// Autocompletion handler for block mappings
+pub fn complete_block_driver_ids<S: BuildHasher>(
+    _arg: &str,
+    _param: &HashMap<String, String, S>,
+) -> Vec<String> {
+    ALL_DRIVERS
+        .iter()
+        .map(BlockDriverType::resolve)
+        .map(|d| d.list())
+        .flatten()
+        .collect()
+}
diff --git a/src/bin/proxmox_file_restore/block_driver_qemu.rs b/src/bin/proxmox_file_restore/block_driver_qemu.rs
new file mode 100644
index 00000000..f66d7738
--- /dev/null
+++ b/src/bin/proxmox_file_restore/block_driver_qemu.rs
@@ -0,0 +1,277 @@
+//! Block file access via a small QEMU restore VM using the PBS block driver in QEMU
+use anyhow::{bail, Error};
+use futures::FutureExt;
+use serde::{Deserialize, Serialize};
+use serde_json::json;
+
+use std::collections::HashMap;
+use std::fs::{File, OpenOptions};
+use std::io::{prelude::*, SeekFrom};
+
+use proxmox::tools::fs::lock_file;
+use proxmox_backup::backup::BackupDir;
+use proxmox_backup::client::*;
+use proxmox_backup::tools;
+
+use super::block_driver::*;
+use crate::proxmox_client_tools::get_user_run_dir;
+
+const RESTORE_VM_MAP: &str = "restore-vm-map.json";
+
+pub struct QemuBlockDriver {}
+
+#[derive(Clone, Hash, Serialize, Deserialize)]
+struct VMState {
+    pid: i32,
+    cid: i32,
+    ticket: String,
+}
+
+struct VMStateMap {
+    map: HashMap<String, VMState>,
+    file: File,
+}
+
+impl VMStateMap {
+    fn open_file_raw(write: bool) -> Result<File, Error> {
+        use std::os::unix::fs::OpenOptionsExt;
+        let mut path = get_user_run_dir()?;
+        path.push(RESTORE_VM_MAP);
+        OpenOptions::new()
+            .read(true)
+            .write(write)
+            .create(write)
+            .mode(0o600)
+            .open(path)
+            .map_err(Error::from)
+    }
+
+    /// Acquire a lock on the state map and retrieve a deserialized version
+    fn load() -> Result<Self, Error> {
+        let mut file = Self::open_file_raw(true)?;
+        lock_file(&mut file, true, Some(std::time::Duration::from_secs(5)))?;
+        let map = serde_json::from_reader(&file).unwrap_or_default();
+        Ok(Self { map, file })
+    }
+
+    /// Load a read-only copy of the current VM map. Only use for informational purposes, like
+    /// shell auto-completion, for anything requiring consistency use load() !
+    fn load_read_only() -> Result<HashMap<String, VMState>, Error> {
+        let file = Self::open_file_raw(false)?;
+        Ok(serde_json::from_reader(&file).unwrap_or_default())
+    }
+
+    /// Write back a potentially modified state map, consuming the held lock
+    fn write(mut self) -> Result<(), Error> {
+        self.file.seek(SeekFrom::Start(0))?;
+        self.file.set_len(0)?;
+        serde_json::to_writer(self.file, &self.map)?;
+
+        // drop ourselves including file lock
+        Ok(())
+    }
+
+    /// Return the map, but drop the lock immediately
+    fn read_only(self) -> HashMap<String, VMState> {
+        self.map
+    }
+}
+
+fn make_name(repo: &BackupRepository, snap: &BackupDir) -> String {
+    let full = format!("qemu_{}/{}", repo, snap);
+    tools::systemd::escape_unit(&full, false)
+}
+
+/// remove non-responsive VMs from given map, returns 'true' if map was modified
+async fn cleanup_map(map: &mut HashMap<String, VMState>) -> bool {
+    let mut to_remove = Vec::new();
+    for (name, state) in map.iter() {
+        let client = VsockClient::new(state.cid, DEFAULT_VSOCK_PORT, Some(state.ticket.clone()));
+        let res = client
+            .get("api2/json/status", Some(json!({"keep-timeout": true})))
+            .await;
+        if res.is_err() {
+            // VM is not reachable, remove from map and inform user
+            to_remove.push(name.clone());
+            println!(
+                "VM '{}' (pid: {}, cid: {}) was not reachable, removing from map",
+                name, state.pid, state.cid
+            );
+        }
+    }
+
+    for tr in &to_remove {
+        map.remove(tr);
+    }
+
+    !to_remove.is_empty()
+}
+
+fn new_ticket() -> String {
+    proxmox::tools::Uuid::generate().to_string()
+}
+
+async fn ensure_running(details: &SnapRestoreDetails) -> Result<VsockClient, Error> {
+    let name = make_name(&details.repo, &details.snapshot);
+    let mut state = VMStateMap::load()?;
+
+    cleanup_map(&mut state.map).await;
+
+    let new_cid;
+    let vms = match state.map.get(&name) {
+        Some(vm) => {
+            let client = VsockClient::new(vm.cid, DEFAULT_VSOCK_PORT, Some(vm.ticket.clone()));
+            let res = client.get("api2/json/status", None).await;
+            match res {
+                Ok(_) => {
+                    // VM is running and we just reset its timeout, nothing to do
+                    return Ok(client);
+                }
+                Err(err) => {
+                    println!("stale VM detected, restarting ({})", err);
+                    // VM is dead, restart
+                    let vms = start_vm(vm.cid, details).await?;
+                    new_cid = vms.cid;
+                    state.map.insert(name, vms.clone());
+                    vms
+                }
+            }
+        }
+        None => {
+            let mut cid = state
+                .map
+                .iter()
+                .map(|v| v.1.cid)
+                .max()
+                .unwrap_or(0)
+                .wrapping_add(1);
+
+            // offset cid by user id, to avoid unneccessary retries
+            let running_uid = nix::unistd::Uid::current();
+            cid = cid.wrapping_add(running_uid.as_raw() as i32);
+
+            // some low CIDs have special meaning, start at 10 to avoid them
+            cid = cid.max(10);
+
+            let vms = start_vm(cid, details).await?;
+            new_cid = vms.cid;
+            state.map.insert(name, vms.clone());
+            vms
+        }
+    };
+
+    state.write()?;
+    Ok(VsockClient::new(
+        new_cid,
+        DEFAULT_VSOCK_PORT,
+        Some(vms.ticket.clone()),
+    ))
+}
+
+async fn start_vm(cid_request: i32, details: &SnapRestoreDetails) -> Result<VMState, Error> {
+    let ticket = new_ticket();
+    let files = details
+        .manifest
+        .files()
+        .iter()
+        .map(|file| file.filename.clone())
+        .filter(|name| name.ends_with(".img.fidx"));
+    let (pid, cid) =
+        super::qemu_helper::start_vm((cid_request.abs() & 0xFFFF) as u16, details, files, &ticket)
+            .await?;
+    Ok(VMState { pid, cid, ticket })
+}
+
+impl BlockRestoreDriver for QemuBlockDriver {
+    fn status(&self) -> Async<Result<Vec<DriverStatus>, Error>> {
+        async move {
+            let mut state_map = VMStateMap::load()?;
+            let modified = cleanup_map(&mut state_map.map).await;
+            let map = if modified {
+                let m = state_map.map.clone();
+                state_map.write()?;
+                m
+            } else {
+                state_map.read_only()
+            };
+            let mut result = Vec::new();
+
+            for (n, s) in map.iter() {
+                let client = VsockClient::new(s.cid, DEFAULT_VSOCK_PORT, Some(s.ticket.clone()));
+                let resp = client
+                    .get("api2/json/status", Some(json!({"keep-timeout": true})))
+                    .await;
+                let name = tools::systemd::unescape_unit(n)
+                    .unwrap_or_else(|_| "<invalid name>".to_owned());
+                let mut extra = json!({"pid": s.pid, "cid": s.cid});
+
+                match resp {
+                    Ok(status) => match status["data"].as_object() {
+                        Some(map) => {
+                            for (k, v) in map.iter() {
+                                extra[k] = v.clone();
+                            }
+                        }
+                        None => {
+                            let err = format!(
+                                "invalid JSON received from /status call: {}",
+                                status.to_string()
+                            );
+                            extra["error"] = json!(err);
+                        }
+                    },
+                    Err(err) => {
+                        let err = format!("error during /status API call: {}", err);
+                        extra["error"] = json!(err);
+                    }
+                }
+
+                result.push(DriverStatus {
+                    id: name,
+                    data: extra,
+                });
+            }
+
+            Ok(result)
+        }
+        .boxed()
+    }
+
+    fn stop(&self, id: String) -> Async<Result<(), Error>> {
+        async move {
+            let name = tools::systemd::escape_unit(&id, false);
+            let mut map = VMStateMap::load()?;
+            let map_mod = cleanup_map(&mut map.map).await;
+            match map.map.get(&name) {
+                Some(state) => {
+                    let client =
+                        VsockClient::new(state.cid, DEFAULT_VSOCK_PORT, Some(state.ticket.clone()));
+                    // ignore errors, this either fails because:
+                    // * the VM is unreachable/dead, in which case we don't want it in the map
+                    // * the call was successful and the connection reset when the VM stopped
+                    let _ = client.get("api2/json/stop", None).await;
+                    map.map.remove(&name);
+                    map.write()?;
+                }
+                None => {
+                    if map_mod {
+                        map.write()?;
+                    }
+                    bail!("VM with name '{}' not found", name);
+                }
+            }
+            Ok(())
+        }
+        .boxed()
+    }
+
+    fn list(&self) -> Vec<String> {
+        match VMStateMap::load_read_only() {
+            Ok(state) => state
+                .iter()
+                .filter_map(|(name, _)| tools::systemd::unescape_unit(&name).ok())
+                .collect(),
+            Err(_) => Vec::new(),
+        }
+    }
+}
diff --git a/src/bin/proxmox_file_restore/mod.rs b/src/bin/proxmox_file_restore/mod.rs
new file mode 100644
index 00000000..aa65b664
--- /dev/null
+++ b/src/bin/proxmox_file_restore/mod.rs
@@ -0,0 +1,6 @@
+//! Block device drivers and tools for single file restore
+pub mod block_driver;
+pub use block_driver::*;
+
+mod qemu_helper;
+mod block_driver_qemu;
diff --git a/src/bin/proxmox_file_restore/qemu_helper.rs b/src/bin/proxmox_file_restore/qemu_helper.rs
new file mode 100644
index 00000000..22563263
--- /dev/null
+++ b/src/bin/proxmox_file_restore/qemu_helper.rs
@@ -0,0 +1,274 @@
+//! Helper to start a QEMU VM for single file restore.
+use std::fs::{File, OpenOptions};
+use std::io::prelude::*;
+use std::os::unix::io::{AsRawFd, FromRawFd};
+use std::path::PathBuf;
+use std::time::Duration;
+
+use anyhow::{bail, format_err, Error};
+use tokio::time;
+
+use nix::sys::signal::{kill, Signal};
+use nix::unistd::Pid;
+
+use proxmox::tools::{
+    fd::Fd,
+    fs::{create_path, file_read_string, make_tmp_file, CreateOptions},
+};
+
+use proxmox_backup::backup::backup_user;
+use proxmox_backup::client::{VsockClient, DEFAULT_VSOCK_PORT};
+use proxmox_backup::{buildcfg, tools};
+
+use super::SnapRestoreDetails;
+
+const PBS_VM_NAME: &str = "pbs-restore-vm";
+const MAX_CID_TRIES: u64 = 32;
+
+fn create_restore_log_dir() -> Result<String, Error> {
+    let logpath = format!("{}/file-restore", buildcfg::PROXMOX_BACKUP_LOG_DIR);
+
+    proxmox::try_block!({
+        let backup_user = backup_user()?;
+        let opts = CreateOptions::new()
+            .owner(backup_user.uid)
+            .group(backup_user.gid);
+
+        let opts_root = CreateOptions::new()
+            .owner(nix::unistd::ROOT)
+            .group(nix::unistd::Gid::from_raw(0));
+
+        create_path(buildcfg::PROXMOX_BACKUP_LOG_DIR, None, Some(opts))?;
+        create_path(&logpath, None, Some(opts_root))?;
+        Ok(())
+    })
+    .map_err(|err: Error| format_err!("unable to create file-restore log dir - {}", err))?;
+
+    Ok(logpath)
+}
+
+fn validate_img_existance() -> Result<(), Error> {
+    let kernel = PathBuf::from(buildcfg::PROXMOX_BACKUP_KERNEL_FN);
+    let initramfs = PathBuf::from(buildcfg::PROXMOX_BACKUP_INITRAMFS_FN);
+    if !kernel.exists() || !initramfs.exists() {
+        bail!("cannot run file-restore VM: package 'proxmox-file-restore' is not (correctly) installed");
+    }
+    Ok(())
+}
+
+fn try_kill_vm(pid: i32) -> Result<(), Error> {
+    let pid = Pid::from_raw(pid);
+    if let Ok(()) = kill(pid, None) {
+        // process is running (and we could kill it), check if it is actually ours
+        // (if it errors assume we raced with the process's death and ignore it)
+        if let Ok(cmdline) = file_read_string(format!("/proc/{}/cmdline", pid)) {
+            if cmdline.split('\0').any(|a| a == PBS_VM_NAME) {
+                // yes, it's ours, kill it brutally with SIGKILL, no reason to take
+                // any chances - in this state it's most likely broken anyway
+                if let Err(err) = kill(pid, Signal::SIGKILL) {
+                    bail!(
+                        "reaping broken VM (pid {}) with SIGKILL failed: {}",
+                        pid,
+                        err
+                    );
+                }
+            }
+        }
+    }
+
+    Ok(())
+}
+
+async fn create_temp_initramfs(ticket: &str) -> Result<(Fd, String), Error> {
+    use std::ffi::CString;
+    use tokio::fs::File;
+
+    let (tmp_fd, tmp_path) =
+        make_tmp_file("/tmp/file-restore-qemu.initramfs.tmp", CreateOptions::new())?;
+    nix::unistd::unlink(&tmp_path)?;
+    tools::fd_change_cloexec(tmp_fd.0, false)?;
+
+    let mut f = File::from_std(unsafe { std::fs::File::from_raw_fd(tmp_fd.0) });
+    let mut base = File::open(buildcfg::PROXMOX_BACKUP_INITRAMFS_FN).await?;
+
+    tokio::io::copy(&mut base, &mut f).await?;
+
+    let name = CString::new("ticket").unwrap();
+    tools::cpio::append_file(
+        &mut f,
+        ticket.as_bytes(),
+        &name,
+        0,
+        (libc::S_IFREG | 0o400) as u16,
+        0,
+        0,
+        0,
+        ticket.len() as u32,
+    )
+    .await?;
+    tools::cpio::append_trailer(&mut f).await?;
+
+    // forget the tokio file, we close the file descriptor via the returned Fd
+    std::mem::forget(f);
+
+    let path = format!("/dev/fd/{}", &tmp_fd.0);
+    Ok((tmp_fd, path))
+}
+
+pub async fn start_vm(
+    // u16 so we can do wrapping_add without going too high
+    mut cid: u16,
+    details: &SnapRestoreDetails,
+    files: impl Iterator<Item = String>,
+    ticket: &str,
+) -> Result<(i32, i32), Error> {
+    validate_img_existance()?;
+
+    if let Err(_) = std::env::var("PBS_PASSWORD") {
+        bail!("environment variable PBS_PASSWORD has to be set for QEMU VM restore");
+    }
+    if let Err(_) = std::env::var("PBS_FINGERPRINT") {
+        bail!("environment variable PBS_FINGERPRINT has to be set for QEMU VM restore");
+    }
+
+    let pid;
+    let (pid_fd, pid_path) = make_tmp_file("/tmp/file-restore-qemu.pid.tmp", CreateOptions::new())?;
+    nix::unistd::unlink(&pid_path)?;
+    tools::fd_change_cloexec(pid_fd.0, false)?;
+
+    let (_ramfs_pid, ramfs_path) = create_temp_initramfs(ticket).await?;
+
+    let logpath = create_restore_log_dir()?;
+    let logfile = &format!("{}/qemu.log", logpath);
+    let mut logrotate = tools::logrotate::LogRotate::new(logfile, false)
+        .ok_or_else(|| format_err!("could not get QEMU log file names"))?;
+
+    if let Err(err) = logrotate.do_rotate(CreateOptions::default(), Some(16)) {
+        eprintln!("warning: logrotate for QEMU log file failed - {}", err);
+    }
+
+    let mut logfd = OpenOptions::new()
+        .append(true)
+        .create_new(true)
+        .open(logfile)?;
+    tools::fd_change_cloexec(logfd.as_raw_fd(), false)?;
+
+    // preface log file with start timestamp so one can see how long QEMU took to start
+    writeln!(logfd, "[{}] PBS file restore VM log", {
+        let now = proxmox::tools::time::epoch_i64();
+        proxmox::tools::time::epoch_to_rfc3339(now)?
+    },)?;
+
+    let base_args = [
+        "-chardev",
+        &format!(
+            "file,id=log,path=/dev/null,logfile=/dev/fd/{},logappend=on",
+            logfd.as_raw_fd()
+        ),
+        "-serial",
+        "chardev:log",
+        "-vnc",
+        "none",
+        "-enable-kvm",
+        "-m",
+        "512",
+        "-kernel",
+        buildcfg::PROXMOX_BACKUP_KERNEL_FN,
+        "-initrd",
+        &ramfs_path,
+        "-append",
+        "quiet",
+        "-daemonize",
+        "-pidfile",
+        &format!("/dev/fd/{}", pid_fd.as_raw_fd()),
+        "-name",
+        PBS_VM_NAME,
+    ];
+
+    // Generate drive arguments for all fidx files in backup snapshot
+    let mut drives = Vec::new();
+    let mut id = 0;
+    for file in files {
+        if !file.ends_with(".img.fidx") {
+            continue;
+        }
+        drives.push("-drive".to_owned());
+        drives.push(format!(
+            "file=pbs:repository={},,snapshot={},,archive={},read-only=on,if=none,id=drive{}",
+            details.repo, details.snapshot, file, id
+        ));
+        drives.push("-device".to_owned());
+        // drive serial is used by VM to map .fidx files to /dev paths
+        drives.push(format!("virtio-blk-pci,drive=drive{},serial={}", id, file));
+        id += 1;
+    }
+
+    // Try starting QEMU in a loop to retry if we fail because of a bad 'cid' value
+    let mut attempts = 0;
+    loop {
+        let mut qemu_cmd = std::process::Command::new("qemu-system-x86_64");
+        qemu_cmd.args(base_args.iter());
+        qemu_cmd.args(&drives);
+        qemu_cmd.arg("-device");
+        qemu_cmd.arg(format!(
+            "vhost-vsock-pci,guest-cid={},disable-legacy=on",
+            cid
+        ));
+
+        qemu_cmd.stdout(std::process::Stdio::null());
+        qemu_cmd.stderr(std::process::Stdio::piped());
+
+        let res = tokio::task::block_in_place(|| qemu_cmd.spawn()?.wait_with_output())?;
+
+        if res.status.success() {
+            // at this point QEMU is already daemonized and running, so if anything fails we
+            // technically leave behind a zombie-VM... this shouldn't matter, as it will stop
+            // itself soon enough (timer), and the following operations are unlikely to fail
+            let mut pid_file = unsafe { File::from_raw_fd(pid_fd.as_raw_fd()) };
+            std::mem::forget(pid_fd); // FD ownership is now in pid_fd/File
+            let mut pidstr = String::new();
+            pid_file.read_to_string(&mut pidstr)?;
+            pid = pidstr.trim_end().parse().map_err(|err| {
+                format_err!("cannot parse PID returned by QEMU ('{}'): {}", &pidstr, err)
+            })?;
+            break;
+        } else {
+            let out = String::from_utf8_lossy(&res.stderr);
+            if out.contains("unable to set guest cid: Address already in use") {
+                attempts += 1;
+                if attempts >= MAX_CID_TRIES {
+                    bail!("CID '{}' in use, but max attempts reached, aborting", cid);
+                }
+                // CID in use, try next higher one
+                eprintln!("CID '{}' in use by other VM, attempting next one", cid);
+                // skip special-meaning low values
+                cid = cid.wrapping_add(1).max(10);
+            } else {
+                eprint!("{}", out);
+                bail!("Starting VM failed. See output above for more information.");
+            }
+        }
+    }
+
+    // QEMU has started successfully, now wait for virtio socket to become ready
+    let pid_t = Pid::from_raw(pid);
+    for _ in 0..60 {
+        let client = VsockClient::new(cid as i32, DEFAULT_VSOCK_PORT, Some(ticket.to_owned()));
+        if let Ok(Ok(_)) =
+            time::timeout(Duration::from_secs(2), client.get("api2/json/status", None)).await
+        {
+            return Ok((pid, cid as i32));
+        }
+        if kill(pid_t, None).is_err() {
+            // QEMU exited
+            bail!("VM exited before connection could be established");
+        }
+        time::sleep(Duration::from_millis(200)).await;
+    }
+
+    // start failed
+    if let Err(err) = try_kill_vm(pid) {
+        eprintln!("killing failed VM failed: {}", err);
+    }
+    bail!("starting VM timed out");
+}
diff --git a/src/buildcfg.rs b/src/buildcfg.rs
index 4f333288..b0f61efb 100644
--- a/src/buildcfg.rs
+++ b/src/buildcfg.rs
@@ -10,6 +10,14 @@ macro_rules! PROXMOX_BACKUP_RUN_DIR_M { () => ("/run/proxmox-backup") }
 #[macro_export]
 macro_rules! PROXMOX_BACKUP_LOG_DIR_M { () => ("/var/log/proxmox-backup") }
 
+#[macro_export]
+macro_rules! PROXMOX_BACKUP_CACHE_DIR_M { () => ("/var/cache/proxmox-backup") }
+
+#[macro_export]
+macro_rules! PROXMOX_BACKUP_FILE_RESTORE_BIN_DIR_M {
+    () => ("/usr/lib/x86_64-linux-gnu/proxmox-backup/file-restore")
+}
+
 /// namespaced directory for in-memory (tmpfs) run state
 pub const PROXMOX_BACKUP_RUN_DIR: &str = PROXMOX_BACKUP_RUN_DIR_M!();
 
@@ -30,6 +38,15 @@ pub const PROXMOX_BACKUP_PROXY_PID_FN: &str = concat!(PROXMOX_BACKUP_RUN_DIR_M!(
 /// the PID filename for the privileged api daemon
 pub const PROXMOX_BACKUP_API_PID_FN: &str = concat!(PROXMOX_BACKUP_RUN_DIR_M!(), "/api.pid");
 
+/// filename of the cached initramfs to use for booting single file restore VMs, this file is
+/// automatically created by APT hooks
+pub const PROXMOX_BACKUP_INITRAMFS_FN: &str =
+    concat!(PROXMOX_BACKUP_CACHE_DIR_M!(), "/file-restore-initramfs.img");
+
+/// filename of the kernel to use for booting single file restore VMs
+pub const PROXMOX_BACKUP_KERNEL_FN: &str =
+    concat!(PROXMOX_BACKUP_FILE_RESTORE_BIN_DIR_M!(), "/bzImage");
+
 /// Prepend configuration directory to a file name
 ///
 /// This is a simply way to get the full path for configuration files.
-- 
2.20.1





  reply	other threads:[~2021-04-01 15:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 10:21 [pbs-devel] [PATCH v3 00/20] Single file restore for VM images Stefan Reiter
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 pxar 01/20] decoder/aio: add contents() and content_size() calls Stefan Reiter
2021-03-31 11:54   ` [pbs-devel] applied: " Wolfgang Bumiller
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 02/20] vsock_client: remove wrong comment Stefan Reiter
2021-04-01  9:53   ` [pbs-devel] applied: " Thomas Lamprecht
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 03/20] vsock_client: remove some &mut restrictions and rustfmt Stefan Reiter
2021-04-01  9:54   ` [pbs-devel] applied: " Thomas Lamprecht
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 04/20] vsock_client: support authorization header Stefan Reiter
2021-04-01  9:54   ` [pbs-devel] applied: " Thomas Lamprecht
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 05/20] proxmox_client_tools: move common key related functions to key_source.rs Stefan Reiter
2021-04-01  9:54   ` [pbs-devel] applied: " Thomas Lamprecht
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 06/20] file-restore: add binary and basic commands Stefan Reiter
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 07/20] file-restore: allow specifying output-format Stefan Reiter
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 08/20] server/rest: extract auth to seperate module Stefan Reiter
2021-04-01  9:55   ` [pbs-devel] applied: " Thomas Lamprecht
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 09/20] server/rest: add ApiAuth trait to make user auth generic Stefan Reiter
2021-03-31 12:55   ` Wolfgang Bumiller
2021-03-31 14:07     ` Thomas Lamprecht
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 10/20] file-restore-daemon: add binary with virtio-vsock API server Stefan Reiter
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 11/20] file-restore-daemon: add watchdog module Stefan Reiter
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 12/20] file-restore-daemon: add disk module Stefan Reiter
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 13/20] add tools/cpio encoding module Stefan Reiter
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 14/20] file-restore: add qemu-helper setuid binary Stefan Reiter
2021-03-31 14:15   ` Oguz Bektas
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 15/20] file-restore: add basic VM/block device support Stefan Reiter
2021-04-01 15:43   ` Stefan Reiter [this message]
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 16/20] debian/client: add postinst hook to rebuild file-restore initramfs Stefan Reiter
2021-03-31 10:21 ` [pbs-devel] [PATCH v3 proxmox-backup 17/20] file-restore(-daemon): implement list API Stefan Reiter
2021-03-31 10:22 ` [pbs-devel] [PATCH v3 proxmox-backup 18/20] pxar/extract: add sequential variant of extract_sub_dir Stefan Reiter
2021-03-31 10:22 ` [pbs-devel] [PATCH v3 proxmox-backup 19/20] tools/zip: add zip_directory helper Stefan Reiter
2021-03-31 10:22 ` [pbs-devel] [PATCH v3 proxmox-backup 20/20] file-restore: add 'extract' command for VM file restore Stefan Reiter
2021-04-08 14:44 ` [pbs-devel] applied: [PATCH v3 00/20] Single file restore for VM images Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210401154352.4143-1-s.reiter@proxmox.com \
    --to=s.reiter@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal