From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id D9C4D1FF165 for ; Thu, 20 Nov 2025 10:30:49 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 2DF63DAF; Thu, 20 Nov 2025 10:30:56 +0100 (CET) Mime-Version: 1.0 Date: Thu, 20 Nov 2025 10:30:21 +0100 Message-Id: From: "Lukas Wagner" To: "Thomas Lamprecht" , "Proxmox Datacenter Manager development discussion" , "Lukas Wagner" X-Mailer: aerc 0.21.0-0-g5549850facc2-dirty References: <20251119111105.174145-1-l.wagner@proxmox.com> <8686399d-c28c-4ae5-8565-94f5608fdfd6@proxmox.com> In-Reply-To: <8686399d-c28c-4ae5-8565-94f5608fdfd6@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1763630990723 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.030 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_SHORT 0.001 Use of a URL Shortener for very short URL SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pdm-devel] [PATCH proxmox] rrd: restrict archive path via regex X-BeenThere: pdm-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Datacenter Manager development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox Datacenter Manager development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pdm-devel-bounces@lists.proxmox.com Sender: "pdm-devel" On Wed Nov 19, 2025 at 7:30 PM CET, Thomas Lamprecht wrote: > Am 19.11.25 um 12:11 schrieb Lukas Wagner: >> The `rel_path` parameter is used as a relative path inside the `rrdb` >> base directory to build the final path for the archive file. Usually, >> this is something like 'node/localhost/cpu_avg1'. For PBS, this is fine, >> since these paths are hardcoded or derived from safe datastore names. In >> PDM however, these paths are built from potentially 'untrusted' (as in, >> one could 'pretend' to be a PBS/PVE remote and send malicious data) >> metric data points - so we should have additional safe guards in place >> to disallow potentially dangerous paths like '../abc' which would escape >> the base directory. > > thanks for tackling this. > >> diff --git a/proxmox-rrd/src/cache.rs b/proxmox-rrd/src/cache.rs >> index 29d46ed5..042b4213 100644 >> --- a/proxmox-rrd/src/cache.rs >> +++ b/proxmox-rrd/src/cache.rs >> @@ -8,8 +8,11 @@ use std::thread::spawn; >> use std::time::SystemTime; >> >> use anyhow::{bail, format_err, Error}; >> +use const_format::concatcp; >> use crossbeam_channel::{bounded, TryRecvError}; >> >> +use proxmox_schema::api_types::SAFE_ID_REGEX_STR; >> +use proxmox_schema::const_regex; >> use proxmox_sys::fs::{create_path, CreateOptions}; >> >> use crate::rrd::{AggregationFn, DataSourceType, Database}; >> @@ -21,6 +24,10 @@ use journal::*; >> mod rrd_map; >> use rrd_map::*; >> >> +const_regex! { >> + DATAPOINT_PATH_REGEX = concatcp!(r"^", SAFE_ID_REGEX_STR, r"(/", SAFE_ID_REGEX_STR, r")+$"); >> +} >> + >> /// RRD cache - keep RRD data in RAM, but write updates to disk >> /// >> /// This cache is designed to run as single instance (no concurrent >> @@ -214,6 +221,10 @@ impl Cache { >> dst: DataSourceType, >> new_only: bool, >> ) -> Result<(), Error> { >> + if !DATAPOINT_PATH_REGEX.is_match(rel_path) { > > Hmm, not really sure if we want to couple this to SAFE_ID here, especially if the > main goal is to avoid breaking out the filesystem. > This approach could probably get away with forbidding `../` explicitly. > I don't have any hard feelings about this, just checking for '../' is also a valid option (and might be much more efficient, and also avoids pulling in additional deps). One could argue that the `rel_path` parameter is in reality just a generic hierarchical identifier for the stored time series and the fact that this identifier directly maps to a filesystem path in the end is rather an implementation detail that should not really matter to the caller. If one approaches it like this, then the SAFE_ID option makes sense in my head. But as I said, no hard feelings here. > That said, IMO this is a bit overfitted to the current usage and problem, we have > quite a few other public function that allow passing rel_path, which might be used > in the future for these things. > > For these it's IMO often better to ensure the actual file operations are contained, > i.e. open these rel_path's using openat2 [0] with a dirfd from the basedir directory > and the open_how RESOLVE_BENEATH mode used, so that it's anchored to the correct > directory. nix has bindings for this syscall [1]. > > We could combine that with your approach (favoring just bailing on matching "../") > to get some better UX, but the "definitive" protection would come from the openat2 > usage. > > [0]: man openat2 or https://manpages.debian.org/trixie/manpages-dev/openat2.2.en.html > [1]: https://docs.rs/nix/latest/nix/fcntl/fn.openat2.html > TIL; I was vaguely familiar with openat, but the possibility of using it to solve this particular kind of problem was not on my radar. Thanks! I'm not sure if I have the capacity right now to implement the openat approach for proxmox-rrd, so I'd suggest that we add the path check right now (I'd send a v2 checking for `../` ASAP) as a quick safe guard and then later, when I have some time to spare, I'd go in the try to make it use openat2. Would this be okay for you? >> + bail!("invalid datapoint path: {rel_path}"); >> + } >> + >> let journal_applied = self.apply_journal()?; >> >> self.state _______________________________________________ pdm-devel mailing list pdm-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel