From: "Lukas Wagner" <l.wagner@proxmox.com>
To: "Thomas Lamprecht" <t.lamprecht@proxmox.com>,
"Proxmox Datacenter Manager development discussion"
<pdm-devel@lists.proxmox.com>,
"Lukas Wagner" <l.wagner@proxmox.com>
Subject: Re: [pdm-devel] [PATCH proxmox] rrd: restrict archive path via regex
Date: Thu, 20 Nov 2025 10:30:21 +0100 [thread overview]
Message-ID: <DEDF0QJVBDZF.15AFDROYA1BC4@proxmox.com> (raw)
In-Reply-To: <8686399d-c28c-4ae5-8565-94f5608fdfd6@proxmox.com>
On Wed Nov 19, 2025 at 7:30 PM CET, Thomas Lamprecht wrote:
> Am 19.11.25 um 12:11 schrieb Lukas Wagner:
>> The `rel_path` parameter is used as a relative path inside the `rrdb`
>> base directory to build the final path for the archive file. Usually,
>> this is something like 'node/localhost/cpu_avg1'. For PBS, this is fine,
>> since these paths are hardcoded or derived from safe datastore names. In
>> PDM however, these paths are built from potentially 'untrusted' (as in,
>> one could 'pretend' to be a PBS/PVE remote and send malicious data)
>> metric data points - so we should have additional safe guards in place
>> to disallow potentially dangerous paths like '../abc' which would escape
>> the base directory.
>
> thanks for tackling this.
>
>> diff --git a/proxmox-rrd/src/cache.rs b/proxmox-rrd/src/cache.rs
>> index 29d46ed5..042b4213 100644
>> --- a/proxmox-rrd/src/cache.rs
>> +++ b/proxmox-rrd/src/cache.rs
>> @@ -8,8 +8,11 @@ use std::thread::spawn;
>> use std::time::SystemTime;
>>
>> use anyhow::{bail, format_err, Error};
>> +use const_format::concatcp;
>> use crossbeam_channel::{bounded, TryRecvError};
>>
>> +use proxmox_schema::api_types::SAFE_ID_REGEX_STR;
>> +use proxmox_schema::const_regex;
>> use proxmox_sys::fs::{create_path, CreateOptions};
>>
>> use crate::rrd::{AggregationFn, DataSourceType, Database};
>> @@ -21,6 +24,10 @@ use journal::*;
>> mod rrd_map;
>> use rrd_map::*;
>>
>> +const_regex! {
>> + DATAPOINT_PATH_REGEX = concatcp!(r"^", SAFE_ID_REGEX_STR, r"(/", SAFE_ID_REGEX_STR, r")+$");
>> +}
>> +
>> /// RRD cache - keep RRD data in RAM, but write updates to disk
>> ///
>> /// This cache is designed to run as single instance (no concurrent
>> @@ -214,6 +221,10 @@ impl Cache {
>> dst: DataSourceType,
>> new_only: bool,
>> ) -> Result<(), Error> {
>> + if !DATAPOINT_PATH_REGEX.is_match(rel_path) {
>
> Hmm, not really sure if we want to couple this to SAFE_ID here, especially if the
> main goal is to avoid breaking out the filesystem.
> This approach could probably get away with forbidding `../` explicitly.
>
I don't have any hard feelings about this, just checking for '../' is
also a valid option (and might be much more efficient, and also avoids
pulling in additional deps).
One could argue that the `rel_path` parameter is in reality just a
generic hierarchical identifier for the stored time series and the fact
that this identifier directly maps to a filesystem path in the end is
rather an implementation detail that should not really matter to the
caller. If one approaches it like this, then the SAFE_ID option makes
sense in my head. But as I said, no hard feelings here.
> That said, IMO this is a bit overfitted to the current usage and problem, we have
> quite a few other public function that allow passing rel_path, which might be used
> in the future for these things.
>
> For these it's IMO often better to ensure the actual file operations are contained,
> i.e. open these rel_path's using openat2 [0] with a dirfd from the basedir directory
> and the open_how RESOLVE_BENEATH mode used, so that it's anchored to the correct
> directory. nix has bindings for this syscall [1].
>
> We could combine that with your approach (favoring just bailing on matching "../")
> to get some better UX, but the "definitive" protection would come from the openat2
> usage.
>
> [0]: man openat2 or https://manpages.debian.org/trixie/manpages-dev/openat2.2.en.html
> [1]: https://docs.rs/nix/latest/nix/fcntl/fn.openat2.html
>
TIL; I was vaguely familiar with openat, but the possibility of using it
to solve this particular kind of problem was not on my radar. Thanks!
I'm not sure if I have the capacity right now to implement the openat
approach for proxmox-rrd, so I'd suggest that we add the path check
right now (I'd send a v2 checking for `../` ASAP) as a quick safe guard
and then later, when I have some time to spare, I'd go in the try to
make it use openat2. Would this be okay for you?
>> + bail!("invalid datapoint path: {rel_path}");
>> + }
>> +
>> let journal_applied = self.apply_journal()?;
>>
>> self.state
_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel
next prev parent reply other threads:[~2025-11-20 9:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-19 11:11 Lukas Wagner
2025-11-19 18:30 ` Thomas Lamprecht
2025-11-20 9:30 ` Lukas Wagner [this message]
2025-11-20 9:55 ` Thomas Lamprecht
2025-11-20 11:01 ` [pdm-devel] superseded: " Lukas Wagner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DEDF0QJVBDZF.15AFDROYA1BC4@proxmox.com \
--to=l.wagner@proxmox.com \
--cc=pdm-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox