From: Dominik Csapak <d.csapak@proxmox.com>
To: "Robert Obkircher" <r.obkircher@proxmox.com>,
"Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
pbs-devel@lists.proxmox.com
Subject: Re: [PATCH v1 proxmox-backup 11/11] tape: media catalog: use Flock wrapper instead of deprecated function
Date: Fri, 6 Mar 2026 09:25:33 +0100 [thread overview]
Message-ID: <64e8e02b-abaa-4739-ba49-5dfcf35500b0@proxmox.com> (raw)
In-Reply-To: <99147604-bffe-4c88-a5f7-a91f4e9fc821@proxmox.com>
On 3/5/26 3:38 PM, Robert Obkircher wrote:
>
> On 3/5/26 14:13, Dominik Csapak wrote:
>>
>>
>> On 3/5/26 12:08 PM, Fabian Grünbichler wrote:
>>> On February 26, 2026 3:40 pm, Robert Obkircher wrote:
>>>> Switch to the new wrapper type to fix the deprecation warnings. Panic
>>>> if unlock fails, because there is no easy way to return the File that
>>>> was temporarily moved out of the field.
>>>>
>>>> An alternative to moving out of the optional field would be to use
>>>> `File::try_clone` but that isn't ideal either because duplicated file
>>>> descriptors share the same lock status. That also doesn't solve the
>>>> problem that propagating the error without leaking the locked file
>>>> descriptor requires unsafe code or a change to nix.
>>>>
>>>> [1]
>>>> https://lore.proxmox.com/pbs-devel/1763634535.5sqd1r4buu.astroid@yuna.none/
>>>>
>>>> Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
>>>> ---
>>>> src/tape/media_catalog.rs | 49
>>>> +++++++++++++++++++++------------------
>>>> 1 file changed, 27 insertions(+), 22 deletions(-)
>>>>
>>>> diff --git a/src/tape/media_catalog.rs b/src/tape/media_catalog.rs
>>>> index 63329a65..372bf59d 100644
>>>> --- a/src/tape/media_catalog.rs
>>>> +++ b/src/tape/media_catalog.rs
>>>> @@ -6,6 +6,7 @@ use std::path::{Path, PathBuf};
>>>> use anyhow::{bail, format_err, Error};
>>>> use endian_trait::Endian;
>>>> +use nix::fcntl::{Flock, FlockArg};
>>>> use proxmox_sys::fs::read_subdir;
>>>> @@ -194,7 +195,7 @@ impl MediaCatalog {
>>>> let me = proxmox_lang::try_block!({
>>>> Self::create_basedir(base_path)?;
>>>> - let mut file = std::fs::OpenOptions::new()
>>>> + let file = std::fs::OpenOptions::new()
>>>> .read(true)
>>>> .write(write)
>>>> .create(create)
>>>> @@ -219,9 +220,10 @@ impl MediaCatalog {
>>>> };
>>>> // Note: lock file, to get a consistent view with
>>>> load_catalog
>>>> - nix::fcntl::flock(file.as_raw_fd(),
>>>> nix::fcntl::FlockArg::LockExclusive)?;
>>>> + let mut file = Flock::lock(file,
>>>> FlockArg::LockExclusive)
>>>> + .map_err(|e| format_err!("flock failed: {e:?}"))?;
>>>> let result = me.load_catalog(&mut file,
>>>> media_id.media_set_label.as_ref());
>>>> - nix::fcntl::flock(file.as_raw_fd(),
>>>> nix::fcntl::FlockArg::Unlock)?;
>>>> + let file = file.unlock().expect("shouldn't fail for
>>>> valid fd");
>>>
>>> here we could just return an error instead of panicing? we only read
>>> from the file anyway, and bubbling up the error should be fine..
>>
>> full agree, we bubbled up the error before, so we should keep that
>> behavior
> Naively returning the error would likely panic when the Flock wrapper
> is dropped, and we also can't just mem::forget it, because that would
> leak the File (unless we retrieve it with unsafe ptr::read).
>
> Should I try to send a PR to nix with a `Flock::unwrap(self) -> T`
> method that could be used in map_err? (Changing unlock to return T
> instead of Self might be an alternative, but that would be weird if
> errno is EINTR...)
well since it can return EINTR anyways, we have to provide some kind of
wrapper that retries in case of EWOULDBLOCK/EINTR anyway?
i get that bubbling up the error would not achieve anything, but
as fabian said, crashing the process (and potentially poisoning held
mutex locks is far from ideal...
>
> It would also be useful if they could add an impl Flockable for &mut
> File. I think that should be safe.
>>
>>>
>>>> let (found_magic_number, _) = result?;
>>>> @@ -367,27 +369,30 @@ impl MediaCatalog {
>>>> return Ok(());
>>>> }
>>>> - match self.file {
>>>> - Some(ref mut file) => {
>>>> - let pending = &self.pending;
>>>> - // Note: lock file, to get a consistent view with
>>>> load_catalog
>>>> - nix::fcntl::flock(file.as_raw_fd(),
>>>> nix::fcntl::FlockArg::LockExclusive)?;
>>>> - let result: Result<(), Error> =
>>>> proxmox_lang::try_block!({
>>>> - file.write_all(pending)?;
>>>> - file.flush()?;
>>>> - file.sync_data()?;
>>>> - Ok(())
>>>> - });
>>>> - nix::fcntl::flock(file.as_raw_fd(),
>>>> nix::fcntl::FlockArg::Unlock)?;
>>>> -
>>>> - result?;
>>>> - }
>>>> - None => bail!("media catalog not writable (opened
>>>> read only)"),
>>>> - }
>>>> + let Some(file) = self.file.take() else {
>>>> + bail!("media catalog not writable (opened read only)");
>>>> + };
>>>> +
>>>> + let mut file = Flock::lock(file,
>>>> FlockArg::LockExclusive).map_err(|(f, e)| {
>>>> + self.file = Some(f);
>>>> + format_err!("flock failed: {:?} - {e:?}", self.file)
>>>> + })?;
>>>> - self.pending = Vec::new();
>>>> + let pending = &self.pending;
>>>> + // Note: lock file, to get a consistent view with
>>>> load_catalog
>>>> + let result: Result<(), Error> = proxmox_lang::try_block!({
>>>> + file.write_all(pending)?;
>>>> + file.flush()?;
>>>> + file.sync_data()?;
>>>> + Ok(())
>>>> + });
>>>> - Ok(())
>>>> + self.file = Some(file.unlock().expect("shouldn't fail for
>>>> valid fd"));
>>>
>>> here the situation is a bit more complicated, since we wrote to the
>>> file
>>> above.. but looking at the call graphs/error behaviour, it seems to me
>>> that we are often operating on temporary files (though we do not
>>> seem to
>>> clean them up in the error path?), or failure to commit herer would
>>> bubble up and abort writing to tape anyway..
>>>
>>> triggering a panic here doesn't help in any case - the possibly
>>> inconsistent writes/reads have already happened, at best we could
>>> maybe
>>> (try to) invalidate the catalog as part of error handling to force
>>> re-reading it from tape? or we could bubble up the error to abort
>>> whatever operation is attempting to commit, and log the error
>>> prominently - taking down the whole PBS instance doesn't buy us
>>> anything
>>> AFAICT?
> Theoretically it buys us slightly better protection against deadlocks,
> but it doesn't look like the callers are doing anything potentially
> dangerous like holding on to the file descriptor across await points.
> (Leaving self.file as None would also prevent that, but it would be a
> change of behavior as well.)
>
>>
>> also here agreeing regarding panicking, we did simply bubble up the
>> error before, no reason to change that IMO.
>>
>> looking at the code i think we should maybe revamp the catalog
>> handling?
>>
>> out of scope for this series, but I think we should try to write the
>> catalog atomically, (meaning we should always writing into a tmp
>> file and renaming into place, like we do with most of the other
>> files we
>> write)
>>
>> then I think we'd need no lock for reading the catalog at all,
>> but we'd have to copy the old catalog to a tmp one first when
>> we want to append data. (this might be a bad idea since
>> they can get quite large)
> Not sure if it is relevant, but copying wouldn't be enough if there
> were multiple writers.
how so?
only one should be able to write during the lock, so the file
is always consistent?
>>
>> but as i wrote, not really relevant for this series, just
>> wanted it to point out
>>
>>>
>>>> +
>>>> + if result.is_ok() {
>>>> + self.pending = Vec::new();
>>>> + }
>>>> + result
>>>> }
>>>> /// Conditionally commit if in pending data is large (> 1Mb)
>>>> --
>>>> 2.47.3
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
next prev parent reply other threads:[~2026-03-06 8:25 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 14:40 [PATCH v1 proxmox-backup 00/11] fix various warnings Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 01/11] datastore: remove allow(clippy::cast_ptr_alignment) attribute Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 02/11] api: use checked_div for compression ratio calculation Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 03/11] datastore+server: sort by key instead of using comparison functions Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 04/11] api: remove unnecessary ampersand to fix clippy warning Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 05/11] bin: debug: use pattern match instead of is_some+unwrap Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 06/11] bin: proxy: fix clippy warning about unnecesary use of find_map Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 07/11] datastore: silence warning about too many arguments Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 08/11] client: catalog shell: avoid unnecessary block_on in async code Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 09/11] client: catalog shell: combine multiple block_on calls into one Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 10/11] client: catalog shell: avoid unsafe transmute Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 11/11] tape: media catalog: use Flock wrapper instead of deprecated function Robert Obkircher
2026-03-05 11:08 ` Fabian Grünbichler
2026-03-05 13:13 ` Dominik Csapak
2026-03-05 14:39 ` Robert Obkircher
2026-03-06 8:25 ` Dominik Csapak [this message]
2026-03-05 11:20 ` partially-applied: [PATCH v1 proxmox-backup 00/11] fix various warnings Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=64e8e02b-abaa-4739-ba49-5dfcf35500b0@proxmox.com \
--to=d.csapak@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
--cc=r.obkircher@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox