From: Robert Obkircher <r.obkircher@proxmox.com>
To: "Dominik Csapak" <d.csapak@proxmox.com>,
"Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
pbs-devel@lists.proxmox.com
Subject: Re: [PATCH v1 proxmox-backup 11/11] tape: media catalog: use Flock wrapper instead of deprecated function
Date: Thu, 5 Mar 2026 15:39:06 +0100 [thread overview]
Message-ID: <99147604-bffe-4c88-a5f7-a91f4e9fc821@proxmox.com> (raw)
In-Reply-To: <54a5a607-215c-44e8-bbb8-4f7219412934@proxmox.com>
On 3/5/26 14:13, Dominik Csapak wrote:
>
>
> On 3/5/26 12:08 PM, Fabian Grünbichler wrote:
>> On February 26, 2026 3:40 pm, Robert Obkircher wrote:
>>> Switch to the new wrapper type to fix the deprecation warnings. Panic
>>> if unlock fails, because there is no easy way to return the File that
>>> was temporarily moved out of the field.
>>>
>>> An alternative to moving out of the optional field would be to use
>>> `File::try_clone` but that isn't ideal either because duplicated file
>>> descriptors share the same lock status. That also doesn't solve the
>>> problem that propagating the error without leaking the locked file
>>> descriptor requires unsafe code or a change to nix.
>>>
>>> [1]
>>> https://lore.proxmox.com/pbs-devel/1763634535.5sqd1r4buu.astroid@yuna.none/
>>>
>>> Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
>>> ---
>>> src/tape/media_catalog.rs | 49
>>> +++++++++++++++++++++------------------
>>> 1 file changed, 27 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/src/tape/media_catalog.rs b/src/tape/media_catalog.rs
>>> index 63329a65..372bf59d 100644
>>> --- a/src/tape/media_catalog.rs
>>> +++ b/src/tape/media_catalog.rs
>>> @@ -6,6 +6,7 @@ use std::path::{Path, PathBuf};
>>> use anyhow::{bail, format_err, Error};
>>> use endian_trait::Endian;
>>> +use nix::fcntl::{Flock, FlockArg};
>>> use proxmox_sys::fs::read_subdir;
>>> @@ -194,7 +195,7 @@ impl MediaCatalog {
>>> let me = proxmox_lang::try_block!({
>>> Self::create_basedir(base_path)?;
>>> - let mut file = std::fs::OpenOptions::new()
>>> + let file = std::fs::OpenOptions::new()
>>> .read(true)
>>> .write(write)
>>> .create(create)
>>> @@ -219,9 +220,10 @@ impl MediaCatalog {
>>> };
>>> // Note: lock file, to get a consistent view with
>>> load_catalog
>>> - nix::fcntl::flock(file.as_raw_fd(),
>>> nix::fcntl::FlockArg::LockExclusive)?;
>>> + let mut file = Flock::lock(file,
>>> FlockArg::LockExclusive)
>>> + .map_err(|e| format_err!("flock failed: {e:?}"))?;
>>> let result = me.load_catalog(&mut file,
>>> media_id.media_set_label.as_ref());
>>> - nix::fcntl::flock(file.as_raw_fd(),
>>> nix::fcntl::FlockArg::Unlock)?;
>>> + let file = file.unlock().expect("shouldn't fail for
>>> valid fd");
>>
>> here we could just return an error instead of panicing? we only read
>> from the file anyway, and bubbling up the error should be fine..
>
> full agree, we bubbled up the error before, so we should keep that
> behavior
Naively returning the error would likely panic when the Flock wrapper
is dropped, and we also can't just mem::forget it, because that would
leak the File (unless we retrieve it with unsafe ptr::read).
Should I try to send a PR to nix with a `Flock::unwrap(self) -> T`
method that could be used in map_err? (Changing unlock to return T
instead of Self might be an alternative, but that would be weird if
errno is EINTR...)
It would also be useful if they could add an impl Flockable for &mut
File. I think that should be safe.
>
>>
>>> let (found_magic_number, _) = result?;
>>> @@ -367,27 +369,30 @@ impl MediaCatalog {
>>> return Ok(());
>>> }
>>> - match self.file {
>>> - Some(ref mut file) => {
>>> - let pending = &self.pending;
>>> - // Note: lock file, to get a consistent view with
>>> load_catalog
>>> - nix::fcntl::flock(file.as_raw_fd(),
>>> nix::fcntl::FlockArg::LockExclusive)?;
>>> - let result: Result<(), Error> =
>>> proxmox_lang::try_block!({
>>> - file.write_all(pending)?;
>>> - file.flush()?;
>>> - file.sync_data()?;
>>> - Ok(())
>>> - });
>>> - nix::fcntl::flock(file.as_raw_fd(),
>>> nix::fcntl::FlockArg::Unlock)?;
>>> -
>>> - result?;
>>> - }
>>> - None => bail!("media catalog not writable (opened
>>> read only)"),
>>> - }
>>> + let Some(file) = self.file.take() else {
>>> + bail!("media catalog not writable (opened read only)");
>>> + };
>>> +
>>> + let mut file = Flock::lock(file,
>>> FlockArg::LockExclusive).map_err(|(f, e)| {
>>> + self.file = Some(f);
>>> + format_err!("flock failed: {:?} - {e:?}", self.file)
>>> + })?;
>>> - self.pending = Vec::new();
>>> + let pending = &self.pending;
>>> + // Note: lock file, to get a consistent view with
>>> load_catalog
>>> + let result: Result<(), Error> = proxmox_lang::try_block!({
>>> + file.write_all(pending)?;
>>> + file.flush()?;
>>> + file.sync_data()?;
>>> + Ok(())
>>> + });
>>> - Ok(())
>>> + self.file = Some(file.unlock().expect("shouldn't fail for
>>> valid fd"));
>>
>> here the situation is a bit more complicated, since we wrote to the
>> file
>> above.. but looking at the call graphs/error behaviour, it seems to me
>> that we are often operating on temporary files (though we do not
>> seem to
>> clean them up in the error path?), or failure to commit herer would
>> bubble up and abort writing to tape anyway..
>>
>> triggering a panic here doesn't help in any case - the possibly
>> inconsistent writes/reads have already happened, at best we could
>> maybe
>> (try to) invalidate the catalog as part of error handling to force
>> re-reading it from tape? or we could bubble up the error to abort
>> whatever operation is attempting to commit, and log the error
>> prominently - taking down the whole PBS instance doesn't buy us
>> anything
>> AFAICT?
Theoretically it buys us slightly better protection against deadlocks,
but it doesn't look like the callers are doing anything potentially
dangerous like holding on to the file descriptor across await points.
(Leaving self.file as None would also prevent that, but it would be a
change of behavior as well.)
>
> also here agreeing regarding panicking, we did simply bubble up the
> error before, no reason to change that IMO.
>
> looking at the code i think we should maybe revamp the catalog
> handling?
>
> out of scope for this series, but I think we should try to write the
> catalog atomically, (meaning we should always writing into a tmp
> file and renaming into place, like we do with most of the other
> files we
> write)
>
> then I think we'd need no lock for reading the catalog at all,
> but we'd have to copy the old catalog to a tmp one first when
> we want to append data. (this might be a bad idea since
> they can get quite large)
Not sure if it is relevant, but copying wouldn't be enough if there
were multiple writers.
>
> but as i wrote, not really relevant for this series, just
> wanted it to point out
>
>>
>>> +
>>> + if result.is_ok() {
>>> + self.pending = Vec::new();
>>> + }
>>> + result
>>> }
>>> /// Conditionally commit if in pending data is large (> 1Mb)
>>> --
>>> 2.47.3
>>>
>>>
>>>
>>>
>>>
>>>
>
next prev parent reply other threads:[~2026-03-05 14:38 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 14:40 [PATCH v1 proxmox-backup 00/11] fix various warnings Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 01/11] datastore: remove allow(clippy::cast_ptr_alignment) attribute Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 02/11] api: use checked_div for compression ratio calculation Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 03/11] datastore+server: sort by key instead of using comparison functions Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 04/11] api: remove unnecessary ampersand to fix clippy warning Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 05/11] bin: debug: use pattern match instead of is_some+unwrap Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 06/11] bin: proxy: fix clippy warning about unnecesary use of find_map Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 07/11] datastore: silence warning about too many arguments Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 08/11] client: catalog shell: avoid unnecessary block_on in async code Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 09/11] client: catalog shell: combine multiple block_on calls into one Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 10/11] client: catalog shell: avoid unsafe transmute Robert Obkircher
2026-02-26 14:40 ` [PATCH v1 proxmox-backup 11/11] tape: media catalog: use Flock wrapper instead of deprecated function Robert Obkircher
2026-03-05 11:08 ` Fabian Grünbichler
2026-03-05 13:13 ` Dominik Csapak
2026-03-05 14:39 ` Robert Obkircher [this message]
2026-03-05 11:20 ` partially-applied: [PATCH v1 proxmox-backup 00/11] fix various warnings Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=99147604-bffe-4c88-a5f7-a91f4e9fc821@proxmox.com \
--to=r.obkircher@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.