* [pbs-devel] [PATCH proxmox-backup] fix #5967: datastore lookup: check for disappeared datastores
@ 2024-12-05 9:12 Fabian Grünbichler
2024-12-06 14:03 ` Christian Ebner
0 siblings, 1 reply; 3+ messages in thread
From: Fabian Grünbichler @ 2024-12-05 9:12 UTC (permalink / raw)
To: pbs-devel
some code paths will only read the datastore base - if the datastore
disappeared (e.g., a lazy unmount, or some network storage disappearing without
blocking) then we should at least detect the issue at lookup time before
re-using an already opened datastore.
any code path that tried to access the chunks themselves would already have
failed without this additional safeguard, but in particular queries listing
namespaces or groups in the root namespace don't, which might cause a remote
client such as pull sync to mistakenly think that the datastore is empty.
limit the additional check to `Read` or `Write` lookups, since any code path
doing those will most likely want to do I/O on the datastore afterwards, as
opposed to a `Lookup` lookup or one without an operation, which might not
require the datastore to be fully functional.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
there is of course still a gap beween the lookup and the readdir after it.. we
could track down all API endpoints that access the datastore but don't touch
the chunk dir, and before returning, check a second time that the datastore
hasn't vanished, but I am not sure that is worth the effort?
pbs-datastore/src/chunk_store.rs | 2 +-
pbs-datastore/src/datastore.rs | 6 ++++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 29d5874a1..189c4ebcd 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -158,7 +158,7 @@ impl ChunkStore {
/// Check if the chunkstore path is absolute and that we can
/// access it. Returns the absolute '.chunks' path on success.
- fn chunk_dir_accessible(base: &Path) -> Result<PathBuf, Error> {
+ pub(crate) fn chunk_dir_accessible(base: &Path) -> Result<PathBuf, Error> {
if !base.is_absolute() {
bail!("expected absolute path - got {:?}", base);
}
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 0801b4bf6..bb9b4471f 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -232,6 +232,12 @@ impl DataStore {
// reuse chunk store so that we keep using the same process locker instance!
let chunk_store = if let Some(datastore) = &entry {
let last_digest = datastore.last_digest.as_ref();
+ match operation {
+ Some(Operation::Read) | Some(Operation::Write) => {
+ ChunkStore::chunk_dir_accessible(&datastore.chunk_store.base_path())?;
+ }
+ Some(Operation::Lookup) | None => {}
+ }
if let Some(true) = last_digest.map(|last_digest| last_digest == &digest) {
if let Some(operation) = operation {
update_active_operations(name, operation, 1)?;
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] fix #5967: datastore lookup: check for disappeared datastores
2024-12-05 9:12 [pbs-devel] [PATCH proxmox-backup] fix #5967: datastore lookup: check for disappeared datastores Fabian Grünbichler
@ 2024-12-06 14:03 ` Christian Ebner
2024-12-11 9:32 ` Fabian Grünbichler
0 siblings, 1 reply; 3+ messages in thread
From: Christian Ebner @ 2024-12-06 14:03 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Fabian Grünbichler
Had a look at this, one comment inline.
On 12/5/24 10:12, Fabian Grünbichler wrote:
> some code paths will only read the datastore base - if the datastore
> disappeared (e.g., a lazy unmount, or some network storage disappearing without
> blocking) then we should at least detect the issue at lookup time before
> re-using an already opened datastore.
>
> any code path that tried to access the chunks themselves would already have
> failed without this additional safeguard, but in particular queries listing
> namespaces or groups in the root namespace don't, which might cause a remote
> client such as pull sync to mistakenly think that the datastore is empty.
>
> limit the additional check to `Read` or `Write` lookups, since any code path
> doing those will most likely want to do I/O on the datastore afterwards, as
> opposed to a `Lookup` lookup or one without an operation, which might not
> require the datastore to be fully functional.
>
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> there is of course still a gap beween the lookup and the readdir after it.. we
> could track down all API endpoints that access the datastore but don't touch
> the chunk dir, and before returning, check a second time that the datastore
> hasn't vanished, but I am not sure that is worth the effort?
>
> pbs-datastore/src/chunk_store.rs | 2 +-
> pbs-datastore/src/datastore.rs | 6 ++++++
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 29d5874a1..189c4ebcd 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -158,7 +158,7 @@ impl ChunkStore {
>
> /// Check if the chunkstore path is absolute and that we can
> /// access it. Returns the absolute '.chunks' path on success.
> - fn chunk_dir_accessible(base: &Path) -> Result<PathBuf, Error> {
> + pub(crate) fn chunk_dir_accessible(base: &Path) -> Result<PathBuf, Error> {
`chunk_dir_accessible` tries to get the metadata for the chunk directory
in order to check if the chunk store is available.
So this could be used to not only check if there is a chunk directory,
but rather that the chunk directory has the same device ID and i-node
number. For that check, one would only need to get and store the chunk
directory metadata on `ChunkStore::open`. That would be a rather non
invasive check to also detect if the datastore has been over-mounted,
not just if it vanished.
What do you think? Is this feasible? Not fully sure about the
implications for removable datastores, but that should be fine as well,
as the device id and i-node should not change as long as the datastore
remains online.
> if !base.is_absolute() {
> bail!("expected absolute path - got {:?}", base);
> }
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 0801b4bf6..bb9b4471f 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -232,6 +232,12 @@ impl DataStore {
> // reuse chunk store so that we keep using the same process locker instance!
> let chunk_store = if let Some(datastore) = &entry {
> let last_digest = datastore.last_digest.as_ref();
> + match operation {
> + Some(Operation::Read) | Some(Operation::Write) => {
> + ChunkStore::chunk_dir_accessible(&datastore.chunk_store.base_path())?;
> + }
> + Some(Operation::Lookup) | None => {}
> + }
> if let Some(true) = last_digest.map(|last_digest| last_digest == &digest) {
> if let Some(operation) = operation {
> update_active_operations(name, operation, 1)?;
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] fix #5967: datastore lookup: check for disappeared datastores
2024-12-06 14:03 ` Christian Ebner
@ 2024-12-11 9:32 ` Fabian Grünbichler
0 siblings, 0 replies; 3+ messages in thread
From: Fabian Grünbichler @ 2024-12-11 9:32 UTC (permalink / raw)
To: Christian Ebner, Proxmox Backup Server development discussion
On December 6, 2024 3:03 pm, Christian Ebner wrote:
> Had a look at this, one comment inline.
>
> On 12/5/24 10:12, Fabian Grünbichler wrote:
>> some code paths will only read the datastore base - if the datastore
>> disappeared (e.g., a lazy unmount, or some network storage disappearing without
>> blocking) then we should at least detect the issue at lookup time before
>> re-using an already opened datastore.
>>
>> any code path that tried to access the chunks themselves would already have
>> failed without this additional safeguard, but in particular queries listing
>> namespaces or groups in the root namespace don't, which might cause a remote
>> client such as pull sync to mistakenly think that the datastore is empty.
>>
>> limit the additional check to `Read` or `Write` lookups, since any code path
>> doing those will most likely want to do I/O on the datastore afterwards, as
>> opposed to a `Lookup` lookup or one without an operation, which might not
>> require the datastore to be fully functional.
>>
>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>> ---
>> there is of course still a gap beween the lookup and the readdir after it.. we
>> could track down all API endpoints that access the datastore but don't touch
>> the chunk dir, and before returning, check a second time that the datastore
>> hasn't vanished, but I am not sure that is worth the effort?
>>
>> pbs-datastore/src/chunk_store.rs | 2 +-
>> pbs-datastore/src/datastore.rs | 6 ++++++
>> 2 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index 29d5874a1..189c4ebcd 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -158,7 +158,7 @@ impl ChunkStore {
>>
>> /// Check if the chunkstore path is absolute and that we can
>> /// access it. Returns the absolute '.chunks' path on success.
>> - fn chunk_dir_accessible(base: &Path) -> Result<PathBuf, Error> {
>> + pub(crate) fn chunk_dir_accessible(base: &Path) -> Result<PathBuf, Error> {
>
> `chunk_dir_accessible` tries to get the metadata for the chunk directory
> in order to check if the chunk store is available.
>
> So this could be used to not only check if there is a chunk directory,
> but rather that the chunk directory has the same device ID and i-node
> number. For that check, one would only need to get and store the chunk
> directory metadata on `ChunkStore::open`. That would be a rather non
> invasive check to also detect if the datastore has been over-mounted,
> not just if it vanished.
>
> What do you think? Is this feasible? Not fully sure about the
> implications for removable datastores, but that should be fine as well,
> as the device id and i-node should not change as long as the datastore
> remains online.
detecting a changed FS and reloading the chunk store might be nice as
well, yes. would need to carefully check that this doesn't affect either
removable datastores or the ProcessLocker semantics..
>
>> if !base.is_absolute() {
>> bail!("expected absolute path - got {:?}", base);
>> }
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 0801b4bf6..bb9b4471f 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -232,6 +232,12 @@ impl DataStore {
>> // reuse chunk store so that we keep using the same process locker instance!
>> let chunk_store = if let Some(datastore) = &entry {
>> let last_digest = datastore.last_digest.as_ref();
>> + match operation {
>> + Some(Operation::Read) | Some(Operation::Write) => {
>> + ChunkStore::chunk_dir_accessible(&datastore.chunk_store.base_path())?;
>> + }
>> + Some(Operation::Lookup) | None => {}
>> + }
>> if let Some(true) = last_digest.map(|last_digest| last_digest == &digest) {
>> if let Some(operation) = operation {
>> update_active_operations(name, operation, 1)?;
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-12-11 9:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-05 9:12 [pbs-devel] [PATCH proxmox-backup] fix #5967: datastore lookup: check for disappeared datastores Fabian Grünbichler
2024-12-06 14:03 ` Christian Ebner
2024-12-11 9:32 ` Fabian Grünbichler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox