public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [RFC storage] zfspoolplugin: check if mounted instead of imported
Date: Thu, 18 Feb 2021 12:33:28 +0100	[thread overview]
Message-ID: <20210218113328.14488-1-s.ivanov@proxmox.com> (raw)

This patch addresses an issue we recently saw on a production machine:
* after booting a ZFS pool failed to get imported (due to an empty
  /etc/zfs/zpool.cache)
* pvestatd/guest-startall eventually tried to import the pool
* the pool was imported, yet the datasets of the pool remained
  not mounted

A bit of debugging showed that `zpool import <poolname>` is not
atomic, in fact it does fork+exec `mount` with appropriate parameters.
If an import ran longer than the hardcoded timeout of 15s, it could
happen that the pool got imported, but the zpool command (and its
forks) got terminated due to timing out.

reproducing this is straight-forward by setting (drastic) bw+iops
limits on a guests' disk (which contains a zpool) - e.g.:
`qm set 100 -scsi1 wd:vm-100-disk-1,iops_rd=10,iops_rd_max=20,\
iops_wr=15,iops_wr_max=20,mbps_rd=10,mbps_rd_max=15,mbps_wr=10,mbps_wr_max=15`
afterwards running `timeout 15 zpool import <poolname>` resulted in
that situation in the guest on my machine

The patch changes the check in activate_storage for the ZFSPoolPlugin,
to check if the configured 'pool' (which can also be a sub-dataset) is
mounted (via `zfs get mounted <dataset>`).

* In case this errors out we try to import the pool (and run `zfs
  mount -a` if the dataset is not mounted after importing)
* In case it succeeds but reports the dataset as not mounted,
  we run `zfs mount -a`

The choice of running `zfs mount -a` has potential for regression, in
case someone manually umounts some dataset after booting the system
(but does not set the canmount option)

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
* sending as RFC since I have the feeling that I miss quite a few corner-cases.
* an alternative to getting the mounted property via `zfs get` might be
  parsing /proc/mounts (but zfs get mounted does mostly just that +stating
  files in /sys, and 2 (potentially blocking) ioctls
* could reproduce the issue with ZFS 2.0.1 and 0.8.6 - so my current guess
  is the issue on the production box might be related to some other
  performance regression in its disk-subsystem (maybe introduced with the
  new kernel)
* huge thanks to Dominik for many suggestions (including the idea of
  limiting the disk-speed via our stack instead of dm-delay :)

 PVE/Storage/ZFSPoolPlugin.pm | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/PVE/Storage/ZFSPoolPlugin.pm b/PVE/Storage/ZFSPoolPlugin.pm
index 105d802..32ad1c9 100644
--- a/PVE/Storage/ZFSPoolPlugin.pm
+++ b/PVE/Storage/ZFSPoolPlugin.pm
@@ -525,8 +525,17 @@ sub activate_storage {
     my ($class, $storeid, $scfg, $cache) = @_;
 
     # Note: $scfg->{pool} can include dataset <pool>/<dataset>
-    my $pool = $scfg->{pool};
-    $pool =~ s!/.*$!!;
+    my $dataset = $scfg->{pool};
+    my $pool = ($dataset =~ s!/.*$!!r);
+
+    my $dataset_mounted = sub {
+	my $mounted = eval { $class->zfs_get_properties($scfg, 'mounted', "$dataset") };
+	if ($@) {
+	    warn "$@\n";
+	    return undef;
+	}
+	return defined($mounted) && $mounted =~ m/^yes$/;
+    };
 
     my $pool_imported = sub {
 	my @param = ('-o', 'name', '-H', "$pool");
@@ -538,13 +547,21 @@ sub activate_storage {
 	return defined($res) && $res =~ m/$pool/;
     };
 
-    if (!$pool_imported->()) {
+    if (!defined($dataset_mounted->())) {
+	# error from zfs get mounted - pool import necessary
 	# import can only be done if not yet imported!
 	my @param = ('-d', '/dev/disk/by-id/', '-o', 'cachefile=none', "$pool");
 	eval { $class->zfs_request($scfg, undef, 'zpool_import', @param) };
 	if (my $err = $@) {
 	    # just could've raced with another import, so recheck if it is imported
-	    die "could not activate storage '$storeid', $@\n" if !$pool_imported->();
+	    die "could not activate storage '$storeid', $err\n" if !$pool_imported->();
+	}
+	if (!$dataset_mounted->()) {
+	    eval { $class->zfs_request($scfg, undef, 'mount', '-a') };
+	    if (my $err = $@) {
+		# just could've raced with another import, so recheck if it is imported
+		die "could not activate storage '$storeid', $err\n";
+	    }
 	}
     }
     return 1;
-- 
2.20.1





             reply	other threads:[~2021-02-18 11:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-18 11:33 Stoiko Ivanov [this message]
2021-02-18 12:20 ` Thomas Lamprecht
2021-02-18 13:15   ` Stoiko Ivanov
2021-02-18 13:34     ` Thomas Lamprecht
2021-02-18 13:37       ` Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210218113328.14488-1-s.ivanov@proxmox.com \
    --to=s.ivanov@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal