From: Ivaylo Markov via pve-devel <pve-devel@lists.proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
"Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Cc: Ivaylo Markov <ivaylo.markov@storpool.com>, nikolay.angelov@storpool.com
Subject: Re: [pve-devel] StorPool storage plugin concerns
Date: Tue, 18 Feb 2025 13:34:47 +0200 [thread overview]
Message-ID: <mailman.354.1739878496.293.pve-devel@lists.proxmox.com> (raw)
In-Reply-To: <709714906.7353.1739533336075@webmail.proxmox.com>
[-- Attachment #1: Type: message/rfc822, Size: 9656 bytes --]
From: Ivaylo Markov <ivaylo.markov@storpool.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>, "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Cc: nikolay.angelov@storpool.com
Subject: Re: [pve-devel] StorPool storage plugin concerns
Date: Tue, 18 Feb 2025 13:34:47 +0200
Message-ID: <d1da3bd4-a717-4b74-bb67-60618d127f06@storpool.com>
Hello,
On 14/02/2025 13:42, Fabian Grünbichler wrote:
> AFAICT from the description above (not looking at code or actually testing anything), issues on your storage layer should be ruled out. But it still leaves issues with anything else, e.g. any long running task (either by PVE, or by the admin) that involves a HA-managed guest is at risk of being "split-brained". In a regular (HA) setup, another node will only recover the config (and thus ownership) of the guest once the requisite timeouts have passed, which means it *knows* the failed node must have fenced itself. In your setup, this is not the case anymore - the non-quorate node still has the VM config (since it is not quorate, it cannot notice the "theft" of the config by the HA stack running on the quorate partition of the cluster) and thus (from a local point of view) at least RO ownership of that guest. Depending on the sequence of events, such a task might have passed a quorum check earlier and not yet reached the next such check, and thus even think it still has full ownership and act accordingly! Obviously, writes to your shared storage or to /etc/pve would be blocked, but that doesn't mean that nothing dangerous can happen (e.g., local or external state being corrupted or running out of sync by writes on/from two different nodes).
>
> The only way to make this safe(r) would be to basically disallow any custom integration (to ensure no non-PVE tasks are running) and kill the whole PVE stack on quorum loss, including any spawned tasks and pmxcfs. At that point, all the configs and API would become unavailable as well, so the risk of something/somebody misinterpreting anything should become zero - if there is no information, nothing can be misinterpreted after all ;) This would mean basically mean "downgrading" a PVE+StorPool node to a StorPool node on quorum loss, which is your intended semantics (I think?).
>
> This approach does come with a new problem though - once this node rejoins the cluster, you'd need to bring up all of the PVE stack again in an orderly fashion.
>
> I hope the above explains why and how PVE is using self-fencing via watchdogs, and the implications of disabling that while keeping HA "enabled". If something is unclear or you have more questions, please reach out!
>
Thank you for the detailed feedback and helpful explanations. Your
suggestion is essentially what we had in mind with the "automatic
recovery" idea, and it seems like the correct direction for the watchdog
after separating it from the plugin.
Best regards,
--
Ivaylo Markov
Quality & Automation Engineer
StorPool Storage
https://www.storpool.com
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-02-18 11:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-04 12:44 Ivaylo Markov via pve-devel
2025-02-04 14:46 ` Fabian Grünbichler
2025-02-13 15:21 ` Ivaylo Markov via pve-devel
[not found] ` <dea47906-dd09-40c1-8e28-386d38643a4d@storpool.com>
2025-02-14 11:42 ` Fabian Grünbichler
2025-02-18 11:34 ` Ivaylo Markov via pve-devel [this message]
-- strict thread matches above, loose matches on Subject: below --
2025-02-04 12:43 Ivaylo Markov via pve-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mailman.354.1739878496.293.pve-devel@lists.proxmox.com \
--to=pve-devel@lists.proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=ivaylo.markov@storpool.com \
--cc=nikolay.angelov@storpool.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox