From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id 3E6121FF165
	for <inbox@lore.proxmox.com>; Thu, 24 Apr 2025 22:27:35 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 01E98D2DB;
	Thu, 24 Apr 2025 22:27:40 +0200 (CEST)
Date: Thu, 24 Apr 2025 22:27:25 +0200
In-Reply-To: <93ffb740-7235-419b-a1ce-aab7cca36a64@proxmox.com>
To: Mira Limbeck <m.limbeck@proxmox.com>
References: <mailman.879.1744219341.359.pve-devel@lists.proxmox.com>
 <b19fc3b6-563c-4d0e-a766-78dd3bb28804@proxmox.com>
 <BE392B97-179A-4168-A3F0-B8ED4EF46907@uni-tuebingen.de>
 <93ffb740-7235-419b-a1ce-aab7cca36a64@proxmox.com>
MIME-Version: 1.0
Message-ID: <mailman.84.1745526459.394.pve-devel@lists.proxmox.com>
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Post: <mailto:pve-devel@lists.proxmox.com>
From: Timo Veith via pve-devel <pve-devel@lists.proxmox.com>
Precedence: list
Cc: Timo Veith <timo.veith@uni-tuebingen.de>,
 Proxmox VE development discussion <pve-devel@lists.proxmox.com>
X-Mailman-Version: 2.1.29
X-BeenThere: pve-devel@lists.proxmox.com
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
Subject: Re: [pve-devel] iscsi and multipathing
Content-Type: multipart/mixed; boundary="===============2412412936892874110=="
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>


--===============2412412936892874110==
Content-Type: message/rfc822
Content-Disposition: inline

Return-Path: <timo.veith@uni-tuebingen.de>
X-Original-To: pve-devel@lists.proxmox.com
Delivered-To: pve-devel@lists.proxmox.com
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by lists.proxmox.com (Postfix) with ESMTPS id 640FCC9A50
	for <pve-devel@lists.proxmox.com>; Thu, 24 Apr 2025 22:27:39 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 4343AD244
	for <pve-devel@lists.proxmox.com>; Thu, 24 Apr 2025 22:27:39 +0200 (CEST)
Received: from mx04.uni-tuebingen.de (mx04.uni-tuebingen.de [IPv6:2001:7c0:300c:3105::8602:5d6])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by firstgate.proxmox.com (Proxmox) with ESMTPS
	for <pve-devel@lists.proxmox.com>; Thu, 24 Apr 2025 22:27:37 +0200 (CEST)
Received: from smtpclient.apple (u-006-s131.v276.uni-tuebingen.de [134.2.6.131])
	by mx04.uni-tuebingen.de (Postfix) with ESMTPSA id 936B620A4D16;
	Thu, 24 Apr 2025 22:27:25 +0200 (CEST)
DKIM-Filter: OpenDKIM Filter v2.11.0 mx04.uni-tuebingen.de 936B620A4D16
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uni-tuebingen.de;
	s=20211202prod; t=1745526445;
	bh=85kq90Zrl8Eh2mYBZaZ5wPu6kbr+oJc6mCtYQXXXtD8=;
	h=From:Subject:Date:In-Reply-To:Cc:To:References:From;
	b=DlR8Z6tsv1CNdDCi8L/BLnVQsNYh6Bmr6KtXi0bRrkf+O4h50BmUEOdliSWgKi2Tc
	 C9WVv0tsz5GNw8vXhbzPV5y3CMQfH9rfIRPN7tJ+FnHQ/bOcETj3zBRPsCCEJxHtcJ
	 cMvFL/9/u3zJZMpA25fQ3J8s3pVCRGYW5ual/C4nZM+QbaOynggL16AbD7+QN91dL2
	 FWfdLbKbw2O2yZ9r1RgRZCMu1wmDzF39MC7UzR0m263P108GE44nlz53Gn2eIoam2k
	 IkXwjURJ9+kBIiGgkaAPkHTHi75Z+CSwUDeHAr3tPe58ReGEs7lU11h1WmPiHy1aMC
	 H107Pa0WCCsOg==
From: Timo Veith <timo.veith@uni-tuebingen.de>
Message-Id: <C55F8DDF-BEEB-42B7-881D-5B18583322EF@uni-tuebingen.de>
Content-Type: multipart/signed;
	boundary="Apple-Mail=_18C12DCA-05FA-422D-9E4F-ACD4389EBF35";
	protocol="application/pkcs7-signature";
	micalg=sha-256
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.500.181.1.5\))
Subject: Re: [pve-devel] iscsi and multipathing
Date: Thu, 24 Apr 2025 22:27:25 +0200
In-Reply-To: <93ffb740-7235-419b-a1ce-aab7cca36a64@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
To: Mira Limbeck <m.limbeck@proxmox.com>
References: <mailman.879.1744219341.359.pve-devel@lists.proxmox.com>
 <b19fc3b6-563c-4d0e-a766-78dd3bb28804@proxmox.com>
 <BE392B97-179A-4168-A3F0-B8ED4EF46907@uni-tuebingen.de>
 <93ffb740-7235-419b-a1ce-aab7cca36a64@proxmox.com>
X-Mailer: Apple Mail (2.3826.500.181.1.5)
X-SPAM-LEVEL: Spam detection results:  0
	AWL                     0.323 Adjusted score from AWL reputation of From: address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DKIM_SIGNED               0.1 Message has a DKIM or DK signature, not necessarily valid
	DKIM_VALID               -0.1 Message has at least one valid DKIM or DK signature
	DKIM_VALID_AU            -0.1 Message has a valid DKIM or DK signature from author's domain
	DKIM_VALID_EF            -0.1 Message has a valid DKIM or DK signature from envelope-from domain
	DMARC_PASS               -0.1 DMARC pass policy
	HTML_MESSAGE            0.001 HTML included in message
	RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/, medium trust
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_PASS               -0.001 SPF: sender matches SPF record
	URIBL_BLOCKED           0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked.  See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com,rescan-scsi-bus.sh,multipath.py,uni-tuebingen.de]
X-Content-Filtered-By: Mailman/MimeDel 2.1.29


--Apple-Mail=_18C12DCA-05FA-422D-9E4F-ACD4389EBF35
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8



> Am 18.04.2025 um 10:45 schrieb Mira Limbeck <m.limbeck@proxmox.com>:
>=20
> On 4/15/25 16:10, Timo Veith wrote:
>> Hello Mira,
>>=20
>> thank you very much for your reply.
>>=20
>>> Am 15.04.2025 um 11:09 schrieb Mira Limbeck <m.limbeck@proxmox.com>:
>>>=20
>>> Hi Timo,
>>>=20
>>> At the moment I'm working on storage mapping support for iSCSI.
>>> This would allow one to configure different portals on each of the =
hosts
>>> that are logically the same storage.
>>>=20
>>> If you tried setting up a storage via iSCSI where each host can only
>>> access a part of the portals which are announced, you probably =
noticed
>>> some higher pvestatd update times.
>>> The storage mapping implementation will alleviate those issues.
>>>=20
>>> Other than that I'm not aware of anyone working on iSCSI =
improvements at
>>> the moment.
>>> We do have some open enhancement requests in our bug tracker [0]. =
One of
>>> which is yours [1].
>>=20
>> =46rom the list [0] you mentioned iSCSI CHAP credentials in the GUI =
is something we are interested in too.=20
> This is probably a bit more work to implement with the current way the
> plugin works.
> Since the discoverydb is recreated constantly, you would have to set =
the
> credentials before each login. Or pass them to iscsiadm as options,
> which needs to make sure that no sensitive information is logged on =
error.

Since you write about discoverydb, I must admit that I was never in the =
need of having to use ``=C3=ACscsiadm -m discoverydb`` to configure a =
storage connection. I always used only ``iscsiadm -m discovery``. =
ovirt/RHV is doing this with the help of a postgresql db. All storage =
connections and their credentials are saved in there on the management =
server. As PVE doesn=E2=80=99t have such a server, but has the =
distributed ``/etc/pve`` directory, the idea comes to mind to save chap =
credentials there too. And there is already that ``/etc/pve/priv`` =
directory which holds sensitive data. So maybe that place could be good =
for chap credentials too. Available for iscsi logins on all nodes. Just =
like ``/etc/pve/storage.cfg``.=20


>=20
>>=20
>>>=20
>>> Regarding multipath handling via the GUI there hasn't been much of a
>>> discussion on how we could tackle that yet. It is quite easy to set =
up
>>> [2] the usual way.
>>=20
>> I know that it is easy, because otherwise I wouldn=E2=80=99t have =
been able to configure it ;)
>>=20
>>=20
>>>=20
>>>=20
>>> Sorry, I might have missed your bug report previously, so I'll go =
into a
>>> bit more detail here. (I'll add that information to the enhancement
>>> request as well)
>>>=20
>>>> When adding iscsi storage to the data center there could possiblity =
to
>>>> do a iscsi discovery multiple times against different portal ips =
and
>>>> thus get multiple path to a iscsi san.
>>>=20
>>> That's already the default. For each target we run the discovery on =
at
>>> least one portal since it should announce all other portals. We =
haven't
>>> encountered a setup where that is not the case.
>>=20
>> I am dealing only with setups that do not announce their portals. I =
have to do iscsi discovery for every portal ip address. That are mostly =
Infortrend iSCSI SAN systems but also from Huawei. But I think I know =
what you mean. Some storage devices give you all portals when you do a =
discovery against one of their ip adresses.
>> However, it would be great to have a possibility to enter multiple =
portal ip addresses in the web ui. Together with chap credentials. =20
> I tried just allowing multiple portals, and it didn't scale well.
> For setups where each host has access to the same portals and targets,
> it already works nicely the way it currently is.
> But for asymmetric setups where each host can only connect to =
different
> portals, and maybe different targets altogether, it doesn't bring any
> benefit.
>=20
> That's the reason I'm currently working on a `storage mapping` =
solution
> where you can specify host-specific portals and targets, that all map =
to
> the same `logical` storage.
>=20
> Do you SANs provide the same target on all portals, or is it always a
> different target for each portal?

What exactly do you mean with =E2=80=9Eit didn=E2=80=99t scale well=E2=80=9C=
?=20

It may work nicely but only if you don=E2=80=99t need to do a discovery =
more than one time to get all portal/target records. I have put this on =
my todo list and I will ask our storage vendor support if it is possible =
to configure the SANs so they announce all their portals with one =
discovery. But what if they say, that it is not possible at all and you =
have to do it one time for each portal ip?

Assymetric setups? That sounds a bit weird, if you allow me to say that. =
Why would one need that? If you have a virtualization cluster, then you =
very probably want to have vm live migration underneath all of your =
cluster nodes. Assymetric would only allow it underneath those who take =
part of the same sub group. Is that correct? Anyway, the same like above =
applies here too. It would bring the benefit that you can configure =
multiple paths by doing more than one discovery.

Maybe your `storage mapping` solution would also solve that problem too.=20=


I am also thinking about providing you screenshots of the `add storage` =
dialog of ovirt/RHV and the output of iscsiadm commands against our SAN =
to show you what I mean. If you want to see those, I could put them =
somewhere on a public share or a web page.=20


>=20
>>=20
>>>=20
>>>> multipathd should be updated with the path to the luns. The user
>>>> would/could only need to have to add vendor specific device configs
>>>> like alua or multibus settings.
>>>=20
>>> For now that has to be done manually. There exists a multipath.conf
>>> setting that automatically creates a multipath mapping for devices =
that
>>> have at least 2 paths available: `find_multipaths yes` [3].
>>=20
>> I will test `find_multipaths yes`. If I understand you correctly then =
the command `multipath -a <wwid>` will not be necessary. Just like =
written in the multipath wiki article [2].

I have tested that, and it works as expected. Thank you for pointing me =
on that!



>>=20
>>>=20
>>>> Then when adding a certain disk to a vm, it would be good if it's =
wwn
>>>> would be displayed instead of the "CH 00 ID0 LUN0" e.g. So it would =
be
>>>> easier to identify the right one.
>>>=20
>>> That would be a nice addition. And shouldn't be too hard to extract =
that
>>> information in the ISCSIPlugin and provide it as additional =
information
>>> via the API.
>>> That information could also be listed in the `VM Disks` page of =
iSCSI
>>> storages.
>>> Would you like to tackle that?
>>=20
>> Are you asking me to provide the code for that?=20
> Since you mentioned `If there are any, what are they, what is their
> status and can they be supplemented or contributed to?` I assumed you
> were willing to contribute code as well. That's why I asked if you
> wanted to tackle that improvement.

We would like to contribute code, but we do not yet have a colleague =
with Proxmox VE development skills in our team. We are currently looking =
for reinforcements who could contribute code. But that seems to take =
more time. Maybe it is even faster to switch to a different storage =
protocol in the meantime. So far, we ourselves can only provide ideas =
and tests. Those ideas come from the many years of use of ovirt/RHV and =
the trials of switching over to Proxmox VE.


>=20
>>=20
>>>=20
>>>> Also when changing lun size would have been grown on the storage =
side,
>>>> it would be handy to have a button in pve web gui to "refresh" the
>>>> disk in the vm. The new size should be reflected in the hardware
>>>> details of the vm. And the qemu prozess should be informed of the =
new
>>>> disk size so the vm would not have to be shutdown and restarted.
>>>=20
>>> Based on experience, I doubt it would be that easy. Refreshing of =
the
>>> LUN sizes involves the SAN, the client, multipath and QEMU. There's
>>> always at least one place where it doesn't update even with
>>> `rescan-scsi-bus.sh`, `multipath -r`, etc.
>>> If you have a reliable way to make all sides agree on the new size,
>>> please let us know.
>>=20
>> Don=E2=80=99t get me wrong, I didn=E2=80=99t meant that it should be =
possible to resize a iscsi disk right from the PVE web gui. I meant that =
if one has changed the size of a LUN on SAN side with the measures that =
are necessary to that there (e.g. with Infortrend you need to login to =
the management software there, find the LUN and then resize it) then the =
refreshing of that new size could be triggered by a button in the PVE =
web gui. When pressing the button an iscsi rescan of the corresponding =
iscsi session would have to be done and then a multipath map rescan like =
you wrote and eventually a qemu block device refresh. (And/Or the =
equvialent for the lxc container)=20
>>=20
>> Even if I do all that manually then the size of the LUN in the =
hardware details of the vm is not beeing updated.=20
>>=20
>> I personally do not know how but at least I know that it is possible =
in ovirt/RHV.=20
> We've seen some setups in our enterprise support where none of the =
above
> mentioned commands helped after a resize. The host still saw the old
> size. Only a reboot helped.
> So that's going to be difficult to do for all combinations of hardware
> and software.
>=20
> Do you have a reliable set of commands that work in all your cases of =
a
> resize, so that the host sees the correct size, and multipath resizes
> reliably?

I must admit that I only tried a lun resize one single time with Proxmox =
VE. I resized the lun on the Infortrend SAN, then I logged into the PVE =
node and issued ``iscsiadm -m session -R``. Then I issued ``multipath =
-r``. And then - as I couldn=E2=80=99t remember the block refresh =
command for qemu I just stopped the test vm and started it again. So I =
can only say this for this combination of hard- and software with PVE. =
Now, that I write this, I think I am mixing this with the virsh command =
``virsh blockresize <domain> <fully-qualified path of block device> =
--size <new volume size>``. That is not available on PVE, but there must =
be a qemu equivalent to this.

At least the new size should be updated in the PVE web gui when a LUN =
was resized. It is just wrong when the LUN size changed from e.g. 5 to 6 =
TB but the gui still shows 5 TB, right?=20

On the other hand, as I already said, I can prove that ovirt/RHV can do =
it. We have used ovirt/RHV together with a Nimble, Huawei, NetApp, =
Infortrend DS/GS systems and one TrueNAS Core storage system.=20

We have looked for the code that implements this in ovirt/RHV and found =
this repo [4]. The folder ``lib/vdsm/storage/` holds iscsi.py, hsm.py, =
and multipath.py. =20

But I am too unexperienced reading code that is split in modules and =
libraries. As already said, I can provide screenshots and command =
outputs that prove that it is working. We could also do a video call =
with a live session on this too.=20



>>=20
>>>=20
>>>=20
>>>=20
>>> [0]
>>> =
https://bugzilla.proxmox.com/buglist.cgi?bug_severity=3Denhancement&list_i=
d=3D50969&resolution=3D---&short_desc=3Discsi&short_desc_type=3Dallwordssu=
bstr
>>> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=3D6133
>>> [2] https://pve.proxmox.com/wiki/Multipath
>>> [3]
>>> =
https://manpages.debian.org/bookworm/multipath-tools/multipath.conf.5.en.h=
tml

[4] https://github.com/oVirt/vdsm=20




--Apple-Mail=_18C12DCA-05FA-422D-9E4F-ACD4389EBF35--


--===============2412412936892874110==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

--===============2412412936892874110==--