From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pve-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 3E6121FF165 for <inbox@lore.proxmox.com>; Thu, 24 Apr 2025 22:27:35 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 01E98D2DB; Thu, 24 Apr 2025 22:27:40 +0200 (CEST) Date: Thu, 24 Apr 2025 22:27:25 +0200 In-Reply-To: <93ffb740-7235-419b-a1ce-aab7cca36a64@proxmox.com> To: Mira Limbeck <m.limbeck@proxmox.com> References: <mailman.879.1744219341.359.pve-devel@lists.proxmox.com> <b19fc3b6-563c-4d0e-a766-78dd3bb28804@proxmox.com> <BE392B97-179A-4168-A3F0-B8ED4EF46907@uni-tuebingen.de> <93ffb740-7235-419b-a1ce-aab7cca36a64@proxmox.com> MIME-Version: 1.0 Message-ID: <mailman.84.1745526459.394.pve-devel@lists.proxmox.com> List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Post: <mailto:pve-devel@lists.proxmox.com> From: Timo Veith via pve-devel <pve-devel@lists.proxmox.com> Precedence: list Cc: Timo Veith <timo.veith@uni-tuebingen.de>, Proxmox VE development discussion <pve-devel@lists.proxmox.com> X-Mailman-Version: 2.1.29 X-BeenThere: pve-devel@lists.proxmox.com List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> Subject: Re: [pve-devel] iscsi and multipathing Content-Type: multipart/mixed; boundary="===============2412412936892874110==" Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com> --===============2412412936892874110== Content-Type: message/rfc822 Content-Disposition: inline Return-Path: <timo.veith@uni-tuebingen.de> X-Original-To: pve-devel@lists.proxmox.com Delivered-To: pve-devel@lists.proxmox.com Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 640FCC9A50 for <pve-devel@lists.proxmox.com>; Thu, 24 Apr 2025 22:27:39 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 4343AD244 for <pve-devel@lists.proxmox.com>; Thu, 24 Apr 2025 22:27:39 +0200 (CEST) Received: from mx04.uni-tuebingen.de (mx04.uni-tuebingen.de [IPv6:2001:7c0:300c:3105::8602:5d6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for <pve-devel@lists.proxmox.com>; Thu, 24 Apr 2025 22:27:37 +0200 (CEST) Received: from smtpclient.apple (u-006-s131.v276.uni-tuebingen.de [134.2.6.131]) by mx04.uni-tuebingen.de (Postfix) with ESMTPSA id 936B620A4D16; Thu, 24 Apr 2025 22:27:25 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mx04.uni-tuebingen.de 936B620A4D16 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uni-tuebingen.de; s=20211202prod; t=1745526445; bh=85kq90Zrl8Eh2mYBZaZ5wPu6kbr+oJc6mCtYQXXXtD8=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=DlR8Z6tsv1CNdDCi8L/BLnVQsNYh6Bmr6KtXi0bRrkf+O4h50BmUEOdliSWgKi2Tc C9WVv0tsz5GNw8vXhbzPV5y3CMQfH9rfIRPN7tJ+FnHQ/bOcETj3zBRPsCCEJxHtcJ cMvFL/9/u3zJZMpA25fQ3J8s3pVCRGYW5ual/C4nZM+QbaOynggL16AbD7+QN91dL2 FWfdLbKbw2O2yZ9r1RgRZCMu1wmDzF39MC7UzR0m263P108GE44nlz53Gn2eIoam2k IkXwjURJ9+kBIiGgkaAPkHTHi75Z+CSwUDeHAr3tPe58ReGEs7lU11h1WmPiHy1aMC H107Pa0WCCsOg== From: Timo Veith <timo.veith@uni-tuebingen.de> Message-Id: <C55F8DDF-BEEB-42B7-881D-5B18583322EF@uni-tuebingen.de> Content-Type: multipart/signed; boundary="Apple-Mail=_18C12DCA-05FA-422D-9E4F-ACD4389EBF35"; protocol="application/pkcs7-signature"; micalg=sha-256 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.500.181.1.5\)) Subject: Re: [pve-devel] iscsi and multipathing Date: Thu, 24 Apr 2025 22:27:25 +0200 In-Reply-To: <93ffb740-7235-419b-a1ce-aab7cca36a64@proxmox.com> Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com> To: Mira Limbeck <m.limbeck@proxmox.com> References: <mailman.879.1744219341.359.pve-devel@lists.proxmox.com> <b19fc3b6-563c-4d0e-a766-78dd3bb28804@proxmox.com> <BE392B97-179A-4168-A3F0-B8ED4EF46907@uni-tuebingen.de> <93ffb740-7235-419b-a1ce-aab7cca36a64@proxmox.com> X-Mailer: Apple Mail (2.3826.500.181.1.5) X-SPAM-LEVEL: Spam detection results: 0 AWL 0.323 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain DMARC_PASS -0.1 DMARC pass policy HTML_MESSAGE 0.001 HTML included in message RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com,rescan-scsi-bus.sh,multipath.py,uni-tuebingen.de] X-Content-Filtered-By: Mailman/MimeDel 2.1.29 --Apple-Mail=_18C12DCA-05FA-422D-9E4F-ACD4389EBF35 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > Am 18.04.2025 um 10:45 schrieb Mira Limbeck <m.limbeck@proxmox.com>: >=20 > On 4/15/25 16:10, Timo Veith wrote: >> Hello Mira, >>=20 >> thank you very much for your reply. >>=20 >>> Am 15.04.2025 um 11:09 schrieb Mira Limbeck <m.limbeck@proxmox.com>: >>>=20 >>> Hi Timo, >>>=20 >>> At the moment I'm working on storage mapping support for iSCSI. >>> This would allow one to configure different portals on each of the = hosts >>> that are logically the same storage. >>>=20 >>> If you tried setting up a storage via iSCSI where each host can only >>> access a part of the portals which are announced, you probably = noticed >>> some higher pvestatd update times. >>> The storage mapping implementation will alleviate those issues. >>>=20 >>> Other than that I'm not aware of anyone working on iSCSI = improvements at >>> the moment. >>> We do have some open enhancement requests in our bug tracker [0]. = One of >>> which is yours [1]. >>=20 >> =46rom the list [0] you mentioned iSCSI CHAP credentials in the GUI = is something we are interested in too.=20 > This is probably a bit more work to implement with the current way the > plugin works. > Since the discoverydb is recreated constantly, you would have to set = the > credentials before each login. Or pass them to iscsiadm as options, > which needs to make sure that no sensitive information is logged on = error. Since you write about discoverydb, I must admit that I was never in the = need of having to use ``=C3=ACscsiadm -m discoverydb`` to configure a = storage connection. I always used only ``iscsiadm -m discovery``. = ovirt/RHV is doing this with the help of a postgresql db. All storage = connections and their credentials are saved in there on the management = server. As PVE doesn=E2=80=99t have such a server, but has the = distributed ``/etc/pve`` directory, the idea comes to mind to save chap = credentials there too. And there is already that ``/etc/pve/priv`` = directory which holds sensitive data. So maybe that place could be good = for chap credentials too. Available for iscsi logins on all nodes. Just = like ``/etc/pve/storage.cfg``.=20 >=20 >>=20 >>>=20 >>> Regarding multipath handling via the GUI there hasn't been much of a >>> discussion on how we could tackle that yet. It is quite easy to set = up >>> [2] the usual way. >>=20 >> I know that it is easy, because otherwise I wouldn=E2=80=99t have = been able to configure it ;) >>=20 >>=20 >>>=20 >>>=20 >>> Sorry, I might have missed your bug report previously, so I'll go = into a >>> bit more detail here. (I'll add that information to the enhancement >>> request as well) >>>=20 >>>> When adding iscsi storage to the data center there could possiblity = to >>>> do a iscsi discovery multiple times against different portal ips = and >>>> thus get multiple path to a iscsi san. >>>=20 >>> That's already the default. For each target we run the discovery on = at >>> least one portal since it should announce all other portals. We = haven't >>> encountered a setup where that is not the case. >>=20 >> I am dealing only with setups that do not announce their portals. I = have to do iscsi discovery for every portal ip address. That are mostly = Infortrend iSCSI SAN systems but also from Huawei. But I think I know = what you mean. Some storage devices give you all portals when you do a = discovery against one of their ip adresses. >> However, it would be great to have a possibility to enter multiple = portal ip addresses in the web ui. Together with chap credentials. =20 > I tried just allowing multiple portals, and it didn't scale well. > For setups where each host has access to the same portals and targets, > it already works nicely the way it currently is. > But for asymmetric setups where each host can only connect to = different > portals, and maybe different targets altogether, it doesn't bring any > benefit. >=20 > That's the reason I'm currently working on a `storage mapping` = solution > where you can specify host-specific portals and targets, that all map = to > the same `logical` storage. >=20 > Do you SANs provide the same target on all portals, or is it always a > different target for each portal? What exactly do you mean with =E2=80=9Eit didn=E2=80=99t scale well=E2=80=9C= ?=20 It may work nicely but only if you don=E2=80=99t need to do a discovery = more than one time to get all portal/target records. I have put this on = my todo list and I will ask our storage vendor support if it is possible = to configure the SANs so they announce all their portals with one = discovery. But what if they say, that it is not possible at all and you = have to do it one time for each portal ip? Assymetric setups? That sounds a bit weird, if you allow me to say that. = Why would one need that? If you have a virtualization cluster, then you = very probably want to have vm live migration underneath all of your = cluster nodes. Assymetric would only allow it underneath those who take = part of the same sub group. Is that correct? Anyway, the same like above = applies here too. It would bring the benefit that you can configure = multiple paths by doing more than one discovery. Maybe your `storage mapping` solution would also solve that problem too.=20= I am also thinking about providing you screenshots of the `add storage` = dialog of ovirt/RHV and the output of iscsiadm commands against our SAN = to show you what I mean. If you want to see those, I could put them = somewhere on a public share or a web page.=20 >=20 >>=20 >>>=20 >>>> multipathd should be updated with the path to the luns. The user >>>> would/could only need to have to add vendor specific device configs >>>> like alua or multibus settings. >>>=20 >>> For now that has to be done manually. There exists a multipath.conf >>> setting that automatically creates a multipath mapping for devices = that >>> have at least 2 paths available: `find_multipaths yes` [3]. >>=20 >> I will test `find_multipaths yes`. If I understand you correctly then = the command `multipath -a <wwid>` will not be necessary. Just like = written in the multipath wiki article [2]. I have tested that, and it works as expected. Thank you for pointing me = on that! >>=20 >>>=20 >>>> Then when adding a certain disk to a vm, it would be good if it's = wwn >>>> would be displayed instead of the "CH 00 ID0 LUN0" e.g. So it would = be >>>> easier to identify the right one. >>>=20 >>> That would be a nice addition. And shouldn't be too hard to extract = that >>> information in the ISCSIPlugin and provide it as additional = information >>> via the API. >>> That information could also be listed in the `VM Disks` page of = iSCSI >>> storages. >>> Would you like to tackle that? >>=20 >> Are you asking me to provide the code for that?=20 > Since you mentioned `If there are any, what are they, what is their > status and can they be supplemented or contributed to?` I assumed you > were willing to contribute code as well. That's why I asked if you > wanted to tackle that improvement. We would like to contribute code, but we do not yet have a colleague = with Proxmox VE development skills in our team. We are currently looking = for reinforcements who could contribute code. But that seems to take = more time. Maybe it is even faster to switch to a different storage = protocol in the meantime. So far, we ourselves can only provide ideas = and tests. Those ideas come from the many years of use of ovirt/RHV and = the trials of switching over to Proxmox VE. >=20 >>=20 >>>=20 >>>> Also when changing lun size would have been grown on the storage = side, >>>> it would be handy to have a button in pve web gui to "refresh" the >>>> disk in the vm. The new size should be reflected in the hardware >>>> details of the vm. And the qemu prozess should be informed of the = new >>>> disk size so the vm would not have to be shutdown and restarted. >>>=20 >>> Based on experience, I doubt it would be that easy. Refreshing of = the >>> LUN sizes involves the SAN, the client, multipath and QEMU. There's >>> always at least one place where it doesn't update even with >>> `rescan-scsi-bus.sh`, `multipath -r`, etc. >>> If you have a reliable way to make all sides agree on the new size, >>> please let us know. >>=20 >> Don=E2=80=99t get me wrong, I didn=E2=80=99t meant that it should be = possible to resize a iscsi disk right from the PVE web gui. I meant that = if one has changed the size of a LUN on SAN side with the measures that = are necessary to that there (e.g. with Infortrend you need to login to = the management software there, find the LUN and then resize it) then the = refreshing of that new size could be triggered by a button in the PVE = web gui. When pressing the button an iscsi rescan of the corresponding = iscsi session would have to be done and then a multipath map rescan like = you wrote and eventually a qemu block device refresh. (And/Or the = equvialent for the lxc container)=20 >>=20 >> Even if I do all that manually then the size of the LUN in the = hardware details of the vm is not beeing updated.=20 >>=20 >> I personally do not know how but at least I know that it is possible = in ovirt/RHV.=20 > We've seen some setups in our enterprise support where none of the = above > mentioned commands helped after a resize. The host still saw the old > size. Only a reboot helped. > So that's going to be difficult to do for all combinations of hardware > and software. >=20 > Do you have a reliable set of commands that work in all your cases of = a > resize, so that the host sees the correct size, and multipath resizes > reliably? I must admit that I only tried a lun resize one single time with Proxmox = VE. I resized the lun on the Infortrend SAN, then I logged into the PVE = node and issued ``iscsiadm -m session -R``. Then I issued ``multipath = -r``. And then - as I couldn=E2=80=99t remember the block refresh = command for qemu I just stopped the test vm and started it again. So I = can only say this for this combination of hard- and software with PVE. = Now, that I write this, I think I am mixing this with the virsh command = ``virsh blockresize <domain> <fully-qualified path of block device> = --size <new volume size>``. That is not available on PVE, but there must = be a qemu equivalent to this. At least the new size should be updated in the PVE web gui when a LUN = was resized. It is just wrong when the LUN size changed from e.g. 5 to 6 = TB but the gui still shows 5 TB, right?=20 On the other hand, as I already said, I can prove that ovirt/RHV can do = it. We have used ovirt/RHV together with a Nimble, Huawei, NetApp, = Infortrend DS/GS systems and one TrueNAS Core storage system.=20 We have looked for the code that implements this in ovirt/RHV and found = this repo [4]. The folder ``lib/vdsm/storage/` holds iscsi.py, hsm.py, = and multipath.py. =20 But I am too unexperienced reading code that is split in modules and = libraries. As already said, I can provide screenshots and command = outputs that prove that it is working. We could also do a video call = with a live session on this too.=20 >>=20 >>>=20 >>>=20 >>>=20 >>> [0] >>> = https://bugzilla.proxmox.com/buglist.cgi?bug_severity=3Denhancement&list_i= d=3D50969&resolution=3D---&short_desc=3Discsi&short_desc_type=3Dallwordssu= bstr >>> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=3D6133 >>> [2] https://pve.proxmox.com/wiki/Multipath >>> [3] >>> = https://manpages.debian.org/bookworm/multipath-tools/multipath.conf.5.en.h= tml [4] https://github.com/oVirt/vdsm=20 --Apple-Mail=_18C12DCA-05FA-422D-9E4F-ACD4389EBF35-- --===============2412412936892874110== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel --===============2412412936892874110==--