From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id E1F1365E00
 for <pve-devel@lists.proxmox.com>; Wed,  9 Mar 2022 08:31:20 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id DAFD619B31
 for <pve-devel@lists.proxmox.com>; Wed,  9 Mar 2022 08:31:20 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id D0F1419B28
 for <pve-devel@lists.proxmox.com>; Wed,  9 Mar 2022 08:31:14 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id A8AB541B46;
 Wed,  9 Mar 2022 08:31:14 +0100 (CET)
Message-ID: <fa51f20a-55f5-04f1-1dbd-48052effd053@proxmox.com>
Date: Wed, 9 Mar 2022 08:31:08 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.6.2
Content-Language: en-US
To: Mark Schouten <mark@tuxis.nl>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
References: <20211123115949.2462727-1-f.ebner@proxmox.com>
 <5CC63593-424B-4439-93FB-BFFD6B087952@tuxis.nl>
 <eaf567ab-0e0c-6061-9a57-b60997fc6747@proxmox.com>
 <E1D57A66-615B-4CA1-874C-ACD2B97B507C@tuxis.nl>
From: Fabian Ebner <f.ebner@proxmox.com>
In-Reply-To: <E1D57A66-615B-4CA1-874C-ACD2B97B507C@tuxis.nl>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.125 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Subject: Re: [pve-devel] [PATCH kernel] Backport two io-wq fixes relevant
 for io_uring
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 09 Mar 2022 07:31:20 -0000

Am 08.03.22 um 17:19 schrieb Mark Schouten:
> Hi,
> 
> So should I try and find someone who is able to reproduce this with a test-machine and is able to give you remote access to debug? Would that help?
>

It would certainly increase the likelihood of finding the issue. Since
it only happens on 7.x, it's likely a regression. Ideally, there needs
to be a snapshot of a problematic VM before the reboot, so that it can
be quickly tested against with e.g. different builds of QEMU/kernel.
Providing such a VM with snapshot state would of course be an
alternative to remote access.

> — 
> Mark Schouten, CTO
> Tuxis B.V.
> mark@tuxis.nl
> 
> 
> 
>> On 8 Mar 2022, at 10:12, Fabian Ebner <f.ebner@proxmox.com> wrote:
>>
>> Am 07.03.22 um 15:51 schrieb Mark Schouten:
>>> Hi,
>>>
>>> Sorry for getting back on this thread after a few months, but is the Windows-case mentioned here the case that is discussed in this forum-thread:
>>> https://forum.proxmox.com/threads/windows-vms-stuck-on-boot-after-proxmox-upgrade-to-7-0.100744/page-3 <https://forum.proxmox.com/threads/windows-vms-stuck-on-boot-after-proxmox-upgrade-to-7-0.100744/page-3>
>>>
>>> ?
>>
>> Hi,
>> the symptoms there sound rather different. The issue addressed by this
>> patch was about a QEMU process getting completely stuck on I/O while the
>> VM was live already. "completely" meant that e.g. connecting for the
>> display also would fail and there would be messages like
>>
>> VM 182 qmp command failed - VM 182 qmp command 'query-proxmox-support'
>> failed - unable to connect to VM 182 qmp socket - timeout after 31 retries
>>
>> in the syslog. The issue described in the forum thread reads like it
>> happens only upon reboot from inside the guest and nobody mentioned
>> messages like the above.
>>
>>>
>>> If so, should this be investigated further or are there other issues? I have personally not had the issue mentioned in the forum, but quite a few people seem to be suffering from issues with Windows VMs, which is currently holding us back from upgrading from 6.x to 7.x on a whole bunch of customer clusters.
>>
>> I also haven't seen the issue myself yet and haven't heard from any
>> colleagues either. Without a reproducer, it's very difficult to debug.
>>
>>>
>>> Thanks,
>>>
>>> — 
>>> Mark Schouten, CTO
>>> Tuxis B.V.
>>> mark@tuxis.nl
>>
> 
> 
>