From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id A82CB60896 for ; Tue, 11 Jan 2022 09:19:42 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 98AA71B55C for ; Tue, 11 Jan 2022 09:19:12 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id EF8341B551 for ; Tue, 11 Jan 2022 09:19:11 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id C52CF4535D; Tue, 11 Jan 2022 09:19:11 +0100 (CET) Message-ID: <554040de-09d6-974b-143a-80c2d66b9573@proxmox.com> Date: Tue, 11 Jan 2022 09:19:10 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:96.0) Gecko/20100101 Thunderbird/96.0 Content-Language: en-US To: Proxmox VE development discussion , Roland , Fabian Ebner , =?UTF-8?Q?Fabian_Gr=c3=bcnbichler?= References: <20211222135257.3242938-1-f.gruenbichler@proxmox.com> <20211222135257.3242938-17-f.gruenbichler@proxmox.com> <5a3846bc-04bc-ed4d-4e2e-38a9911390aa@proxmox.com> <462e37fd-5e74-e372-6cac-80069033a361@web.de> From: Thomas Lamprecht In-Reply-To: <462e37fd-5e74-e372-6cac-80069033a361@web.de> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.059 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2022 08:19:42 -0000 On 04.01.22 17:44, Roland wrote: >>> =C2=A0 +sub phase2_start_remote_cluster { >>> +=C2=A0=C2=A0=C2=A0 my ($self, $vmid, $params) =3D @_; >>> + >>> +=C2=A0=C2=A0=C2=A0 die "insecure migration to remote cluster not imp= lemented\n" >>> +=C2=A0=C2=A0=C2=A0 if $params->{migrate_opts}->{type} ne 'websocket'= ; >>> + >>> +=C2=A0=C2=A0=C2=A0 my $remote_vmid =3D $self->{opts}->{remote}->{vmi= d}; >>> + >>> +=C2=A0=C2=A0=C2=A0 my $res =3D PVE::Tunnel::write_tunnel($self->{tun= nel}, 10, >>> "start", $params); >> >> 10 seconds feels a bit short to me. @Fabian(s): Why not use the same as vm_start_nolock + some buffer? E.g., ($params->{timeout} // config_aware_timeout($conf, $resume)) + 10; That should be quite generous, as the worst case time for the start for incoming migration, which is always paused so not doing much, is normally= way lower than a cold-start. I mean, we're in an worker task and do not really care much if setup need= s a bit longer. >> > Please, administrators like tunables and knobs for changing default val= ues. >=20 @Roland sure, but exposing all hundreds+ of timeout and other parameters will jus= t overload most users and produce guides to extend timeouts for issues that= should be actually fixed in the setup, i.e., when the timeout is just a s= ymptom of a bad config/hw/... > Not only for being empowered to fix things themselves but also to be > able to dig into a problem and find the root cause... >=20 With PVE et. al being under AGPLv3 you're already empowered to just chang= e the code, so lets keep it simple(r) to most users, if a timeout is actually t= o short we can always change it to a better fitting value (at least if reports ab= out that reach us). > I remember that i had more then one occasion , where i grepped for > timeout or other limiting values in proxmox or other softwares source,=C2= =A0 > and often gave up in the end, because it was too deeply hidden or i got= > too many hits/findings. >=20 same would happen if we expose every possible parameter as knob, you'd ha= ve hundreds and get many hits on searches.. > Finding such without knowing the code can often be like searching for > the needle in a haystack and extremely frustrating. sure and I can imagine that it's frustrating, but you can always ask here= , on pve-user or other channels about the issue at hand and people more used t= o the code can often better guide you to the actual parameter location, or = give some other input. >=20 > I would be happy, if such important values would get defined with some > descriptive variable name at a suitable location, maybe even with some > comment what's it all about ( even if it's not meant to be changed/tune= d) >=20 everybody sees specific knobs as more or less important, so that's not a = a clear cut at all. For comments I'd also suggest using git blame to get the comm= it (message) that introduced it, often there's some info encoded in there to= o. Also, if you ever got some reasoning into a special timeout that had no i= nfo or comment whatsoever we're naturally happy to accept patches that add on= e. > just my 10 cents... thanks for your input. IMO this specific one can be improved without expo= sing a new tunable (see start of my reply), and in general I prefer to avoid t= o many tunables, we can still add them if it shows that there's quite some deman= d for a specific one, while removing would break compat.. cheers, Thomas