From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pve-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id F01871FF16E for <inbox@lore.proxmox.com>; Mon, 28 Apr 2025 15:20:50 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 26EC131827; Mon, 28 Apr 2025 15:20:59 +0200 (CEST) Message-ID: <28ca2817-2a17-4b67-b245-2b40462b776a@proxmox.com> Date: Mon, 28 Apr 2025 15:20:24 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>, Daniel Kral <d.kral@proxmox.com> References: <20250325151254.193177-1-d.kral@proxmox.com> <20250325151254.193177-12-d.kral@proxmox.com> Content-Language: en-US From: Fiona Ebner <f.ebner@proxmox.com> In-Reply-To: <20250325151254.193177-12-d.kral@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL -0.037 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH ha-manager 10/15] sim: resources: add option to limit start and migrate tries to node X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> List-Post: <mailto:pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com> Am 25.03.25 um 16:12 schrieb Daniel Kral: > Add an option to the VirtFail's name to allow the start and migrate fail > counts to only apply on a certain node number with a specific naming > scheme. > > This allows a slightly more elaborate test type, e.g. where a service > can start on one node (or any other in that case), but fails to start on > a specific node, which it is expected to start on after a migration. > > Signed-off-by: Daniel Kral <d.kral@proxmox.com> With some nits: Reviewed-by: Fiona Ebner <f.ebner@proxmox.com> > --- > src/PVE/HA/Sim/Resources/VirtFail.pm | 37 +++++++++++++++++++--------- > 1 file changed, 26 insertions(+), 11 deletions(-) > > diff --git a/src/PVE/HA/Sim/Resources/VirtFail.pm b/src/PVE/HA/Sim/Resources/VirtFail.pm > index ce88391..fddecd6 100644 > --- a/src/PVE/HA/Sim/Resources/VirtFail.pm > +++ b/src/PVE/HA/Sim/Resources/VirtFail.pm > @@ -10,25 +10,36 @@ use base qw(PVE::HA::Sim::Resources); > # To make it more interesting we can encode some behavior in the VMID > # with the following format, where fa: is the type and a, b, c, ... > # are digits in base 10, i.e. the full service ID would be: > -# fa:abcde > +# fa:abcdef > # And the digits after the fa: type prefix would mean: > # - a: no meaning but can be used for differentiating similar resources > # - b: how many tries are needed to start correctly (0 is normal behavior) (should be set) > # - c: how many tries are needed to migrate correctly (0 is normal behavior) (should be set) > # - d: should shutdown be successful (0 = yes, anything else no) (optional) > # - e: return value of $plugin->exists() defaults to 1 if not set (optional) > +# - f: limits the constraints of b and c to the nodeX (0 = apply to all nodes) (optional) Requires us to have exactly this kind of node name for such tests, but can be fine IMHO. > > my $decode_id = sub { > my $id = shift; > > - my ($start, $migrate, $stop, $exists) = $id =~ /^\d(\d)(\d)(\d)?(\d)?/g; > + my ($start, $migrate, $stop, $exists, $limit_to_node) = $id =~ /^\d(\d)(\d)(\d)?(\d)?(\d)?/g; > > $start = 0 if !defined($start); > $migrate = 0 if !defined($migrate); > $stop = 0 if !defined($stop); > $exists = 1 if !defined($exists); > + $limit_to_node = 0 if !defined($limit_to_node); > > - return ($start, $migrate, $stop, $exists) > + return ($start, $migrate, $stop, $exists, $limit_to_node); > +}; > + > +my $should_retry_action = sub { "action" feels a bit too general to me. It does not apply to all actions. Also it determines whether the action itself should fail. Retrying is then just the consequence. > + my ($haenv, $limit_to_node) = @_; > + > + my ($node) = $haenv->nodename() =~ /^node(\d)/g; No need for a regex, you could just check $limit_to_node == 0 early and then compare with the exactly known value. > + $node = 0 if !defined($node); > + > + return $limit_to_node == 0 || $limit_to_node == $node; > }; > > my $tries = { > @@ -53,12 +64,14 @@ sub exists { > sub start { > my ($class, $haenv, $id) = @_; > > - my ($start_failure_count) = &$decode_id($id); > + my ($start_failure_count, $limit_to_node) = (&$decode_id($id))[0,4]; Style nit: pre-existing, but you can go for $decode_id->() > > - $tries->{start}->{$id} = 0 if !$tries->{start}->{$id}; > - $tries->{start}->{$id}++; > + if ($should_retry_action->($haenv, $limit_to_node)) { > + $tries->{start}->{$id} = 0 if !$tries->{start}->{$id}; > + $tries->{start}->{$id}++; > > - return if $start_failure_count >= $tries->{start}->{$id}; > + return if $start_failure_count >= $tries->{start}->{$id}; > + } > > $tries->{start}->{$id} = 0; # reset counts > > @@ -79,12 +92,14 @@ sub shutdown { > sub migrate { > my ($class, $haenv, $id, $target, $online) = @_; > > - my (undef, $migrate_failure_count) = &$decode_id($id); > + my ($migrate_failure_count, $limit_to_node) = (&$decode_id($id))[1,4]; Same as above > > - $tries->{migrate}->{$id} = 0 if !$tries->{migrate}->{$id}; > - $tries->{migrate}->{$id}++; > + if ($should_retry_action->($haenv, $limit_to_node)) { > + $tries->{migrate}->{$id} = 0 if !$tries->{migrate}->{$id}; > + $tries->{migrate}->{$id}++; > > - return if $migrate_failure_count >= $tries->{migrate}->{$id}; > + return if $migrate_failure_count >= $tries->{migrate}->{$id}; > + } > > $tries->{migrate}->{$id} = 0; # reset counts > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel