From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id B8FE9661E2 for ; Tue, 5 Jan 2021 20:10:27 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id B18F3CA25 for ; Tue, 5 Jan 2021 20:10:27 +0100 (CET) Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id EE963CA1B for ; Tue, 5 Jan 2021 20:10:26 +0100 (CET) Received: by mail-wr1-x42c.google.com with SMTP id a12so252341wrv.8 for ; Tue, 05 Jan 2021 11:10:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=reply-to:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=DA3WzYiR+Htz4A6SvHu35tna+tS/sPRDjGHiM6EpfxY=; b=XUWPkQvfXptf6o/agpGBswpxKvzAsu0HMCWfGrqOUTKSjlqk700dhVwyMQm4Lp0nmY g0rLMcDVVoIqEr53SRjmKPIXu21JfMnuT+bnWi3yL1tunOdH5l3F1bw2EsBD1pFnBe/C GmFSfFhNZ4qUei4RnR8aPA/vc5neVDm2Ue2KssXZ6MT3L1UjceuWeSnoO7ojdhexck/o MN+iH+FwvO0Te2P0wonnfkk6iGNzPhbKGClcRqI2RPNeTmAJilqM6Zbk++hvXGYIEf9+ y6h14Wq3uw+Xdh2sC9+5Y23hw/Ezs/az5KuJD2y1Or48SRwg+d/hQROOIOcC7Kt9m900 Nqmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=DA3WzYiR+Htz4A6SvHu35tna+tS/sPRDjGHiM6EpfxY=; b=D3RNncWQKJ++/nM2Q+g1ow8A1LBGDo6QBV5+dD8gtbTTFWNaMlNMVe1XESVIMwZ+Px 4WokptL21vDQdQHY9KK+blMaY82qd5eok8+ShaSwvYXR/QiZ06KJdDz9bEQPaGaHG119 IQMwTGGgLVwfgNifUtgLynjtu+uiVf6ljiqAMhR5/AuQbHduLgMPRPxB4sVH+WYUhiqB SI2jWBp4K2b9CiiPM5CtCEsEJewY00oKWJVEX5r3YEI7Pu7pDJR60VU8oYuvtVt3qShg 6r6TkroeIrOgR1/w2iBWnGNxl/Nc+uE6LLf//SM32frsJewKfUApPZZNKVTqP/vA6FbF R2Yg== X-Gm-Message-State: AOAM531bBxiMg+ZLaOGyfs8x4OheA1Q+NXDLyZ8q3BctZdJVvpt09tPI yl66v5k7I5EAmxp0C6VsyobL5JDt2OY= X-Google-Smtp-Source: ABdhPJxv8RLE+jq2UpSybkSCt0NCrO46ndFBLbSYGMQiheds+2gEKGAweKVLlVkVo7LAiaoupCdp/g== X-Received: by 2002:adf:e704:: with SMTP id c4mr919993wrm.355.1609873820696; Tue, 05 Jan 2021 11:10:20 -0800 (PST) Received: from ?IPv6:2a02:8070:a3c0:1400:b618:8ccd:66f6:e7a? ([2a02:8070:a3c0:1400:b618:8ccd:66f6:e7a]) by smtp.googlemail.com with ESMTPSA id c20sm126817wmb.38.2021.01.05.11.10.20 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 05 Jan 2021 11:10:20 -0800 (PST) Reply-To: uwe.sauter.de@gmail.com To: pve-user@lists.proxmox.com References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> From: Uwe Sauter Message-ID: Date: Tue, 5 Jan 2021 20:10:19 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: de-DE Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.000 Adjusted score from AWL reputation of From: address DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain FREEMAIL_FROM 0.001 Sender email is commonly abused enduser mail provider NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_NONE -0.0001 Sender listed at https://www.dnswl.org/, no trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [ceph.target, proxmox.com] Subject: Re: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jan 2021 19:10:27 -0000 Hi Frank, did you look into the log of MON and OSD? Can you provide the list of installed packages of the affected host and the rest of the cluster? Is the output of "ceph status" the same for all hosts? Regards, Uwe Am 05.01.21 um 20:01 schrieb Frank Thommen: > > On 04.01.21 12:44, Frank Thommen wrote: >> >> Dear all, >> >> one of our three PVE hypervisors in the cluster crashed (it was fenced successfully) and rebooted automatically.  I >> took the chance to do a complete dist-upgrade and rebooted again. >> >> The PVE Ceph dashboard now reports, that >> >>    * the monitor on the host is down (out of quorum), and >>    * "A newer version was installed but old version still running, please restart" >> >> The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 is installed. The hypervisor has been rebooted twice >> since the upgrade, so it should be basically impossible that the old version is still running. >> >> `systemctl restart ceph.target` and restarting the monitor through the PVE Ceph UI didn't help. The hypervisor is >> running PVE 6.3-3 (the other two are running 6.3-2 with monitor 14.2.15) >> >> What to do in this situation? >> >> I am happy with either UI or commandline instructions, but I have no Ceph experience besides setting up it up >> following the PVE instructions. >> >> Any help or hint is appreciated. >> Cheers, Frank > > In an attempt to fix the issue I destroyed the monitor through the UI and recreated it.  Unfortunately it can still not > be started.  A popup tells me that the monitor has been started, but the overview still shows "stopped" and there is no > version number any more. > > Then I stopped and started Ceph on the node (`pveceph stop; pveceph start`) which resulted in a degraded cluster (1 host > down, 7 of 21 OSDs down). OSDs cannot be started through the UI either. > > I feel extremely uncomfortable with this situation and would appreciate any hint as to how I should proceed with the > problem. > > Cheers, Frank > > _______________________________________________ > pve-user mailing list > pve-user@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user