From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 641D1871A3 for ; Wed, 29 Dec 2021 15:06:49 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 4F8D3FD3A for ; Wed, 29 Dec 2021 15:06:19 +0100 (CET) Received: from relay161.nicmail.ru (relay161.nicmail.ru [91.189.117.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id E021BFD2E for ; Wed, 29 Dec 2021 15:06:17 +0100 (CET) Received: from [10.28.138.152] (port=35020 helo=[192.168.8.155]) by relay.hosting.mail.nic.ru with esmtp (Exim 5.55) (envelope-from ) id 1n2Zaq-0006Fj-6R; Wed, 29 Dec 2021 17:06:17 +0300 Received: from [62.105.41.93] (account tsabolov@t8.ru HELO [192.168.8.155]) by incarp1104.int.hosting.nic.ru (Exim 5.55) with id 1n2Zaq-0000iY-US; Wed, 29 Dec 2021 17:06:16 +0300 To: uwe.sauter.de@gmail.com, Proxmox VE user list References: <6f23d719-1931-cc81-899d-3202047c4a56@binovo.es> <101971ad-519a-9af2-249e-433df28b1f1a@t8.ru> <0dd27e4e-391d-6262-bbf5-db84229accad@t8.ru> <015106bc-726b-da07-c3cf-80b63197b2c7@gmail.com> <216fd781-c35a-6e99-2662-6fe6378adc23@t8.ru> <550c21eb-5371-6f3e-f1f4-bccbc6b5384b@gmail.com> From: =?UTF-8?B?0KHQtdGA0LPQtdC5INCm0LDQsdC+0LvQvtCy?= Message-ID: <131ea5ec-89df-4c90-5808-451c33abbb05@t8.ru> Date: Wed, 29 Dec 2021 17:06:16 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: <550c21eb-5371-6f3e-f1f4-bccbc6b5384b@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-KLMS-AntiSpam-Auth: dkim=none X-MS-Exchange-Organization-SCL: -1 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.003 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% GB_TO_NAME_FREEMAIL 0.01 Freemail spear phish with free mail KAM_ASCII_DIVIDERS 0.8 Spam that uses ascii formatting tricks KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -3.024 Looks like a legit reply (A) RCVD_IN_DNSWL_NONE -0.0001 Sender listed at https://www.dnswl.org/, no trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [ceph.com, t8.ru] Subject: Re: [PVE-User] [ceph-users] Re: Ceph Usage web and terminal. X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Dec 2021 14:06:49 -0000 Ok,  I understand the case. 29.12.2021 16:13, Uwe Sauter пишет: > Am 29.12.21 um 13:51 schrieb Сергей Цаболов: >> Hi, Uwe >> >> 29.12.2021 14:16, Uwe Sauter пишет: >>> Just a feeling but I'd say that the imbalance in OSDs (one host having many more disks than the >>> rest) is your problem. >> Yes, last node in cluster have more disk then the rest, but >> >> one disk is 12TB and all others 9 HD is 1TB >> >>> Assuming that your configuration keeps 3 copies of each VM image then the imbalance probably means >>> that 2 of these 3 copies reside on pve-3111 and if this host is unavailable, all VM images with 2 >>> copies on that host become unresponsive, too. >> In Proxmox web ceph pool I set the  Size: 2 , Min.Size: 2 >> > So this means that you want to have 2 copies in the regular case (size) and also 2 copies in the > failure case (min size) so that the VMs stay available. Yes I think before like you answer, but is not so worked. > > So you might solve your problem by decreasing min size to 1 (dangerous!!) or by increasing size to > 3, which means that in the regular case you will have 3 copies but if only 2 are available, it will > still work and re-sync the 3rd copy once it comes online again. I understand if decreasing min.size to 1 is very (dangerous!!!) If I increasing to 3 min.size keep 2 is default . But I'm afraid if set the 3/2 (good choice) MAX AVAIL in pool is will decrease in two or more space, or am I wrong? For now I have with all disk : CLASS  SIZE         AVAIL       USED         RAW USED  %RAW USED hdd    `106 TiB      99 TiB      7.7 TiB       7.7 TiB       7.26 TOTAL  106 TiB      99 TiB      7.7 TiB       7.7 TiB       7.26 --- POOLS --- POOL                             ID      PGS       STORED OBJECTS  USED         %USED      MAX AVAIL device_health_metrics   1         1          8.3 MiB 22   17 MiB              0             44 TiB vm.pool                         2          1024    3.0 TiB   864.55k  6.0 TiB       6.39         44 TiB ( terminal 44 TiB = 48.37 ) in web I see  51.50 TB cephfs_data                   3         32         874 GiB 223.76k  1.7 TiB       1.91         44 TiB cephfs_metadata            4        32           25 MiB 27   51 MiB      0                       44 TiB Am I right in my reasoning ? Thank you! > >> With :  ceph osd map vm.pool object-name (vm ID) I see some of vm object one copy is on osd.12, >> example : >> >> osdmap e14321 pool 'vm.pool' (2) object '114' -> pg 2.10486407 (2.7) -> up ([12,8], p12) acting >> ([12,8], p12) >> >> But this example : >> >> osdmap e14321 pool 'vm.pool' (2) object '113' -> pg 2.8bd09f6d (2.36d) -> up ([10,7], p10) acting >> ([10,7], p10) >> >> osd.10 and osd.7 >> >>> Check your failure domain for Ceph and possibly change it from OSD to host. This should prevent that >>> one host holds multiple copies of a VM image. >> I didn 't understand a little what to check  ? >> >> Can you explain me with example? >> > I don't have an example but you can read about the concept at: > > https://docs.ceph.com/en/latest/rados/operations/crush-map/#crush-maps > > > Regards, > > Uwe > > > >>> >>> Regards, >>> >>>     Uwe >>> >>> Am 29.12.21 um 09:36 schrieb Сергей Цаболов: >>>> Hello to all. >>>> >>>> In my case I have the 7 node cluster Proxmox and working Ceph (ceph version 15.2.15  octopus >>>> (stable)": 7) >>>> >>>> Ceph HEALTH_OK >>>> >>>> ceph -s >>>>    cluster: >>>>      id:     9662e3fa-4ce6-41df-8d74-5deaa41a8dde >>>>      health: HEALTH_OK >>>> >>>>    services: >>>>      mon: 7 daemons, quorum pve-3105,pve-3107,pve-3108,pve-3103,pve-3101,pve-3111,pve-3109 (age 17h) >>>>      mgr: pve-3107(active, since 41h), standbys: pve-3109, pve-3103, pve-3105, pve-3101, pve-3111, >>>> pve-3108 >>>>      mds: cephfs:1 {0=pve-3105=up:active} 6 up:standby >>>>      osd: 22 osds: 22 up (since 17h), 22 in (since 17h) >>>> >>>>    task status: >>>> >>>>    data: >>>>      pools:   4 pools, 1089 pgs >>>>      objects: 1.09M objects, 4.1 TiB >>>>      usage:   7.7 TiB used, 99 TiB / 106 TiB avail >>>>      pgs:     1089 active+clean >>>> >>>> --------------------------------------------------------------------------------------------------------------------- >>>> >>>> >>>> >>>> ceph osd tree >>>> >>>> ID   CLASS  WEIGHT     TYPE NAME            STATUS  REWEIGHT PRI-AFF >>>>   -1         106.43005  root default >>>> -13          14.55478      host pve-3101 >>>>   10    hdd    7.27739          osd.10           up   1.00000 1.00000 >>>>   11    hdd    7.27739          osd.11           up   1.00000 1.00000 >>>> -11          14.55478      host pve-3103 >>>>    8    hdd    7.27739          osd.8            up   1.00000 1.00000 >>>>    9    hdd    7.27739          osd.9            up   1.00000 1.00000 >>>>   -3          14.55478      host pve-3105 >>>>    0    hdd    7.27739          osd.0            up   1.00000 1.00000 >>>>    1    hdd    7.27739          osd.1            up   1.00000 1.00000 >>>>   -5          14.55478      host pve-3107 >>>>    2    hdd    7.27739          osd.2            up   1.00000 1.00000 >>>>    3    hdd    7.27739          osd.3            up   1.00000 1.00000 >>>>   -9          14.55478      host pve-3108 >>>>    6    hdd    7.27739          osd.6            up   1.00000 1.00000 >>>>    7    hdd    7.27739          osd.7            up   1.00000 1.00000 >>>>   -7          14.55478      host pve-3109 >>>>    4    hdd    7.27739          osd.4            up   1.00000 1.00000 >>>>    5    hdd    7.27739          osd.5            up   1.00000 1.00000 >>>> -15          19.10138      host pve-3111 >>>>   12    hdd   10.91409          osd.12           up   1.00000 1.00000 >>>>   13    hdd    0.90970          osd.13           up   1.00000 1.00000 >>>>   14    hdd    0.90970          osd.14           up   1.00000 1.00000 >>>>   15    hdd    0.90970          osd.15           up   1.00000 1.00000 >>>>   16    hdd    0.90970          osd.16           up   1.00000 1.00000 >>>>   17    hdd    0.90970          osd.17           up   1.00000 1.00000 >>>>   18    hdd    0.90970          osd.18           up   1.00000 1.00000 >>>>   19    hdd    0.90970          osd.19           up   1.00000 1.00000 >>>>   20    hdd    0.90970          osd.20           up   1.00000 1.00000 >>>>   21    hdd    0.90970          osd.21           up   1.00000 1.00000 >>>> >>>> --------------------------------------------------------------------------------------------------------------- >>>> >>>> >>>> >>>> POOL                               ID  PGS   STORED   OBJECTS USED     %USED  MAX AVAIL >>>> vm.pool                            2  1024  3.0 TiB  863.31k  6.0 TiB   6.38     44 TiB  (this pool >>>> have the all VM disk) >>>> >>>> --------------------------------------------------------------------------------------------------------------- >>>> >>>> >>>> >>>> ceph osd map vm.pool vm.pool.object >>>> osdmap e14319 pool 'vm.pool' (2) object 'vm.pool.object' -> pg 2.196f68d5 (2.d5) -> up ([2,4], p2) >>>> acting ([2,4], p2) >>>> >>>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> >>>> pveversion -v >>>> proxmox-ve: 6.4-1 (running kernel: 5.4.143-1-pve) >>>> pve-manager: 6.4-13 (running version: 6.4-13/9f411e79) >>>> pve-kernel-helper: 6.4-8 >>>> pve-kernel-5.4: 6.4-7 >>>> pve-kernel-5.4.143-1-pve: 5.4.143-1 >>>> pve-kernel-5.4.106-1-pve: 5.4.106-1 >>>> ceph: 15.2.15-pve1~bpo10 >>>> ceph-fuse: 15.2.15-pve1~bpo10 >>>> corosync: 3.1.2-pve1 >>>> criu: 3.11-3 >>>> glusterfs-client: 5.5-3 >>>> ifupdown: residual config >>>> ifupdown2: 3.0.0-1+pve4~bpo10 >>>> ksm-control-daemon: 1.3-1 >>>> libjs-extjs: 6.0.1-10 >>>> libknet1: 1.22-pve1~bpo10+1 >>>> libproxmox-acme-perl: 1.1.0 >>>> libproxmox-backup-qemu0: 1.1.0-1 >>>> libpve-access-control: 6.4-3 >>>> libpve-apiclient-perl: 3.1-3 >>>> libpve-common-perl: 6.4-4 >>>> libpve-guest-common-perl: 3.1-5 >>>> libpve-http-server-perl: 3.2-3 >>>> libpve-storage-perl: 6.4-1 >>>> libqb0: 1.0.5-1 >>>> libspice-server1: 0.14.2-4~pve6+1 >>>> lvm2: 2.03.02-pve4 >>>> lxc-pve: 4.0.6-2 >>>> lxcfs: 4.0.6-pve1 >>>> novnc-pve: 1.1.0-1 >>>> proxmox-backup-client: 1.1.13-2 >>>> proxmox-mini-journalreader: 1.1-1 >>>> proxmox-widget-toolkit: 2.6-1 >>>> pve-cluster: 6.4-1 >>>> pve-container: 3.3-6 >>>> pve-docs: 6.4-2 >>>> pve-edk2-firmware: 2.20200531-1 >>>> pve-firewall: 4.1-4 >>>> pve-firmware: 3.3-2 >>>> pve-ha-manager: 3.1-1 >>>> pve-i18n: 2.3-1 >>>> pve-qemu-kvm: 5.2.0-6 >>>> pve-xtermjs: 4.7.0-3 >>>> qemu-server: 6.4-2 >>>> smartmontools: 7.2-pve2 >>>> spiceterm: 3.1-1 >>>> vncterm: 1.6-2 >>>> zfsutils-linux: 2.0.6-pve1~bpo10+1 >>>> >>>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> >>>> >>>> And now my problem: >>>> >>>> For all VM I have one pool for VM disks >>>> >>>> When  node/host pve-3111  is shutdown in many of other nodes/hosts pve-3107, pve-3105  VM not >>>> shutdown but not available in network. >>>> >>>> After the node/host is up Ceph back to HEALTH_OK and the all VM back to access in Network (without >>>> reboot). >>>> >>>> Can some one to suggest me what I can to check in Ceph ? >>>> >>>> Thanks. >>>> > -- ------------------------- С уважением Сергей Цаболов, Системный администратор ООО "Т8" Тел.: +74992716161, Моб: +79850334875 tsabolov@t8.ru ООО «Т8», 107076, г. Москва, Краснобогатырская ул., д. 44, стр.1 www.t8.ru