* [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
@ 2022-06-22 18:39 Branislav Viest
[not found] ` <mailman.54.1655965821.338.pve-user@lists.proxmox.com>
0 siblings, 1 reply; 4+ messages in thread
From: Branislav Viest @ 2022-06-22 18:39 UTC (permalink / raw)
To: pve-user
Hello,
I have experienced some serious difficulties with our Ceph cluster built-in from 4 nodes. I have created a proxmox forum topic regarding this problem and I think it is useless to type all the information in this mailing thread. I would like to ask you guys to check it out and if anyone has any idea what should cause this problem, what should be a problem, or how to solve it, I would be very sincere and thankful. Any idea or your experience is worthy.
We have already begun the migration process to another provider but I want to know the answer or the problem roots anyway to prevent possible failure or problems after migration.
The topic is here: [ https://forum.proxmox.com/threads/ceph-sudden-slow-ops-freezes-and-slow-downs.111144/ | https://forum.proxmox.com/threads/ceph-sudden-slow-ops-freezes-and-slow-downs.111144/ ]
Thank you all.
------------
Best Regards
Branislav Brian Viest
linux administrator
------------
Legal Disclaimer: This e-mail and any attached files are confidential and may be legally privileged. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. If you have received this transmission in error please notify the sender immediately and then delete this e-mail. The sender does not accept liability for the correct and complete transmission of the information, nor for any delay or interruption of the transmission, nor for damages arising from the use of or reliance on the information. All e-mail messages addressed to, received or sent by sender are deemed to be professional in nature. Accordingly, the sender or recipient of these messages agrees that they may be read by other sender employees than the official recipient or sender in order to ensure the continuity of work-related activities and allow supervision thereof.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
[not found] ` <mailman.54.1655965821.338.pve-user@lists.proxmox.com>
@ 2022-06-23 6:54 ` Branislav Viest
[not found] ` <4c507d5f-5ad9-9fb1-6eab-6ca1913784d8@binovo.es>
0 siblings, 1 reply; 4+ messages in thread
From: Branislav Viest @ 2022-06-23 6:54 UTC (permalink / raw)
To: Proxmox VE user list
Hello,
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 15.52213 root default
-3 5.18097 host node1
0 ssd 1.72699 osd.0 up 1.00000 1.00000
1 ssd 1.72699 osd.1 up 1.00000 1.00000
2 ssd 1.72699 osd.2 up 1.00000 1.00000
-5 3.45398 host node2
3 ssd 1.72699 osd.3 up 1.00000 1.00000
5 ssd 1.72699 osd.5 up 1.00000 1.00000
-7 1.70740 host node3
6 ssd 0.85370 osd.6 up 1.00000 1.00000
7 ssd 0.85370 osd.7 up 1.00000 1.00000
-9 5.17978 host node4
8 ssd 1.72659 osd.8 up 1.00000 1.00000
9 ssd 1.72659 osd.9 up 1.00000 1.00000
10 ssd 1.72659 osd.10 up 1.00000 1.00000
Since slow ops are reported the most of the time within multiple OSDs, I did not try to perform tests with some OSDs out.
Now I check the logs from the last 2-3 days and slow ops are reported mostly on the 5 OSDs from total 10.
------------
Best Regards
Branislav Brian Viest
------------
Legal Disclaimer: This e-mail and any attached files are confidential and may be legally privileged. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. If you have received this transmission in error please notify the sender immediately and then delete this e-mail. The sender does not accept liability for the correct and complete transmission of the information, nor for any delay or interruption of the transmission, nor for damages arising from the use of or reliance on the information. All e-mail messages addressed to, received or sent by sender are deemed to be professional in nature. Accordingly, the sender or recipient of these messages agrees that they may be read by other sender employees than the official recipient or sender in order to ensure the continuity of work-related activities and allow supervision thereof.
----- Pôvodná správa -----
Od: "Eneko Lacunza via pve-user" <pve-user@lists.proxmox.com>
Komu: "pve-user" <pve-user@lists.proxmox.com>
Kópia: "Eneko Lacunza" <elacunza@binovo.es>
Odoslané: štvrtok, 23. jún 2022 8:29:40
Predmet: Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
[not found] ` <4c507d5f-5ad9-9fb1-6eab-6ca1913784d8@binovo.es>
@ 2022-06-23 8:24 ` Branislav Viest
[not found] ` <1cdc99da-0537-7533-2987-8678223e2971@binovo.es>
0 siblings, 1 reply; 4+ messages in thread
From: Branislav Viest @ 2022-06-23 8:24 UTC (permalink / raw)
To: Eneko Lacunza; +Cc: Proxmox VE user list
OSD1,3,5,2,8
All drives are Samsung NVMe
Model Number: SAMSUNG MZQLB1T9HAJR-00007
According to SMART value "Percentage Used", all are up to 10%. All SMART overall-health self-assessment test result are PASSED.
------------
Best Regards
Branislav Brian Viest
------------
Legal Disclaimer: This e-mail and any attached files are confidential and may be legally privileged. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. If you have received this transmission in error please notify the sender immediately and then delete this e-mail. The sender does not accept liability for the correct and complete transmission of the information, nor for any delay or interruption of the transmission, nor for damages arising from the use of or reliance on the information. All e-mail messages addressed to, received or sent by sender are deemed to be professional in nature. Accordingly, the sender or recipient of these messages agrees that they may be read by other sender employees than the official recipient or sender in order to ensure the continuity of work-related activities and allow supervision thereof.
Od: "Eneko Lacunza" <elacunza@binovo.es>
Komu: "Branislav Viest" <info@branoviest.com>, "Proxmox VE user list" <pve-user@lists.proxmox.com>
Odoslané: štvrtok, 23. jún 2022 9:22:08
Predmet: Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
Hi,
What numbers are those 5 OSDs?
Hace you checked SSD drive manufacturer and models?
El 23/6/22 a las 8:54, Branislav Viest escribió:
Hello,
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 15.52213 root default
-3 5.18097 host node1
0 ssd 1.72699 osd.0 up 1.00000 1.00000
1 ssd 1.72699 osd.1 up 1.00000 1.00000
2 ssd 1.72699 osd.2 up 1.00000 1.00000
-5 3.45398 host node2
3 ssd 1.72699 osd.3 up 1.00000 1.00000
5 ssd 1.72699 osd.5 up 1.00000 1.00000
-7 1.70740 host node3
6 ssd 0.85370 osd.6 up 1.00000 1.00000
7 ssd 0.85370 osd.7 up 1.00000 1.00000
-9 5.17978 host node4
8 ssd 1.72659 osd.8 up 1.00000 1.00000
9 ssd 1.72659 osd.9 up 1.00000 1.00000
10 ssd 1.72659 osd.10 up 1.00000 1.00000
Since slow ops are reported the most of the time within multiple OSDs, I did not try to perform tests with some OSDs out.
Now I check the logs from the last 2-3 days and slow ops are reported mostly on the 5 OSDs from total 10.
------------
Best Regards
Branislav Brian Viest
------------
Legal Disclaimer: This e-mail and any attached files are confidential and may be legally privileged. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. If you have received this transmission in error please notify the sender immediately and then delete this e-mail. The sender does not accept liability for the correct and complete transmission of the information, nor for any delay or interruption of the transmission, nor for damages arising from the use of or reliance on the information. All e-mail messages addressed to, received or sent by sender are deemed to be professional in nature. Accordingly, the sender or recipient of these messages agrees that they may be read by other sender employees than the official recipient or sender in order to ensure the continuity of work-related activities and allow supervision thereof.
----- Pôvodná správa -----
Od: "Eneko Lacunza via pve-user" [ mailto:pve-user@lists.proxmox.com | <pve-user@lists.proxmox.com> ] Komu: "pve-user" [ mailto:pve-user@lists.proxmox.com | <pve-user@lists.proxmox.com> ] Kópia: "Eneko Lacunza" [ mailto:elacunza@binovo.es | <elacunza@binovo.es> ] Odoslané: štvrtok, 23. jún 2022 8:29:40
Predmet: Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
_______________________________________________
pve-user mailing list [ mailto:pve-user@lists.proxmox.com | pve-user@lists.proxmox.com ] [ https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user | https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user ]
Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project
Tel. +34 943 569 206 | [ https://www.binovo.es/ | https://www.binovo.es ] Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun [ https://www.youtube.com/user/CANALBINOVO | https://www.youtube.com/user/CANALBINOVO ] [ https://www.linkedin.com/company/37269706/ | https://www.linkedin.com/company/37269706/ ]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
[not found] ` <1cdc99da-0537-7533-2987-8678223e2971@binovo.es>
@ 2022-06-23 13:54 ` Branislav Viest
0 siblings, 0 replies; 4+ messages in thread
From: Branislav Viest @ 2022-06-23 13:54 UTC (permalink / raw)
To: Eneko Lacunza; +Cc: Proxmox VE user list
3 nodes have the same firmware on Drives (EDA5402Q) except the last one (newest) where is higher firmware version, EDA5702Q
------------
Best Regards
Branislav Viest
------------
Legal Disclaimer: This e-mail and any attached files are confidential and may be legally privileged. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. If you have received this transmission in error please notify the sender immediately and then delete this e-mail. The sender does not accept liability for the correct and complete transmission of the information, nor for any delay or interruption of the transmission, nor for damages arising from the use of or reliance on the information. All e-mail messages addressed to, received or sent by sender are deemed to be professional in nature. Accordingly, the sender or recipient of these messages agrees that they may be read by other sender employees than the official recipient or sender in order to ensure the continuity of work-related activities and allow supervision thereof.
Od: "Eneko Lacunza" <elacunza@binovo.es>
Komu: "Branislav Viest" <info@branoviest.com>
Kópia: "Proxmox VE user list" <pve-user@lists.proxmox.com>
Odoslané: štvrtok, 23. jún 2022 13:12:23
Predmet: Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
Drives are DC so should be good. All have the same firmware version?
El 23/6/22 a las 10:24, Branislav Viest escribió:
OSD1,3,5,2,8
All drives are Samsung NVMe
Model Number: SAMSUNG MZQLB1T9HAJR-00007
According to SMART value "Percentage Used", all are up to 10%. All SMART overall-health self-assessment test result are PASSED.
------------
Best Regards
Branislav Brian Viest
------------
Legal Disclaimer: This e-mail and any attached files are confidential and may be legally privileged. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. If you have received this transmission in error please notify the sender immediately and then delete this e-mail. The sender does not accept liability for the correct and complete transmission of the information, nor for any delay or interruption of the transmission, nor for damages arising from the use of or reliance on the information. All e-mail messages addressed to, received or sent by sender are deemed to be professional in nature. Accordingly, the sender or recipient of these messages agrees that they may be read by other sender employees than the official recipient or sender in order to ensure the continuity of work-related activities and allow supervision thereof.
Od: "Eneko Lacunza" [ mailto:elacunza@binovo.es | <elacunza@binovo.es> ]
Komu: "Branislav Viest" [ mailto:info@branoviest.com | <info@branoviest.com> ] , "Proxmox VE user list" [ mailto:pve-user@lists.proxmox.com | <pve-user@lists.proxmox.com> ]
Odoslané: štvrtok, 23. jún 2022 9:22:08
Predmet: Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
Hi,
What numbers are those 5 OSDs?
Hace you checked SSD drive manufacturer and models?
El 23/6/22 a las 8:54, Branislav Viest escribió:
BQ_BEGIN
Hello,
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 15.52213 root default
-3 5.18097 host node1
0 ssd 1.72699 osd.0 up 1.00000 1.00000
1 ssd 1.72699 osd.1 up 1.00000 1.00000
2 ssd 1.72699 osd.2 up 1.00000 1.00000
-5 3.45398 host node2
3 ssd 1.72699 osd.3 up 1.00000 1.00000
5 ssd 1.72699 osd.5 up 1.00000 1.00000
-7 1.70740 host node3
6 ssd 0.85370 osd.6 up 1.00000 1.00000
7 ssd 0.85370 osd.7 up 1.00000 1.00000
-9 5.17978 host node4
8 ssd 1.72659 osd.8 up 1.00000 1.00000
9 ssd 1.72659 osd.9 up 1.00000 1.00000
10 ssd 1.72659 osd.10 up 1.00000 1.00000
Since slow ops are reported the most of the time within multiple OSDs, I did not try to perform tests with some OSDs out.
Now I check the logs from the last 2-3 days and slow ops are reported mostly on the 5 OSDs from total 10.
------------
Best Regards
Branislav Brian Viest
------------
Legal Disclaimer: This e-mail and any attached files are confidential and may be legally privileged. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. If you have received this transmission in error please notify the sender immediately and then delete this e-mail. The sender does not accept liability for the correct and complete transmission of the information, nor for any delay or interruption of the transmission, nor for damages arising from the use of or reliance on the information. All e-mail messages addressed to, received or sent by sender are deemed to be professional in nature. Accordingly, the sender or recipient of these messages agrees that they may be read by other sender employees than the official recipient or sender in order to ensure the continuity of work-related activities and allow supervision thereof.
----- Pôvodná správa -----
Od: "Eneko Lacunza via pve-user" [ mailto:pve-user@lists.proxmox.com | <pve-user@lists.proxmox.com> ] Komu: "pve-user" [ mailto:pve-user@lists.proxmox.com | <pve-user@lists.proxmox.com> ] Kópia: "Eneko Lacunza" [ mailto:elacunza@binovo.es | <elacunza@binovo.es> ] Odoslané: štvrtok, 23. jún 2022 8:29:40
Predmet: Re: [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs
_______________________________________________
pve-user mailing list [ mailto:pve-user@lists.proxmox.com | pve-user@lists.proxmox.com ] [ https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user | https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user ]
BQ_END
Eneko Lacunza
Director Técnico | Zuzendari teknikoa
Binovo IT Human Project
943 569 206
[ mailto:elacunza@binovo.es | elacunza@binovo.es ]
[ https://binovo.es/ | binovo.es ]
Astigarragako Bidea, 2 - 2 izda. Oficina 10-11, 20180 Oiartzun
[ https://www.youtube.com/user/CANALBINOVO/ ]
[ https://www.linkedin.com/company/37269706/ ]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-06-23 13:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22 18:39 [PVE-User] Ceph: sudden slow ops, freezes, and slow-downs Branislav Viest
[not found] ` <mailman.54.1655965821.338.pve-user@lists.proxmox.com>
2022-06-23 6:54 ` Branislav Viest
[not found] ` <4c507d5f-5ad9-9fb1-6eab-6ca1913784d8@binovo.es>
2022-06-23 8:24 ` Branislav Viest
[not found] ` <1cdc99da-0537-7533-2987-8678223e2971@binovo.es>
2022-06-23 13:54 ` Branislav Viest
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox