From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id CA64B9BF5D for ; Tue, 30 May 2023 12:00:13 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id BAF7A2D716 for ; Tue, 30 May 2023 12:00:13 +0200 (CEST) Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 30 May 2023 12:00:12 +0200 (CEST) Received: by mail-qk1-x732.google.com with SMTP id af79cd13be357-75b050b4fa0so449599985a.0 for ; Tue, 30 May 2023 03:00:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gridscale-io.20221208.gappssmtp.com; s=20221208; t=1685440802; x=1688032802; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=cWd1stt9XakMaxeFS571C8hAw1au7Gicb2UMK0451TM=; b=LTGrsVJHUCDymngMo2x2cFhbuAGM/yIslzepYSmQKNGCDoGSGFKZGHzHVRNHx3HRAb BN+J5uT2ba3Tjq7wmZZTUYBSw7mobEB1wB6DvbSlIhFx62vRy5UitumAzuRmrCHzFOHf J6UUOsq9Xb6Nsuq45F2g57jYF6mrv4kncRoT5WGzjgYEK1UK8gtev6TFVFdf7lANiXR8 aMBJMDc/oQpHNu20Xjgnmk4YMiO5ErZX0p8zBPVbc90wLVKLGrWh9VaFmB5XOGHDVDKv HYCaX15pTb4xSiOQKBvA11mac4xvNeKuOsUeUvYP5qYA8T1B+toAO+3ie46EAYMYpiu+ DnSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685440802; x=1688032802; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=cWd1stt9XakMaxeFS571C8hAw1au7Gicb2UMK0451TM=; b=gkieQ9qaEAI6Ek3Kcegg0IU3bSx7cKsnq5DHKSl6X6VxqK8edxWQ8YkYFS0SyGxttk tr2vJp+oMWiqTLXfc3z481inMxBYI69pX0DJ3vcdgmFG0WMcEOtkpw8DRdtY6wXBM6gT FQ1/n4cr7s2XlXVWqggIBDIx7nGBcpB49ZBqYmEOexRGdyNuhKweoArHvqVBLlqnTXSn K3ExrRdXAHgAtyhUHU0+KwY96ZEK1EW0ljLLTtT6QS3qKl5BriQwOUp0uWIBanKVl8GS 6xxWd1b13ZkCJyUt34GBaPJzIi+kMMoDzZXZDduxpWgb0xyJGv2aVSTIcVStDweuLHmk xqFg== X-Gm-Message-State: AC+VfDxDFKzbFThUMFw7d8lP88dGPv3D8cyJ5c1MQBeArRUVdrkqfQPu 3Bw9VrfkgY+jkFKTzy/SLCrgK8hjiLuCZjpSg1Sd0SmlUx9EPEYp X-Google-Smtp-Source: ACHHUZ5gLLNUwZP53iYJvc/0syhTkoLeeY2VEGolWSVaNtlMfDBLOMKDcDzqZ/gj6vRHZG0S/lij6Rv4nNu5hHqtx0M= X-Received: by 2002:ad4:5f4c:0:b0:626:94f:6044 with SMTP id p12-20020ad45f4c000000b00626094f6044mr1422693qvg.2.1685440802434; Tue, 30 May 2023 03:00:02 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a0c:e10d:0:b0:5ec:66a8:89cf with HTTP; Tue, 30 May 2023 03:00:02 -0700 (PDT) From: Benjamin Hofer Date: Tue, 30 May 2023 12:00:02 +0200 Message-ID: To: pve-user@lists.proxmox.com Content-Type: text/plain; charset="UTF-8" X-SPAM-LEVEL: Spam detection results: 0 AWL 0.276 Adjusted score from AWL reputation of From: address BAYES_05 -0.5 Bayes spam probability is 1 to 5% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DMARC_PASS -0.1 DMARC pass policy KAM_NUMSUBJECT 0.5 Subject ends in numbers excluding current years RCVD_IN_DNSWL_NONE -0.0001 Sender listed at https://www.dnswl.org/, no trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: [PVE-User] Proxmox HCI Ceph: "osd_max_backfills" is overridden and set to 1000 X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 May 2023 10:00:13 -0000 Dear community, We've set up a Proxmox hyper-converged Ceph cluster in production. After syncing in one new OSD using the "pveceph osd create" command, we got massive network performance issues and outages. We then found that "osd_max_backfills" is set to 1000 (Ceph default is 1) and that this (along with some other values) have been overridden. Does anyone know a root cause? I can't imagine that this is the Proxmox default behaviour and I'm very sure that we didn't change anything (actually I didn't even know the value before researching and talking to colleagues with deeper Ceph knowledge). System: PVE version output: pve-manager/7.3-6/723bb6ec (running kernel: 5.15.102-1-pve) ceph version 17.2.5 (e04241aa9b639588fa6c864845287d2824cb6b55) quincy (stable) # ceph config get osd.1 WHO MASK LEVEL OPTION VALUE RO osd.1 basic osd_mclock_max_capacity_iops_ssd 17080.220753 # ceph config show osd.1 NAME VALUE SOURCE OVERRIDES IGNORES auth_client_required cephx file auth_cluster_required cephx file auth_service_required cephx file cluster_network 10.0.18.0/24 file daemonize false override keyring $osd_data/keyring default leveldb_log default mon_allow_pool_delete true file mon_host 10.0.18.30 10.0.18.10 10.0.18.20 file ms_bind_ipv4 true file ms_bind_ipv6 false file no_config_file false override osd_delete_sleep 0.000000 override osd_delete_sleep_hdd 0.000000 override osd_delete_sleep_hybrid 0.000000 override osd_delete_sleep_ssd 0.000000 override osd_max_backfills 1000 override osd_mclock_max_capacity_iops_ssd 17080.220753 mon osd_mclock_scheduler_background_best_effort_lim 999999 default osd_mclock_scheduler_background_best_effort_res 534 default osd_mclock_scheduler_background_best_effort_wgt 2 default osd_mclock_scheduler_background_recovery_lim 2135 default osd_mclock_scheduler_background_recovery_res 534 default osd_mclock_scheduler_background_recovery_wgt 1 default osd_mclock_scheduler_client_lim 999999 default osd_mclock_scheduler_client_res 1068 default osd_mclock_scheduler_client_wgt 2 default osd_pool_default_min_size 2 file osd_pool_default_size 3 file osd_recovery_max_active 1000 override osd_recovery_max_active_hdd 1000 override osd_recovery_max_active_ssd 1000 override osd_recovery_sleep 0.000000 override osd_recovery_sleep_hdd 0.000000 override osd_recovery_sleep_hybrid 0.000000 override osd_recovery_sleep_ssd 0.000000 override osd_scrub_sleep 0.000000 override osd_snap_trim_sleep 0.000000 override osd_snap_trim_sleep_hdd 0.000000 override osd_snap_trim_sleep_hybrid 0.000000 override osd_snap_trim_sleep_ssd 0.000000 override public_network 10.0.18.0/24 file rbd_default_features 61 default rbd_qos_exclude_ops 0 default setgroup ceph cmdline setuser ceph cmdline Thanks a lot in advance. Best Benjamin