From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 7AA921FF13C for ; Thu, 16 Apr 2026 06:08:43 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E300420E93; Thu, 16 Apr 2026 06:08:42 +0200 (CEST) From: Kefu Chai To: pve-devel@lists.proxmox.com Subject: [PATCH docs 1/1] pveceph: document FastEC (allow_ec_optimizations) on erasure coded pools Date: Thu, 16 Apr 2026 12:08:29 +0800 Message-ID: <20260416040829.313252-2-k.chai@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260416040829.313252-1-k.chai@proxmox.com> References: <20260416040829.313252-1-k.chai@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1776312440203 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.336 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: FG4UD53CITICJCEZXKANVIQP6UFNJNPJ X-Message-ID-Hash: FG4UD53CITICJCEZXKANVIQP6UFNJNPJ X-MailFrom: k.chai@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Ceph Tentacle introduces a new erasure coding I/O path called FastEC, enabled per pool via the allow_ec_optimizations flag. Document the feature, its mon-enforced preconditions, the one-way-switch caveat, and how to enable it via the ceph CLI. FastEC can be enabled on both new and populated pools without data migration. Document the operational caveats for enabling on a populated pool: primary re-election when data shards 1..k-1 are marked non-primary, scrub interaction during the transition, and the stripe unit difference between pre-FastEC (4 KiB) and FastEC-native (16 KiB) pools. pveceph does not expose allow_ec_optimizations as a parameter, consistent with how other per-pool flags (compression_mode, noscrub, etc.) are handled: operators use the ceph CLI directly. Signed-off-by: Kefu Chai --- pveceph.adoc | 104 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) diff --git a/pveceph.adoc b/pveceph.adoc index fc8a072..e7380e7 100644 --- a/pveceph.adoc +++ b/pveceph.adoc @@ -764,6 +764,110 @@ For example: pveceph pool create --erasure-coding profile= ---- +FastEC (`allow_ec_optimizations`) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Ceph Tentacle introduces a new erasure coding I/O path called FastEC, +enabled per pool via the `allow_ec_optimizations` flag. FastEC adds partial +reads and partial writes on EC pools, substantially improving small random +I/O performance -- the dominant access pattern of virtual machine disk I/O. +Upstream recommends it for RBD and CephFS backed storage +footnote:[Ceph Erasure Code +{cephdocs-url}/rados/operations/erasure-code/]. + +FastEC has four requirements that the Ceph monitor enforces: + +* the pool must be erasure coded; +* the cluster must be at `require_osd_release` tentacle or later; +* the erasure code profile must use a plugin and technique combination + that supports the optimized EC path: ++ +[width="70%",options="header"] +|=== +|Plugin |Technique |FastEC +|`isa` |`reed_sol_van` |yes +|`isa` |`cauchy` |no +|`jerasure` |`reed_sol_van` |yes +|`jerasure` |all others |no +|`lrc` |(any) |no +|`shec` |(any) |no +|`clay` |(any) |no +|=== ++ +* the pool's stripe unit must be a multiple of 4096 bytes. The default + stripe unit satisfies this, so this is only relevant for pools created + with a custom `stripe_unit` in their erasure code profile. + +If any precondition is not met, `ceph osd pool set -data allow_ec_optimizations 1` +fails with a descriptive error -- no silent degradation. + +`pveceph` does not set the plugin or technique explicitly when creating an +EC profile -- both are seeded from the monitor's +`osd_pool_default_erasure_code_profile` option (`reed_sol_van` by default +on both Squid and Tentacle). The technique requirement is therefore +satisfied out of the box unless the operator has overridden this setting. +Verify with: + +[source,bash] +---- +ceph config get mon osd_pool_default_erasure_code_profile +---- + +If the output shows a technique other than `reed_sol_van`, either update +the cluster-wide default or create a compatible profile explicitly and pass +it via the `profile` property when creating the pool (see above). + +Check the cluster release with: + +[source,bash] +---- +ceph osd dump | grep require_osd_release +---- + +If this reports a release older than `tentacle`, run +`ceph osd require-osd-release tentacle` after all OSDs have been upgraded. + +IMPORTANT: `allow_ec_optimizations` is a one-way switch. Once enabled on a +pool, the Ceph monitor refuses to clear the flag again, so rolling back +requires draining and recreating the pool. Enable it only after validating +that FastEC works for your workload on a test pool. + +Enabling FastEC on a pool ++++++++++++++++++++++++++ + +FastEC is enabled per pool via the `ceph` CLI. It can be set on both new +and existing (populated) pools. No data migration or re-encoding takes +place; only pool metadata changes. + +To enable FastEC on a pool, run the following command against the data pool +(typically named `-data`): + +[source,bash] +---- +ceph osd pool set -data allow_ec_optimizations 1 +---- + +Verify the result with: + +[source,bash] +---- +ceph osd pool get -data allow_ec_optimizations +---- + +The same command reports the current state for any EC pool, which is useful +when auditing a cluster. + +When the flag is set on a populated pool, be aware of the following: + +* **Primary re-election.** FastEC marks data shards 1 through k-1 as + non-primary, so PGs whose primary was on one of those shards will + re-peer. This is a normal peering event and causes a brief I/O pause + per affected PG. + +* **Scrub interaction.** Re-peering cancels any in-flight scrub on + affected PGs. Those PGs will need to be re-scrubbed after peering + completes. + Adding EC Pools as Storage ^^^^^^^^^^^^^^^^^^^^^^^^^^ -- 2.47.3