From: Kefu Chai <k.chai@proxmox.com>
To: pve-devel@lists.proxmox.com
Cc: Kefu Chai <tchaikov@gmail.com>
Subject: [pve-devel] [PATCH pve-cluster 15/15] pmxcfs-rs: add project documentation
Date: Tue, 6 Jan 2026 22:24:39 +0800 [thread overview]
Message-ID: <20260106142440.2368585-16-k.chai@proxmox.com> (raw)
In-Reply-To: <20260106142440.2368585-1-k.chai@proxmox.com>
From: Kefu Chai <tchaikov@gmail.com>
---
src/pmxcfs-rs/ARCHITECTURE.txt | 350 +++++++++++++++++++++++++++++++++
src/pmxcfs-rs/README.md | 235 ++++++++++++++++++++++
2 files changed, 585 insertions(+)
create mode 100644 src/pmxcfs-rs/ARCHITECTURE.txt
create mode 100644 src/pmxcfs-rs/README.md
diff --git a/src/pmxcfs-rs/ARCHITECTURE.txt b/src/pmxcfs-rs/ARCHITECTURE.txt
new file mode 100644
index 00000000..2854520b
--- /dev/null
+++ b/src/pmxcfs-rs/ARCHITECTURE.txt
@@ -0,0 +1,350 @@
+================================================================================
+ pmxcfs-rs Architecture Overview
+================================================================================
+
+ Crate Dependency Graph
+================================================================================
+
+ +-------------------+
+ | pmxcfs-api-types |
+ | (Shared Types) |
+ +-------------------+
+ ^
+ |
+ +----------------------+----------------------+
+ | | |
+ | | |
++---------+---------+ +---------+---------+ +---------+---------+
+| pmxcfs-config | | pmxcfs-memdb | | pmxcfs-rrd |
+| (Configuration) | | (SQLite DB) | | (RRD Files) |
++-------------------+ +-------------------+ +-------------------+
+ ^ ^ ^
+ | | |
+ | +------------+------------+ |
+ | | | |
++---------+---------+ +---------+---------+
+| pmxcfs-ipc | | pmxcfs-status |
+| (libqb Server) | | (VM/Node Status) |
++-------------------+ +-------------------+
+ ^ ^
+ | |
+ | +------------------------+
+ | |
++---------+---------+
+| pmxcfs-logger |
+| (Cluster Log) |
++-------------------+
+ ^
+ |
++---------+---------+ +-------------------+
+| pmxcfs-dfsm | | pmxcfs-services |
+| (State Machine) | | (Service Mgmt) |
++-------------------+ +-------------------+
+ ^ ^
+ | |
+ +------------------+---------------+
+ |
+ +---------+---------+
+ | pmxcfs |
+ | (Main Daemon) |
+ +-------------------+
+
+
+================================================================================
+ Component Descriptions
+================================================================================
+
+pmxcfs-api-types
+ Shared types, errors, and constants used across all crates
+ - Error types (PmxcfsError)
+ - Common data structures
+ - VmType enum (Qemu, Lxc)
+
+pmxcfs-config
+ Corosync configuration parsing and management
+ - Reads /etc/corosync/corosync.conf
+ - Extracts cluster configuration (nodes, quorum, etc.)
+ - Provides Config struct
+
+pmxcfs-memdb
+ In-memory database with SQLite persistence
+ - SQLite schema version 5 (C-compatible)
+ - FUSE plugin system (6 functional + 4 link plugins)
+ - Key-value storage
+ - Version tracking
+
+pmxcfs-rrd
+ Round-Robin Database file management
+ - RRD file creation and updates
+ - Schema definitions (CPU, memory, network, etc.)
+ - Format migration (v1/v2/v3)
+ - rrdcached integration
+
+pmxcfs-status
+ Cluster status tracking
+ - VM/CT registration and tracking
+ - Node online/offline status
+ - RRD data collection
+ - Cluster log storage
+
+pmxcfs-ipc
+ libqb-compatible IPC server
+ - Unix socket server (@pve2)
+ - Wire protocol compatibility with libqb clients
+ - QB_IPC_SOCKET implementation
+ - 13 IPC operations (version, get, set, mkdir, etc.)
+
+pmxcfs-logger
+ Cluster log with distributed synchronization
+ - Ring buffer storage (50,000 entries)
+ - Deduplication
+ - Binary message format (32-byte aligned)
+ - Multi-node synchronization
+
+pmxcfs-dfsm
+ Distributed Finite State Machine
+ - State synchronization via Corosync CPG
+ - Message ordering and queuing
+ - Leader-based updates
+ - Membership change handling
+ - Services:
+ * ClusterDatabaseService (MemDB sync)
+ * StatusSyncService (Status sync)
+
+pmxcfs-services
+ Service lifecycle management framework
+ - Automatic retry logic
+ - Service dependencies
+ - Graceful shutdown
+
+pmxcfs (main daemon)
+ Main binary that integrates all components
+ - FUSE filesystem operations
+ - Corosync/CPG integration
+ - IPC server lifecycle
+ - Plugin system
+ - Daemon process management
+
+
+================================================================================
+ Data Flow: Write Operation
+================================================================================
+
+User/API
+ |
+ | write to /etc/pve/nodes/node1/qemu-server/100.conf
+ |
+ v
+FUSE Layer (pmxcfs::fuse::filesystem)
+ |
+ | filesystem::write()
+ |
+ v
+MemDB (pmxcfs-memdb)
+ |
+ | memdb.set(path, data)
+ | Update SQLite database
+ |
+ v
+DFSM (pmxcfs-dfsm)
+ |
+ | dfsm.broadcast_update(FuseMessage::Write)
+ |
+ v
+Corosync CPG
+ |
+ | CPG multicast to all nodes
+ |
+ v
+All Cluster Nodes
+ |
+ | Receive CPG message
+ | Apply update to local MemDB
+ | Update FUSE filesystem
+
+
+================================================================================
+ Data Flow: Cluster Log Entry
+================================================================================
+
+Local Log Event
+ |
+ | cluster log write
+ |
+ v
+Logger (pmxcfs-logger)
+ |
+ | Add to ring buffer
+ | Check for duplicates
+ |
+ v
+Status (pmxcfs-status)
+ |
+ | Store in status subsystem
+ |
+ v
+DFSM (pmxcfs-dfsm)
+ |
+ | Broadcast via StatusSyncService
+ |
+ v
+Corosync CPG
+ |
+ | Multicast to cluster
+ |
+ v
+All Nodes
+ |
+ | Receive and merge log entries
+
+
+================================================================================
+ Data Flow: IPC Request
+================================================================================
+
+Perl Client (PVE::IPCC)
+ |
+ | libqb IPC request (e.g., get("/nodes/localhost/qemu-server/100.conf"))
+ |
+ v
+IPC Server (pmxcfs-ipc)
+ |
+ | Parse libqb wire protocol
+ | Route to appropriate handler
+ |
+ v
+MemDB (pmxcfs-memdb)
+ |
+ | memdb.get(path)
+ | Query SQLite or plugin
+ |
+ v
+IPC Server
+ |
+ | Format libqb response
+ |
+ v
+Perl Client
+ |
+ | Receive data
+
+
+================================================================================
+ Initialization Sequence
+================================================================================
+
+1. Parse command line arguments
+ - Debug mode, local mode, paths, etc.
+
+2. Set up logging (tracing)
+ - journald integration
+ - Environment filter
+ - .debug file toggle support
+
+3. Initialize MemDB
+ - Open/create SQLite database
+ - Initialize schema (version 5)
+ - Register plugins
+
+4. Load Corosync configuration
+ - Parse corosync.conf
+ - Extract node info, quorum settings
+
+5. Initialize Status subsystem
+ - Set up VM/CT tracking
+ - Initialize RRD storage
+ - Set up cluster log
+
+6. Create DFSM
+ - Initialize state machine
+ - Set up CPG handler
+ - Register callbacks (MemDbCallbacks, StatusCallbacks)
+
+7. Start Services
+ - ClusterDatabaseService (MemDB sync)
+ - StatusSyncService (Status sync)
+ - QuorumService (quorum monitoring)
+ - ClusterConfigService (config sync)
+
+8. Initialize IPC Server
+ - Create Unix socket (@pve2)
+ - Set up request handlers
+ - Start listening
+
+9. Mount FUSE Filesystem
+ - Create mount point (/etc/pve)
+ - Initialize FUSE operations
+ - Start FUSE event loop
+
+10. Enter main event loop
+ - Handle DFSM messages
+ - Process IPC requests
+ - Service FUSE operations
+ - Monitor quorum
+
+
+================================================================================
+ Key Design Patterns
+================================================================================
+
+Trait-Based Abstraction
+ - DFSM uses Callbacks trait for MemDB/Status updates
+ - Enables testing with mock implementations
+ - Clean separation of concerns
+
+Service Framework
+ - pmxcfs-services provides retry logic
+ - Services can be started/stopped independently
+ - Automatic error recovery
+
+Plugin System
+ - MemDB supports dynamic plugins
+ - Functional plugins: Generate content on-the-fly
+ - Link plugins: Symlinks to other paths
+ - Examples: .version, .members, .vmlist, etc.
+
+Wire Protocol Compatibility
+ - IPC server implements libqb wire protocol
+ - Binary compatibility with C libqb clients
+ - Enables Perl tools (PVE::IPCC) to work unchanged
+
+Async Runtime
+ - tokio for async I/O
+ - Non-blocking operations
+ - Efficient resource usage
+
+
+================================================================================
+ Thread Model
+================================================================================
+
+Main Thread
+ - FUSE event loop (blocking)
+ - Handles filesystem operations
+
+Tokio Runtime
+ - IPC server (async)
+ - DFSM message handling (async)
+ - Service tasks (async)
+ - CPG message processing
+
+Background Threads
+ - SQLite I/O (blocking, offloaded)
+ - RRD file writes (blocking)
+
+
+================================================================================
+ Testing
+================================================================================
+
+Unit Tests
+ - Per-crate unit tests with mock implementations
+ - Run with: cargo test --workspace
+
+Integration Tests
+ - Comprehensive test suite in integration-tests/ directory
+ - Single-node, multi-node, and mixed C/Rust cluster tests
+ - See integration-tests/README.md for full documentation
+
+
+================================================================================
diff --git a/src/pmxcfs-rs/README.md b/src/pmxcfs-rs/README.md
new file mode 100644
index 00000000..4ad846f3
--- /dev/null
+++ b/src/pmxcfs-rs/README.md
@@ -0,0 +1,235 @@
+# pmxcfs-rs
+
+## Executive Summary
+
+pmxcfs-rs is a complete rewrite of the Proxmox Cluster File System from C to Rust, achieving full functional parity while maintaining wire-format compatibility with the C implementation. The implementation has passed comprehensive single-node and multi-node integration testing.
+
+**Overall Completion**: All subsystems implemented
+- All core subsystems implemented and tested
+- Wire protocol compatibility verified
+- Comprehensive test coverage (24 integration tests + extensive unit tests)
+- Production client compatibility confirmed
+- Multi-node cluster functionality validated
+
+---
+
+## Component Status
+
+### Workspace Structure
+
+pmxcfs-rs is organized as a Rust workspace with 9 crates:
+
+| Crate | Purpose |
+|-------|---------|
+| `pmxcfs` | Main daemon binary |
+| `pmxcfs-config` | Configuration management |
+| `pmxcfs-api-types` | Shared types and errors |
+| `pmxcfs-memdb` | Database with SQLite backend |
+| `pmxcfs-dfsm` | Distributed state machine + CPG |
+| `pmxcfs-rrd` | RRD file persistence |
+| `pmxcfs-status` | Status monitoring + RRD |
+| `pmxcfs-ipc` | libqb-compatible IPC server |
+| `pmxcfs-services` | Service lifecycle framework |
+| `pmxcfs-logger` | Cluster log + ring buffer |
+
+### Compatibility Matrix
+
+| Component | Notes |
+|-----------|-------|
+| **FUSE Filesystem** | All operations implemented |
+| **Database (MemDB)** | SQLite schema compatible |
+| **Cluster Communication** | CPG/Quorum via Corosync |
+| **DFSM State Machine** | Binary message format compatible |
+| **IPC Server** | Wire protocol verified with libqb clients |
+| **Plugin System** | All 10 plugins (6 func + 4 link) with write support |
+| **RRD Integration** | Format migration implemented |
+| **Status Subsystem** | VM list, config tracking, cluster log |
+
+---
+
+## Design Decisions and Notable Differences
+
+### 1. IPC Protocol: Partial libqb Implementation
+
+**Decision**: Implement libqb-compatible wire protocol without using libqb library directly.
+
+**C Implementation**:
+- Uses libqb library directly (`libqb0`, `libqb-dev`)
+- Full libqb feature set (SHM ring buffers, POSIX message queues, etc.)
+- IPC types: `QB_IPC_SOCKET`, `QB_IPC_SHM`, `QB_IPC_POSIX_MQ`
+
+**Rust Implementation**:
+- Custom implementation of libqb wire protocol
+- Only implements `QB_IPC_SOCKET` type (Unix datagram sockets + shared memory control files)
+- Compatible handshake, request/response structures
+- Verified with both libqb C clients and production Perl clients (PVE::IPCC)
+
+**Rationale**:
+- libqb has no Rust bindings and FFI would be complex
+- pmxcfs only uses `QB_IPC_SOCKET` type in production
+- Wire protocol compatibility is what matters for clients
+- Simpler implementation, easier to maintain
+
+**Compatibility Impact**: **None** - All production clients work identically
+
+**Reference**:
+- C: `src/pmxcfs/server.c` (uses libqb API)
+- Rust: `src/pmxcfs-rs/pmxcfs-ipc/src/server.rs` (custom implementation)
+- Verification: `pmxcfs-ipc/tests/qb_wire_compat.rs` (all tests passing)
+
+---
+
+### 2. Logging System: tracing vs qb_log
+
+**Decision**: Use Rust `tracing` ecosystem instead of libqb's `qb_log`.
+
+**C Implementation**:
+- Uses `qb_log` from libqb for all logging
+- Log levels: `QB_LOG_EMERG`, `QB_LOG_ALERT`, `QB_LOG_CRIT`, `QB_LOG_ERR`, `QB_LOG_WARNING`, `QB_LOG_NOTICE`, `QB_LOG_INFO`, `QB_LOG_DEBUG`
+- Output: syslog + stderr
+- Runtime control: Write to `/etc/pve/.debug` file (0 = info, 1 = debug)
+- Format: `[domain] LEVEL: message (file.c:line:function)`
+
+**Rust Implementation**:
+- Uses `tracing` crate with `tracing-subscriber`
+- Log levels: `ERROR`, `WARN`, `INFO`, `DEBUG`, `TRACE`
+- Output: journald (via `tracing-journald`) + stdout
+- Runtime control: Same mechanism - `.debug` plugin file (0 = info, 1 = debug)
+- Format: `[timestamp] LEVEL module::path: message`
+
+**Key Differences**:
+
+| Aspect | C (qb_log) | Rust (tracing) | Impact |
+|--------|-----------|----------------|--------|
+| **Log format** | `[domain] INFO: msg (file.c:123)` | `2025-11-14T10:30:45 INFO pmxcfs::module: msg` | Log parsers need update |
+| **Severity levels** | 8 levels (syslog standard) | 5 levels (standard Rust) | Mapping works fine |
+| **Destination** | syslog | journald (systemd) | Both queryable, journald is modern |
+| **Runtime toggle** | `/etc/pve/.debug` | Same | **No change** |
+| **CLI flag** | `-d` or `--debug` | Same | **No change** |
+
+**Rationale**:
+- `tracing` is the Rust ecosystem standard
+- Better async/structured logging support
+- No FFI to libqb needed
+- Integrates with systemd/journald natively
+- Same user-facing behavior (`.debug` file toggle)
+
+**Compatibility Impact**: **Minor** - Log monitoring scripts may need format updates
+
+**Migration**:
+```bash
+# Old C logs (syslog)
+journalctl -u pve-cluster | grep pmxcfs
+
+# New Rust logs (journald, same command works)
+journalctl -u pve-cluster | grep pmxcfs
+```
+
+**Reference**:
+- C: `src/pmxcfs/pmxcfs.c` (qb_log initialization)
+- Rust: `src/pmxcfs-rs/pmxcfs/src/main.rs` (tracing-subscriber setup)
+
+---
+
+### 3. OpenVZ Container Support: Intentionally Excluded
+
+**Decision**: No functional support for OpenVZ containers.
+
+**C Implementation**:
+- Includes OpenVZ VM type (`VMTYPE_OPENVZ = 2`)
+- Detects OpenVZ action scripts (`vps*.mount`, `*.start`, `*.stop`, etc.)
+- Sets executable permissions on OpenVZ scripts
+- Scans `nodes/*/openvz/` directories for containers
+- **All code marked**: `// FIXME: remove openvz stuff for 7.x`
+
+**Rust Implementation**:
+- VM types: `VmType::Qemu = 1`, `VmType::Lxc = 3` (no `VMTYPE_OPENVZ = 2`)
+- `/openvz` symlink exists (for backward compatibility) but no functional support
+- No OpenVZ script detection or VM scanning
+
+**Rationale**:
+- OpenVZ deprecated in Proxmox VE 4.0 (2015)
+- OpenVZ removed completely in Proxmox VE 7.0 (2021)
+- pmxcfs-rs ships with Proxmox VE 9.x (2 major versions after removal)
+- Last OpenVZ code change: October 2011 (14 years ago)
+- Mandatory LXC migration completed years ago
+
+**Compatibility Impact**: **None** - No PVE 9.x systems have OpenVZ containers
+
+**Reference**:
+- C: `src/pmxcfs/status.h:31-32`, `cfs-plug-memdb.c:46-93`, `memdb.c:455-460`
+- Rust: `pmxcfs-api-types/src/lib.rs:99-102` (VmType enum)
+
+---
+
+## Testing
+
+pmxcfs-rs has a comprehensive test suite with 100+ tests organized following modern Rust testing best practices.
+
+### Quick Start
+
+```bash
+# Run all tests
+cargo test --workspace
+
+# Run unit tests only (fast, inline tests)
+cargo test --lib
+
+# Run integration tests only
+cargo test --test '*'
+
+# Run specific package tests
+cargo test -p pmxcfs-memdb
+```
+
+### Multi-Node Integration Tests
+
+Complete integration test suite covering single-node, multi-node cluster, and C/Rust interoperability.
+
+```bash
+cd integration-tests
+./test --build # Build and run all tests
+./test --no-build # Quick iteration
+./test --list # Show available tests
+```
+
+See [integration-tests/README.md](integration-tests/README.md) for detailed documentation.
+
+---
+
+## Compatibility Summary
+
+### Wire-Compatible
+- IPC protocol (verified with libqb clients)
+- DFSM message format (binary compatible)
+- Database schema (SQLite version 5)
+- RRD file formats (all versions)
+- FUSE operations (all 12 ops)
+
+### Different but Compatible
+- Logging system (tracing vs qb_log) - format differs, functionality same
+- IPC implementation (custom vs libqb) - protocol identical, implementation differs
+- Event loop (tokio vs qb_loop) - both provide event-driven concurrency
+
+### Intentionally Different
+- OpenVZ support (removed, not needed)
+- Service priority levels (all run concurrently in Rust)
+
+---
+
+## References
+
+- **C Implementation**: `src/pmxcfs/`
+- **Rust Implementation**: `src/pmxcfs-rs/`
+ - `pmxcfs` - Main daemon binary
+ - `pmxcfs-config` - Configuration management
+ - `pmxcfs-api-types` - Shared types and error definitions
+ - `pmxcfs-memdb` - In-memory database with SQLite persistence
+ - `pmxcfs-dfsm` - Distributed Finite State Machine (CPG integration)
+ - `pmxcfs-rrd` - RRD persistence
+ - `pmxcfs-status` - Status monitoring and RRD data management
+ - `pmxcfs-ipc` - libqb-compatible IPC server
+ - `pmxcfs-services` - Service framework for lifecycle management
+ - `pmxcfs-logger` - Cluster log with ring buffer and deduplication
+- **Testing Guide**: `integration-tests/README.md`
+- **Test Runner**: `integration-tests/test` (unified test interface)
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
prev parent reply other threads:[~2026-01-06 14:25 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-06 14:24 [pve-devel] [PATCH pve-cluster 00/15 v1] Rewrite pmxcfs with Rust Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 01/15] pmxcfs-rs: add workspace and pmxcfs-api-types crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 02/15] pmxcfs-rs: add pmxcfs-config crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 03/15] pmxcfs-rs: add pmxcfs-logger crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 04/15] pmxcfs-rs: add pmxcfs-rrd crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 05/15] pmxcfs-rs: add pmxcfs-memdb crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 06/15] pmxcfs-rs: add pmxcfs-status crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 07/15] pmxcfs-rs: add pmxcfs-test-utils infrastructure crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 08/15] pmxcfs-rs: add pmxcfs-services crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 09/15] pmxcfs-rs: add pmxcfs-ipc crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 10/15] pmxcfs-rs: add pmxcfs-dfsm crate Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 11/15] pmxcfs-rs: vendor patched rust-corosync for CPG compatibility Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 13/15] pmxcfs-rs: add integration and workspace tests Kefu Chai
2026-01-06 14:24 ` [pve-devel] [PATCH pve-cluster 14/15] pmxcfs-rs: add Makefile for build automation Kefu Chai
2026-01-06 14:24 ` Kefu Chai [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260106142440.2368585-16-k.chai@proxmox.com \
--to=k.chai@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
--cc=tchaikov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox