From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id B4BBE9A4DB for ; Fri, 13 Oct 2023 15:33:29 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 964B21F03D for ; Fri, 13 Oct 2023 15:33:29 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 13 Oct 2023 15:33:28 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 3DE0C48F31 for ; Fri, 13 Oct 2023 15:33:28 +0200 (CEST) Message-ID: <1bb25817-66e7-406a-bd4b-0699de6cba31@proxmox.com> Date: Fri, 13 Oct 2023 15:33:26 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: de-AT, en-US From: Lukas Wagner To: Proxmox VE development discussion Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.027 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [update.sh] Subject: [pve-devel] [RFC] towards automated integration testing X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Oct 2023 13:33:29 -0000 Hello, I am currently doing the groundwork that should eventually enable us to write automated integration tests for our products. Part of that endeavor will be to write a custom test runner, which will - setup a specified test environment - execute test cases in that environment - create some sort of test report What will follow is a description of how that test runner would roughly work. The main point is to get some feedback on some of the ideas/ approaches before I start with the actual implementation. Let me know what you think! ## Introduction The goal is to establish a framework that allows us to write automated integration tests for our products. These tests are intended to run in the following situations: - When new packages are uploaded to the staging repos (by triggering a test run from repoman, or similar) - Later, this tests could also be run when patch series are posted to our mailing lists. This requires a mechanism to automatically discover, fetch and build patches, which will be a separate, follow-up project. - Additionally, it should be easy to run these integration tests locally on a developer's workstation in order to write new test cases, as well as troubleshooting and debugging existing test cases. The local test environment should match the one being used for automated testing as closely as possible As a main mode of operation, the Systems under Test (SUTs) will be virtualized on top of a Proxmox VE node. This has the following benefits: - it is easy to create various test setups (fixtures), including but not limited to single Proxmox VE nodes, clusters, Backup servers and auxiliary services (e.g. an LDAP server for testing LDAP authentication) - these test setups can easily be brought to a well-defined state: cloning from a template/restoring a backup/rolling back to snapshot - it makes it easy to run the integration tests on a developers workstation in identical configuration For the sake of completeness, some of the drawbacks of not running the tests on bare-metal: - Might be unable to detect regressions that only occur on real hardware In theory, the test runner would also be able to drive tests on real hardware, but of course with some limitations (harder to have a predictable, reproducible environment, etc.) ## Terminology - Template: A backup/VM template that can be instantiated by the test runner - Test Case: Some script/executable executed by the test runner, success is determined via exit code. - Fixture: Description of a test setup (e.g. which templates are needed, additional setup steps to run, etc.) ## Approach Test writers write template, fixture, test case definition in declarative configuration files (most likely TOML). The test case references a test executable/script, which performs the actual test. The test script is executed by the test runner; the test outcome is determined by the exit code of the script. Test scripts could be written in any language, e.g. they could be Perl scripts that use the official `libpve-apiclient-perl` to test-drive the SUTs. If we notice any emerging patterns, we could write additional helper libs that reduce the amount of boilerplate in test scripts. In essence, the test runner would do the following: - Group testcases by fixture - For every fixture: - Instantiate needed templates from their backup snapshot - Start VMs - Run any specified `setup-hooks` (update system, deploy packages, etc.) - Take a snapshot, including RAM - For every testcase using that fixture: - Run testcase (execute test executable, check exit code) - Rollback to snapshot (iff `rollback = true` for that template) - destroy test instances (or at least those which are not needed by other fixtures) In the beginning, the test scripts would primarily drive the Systems under Test (SUTs) via their API. However, the system would also offer the flexibility for us to venture into the realm of automated GUI testing at some point (e.g. using selenium) - without having to change the overall test architecture. ## Mock Test Runner Config Beside the actual test scripts, test writers would write test configuration. Based on the current requirements and approach that I have chose, a example config *could* look like the one following. These would likely be split into multiple files/folders (e.g. to group test case definition and the test script logically). ```toml [template.pve-default] # Backup image to restore from, in this case this would be a previously # set up PVE installation restore = '...' # To check if node is booted successfully, also made available to hook # scripts, in case they need to SSH in to setup things. ip = "10.0.0.1" # Define credentials in separate file - most template could use a # default password/SSH key/API token etc. credentials = "default" # Update to latest packages, install test .debs # credentials are passed via env var # Maybe this could also be ansible playbooks, if the need arises. setup-hooks = [ "update.sh", ] # Take snapshot after setup-hook, roll back after each test case rollback = true [template.ldap-server] # Backup image to restore from restore = '...' credentials = "default" ip = "10.0.0.3" # No need to roll back in between test cases, there won't be any changes rollback = false # Example fixture. They can be used by multiple testcases. [fixture.pve-with-ldap-server] # Maybe one could specify additional setup-hooks here as well, in case # one wants a 'per-fixture' setup? So that we can reduce the number of # base images? templates = [ 'pve-default', 'ldap-server', ] # testcases.toml (might be split to multiple files/folders?) [testcase.test-ldap-realms] fixture = 'pve-with-ldap-server' # - return code is check to determine test case success # - stderr/stdout is captured for the final test report # - some data is passed via env var: # - name of the test case # - template configuration (IPs, credentials, etc.) # - ... test-exec = './test-ldap-realms.pl' # Consider test as failed if test script does not finish fast enough test-timeout = 60 # Additional params for the test script, allowing for parameterized # tests. # Could also turn this into an array and loop over the values, in # order to create multiple test cases from the same definition. test-params = { foo = "bar" } # Second test case, using the same fixture [testcase.test-ldap-something-else] fixture = 'pve-with-ldap-server' test-exec = './test-ldap-something-else.pl' ``` -- - Lukas