From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <l.wagner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id C94EA9AF69
 for <pve-devel@lists.proxmox.com>; Mon, 16 Oct 2023 17:18:57 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id A05651A79A
 for <pve-devel@lists.proxmox.com>; Mon, 16 Oct 2023 17:18:57 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Mon, 16 Oct 2023 17:18:56 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 48F584073F
 for <pve-devel@lists.proxmox.com>; Mon, 16 Oct 2023 17:18:56 +0200 (CEST)
Message-ID: <9780afc5-6bf9-40da-8a2f-0c5e02ded605@proxmox.com>
Date: Mon, 16 Oct 2023 17:18:54 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: de-AT, en-US
To: Stefan Hanreich <s.hanreich@proxmox.com>,
 Proxmox VE development discussion <pve-devel@lists.proxmox.com>
References: <1bb25817-66e7-406a-bd4b-0699de6cba31@proxmox.com>
 <44f89ce6-043f-7b05-75ad-ac66550eb3e8@proxmox.com>
From: Lukas Wagner <l.wagner@proxmox.com>
In-Reply-To: <44f89ce6-043f-7b05-75ad-ac66550eb3e8@proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.027 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] [RFC] towards automated integration testing
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Mon, 16 Oct 2023 15:18:57 -0000

Thank you for the feedback!

On 10/16/23 13:20, Stefan Hanreich wrote:
> On 10/13/23 15:33, Lukas Wagner wrote:
> 
>> - Additionally, it should be easy to run these integration tests locally
>>    on a developer's workstation in order to write new test cases, as well
>>    as troubleshooting and debugging existing test cases. The local
>>    test environment should match the one being used for automated testing
>>    as closely as possible
> This would also include sharing those fixture templates somewhere, do
> you already have an idea on how to accomplish this? PBS sounds like a
> good option for this if I'm not missing something.
> 
Yes, these templates could be stored on some shared storage, e.g. a PBS
instance, or they could also distributed via a .deb/multiple .debs (not 
sure if that is a good idea, since these would become huge pretty
quickly).

It could also be a two-step process: Use one command to get the
latest test templates, restoring them from a remote backup, converting
them to a local VM template. When executing tests, the test runner
could then use linked clones, speeding up the test setup time
quite a bit.

All in all, these templates that can be used in test fixtures should be:
   - easily obtainable for developers, in order to have a fully
     functional test setup on their workstation
   - easily updateable (e.g. installing the latest packages, so that
     the setup-hook does not need to fetch a boatload of packages every
     time)


>> As a main mode of operation, the Systems under Test (SUTs)
>> will be virtualized on top of a Proxmox VE node.
>>
>> This has the following benefits:
>> - it is easy to create various test setups (fixtures), including but not
>>    limited to single Proxmox VE nodes, clusters, Backup servers and
>>    auxiliary services (e.g. an LDAP server for testing LDAP
>>    authentication)
> I can imagine having to setup VMs inside the Test Setup as well for
> doing various tests. Doing this manually every time could be quite
> cumbersome / hard to automate. Do you have a mechanism in mind to deploy
> VMs inside the test system as well? Again, PBS could be an interesting
> option for this imo.
> 
Several options come to mind. We could use a virtualized PBS instance 
with a datastore containing the VM backup as part of the fixture.
We could use some external backup store (so the same 'source' as for the 
templates themselves) - however that means that the systems under test
must have network access to that.
We could also think about using iPXE to boot test VMs, with the
boot image either be provided by some template from the fixture, or by
some external server.
For both approaches, the 'as part of the fixture' approaches seem a bit
nicer, as they are more self-contained.

Also, the vmbd2 thingy that thomas mentioned might be interesting for
this - i've only glanced at it so far though.

As of now it seems that this question will not influence the design
of the test runner much, so it can probably be postponed to a later
stage.

>> In theory, the test runner would also be able to drive tests on real
>> hardware, but of course with some limitations (harder to have a
>> predictable, reproducible environment, etc.)
> 
> Maybe utilizing Aaron's installer for setting up those test systems
> could at least produce somewhat identical setups? Although it is really
> hard managing systems with different storage types, network cards, ... .

In general my biggest concern with 'bare-metal' tests - and to precise,
that does not really have anything to do with being 'bare-metal',
more about testing on something that is harder roll back into
a clean state that can be used for the next test execution, is that
I'm afraid that a setup like this could become quite brittle and a 
maintenance burden. At some point, a test execution might leave
something in an unclean state (e.g. due to a crashed test or missing
something while cleanup), tripping up the following test job.
As an example from personal experience: One test run
might test new packages which introduce a new flag in a configuration
file. If that flag is not cleanup up afterwards, another test job
testing other packages might fail because it now has to
deal with an 'unknown' configuration key.

Maybe ZFS snapshots could help with that, but I'm not sure how that
would work in practice (e.g. due to the kernel being stored on
the EFI partition).

The automated installer *could* certainly help here - however,
right now I don't want to extend the scope of this project too much.
Also, there is also the question if the installation should be refreshed 
after every single test run, increasing the test cycle time/resource 
consumption quite a bit? Or only if 'something' breaks?

That being said, it might also make sense to be able to run the tests
(or more likely, a subset of them, since some will inherently
require a fixture) against an arbitrary PVE instance that is under full 
control of a developer (e.g. a development VM, or, if feeling 
adventurous, the workstation itself). If this is possible, then these 
tests could the fastest way to get feedback while developing, since
there is no need to instantiate a template, update, deploy, etc.
In this case, the test runner's job would only be to run the test 
scripts, without managing fixtures/etc, and then reporting the results
back to the developer.

Essentially, as Thomas already mentioned, one approach to do this would
be to decouple the 'fixture setup' and 'test case execution' part as
much as possible. How that will look in practice will be part of
further research.

> I've seen GitLab using tags for runners that specify certain
> capabilities of systems. Maybe we could also introduce something like
> that here for different bare-metal systems? E.g. a test case specifies
> it needs a system with tag `ZFS` and then you can run / skip the
> respective test case on that system. Managing those tags can introduce
> quite a lot of churn though, so I'm not sure if this would be a good idea.
> 

I have thought about a tag system as well - not necessarily for test
runners, but for test cases. E.g. you could tag tests for the
authentication system with 'auth' - because at least for the local
development cycle it might not make much sense to run tests for
clusters, ceph, etc. while working on the authentication system.

The 'tags' to be executed might then be simply passed to the test
runner.

These tags could also be used to mark the subset of 'simple'
test cases that don't need a special test fixture, as described above...

This could also be extended to a full 'predicate-like' system as Thomas
described.


>> The test script is executed by the test runner; the test outcome is
>> determined by the exit code of the script. Test scripts could be written
> Are you considering capturing output as well? That would make sense when
> using assertions at least, so in case of failures developers have a
> starting point for debugging.
Yup, I'd capture stdout/stderr from all test executables/scripts and
include it in the final test report.
Test output is indeed very useful when determining *why* something went
wrong.

> 
> Would it make sense to allow specifying a expected exit code for tests
> that actually should fail - or do you consider this something that
> should be handled by the test script?

I guess that's a matter of taste. Personally I'd keep the contract
between test runner and test script simple and say 0 == success,
everything else is a failure. If there are any test cases that
expect a failure of some API call, then the script should 'translate'
the exit code.
If we discover that specifying an expected exit actually makes things
easier for us, then adding it should be rather trivial - and easier
than ripping it out the other way round.


> I've refrained from talking about the toml files too much since it's
> probably too early to say something about that, but they look good so
> far from my pov.
> 
> In general this sounds like quite the exciting feature and the RFC looks
> very promising already.

Thanks for your feedback!

> 
> Kind Regards
> Stefan

-- 
- Lukas