* [pve-devel] [RFC] towards automated integration testing
@ 2023-10-13 13:33 Lukas Wagner
2023-10-16 11:20 ` Stefan Hanreich
2023-10-16 13:57 ` Thomas Lamprecht
0 siblings, 2 replies; 10+ messages in thread
From: Lukas Wagner @ 2023-10-13 13:33 UTC (permalink / raw)
To: Proxmox VE development discussion
Hello,
I am currently doing the groundwork that should eventually enable us
to write automated integration tests for our products.
Part of that endeavor will be to write a custom test runner, which will
- setup a specified test environment
- execute test cases in that environment
- create some sort of test report
What will follow is a description of how that test runner would roughly
work. The main point is to get some feedback on some of the ideas/
approaches before I start with the actual implementation.
Let me know what you think!
## Introduction
The goal is to establish a framework that allows us to write
automated integration tests for our products.
These tests are intended to run in the following situations:
- When new packages are uploaded to the staging repos (by triggering
a test run from repoman, or similar)
- Later, this tests could also be run when patch series are posted to
our mailing lists. This requires a mechanism to automatically
discover, fetch and build patches, which will be a separate,
follow-up project.
- Additionally, it should be easy to run these integration tests locally
on a developer's workstation in order to write new test cases, as well
as troubleshooting and debugging existing test cases. The local
test environment should match the one being used for automated testing
as closely as possible
As a main mode of operation, the Systems under Test (SUTs)
will be virtualized on top of a Proxmox VE node.
This has the following benefits:
- it is easy to create various test setups (fixtures), including but not
limited to single Proxmox VE nodes, clusters, Backup servers and
auxiliary services (e.g. an LDAP server for testing LDAP
authentication)
- these test setups can easily be brought to a well-defined state:
cloning from a template/restoring a backup/rolling back to snapshot
- it makes it easy to run the integration tests on a developers
workstation in identical configuration
For the sake of completeness, some of the drawbacks of not running the
tests on bare-metal:
- Might be unable to detect regressions that only occur on real hardware
In theory, the test runner would also be able to drive tests on real
hardware, but of course with some limitations (harder to have a
predictable, reproducible environment, etc.)
## Terminology
- Template: A backup/VM template that can be instantiated by the test
runner
- Test Case: Some script/executable executed by the test runner, success
is determined via exit code.
- Fixture: Description of a test setup (e.g. which templates are needed,
additional setup steps to run, etc.)
## Approach
Test writers write template, fixture, test case definition in
declarative configuration files (most likely TOML). The test case
references a test executable/script, which performs the actual test.
The test script is executed by the test runner; the test outcome is
determined by the exit code of the script. Test scripts could be written
in any language, e.g. they could be Perl scripts that use the official
`libpve-apiclient-perl` to test-drive the SUTs.
If we notice any emerging patterns, we could write additional helper
libs that reduce the amount of boilerplate in test scripts.
In essence, the test runner would do the following:
- Group testcases by fixture
- For every fixture:
- Instantiate needed templates from their backup snapshot
- Start VMs
- Run any specified `setup-hooks` (update system, deploy packages,
etc.)
- Take a snapshot, including RAM
- For every testcase using that fixture:
- Run testcase (execute test executable, check exit code)
- Rollback to snapshot (iff `rollback = true` for that template)
- destroy test instances (or at least those which are not needed by
other fixtures)
In the beginning, the test scripts would primarily drive the Systems
under Test (SUTs) via their API. However, the system would also offer
the flexibility for us to venture into the realm of automated GUI
testing at some point (e.g. using selenium) - without having to
change the overall test architecture.
## Mock Test Runner Config
Beside the actual test scripts, test writers would write test
configuration. Based on the current requirements and approach that
I have chose, a example config *could* look like the one following.
These would likely be split into multiple files/folders
(e.g. to group test case definition and the test script logically).
```toml
[template.pve-default]
# Backup image to restore from, in this case this would be a previously
# set up PVE installation
restore = '...'
# To check if node is booted successfully, also made available to hook
# scripts, in case they need to SSH in to setup things.
ip = "10.0.0.1"
# Define credentials in separate file - most template could use a
# default password/SSH key/API token etc.
credentials = "default"
# Update to latest packages, install test .debs
# credentials are passed via env var
# Maybe this could also be ansible playbooks, if the need arises.
setup-hooks = [
"update.sh",
]
# Take snapshot after setup-hook, roll back after each test case
rollback = true
[template.ldap-server]
# Backup image to restore from
restore = '...'
credentials = "default"
ip = "10.0.0.3"
# No need to roll back in between test cases, there won't be any changes
rollback = false
# Example fixture. They can be used by multiple testcases.
[fixture.pve-with-ldap-server]
# Maybe one could specify additional setup-hooks here as well, in case
# one wants a 'per-fixture' setup? So that we can reduce the number of
# base images?
templates = [
'pve-default',
'ldap-server',
]
# testcases.toml (might be split to multiple files/folders?)
[testcase.test-ldap-realms]
fixture = 'pve-with-ldap-server'
# - return code is check to determine test case success
# - stderr/stdout is captured for the final test report
# - some data is passed via env var:
# - name of the test case
# - template configuration (IPs, credentials, etc.)
# - ...
test-exec = './test-ldap-realms.pl'
# Consider test as failed if test script does not finish fast enough
test-timeout = 60
# Additional params for the test script, allowing for parameterized
# tests.
# Could also turn this into an array and loop over the values, in
# order to create multiple test cases from the same definition.
test-params = { foo = "bar" }
# Second test case, using the same fixture
[testcase.test-ldap-something-else]
fixture = 'pve-with-ldap-server'
test-exec = './test-ldap-something-else.pl'
```
--
- Lukas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-13 13:33 [pve-devel] [RFC] towards automated integration testing Lukas Wagner
@ 2023-10-16 11:20 ` Stefan Hanreich
2023-10-16 15:18 ` Lukas Wagner
2023-10-16 13:57 ` Thomas Lamprecht
1 sibling, 1 reply; 10+ messages in thread
From: Stefan Hanreich @ 2023-10-16 11:20 UTC (permalink / raw)
To: Proxmox VE development discussion, Lukas Wagner
On 10/13/23 15:33, Lukas Wagner wrote:
> - Additionally, it should be easy to run these integration tests locally
> on a developer's workstation in order to write new test cases, as well
> as troubleshooting and debugging existing test cases. The local
> test environment should match the one being used for automated testing
> as closely as possible
This would also include sharing those fixture templates somewhere, do
you already have an idea on how to accomplish this? PBS sounds like a
good option for this if I'm not missing something.
> As a main mode of operation, the Systems under Test (SUTs)
> will be virtualized on top of a Proxmox VE node.
>
> This has the following benefits:
> - it is easy to create various test setups (fixtures), including but not
> limited to single Proxmox VE nodes, clusters, Backup servers and
> auxiliary services (e.g. an LDAP server for testing LDAP
> authentication)
I can imagine having to setup VMs inside the Test Setup as well for
doing various tests. Doing this manually every time could be quite
cumbersome / hard to automate. Do you have a mechanism in mind to deploy
VMs inside the test system as well? Again, PBS could be an interesting
option for this imo.
> In theory, the test runner would also be able to drive tests on real
> hardware, but of course with some limitations (harder to have a
> predictable, reproducible environment, etc.)
Maybe utilizing Aaron's installer for setting up those test systems
could at least produce somewhat identical setups? Although it is really
hard managing systems with different storage types, network cards, ... .
I've seen GitLab using tags for runners that specify certain
capabilities of systems. Maybe we could also introduce something like
that here for different bare-metal systems? E.g. a test case specifies
it needs a system with tag `ZFS` and then you can run / skip the
respective test case on that system. Managing those tags can introduce
quite a lot of churn though, so I'm not sure if this would be a good idea.
> The test script is executed by the test runner; the test outcome is
> determined by the exit code of the script. Test scripts could be written
Are you considering capturing output as well? That would make sense when
using assertions at least, so in case of failures developers have a
starting point for debugging.
Would it make sense to allow specifying a expected exit code for tests
that actually should fail - or do you consider this something that
should be handled by the test script?
I've refrained from talking about the toml files too much since it's
probably too early to say something about that, but they look good so
far from my pov.
In general this sounds like quite the exciting feature and the RFC looks
very promising already.
Kind Regards
Stefan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-13 13:33 [pve-devel] [RFC] towards automated integration testing Lukas Wagner
2023-10-16 11:20 ` Stefan Hanreich
@ 2023-10-16 13:57 ` Thomas Lamprecht
2023-10-16 15:33 ` Lukas Wagner
1 sibling, 1 reply; 10+ messages in thread
From: Thomas Lamprecht @ 2023-10-16 13:57 UTC (permalink / raw)
To: Proxmox VE development discussion, Lukas Wagner
A few things, most of which we talked off-list already anyway.
We should eye if we can integrate existing regression testing in there
too, i.e.:
- The qemu autotest that Stefan Reiter started and Fiona still uses,
here we should drop the in-git tracked backup that the test VM is
restored from (replace with something like vmdb2 [0] managed Debian
image that gets generated on demand), replace some hard coded
configs with a simple config and make it public.
[0]: https://vmdb2.liw.fi/
- The selenium based end-to-end tests which we also use to generate most
screenshots with (they can run headless too). Here we also need a few
clean-ups, but not that many, and make the repo public.
Am 13/10/2023 um 15:33 schrieb Lukas Wagner:> I am currently doing the groundwork that should eventually enable us
> to write automated integration tests for our products.
>
> Part of that endeavor will be to write a custom test runner, which will
> - setup a specified test environment
> - execute test cases in that environment
This should be decoupled from all else, so that I can run it on any
existing installation, bare-metal or not. This allows devs using it in
their existing setups with almost no change required.
We can then also add it easily in our existing buildbot instance
relatively easily, so it would be worth doing so even if we might
deprecate Buildbot in the future (for what little it can do, it could
be simpler).
> - create some sort of test report
As Stefan mentioned, test-output can be good to have. Our buildbot
instance provides that, and while I don't look at them in 99% of the
builds, when I need to its worth *a lot*.
>
> ## Introduction
>
> The goal is to establish a framework that allows us to write
> automated integration tests for our products.
> These tests are intended to run in the following situations:
> - When new packages are uploaded to the staging repos (by triggering
> a test run from repoman, or similar)
*debian repos, as we could also trigger some when git commits are
pushed, just like we do now through Buildbot. Doing so is IMO nice as it
will catch issues before a package was bumped, but is still quite a bit
simpler to implement than an "apply patch from list to git repos" thing
from the next point, but could still act as a preparation for that.
> - Later, this tests could also be run when patch series are posted to
> our mailing lists. This requires a mechanism to automatically
> discover, fetch and build patches, which will be a separate,
> follow-up project.
>
> As a main mode of operation, the Systems under Test (SUTs)
> will be virtualized on top of a Proxmox VE node.
For the fully-automated test system this can be OK as primary mode, as
it indeed makes things like going back to an older software state much
easier.
But, if we decouple the test harness and running them from that more
automated system, we can also run the harness periodically on our
bare-metal test servers.
> ## Terminology
> - Template: A backup/VM template that can be instantiated by the test
> runner
I.e., the base of the test host? I'd call this test-host, template is a
bit to overloaded/generic and might focus too much on the virtual test
environment.
Or is this some part that takes place in the test, i.e., a
generalization of product to test and supplementary tool/app that helps
on that test?
Hmm, could work out ok, and we should be able to specialize stuff
relatively easier later too, if wanted.
> - Test Case: Some script/executable executed by the test runner, success
> is determined via exit code.
> - Fixture: Description of a test setup (e.g. which templates are needed,
> additional setup steps to run, etc.)
>
> ## Approach
> Test writers write template, fixture, test case definition in
> declarative configuration files (most likely TOML). The test case
> references a test executable/script, which performs the actual test.
>
> The test script is executed by the test runner; the test outcome is
> determined by the exit code of the script. Test scripts could be written
> in any language, e.g. they could be Perl scripts that use the official
> `libpve-apiclient-perl` to test-drive the SUTs.
> If we notice any emerging patterns, we could write additional helper
> libs that reduce the amount of boilerplate in test scripts.
>
> In essence, the test runner would do the following:
> - Group testcases by fixture
> - For every fixture:
> - Instantiate needed templates from their backup snapshot
Should be optional, possible a default-on boolean option that conveys
> - Start VMs
Same.
> - Run any specified `setup-hooks` (update system, deploy packages,
> etc.)
Should be as idempotent as possible.
> - Take a snapshot, including RAM
Should be optional (as in, don't care if it cannot be done, e.g., on
bare metal).
> - For every testcase using that fixture:
> - Run testcase (execute test executable, check exit code)
> - Rollback to snapshot (iff `rollback = true` for that template)
> - destroy test instances (or at least those which are not needed by
> other fixtures)
Might be optional for l1 hosts, l2 test VMs might be a separate switch.
> In the beginning, the test scripts would primarily drive the Systems
> under Test (SUTs) via their API. However, the system would also offer
> the flexibility for us to venture into the realm of automated GUI
> testing at some point (e.g. using selenium) - without having to
> change the overall test architecture.
Our existing selenium based UI test simple use the API to create stuff
that it needs, if it's not existing, and sometimes remove also some.
It uses some special ranges or values to avoid most conflicts with real
systems, allowing one to point it at existing (production) systems
without problems.
IMO this has a big value, and I actually added a bit of resiliency, as I
find that having to set up clean states a bit annoying and for one of
the main use cases of that tooling, creating screenshots, too sterile.
But always starting out from a very clean state is IMO not only "ugly"
for screenshots, but can also sometimes mas issues that test can run
into on systems with a longer uptime and the "organic mess" that comes
from long-term maintenance.
In practice one naturally wants both, starting from a clean state and
from existing one, both have their advantages and disadvantages. Like
messy systems also might have more false-positives on regression
tracking.
>
> ## Mock Test Runner Config
>
> Beside the actual test scripts, test writers would write test
> configuration. Based on the current requirements and approach that
> I have chose, a example config *could* look like the one following.
> These would likely be split into multiple files/folders
> (e.g. to group test case definition and the test script logically).
>
> ```toml
> [template.pve-default]
> # Backup image to restore from, in this case this would be a previously
> # set up PVE installation
> restore = '...'
> # To check if node is booted successfully, also made available to hook
> # scripts, in case they need to SSH in to setup things.
> ip = "10.0.0.1"
> # Define credentials in separate file - most template could use a
> # default password/SSH key/API token etc.
> credentials = "default"
> # Update to latest packages, install test .debs
> # credentials are passed via env var
> # Maybe this could also be ansible playbooks, if the need arises.
fwiw, one could also define a config-deployment-system, like
- none (already is setup)
- cloudinit
- QGA
but that can be added later on too.
> setup-hooks = [
> "update.sh",
> ]
> # Take snapshot after setup-hook, roll back after each test case
> rollback = true
>
>
> [template.ldap-server]
> # Backup image to restore from
> restore = '...'
> credentials = "default"
> ip = "10.0.0.3"
> # No need to roll back in between test cases, there won't be any changes
> rollback = false
>
>
>
> # Example fixture. They can be used by multiple testcases.
> [fixture.pve-with-ldap-server]
> # Maybe one could specify additional setup-hooks here as well, in case
> # one wants a 'per-fixture' setup? So that we can reduce the number of
> # base images?
> templates = [
> 'pve-default',
> 'ldap-server',
> ]
>
>
> # testcases.toml (might be split to multiple files/folders?)
maybe some sort of predicates could be also nice (even if not there from
the start), like to place condition where a test is skipped if that's
not met, like the existence of a ZFS-storage or something like that.
While those seem like details, having a general (simple) dependency and,
so to say, anti-dependency system might influence overall design more.
> [testcase.test-ldap-realms]
> fixture = 'pve-with-ldap-server'
>
> # - return code is check to determine test case success
> # - stderr/stdout is captured for the final test report
> # - some data is passed via env var:
> # - name of the test case
> # - template configuration (IPs, credentials, etc.)
> # - ...
> test-exec = './test-ldap-realms.pl'
> # Consider test as failed if test script does not finish fast enough
> test-timeout = 60
> # Additional params for the test script, allowing for parameterized
> # tests.
> # Could also turn this into an array and loop over the values, in
> # order to create multiple test cases from the same definition.
> test-params = { foo = "bar" }
>
> # Second test case, using the same fixture
> [testcase.test-ldap-something-else]
> fixture = 'pve-with-ldap-server'
> test-exec = './test-ldap-something-else.pl'
>
> ```
>
Is the order of test-cases guaranteed by toml parsing, or how are intra-
fixture dependencies ensured?
Anyway, the most important thing is to start out here, so I don't
want to block anything on base on minorish stuff.
The most important thing for me is that the following parts are decoupled
and ideally shippable by a separate debian package each:
- parts that manage automated testing, including how the test host
base system is set up (the latter could be even its own thing)
- running test itself inclusive some helper modules/scripts
- the test definitions
As then we can run them anywhere easily and extend, or possible even
rework some parts independently, if ever needed.
- Thomas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-16 11:20 ` Stefan Hanreich
@ 2023-10-16 15:18 ` Lukas Wagner
2023-10-17 7:34 ` Thomas Lamprecht
0 siblings, 1 reply; 10+ messages in thread
From: Lukas Wagner @ 2023-10-16 15:18 UTC (permalink / raw)
To: Stefan Hanreich, Proxmox VE development discussion
Thank you for the feedback!
On 10/16/23 13:20, Stefan Hanreich wrote:
> On 10/13/23 15:33, Lukas Wagner wrote:
>
>> - Additionally, it should be easy to run these integration tests locally
>> on a developer's workstation in order to write new test cases, as well
>> as troubleshooting and debugging existing test cases. The local
>> test environment should match the one being used for automated testing
>> as closely as possible
> This would also include sharing those fixture templates somewhere, do
> you already have an idea on how to accomplish this? PBS sounds like a
> good option for this if I'm not missing something.
>
Yes, these templates could be stored on some shared storage, e.g. a PBS
instance, or they could also distributed via a .deb/multiple .debs (not
sure if that is a good idea, since these would become huge pretty
quickly).
It could also be a two-step process: Use one command to get the
latest test templates, restoring them from a remote backup, converting
them to a local VM template. When executing tests, the test runner
could then use linked clones, speeding up the test setup time
quite a bit.
All in all, these templates that can be used in test fixtures should be:
- easily obtainable for developers, in order to have a fully
functional test setup on their workstation
- easily updateable (e.g. installing the latest packages, so that
the setup-hook does not need to fetch a boatload of packages every
time)
>> As a main mode of operation, the Systems under Test (SUTs)
>> will be virtualized on top of a Proxmox VE node.
>>
>> This has the following benefits:
>> - it is easy to create various test setups (fixtures), including but not
>> limited to single Proxmox VE nodes, clusters, Backup servers and
>> auxiliary services (e.g. an LDAP server for testing LDAP
>> authentication)
> I can imagine having to setup VMs inside the Test Setup as well for
> doing various tests. Doing this manually every time could be quite
> cumbersome / hard to automate. Do you have a mechanism in mind to deploy
> VMs inside the test system as well? Again, PBS could be an interesting
> option for this imo.
>
Several options come to mind. We could use a virtualized PBS instance
with a datastore containing the VM backup as part of the fixture.
We could use some external backup store (so the same 'source' as for the
templates themselves) - however that means that the systems under test
must have network access to that.
We could also think about using iPXE to boot test VMs, with the
boot image either be provided by some template from the fixture, or by
some external server.
For both approaches, the 'as part of the fixture' approaches seem a bit
nicer, as they are more self-contained.
Also, the vmbd2 thingy that thomas mentioned might be interesting for
this - i've only glanced at it so far though.
As of now it seems that this question will not influence the design
of the test runner much, so it can probably be postponed to a later
stage.
>> In theory, the test runner would also be able to drive tests on real
>> hardware, but of course with some limitations (harder to have a
>> predictable, reproducible environment, etc.)
>
> Maybe utilizing Aaron's installer for setting up those test systems
> could at least produce somewhat identical setups? Although it is really
> hard managing systems with different storage types, network cards, ... .
In general my biggest concern with 'bare-metal' tests - and to precise,
that does not really have anything to do with being 'bare-metal',
more about testing on something that is harder roll back into
a clean state that can be used for the next test execution, is that
I'm afraid that a setup like this could become quite brittle and a
maintenance burden. At some point, a test execution might leave
something in an unclean state (e.g. due to a crashed test or missing
something while cleanup), tripping up the following test job.
As an example from personal experience: One test run
might test new packages which introduce a new flag in a configuration
file. If that flag is not cleanup up afterwards, another test job
testing other packages might fail because it now has to
deal with an 'unknown' configuration key.
Maybe ZFS snapshots could help with that, but I'm not sure how that
would work in practice (e.g. due to the kernel being stored on
the EFI partition).
The automated installer *could* certainly help here - however,
right now I don't want to extend the scope of this project too much.
Also, there is also the question if the installation should be refreshed
after every single test run, increasing the test cycle time/resource
consumption quite a bit? Or only if 'something' breaks?
That being said, it might also make sense to be able to run the tests
(or more likely, a subset of them, since some will inherently
require a fixture) against an arbitrary PVE instance that is under full
control of a developer (e.g. a development VM, or, if feeling
adventurous, the workstation itself). If this is possible, then these
tests could the fastest way to get feedback while developing, since
there is no need to instantiate a template, update, deploy, etc.
In this case, the test runner's job would only be to run the test
scripts, without managing fixtures/etc, and then reporting the results
back to the developer.
Essentially, as Thomas already mentioned, one approach to do this would
be to decouple the 'fixture setup' and 'test case execution' part as
much as possible. How that will look in practice will be part of
further research.
> I've seen GitLab using tags for runners that specify certain
> capabilities of systems. Maybe we could also introduce something like
> that here for different bare-metal systems? E.g. a test case specifies
> it needs a system with tag `ZFS` and then you can run / skip the
> respective test case on that system. Managing those tags can introduce
> quite a lot of churn though, so I'm not sure if this would be a good idea.
>
I have thought about a tag system as well - not necessarily for test
runners, but for test cases. E.g. you could tag tests for the
authentication system with 'auth' - because at least for the local
development cycle it might not make much sense to run tests for
clusters, ceph, etc. while working on the authentication system.
The 'tags' to be executed might then be simply passed to the test
runner.
These tags could also be used to mark the subset of 'simple'
test cases that don't need a special test fixture, as described above...
This could also be extended to a full 'predicate-like' system as Thomas
described.
>> The test script is executed by the test runner; the test outcome is
>> determined by the exit code of the script. Test scripts could be written
> Are you considering capturing output as well? That would make sense when
> using assertions at least, so in case of failures developers have a
> starting point for debugging.
Yup, I'd capture stdout/stderr from all test executables/scripts and
include it in the final test report.
Test output is indeed very useful when determining *why* something went
wrong.
>
> Would it make sense to allow specifying a expected exit code for tests
> that actually should fail - or do you consider this something that
> should be handled by the test script?
I guess that's a matter of taste. Personally I'd keep the contract
between test runner and test script simple and say 0 == success,
everything else is a failure. If there are any test cases that
expect a failure of some API call, then the script should 'translate'
the exit code.
If we discover that specifying an expected exit actually makes things
easier for us, then adding it should be rather trivial - and easier
than ripping it out the other way round.
> I've refrained from talking about the toml files too much since it's
> probably too early to say something about that, but they look good so
> far from my pov.
>
> In general this sounds like quite the exciting feature and the RFC looks
> very promising already.
Thanks for your feedback!
>
> Kind Regards
> Stefan
--
- Lukas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-16 13:57 ` Thomas Lamprecht
@ 2023-10-16 15:33 ` Lukas Wagner
2023-10-17 6:35 ` Thomas Lamprecht
0 siblings, 1 reply; 10+ messages in thread
From: Lukas Wagner @ 2023-10-16 15:33 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion
Thanks for the summary from our discussion and the additional feedback!
On 10/16/23 15:57, Thomas Lamprecht wrote:
>> - create some sort of test report
>
> As Stefan mentioned, test-output can be good to have. Our buildbot
> instance provides that, and while I don't look at them in 99% of the
> builds, when I need to its worth *a lot*.
>
Agreed, test output is always valuable and will definitely captured.
>>
>> ## Introduction
>>
>> The goal is to establish a framework that allows us to write
>> automated integration tests for our products.
>> These tests are intended to run in the following situations:
>> - When new packages are uploaded to the staging repos (by triggering
>> a test run from repoman, or similar)
>
> *debian repos, as we could also trigger some when git commits are
> pushed, just like we do now through Buildbot. Doing so is IMO nice as it
> will catch issues before a package was bumped, but is still quite a bit
> simpler to implement than an "apply patch from list to git repos" thing
> from the next point, but could still act as a preparation for that.
>
>> - Later, this tests could also be run when patch series are posted to
>> our mailing lists. This requires a mechanism to automatically
>> discover, fetch and build patches, which will be a separate,
>> follow-up project.
>
>>
>> As a main mode of operation, the Systems under Test (SUTs)
>> will be virtualized on top of a Proxmox VE node.
>
> For the fully-automated test system this can be OK as primary mode, as
> it indeed makes things like going back to an older software state much
> easier.
>
> But, if we decouple the test harness and running them from that more
> automated system, we can also run the harness periodically on our
> bare-metal test servers.
>
>> ## Terminology
>> - Template: A backup/VM template that can be instantiated by the test
>> runner
>
> I.e., the base of the test host? I'd call this test-host, template is a
> bit to overloaded/generic and might focus too much on the virtual test
> environment.
True, 'template' is a bit overloaded.
>
> Or is this some part that takes place in the test, i.e., a
> generalization of product to test and supplementary tool/app that helps
> on that test?
It was intended to be a 'general VM/CT base thingy' that can be
instantiated and managed by the test runner, so either a PVE/PBS/PMG
base installation, or some auxiliary resource, e.g. a Debian VM with
an already-set-up LDAP server.
I'll see if I can find good terms with the newly added focus on
bare-metal testing / the decoupling between environment setup and test
execution.
> Is the order of test-cases guaranteed by toml parsing, or how are intra-
> fixture dependencies ensured?
>
Good point. With rollbacks in between test cases it probably does not
matter much, but on 'real hardware' with no rollback this could
definitely be a concern.
A super simple thing that could just work fine is ordering test
execution by testcase-names, sorted alphabetically. Ideally you'd write
test cases that do not depend on each other any way, and *if* you ever
find yourself in the situation where you *need* some ordering, you could
just encode the order in the test-case name by adding an integer prefix
- similar how you would name config files in /etc/sysctl.d/*, for
instance.
--
- Lukas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-16 15:33 ` Lukas Wagner
@ 2023-10-17 6:35 ` Thomas Lamprecht
2023-10-17 12:33 ` Lukas Wagner
0 siblings, 1 reply; 10+ messages in thread
From: Thomas Lamprecht @ 2023-10-17 6:35 UTC (permalink / raw)
To: Lukas Wagner, Proxmox VE development discussion
Am 16/10/2023 um 17:33 schrieb Lukas Wagner:
>> Or is this some part that takes place in the test, i.e., a
>> generalization of product to test and supplementary tool/app that helps
>> on that test?
>
> It was intended to be a 'general VM/CT base thingy' that can be
> instantiated and managed by the test runner, so either a PVE/PBS/PMG
> base installation, or some auxiliary resource, e.g. a Debian VM with
> an already-set-up LDAP server.
>
> I'll see if I can find good terms with the newly added focus on
> bare-metal testing / the decoupling between environment setup and test
> execution.
Hmm, yeah OK, having some additional info on top of "template" like e.g.,
"system-template", or "app-template", could be already slightly better
then.
While slightly details, IMO still important for overall future
direction, I'd possibly split "restore" into "source-type" and "source",
where the "source-type" can be e.g., "disk-image" for a qcow2 or the
like to work directly on, or "backup-image" for your backup restore
process, or some type for bootstrap tools like debootstrap or the VM
specific vmdb2.
Also having re-use configurable, i.e., if the app-template-instance
is destroyed after some test run is done. For that, writing a simple
info about mapping instantiated templates to other identifiers (VMID,
IP, ...) in e.g. /var/cache/ (or some XDG_ directory to cater also to
any users running this as non-root).
Again, can be classified as details, but IMO important for the
direction this is going, and not too much work, so should be at least
on the radar.
>> Is the order of test-cases guaranteed by toml parsing, or how are intra-
>> fixture dependencies ensured?
>>
>
> Good point. With rollbacks in between test cases it probably does not
> matter much, but on 'real hardware' with no rollback this could
> definitely be a concern.
> A super simple thing that could just work fine is ordering test
> execution by testcase-names, sorted alphabetically. Ideally you'd write
> test cases that do not depend on each other any way, and *if* you ever
> find yourself in the situation where you *need* some ordering, you
> could> just encode the order in the test-case name by adding an integer
> prefix> - similar how you would name config files in /etc/sysctl.d/*,
> for instance.
While it can be OK to leave that for later, encoding such things
in names is IMO brittle and hard to manage if more than a handful
of tests, and we hopefully got lots more ;-)
From top of my head I'd rather do some attribute based dependency
annotation, so that one can depend on single tests, or whole fixture
on others single tests or whole fixture.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-16 15:18 ` Lukas Wagner
@ 2023-10-17 7:34 ` Thomas Lamprecht
0 siblings, 0 replies; 10+ messages in thread
From: Thomas Lamprecht @ 2023-10-17 7:34 UTC (permalink / raw)
To: Proxmox VE development discussion, Lukas Wagner, Stefan Hanreich
Am 16/10/2023 um 17:18 schrieb Lukas Wagner:
> On 10/16/23 13:20, Stefan Hanreich wrote:
>> I can imagine having to setup VMs inside the Test Setup as well for
>> doing various tests. Doing this manually every time could be quite
>> cumbersome / hard to automate. Do you have a mechanism in mind to
>> deploy VMs inside the test system as well? Again, PBS could be an
>> interesting option for this imo.
>>
> Several options come to mind. We could use a virtualized PBS instance
> with a datastore containing the VM backup as part of the fixture. We
> could use some external backup store (so the same 'source' as for the
> templates themselves) - however that means that the systems under test
> must have network access to that. We could also think about using
> iPXE to boot test VMs, with the boot image either be provided by some
> template from the fixture, or by some external server. For both
> approaches, the 'as part of the fixture' approaches seem a bit nicer,
> as they are more self-contained.
What about the following approach:
The test state that they need one or more VMs with certain properties,
i.e., something like "none" (don't care), "ostype=win*", "memory>=10G"
or the like (can start out easy w.r.t to supported comparison features,
as long the base system is there it can be extended relatively easily
later on).
Then, on a run of a test first all those asset-dependencies are
collected. Then they can be, depending on further config, get newly
created or selected from existing candidates on the target test-host
system.
In general the test-system can add a specific tag (like "test-asset") to
such virtual guests by default, and also add that as implicit property
condition (if no explicit tag-condition is already present) for when
searching for existing assets, this way one can either re-use guests, be
it because they exist due to running on a bare-metal system, that won't
get rolled back, or even in some virtual system that gets rolled back to
a state that already has to virtual-guest test-assets configured and
thus can also reduce the time required to set up a clean environment by
a lot, benefiting both use cases.
Extra config, and/or command line, knobs can then force re-creation of
all, or some asses of, a test, or the base search path for images, here
it's probably enough to have some simpler definitively wanted ones to
provide the core-infra for how to add others, maybe more complex knobs
in the future more easily (creating new things is IMO always harder than
extending existing ones, at least if non-trivial).
> Also, the vmbd2 thingy that thomas mentioned might be interesting for
Because I stumbled upon it today, systemd's mkosi tool could be also
interesting here:
https://github.com/systemd/mkosi
https://github.com/systemd/mkosi/blob/main/mkosi/resources/mkosi.md
> this - i've only glanced at it so far though.
>
> As of now it seems that this question will not influence the design of
> the test runner much, so it can probably be postponed to a later
> stage.
Not of the runner itself, but all set up stuff for it, so I'd at least
try to keep it in mind – above features might not be that much work, but
would create lots of flexibility to allow devs using it more easily for
declarative reproduction tries of bugs too. At least I see it a big
mental roadblock if I have to set up specific environments for using
such tools, and cannot just re-use my existing ones 1:1.
>
>>> In theory, the test runner would also be able to drive tests on real
>>> hardware, but of course with some limitations (harder to have a
>>> predictable, reproducible environment, etc.)
>>
>> Maybe utilizing Aaron's installer for setting up those test systems
>> could at least produce somewhat identical setups? Although it is
>> really hard managing systems with different storage types, network
>> cards, ... .
>
> In general my biggest concern with 'bare-metal' tests - and to
> precise, that does not really have anything to do with being
> 'bare-metal', more about testing on something that is harder roll back
> into a clean state that can be used for the next test execution, is
> that I'm afraid that a setup like this could become quite brittle and
> a maintenance burden
I don't see that as issue, just as two separate thing, one is regression
testing in clean states where we can turn up reporting of test-failures
to the max and the other is integration testing where we don't report
widely but only allow some way to see list of issues for admins to
decide.
Bugs in the test system or configuration issue breaking idempotency
assumptions can then be fixed, other issues that are not visible in
those clean-room tests can become visible, I see no reason why both
cannot co-exist and have equivalent priority/focus.
New tests can be checked for basic idempotency by running them twice,
with the second run not doing any rollback.
>> I've seen GitLab using tags for runners that specify certain
>> capabilities of systems. Maybe we could also introduce something like
>> that here for different bare-metal systems? E.g. a test case
>> specifies it needs a system with tag `ZFS` and then you can run /
>> skip the respective test case on that system. Managing those tags can
>> introduce quite a lot of churn though, so I'm not sure if this would
>> be a good idea.
>
> I have thought about a tag system as well - not necessarily for test
> runners, but for test cases. E.g. you could tag tests for the
> authentication system with 'auth' - because at least for the local
> development cycle it might not make much sense to run tests for
> clusters, ceph, etc. while working on the authentication system.
Yes, I thought about something like that too, a known set of tags (i.e.,
centrally managed set and bail, or at least warn if test uses unknown
one) – having test runs be filtered by their use classes, like
"migration" or "windows" or your "auth" example would be definitively
nice.
>>> The test script is executed by the test runner; the test outcome is
>>> determined by the exit code of the script. Test scripts could be
>>> written
>> Are you considering capturing output as well? That would make sense
>> when using assertions at least, so in case of failures developers
>> have a starting point for debugging.
> Yup, I'd capture stdout/stderr from all test executables/scripts and
> include it in the final test report.
I guess there would be a (optional) notification to a set of addresses,
passed to the test system via CLI/Config by the tester (human on manual
tests or derived from changes and maintainers for automated tests), and
that would only have a summary and link/point to the full report that
provides the longer outputs of test harness and possibly system logs.
> Test output is indeed very useful when determining *why* something
> went wrong.
Journalctl of all nodes that took part of a test might be useful too.
>> Would it make sense to allow specifying a expected exit code for
>> tests that actually should fail - or do you consider this something
>> that should be handled by the test script?
>
> I guess that's a matter of taste. Personally I'd keep the contract
> between test runner and test script simple and say 0 == success,
> everything else is a failure. If there are any test cases that expect
> a failure of some API call, then the script should 'translate' the
> exit code.
W.r.t. exit code I find that fine, but maybe we want to allow passing a
more formal result text back, but we always can extend this by just
using some special files that the test script writes to, or something
like that, in the future, here starting out with simply checking exit
code seems fine enough to me.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-17 6:35 ` Thomas Lamprecht
@ 2023-10-17 12:33 ` Lukas Wagner
2023-10-17 16:28 ` Thomas Lamprecht
0 siblings, 1 reply; 10+ messages in thread
From: Lukas Wagner @ 2023-10-17 12:33 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion
On 10/17/23 08:35, Thomas Lamprecht wrote:
>>> Is the order of test-cases guaranteed by toml parsing, or how are intra-
>>> fixture dependencies ensured?
>>>
>>
>> Good point. With rollbacks in between test cases it probably does not
>> matter much, but on 'real hardware' with no rollback this could
>> definitely be a concern.
>> A super simple thing that could just work fine is ordering test
>> execution by testcase-names, sorted alphabetically. Ideally you'd write
>> test cases that do not depend on each other any way, and *if* you ever
>> find yourself in the situation where you *need* some ordering, you
>> could> just encode the order in the test-case name by adding an integer
>> prefix> - similar how you would name config files in /etc/sysctl.d/*,
>> for instance.
>
>
> While it can be OK to leave that for later, encoding such things
> in names is IMO brittle and hard to manage if more than a handful
> of tests, and we hopefully got lots more ;-)
>
>
> From top of my head I'd rather do some attribute based dependency
> annotation, so that one can depend on single tests, or whole fixture
> on others single tests or whole fixture.
>
The more thought I spend on it, the more I believe that inter-testcase
deps should be avoided as much as possible. In unit testing, (hidden)
dependencies between tests are in my experience the no. 1 cause of
flaky tests, and I see no reason why this would not also apply for
end-to-end integration testing.
I'd suggest to only allow test cases to depend on fixtures. The fixtures
themselves could have setup/teardown hooks that allow setting up and
cleaning up a test scenario. If needed, we could also have something
like 'fixture inheritance', where a fixture can 'extend' another,
supplying additional setup/teardown.
Example: the 'outermost' or 'parent' fixture might define that we
want a 'basic PVE installation' with the latest .debs deployed,
while another fixture that inherits from that one might set up a
storage of a certain type, useful for all tests that require specific
that type of storage.
On the other hand, instead of inheritance, a 'role/trait'-based system
might also work (composition >>> inheritance, after all) - and
maybe that also aligns better with the 'properties' mentioned in
your other mail (I mean this here: "ostype=win*", "memory>=10G").
This is essentially a very similar pattern as in numerous other testing
frameworks (xUnit, pytest, etc.); I think it makes sense to
build upon this battle-proven approach.
Regarding execution order, I'd now even suggest the polar opposite of my
prior idea. Instead of enforcing some execution order, we could also
actively shuffle execution order from run to run, at least for tests
using the same fixture.
The seed used for the RNG should be put into the test
report and could also be provided via a flag to the test runner, in case
we need to repeat a specific test sequence .
In that way, the runner would actively help us to hunt down
hidden inter-TC deps, making our test suite hopefully less brittle and
more robust in the long term.
Any way, lots of details to figure out. Thanks again for your input.
--
- Lukas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-17 12:33 ` Lukas Wagner
@ 2023-10-17 16:28 ` Thomas Lamprecht
2023-10-18 8:43 ` Lukas Wagner
0 siblings, 1 reply; 10+ messages in thread
From: Thomas Lamprecht @ 2023-10-17 16:28 UTC (permalink / raw)
To: Lukas Wagner, Proxmox VE development discussion
Am 17/10/2023 um 14:33 schrieb Lukas Wagner:
> On 10/17/23 08:35, Thomas Lamprecht wrote:
>> From top of my head I'd rather do some attribute based dependency
>> annotation, so that one can depend on single tests, or whole fixture
>> on others single tests or whole fixture.
>>
>
> The more thought I spend on it, the more I believe that inter-testcase
> deps should be avoided as much as possible. In unit testing, (hidden)
We don't plan unit testing here though and the dependencies I proposed
are the contrary from hidden, rather explicit annotated ones.
> dependencies between tests are in my experience the no. 1 cause of
> flaky tests, and I see no reason why this would not also apply for
> end-to-end integration testing.
Any source on that being the no 1 source of flaky tests? IMO that
should not make any difference, in the end you just allow better
reuse through composition of other tests (e.g., migration builds
upon clustering *set up*, not tests, if I just want to run
migration I can do clustering setup without executing its tests).
Not providing that could also mean that one has to move all logic
in the test-script, resulting in a single test per "fixture", reducing
granularity and parallelity of some running tests.
I also think that
> I'd suggest to only allow test cases to depend on fixtures. The fixtures
> themselves could have setup/teardown hooks that allow setting up and
> cleaning up a test scenario. If needed, we could also have something
> like 'fixture inheritance', where a fixture can 'extend' another,
> supplying additional setup/teardown.
> Example: the 'outermost' or 'parent' fixture might define that we
> want a 'basic PVE installation' with the latest .debs deployed,
> while another fixture that inherits from that one might set up a
> storage of a certain type, useful for all tests that require specific
> that type of storage.
Maybe our disagreement stems mostly from different design pictures in
our head, I probably am a bit less fixed (heh) on the fixtures, or at
least the naming of that term and might use test system, or intra test
system when for your design plan fixture would be the better word.
> On the other hand, instead of inheritance, a 'role/trait'-based system
> might also work (composition >>> inheritance, after all) - and
> maybe that also aligns better with the 'properties' mentioned in
> your other mail (I mean this here: "ostype=win*", "memory>=10G").
>
> This is essentially a very similar pattern as in numerous other testing
> frameworks (xUnit, pytest, etc.); I think it makes sense to
> build upon this battle-proven approach.
Those are all unit testing tools though that we do already in the
sources and IIRC those do not really provide what we need here.
While starting out simple(r) and avoiding too much complexity has
certainly it's merits, I don't think we should try to draw/align
too many parallels with those tools here for us.
>
> Regarding execution order, I'd now even suggest the polar opposite of my
> prior idea. Instead of enforcing some execution order, we could also
> actively shuffle execution order from run to run, at least for tests
> using the same fixture.
> The seed used for the RNG should be put into the test
> report and could also be provided via a flag to the test runner, in case
> we need to repeat a specific test sequence .
Hmm, this also has a chance to make tests flaky and get a bit annoying,
like perl's hash scrambling, but not a bad idea, I'd just not do that by
default on the "armed" test system that builds on package/git/patch updates,
but possibly in addition with reporting turned off like the double tests
for idempotency-checking I wrote in my previous message.
> In that way, the runner would actively help us to hunt down
> hidden inter-TC deps, making our test suite hopefully less brittle and
> more robust in the long term.
Agree, but as mentioned above I'd not enable it by default on the dev
facing automated systems, but possibly for manual runs from devs and
a separate "test-test-system" ^^
In summary, the most important points for me is a decoupled test-system
from the automation system that can manage it, ideally such that I can
decide relatively flexible on manual runs, IMO that should not be to much
work and it guarantees for clean cut APIs from which future development,
or integration surely will benefit too.
The rest is possibly hard to determine clearly on this stage, as it's easy
(at least for me) to get lost in different understandings of terms and
design perception, but hard to convey those very clearly about "pipe dreams",
so at this stage I'll cede to add discussion churn until there's something
more concrete that I can grasp on my terms (through reading/writing code),
but that should not deter others from giving input still while at this stage.
Thanks for your work on this.
- Thomas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC] towards automated integration testing
2023-10-17 16:28 ` Thomas Lamprecht
@ 2023-10-18 8:43 ` Lukas Wagner
0 siblings, 0 replies; 10+ messages in thread
From: Lukas Wagner @ 2023-10-18 8:43 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion
On 10/17/23 18:28, Thomas Lamprecht wrote:
> Am 17/10/2023 um 14:33 schrieb Lukas Wagner:
>> On 10/17/23 08:35, Thomas Lamprecht wrote:
>>> From top of my head I'd rather do some attribute based dependency
>>> annotation, so that one can depend on single tests, or whole fixture
>>> on others single tests or whole fixture.
>>>
>>
>> The more thought I spend on it, the more I believe that inter-testcase
>> deps should be avoided as much as possible. In unit testing, (hidden)
>
> We don't plan unit testing here though and the dependencies I proposed
> are the contrary from hidden, rather explicit annotated ones.
>
>> dependencies between tests are in my experience the no. 1 cause of
>> flaky tests, and I see no reason why this would not also apply for
>> end-to-end integration testing.
>
> Any source on that being the no 1 source of flaky tests? IMO that
> should not make any difference, in the end you just allow better
Of course I don't have bullet-proof evidence for the 'no. 1' claim, but
it's just my personal experience, which comes partly from a former job
(where was I coincidentally also responsible for setting up automated
testing ;) - there it was for a firmware project), partly from the work
I did for my master's thesis (which was also in the broader area of
software testing).
I would say it's just the consequence of having multiple test cases
manipulating a shared, stateful entity, be it directly or indirectly
via side effects. Things get of course even more difficult and messy if
concurrent test execution enters the picture ;)
> reuse through composition of other tests (e.g., migration builds
> upon clustering *set up*, not tests, if I just want to run
> migration I can do clustering setup without executing its tests).
> > Not providing that could also mean that one has to move all logic
> in the test-script, resulting in a single test per "fixture", reducing
> granularity and parallelity of some running tests.
>
> I also think that
>
>> I'd suggest to only allow test cases to depend on fixtures. The fixtures
>> themselves could have setup/teardown hooks that allow setting up and
>> cleaning up a test scenario. If needed, we could also have something
>> like 'fixture inheritance', where a fixture can 'extend' another,
>> supplying additional setup/teardown.
>> Example: the 'outermost' or 'parent' fixture might define that we
>> want a 'basic PVE installation' with the latest .debs deployed,
>> while another fixture that inherits from that one might set up a
>> storage of a certain type, useful for all tests that require specific
>> that type of storage.
>
> Maybe our disagreement stems mostly from different design pictures in
> our head, I probably am a bit less fixed (heh) on the fixtures, or at
> least the naming of that term and might use test system, or intra test
> system when for your design plan fixture would be the better word.
I think it's mostly a terminology problem. In my previous definition of
'fixture' I was maybe too fixated (heh) on it being 'the test
infrastructure/VMs that must be set up/instantatiated'. Maybe it helps
to think about it more generally as 'common setup/cleanup steps for a
set of test cases, which *might* include setting up test infra (although
I have not figured out a good way how that would be modeled with the
desired decoupling between test runner and test-VM-setup-thingy).
>
>> On the other hand, instead of inheritance, a 'role/trait'-based system
>> might also work (composition >>> inheritance, after all) - and
>> maybe that also aligns better with the 'properties' mentioned in
>> your other mail (I mean this here: "ostype=win*", "memory>=10G").
>>
>> This is essentially a very similar pattern as in numerous other testing
>> frameworks (xUnit, pytest, etc.); I think it makes sense to
>> build upon this battle-proven approach.
>
> Those are all unit testing tools though that we do already in the
> sources and IIRC those do not really provide what we need here.
> While starting out simple(r) and avoiding too much complexity has
> certainly it's merits, I don't think we should try to draw/align
> too many parallels with those tools here for us.
> >
> In summary, the most important points for me is a decoupled test-system
> from the automation system that can manage it, ideally such that I can
> decide relatively flexible on manual runs, IMO that should not be to much
> work and it guarantees for clean cut APIs from which future development,
> or integration surely will benefit too.
>
> The rest is possibly hard to determine clearly on this stage, as it's easy
> (at least for me) to get lost in different understandings of terms and
> design perception, but hard to convey those very clearly about "pipe dreams",
> so at this stage I'll cede to add discussion churn until there's something
> more concrete that I can grasp on my terms (through reading/writing code),
> but that should not deter others from giving input still while at this stage.
Agreed.
I think we agree on the most important requirements/aspects of this
project and that's a good foundation for my upcoming efforts.
At this point, the best move forward for me is to start experimenting
with some ideas and start with the actual implementation.
When I have something concrete to show, may it be a prototype or some
sort of minimum viable product, it's much easier to discuss
any further details and design aspects.
Thanks!
--
- Lukas
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-10-18 8:43 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-13 13:33 [pve-devel] [RFC] towards automated integration testing Lukas Wagner
2023-10-16 11:20 ` Stefan Hanreich
2023-10-16 15:18 ` Lukas Wagner
2023-10-17 7:34 ` Thomas Lamprecht
2023-10-16 13:57 ` Thomas Lamprecht
2023-10-16 15:33 ` Lukas Wagner
2023-10-17 6:35 ` Thomas Lamprecht
2023-10-17 12:33 ` Lukas Wagner
2023-10-17 16:28 ` Thomas Lamprecht
2023-10-18 8:43 ` Lukas Wagner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox