* [pve-devel] proxmox French days conference feedback @ 2022-06-23 8:43 DERUMIER, Alexandre 2022-06-23 10:59 ` Thomas Lamprecht 0 siblings, 1 reply; 3+ messages in thread From: DERUMIER, Alexandre @ 2022-06-23 8:43 UTC (permalink / raw) To: pve-devel Hi, 2 weeks ago we organised 2 days of proxmox ve/ceph conferences at Clermont Ferrand university in France. This was organized by university and CNRS (national research scientific center). Proxmox is a lot used in this scientific departements in France, and we had people coming from everywhere in France. 70 people on site, 300 peoples on streaming. The purpose of the conference was to exchange about Proxmox experience, and try to show to French Government and other public departements, that proxmox VE was viable and working solution for virtualisation. (As they are still a lot of vmware lobbying, but with broadcom coming acquisation and lower budgets, they are a lot of planned migration). Video Replay are available here (in French only , sorry ) https://indico.mathrice.fr/event/327/ So the conference was a success. Overall experience with proxmox VE/PBS is really good. Nobody had serious problem with PVE/PBS. Some are coming from openstack (too big, too complex to manage) Some others are coming from vmware. Some others are using proxmox since 0.9 ;) Some return of experience with ceph too. (with some problems). Maybe for the most requested missing features: - cross cluster management + replication/disaster recovery. (They have a lot of dual room / dual site. But 3 sites to keep quorum is not always possible) - a drs feature like vmware for vm balancing (I'm still working on it, I'll try to have a working after this summer. I still need vm pressure stats pending patches apply first ;) - Pool quota (restrict the total mem/cpu/disk allocated to all vms in a pool). as they have some clusters shared between differents departements / students/ ... A big thanks to Daniela for the T-shirts and other goodies ! Also, another recurrent question: "Why proxmox team is not more present on differents events/conference like fosdem, ... ?" Yes, we would like to drink some beers with you guys ;) Regards, Alexandre ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pve-devel] proxmox French days conference feedback 2022-06-23 8:43 [pve-devel] proxmox French days conference feedback DERUMIER, Alexandre @ 2022-06-23 10:59 ` Thomas Lamprecht 2022-08-24 12:59 ` DERUMIER, Alexandre 0 siblings, 1 reply; 3+ messages in thread From: Thomas Lamprecht @ 2022-06-23 10:59 UTC (permalink / raw) To: Proxmox VE development discussion, DERUMIER, Alexandre Hi, Am 23/06/2022 um 10:43 schrieb DERUMIER, Alexandre: > So the conference was a success. > Overall experience with proxmox VE/PBS is really good. > Nobody had serious problem with PVE/PBS. > > Some are coming from openstack (too big, too complex to manage) > Some others are coming from vmware. > Some others are using proxmox since 0.9 😉 > Some return of experience with ceph too. (with some problems). > > Thanks for the feedback and talking about Proxmox projects! > Maybe for the most requested missing features: > > - cross cluster management + replication/disaster recovery. (They have > a lot of dual room / dual site. But 3 sites to keep quorum is not > always possible) That's in the pipeline, cross cluster migration is not that far off and Fabian should be soon able to pick that up (he works on some infrastructure projects currently to make air-gapped offline updates possible in a relatively easy and integrated way), that's one of the biggest pre-requisites left. > > - a drs feature like vmware for vm balancing > (I'm still working on it, I'll try to have a working after this summer. > I still need vm pressure stats pending patches apply first 😉 I'd really like to more actively work on this from our part too, I thought about 7.3 feature planning a bit yesterday and wrote a few (rough) edge points for tackling this, using the alogrithm and rough direction you already worked on in your proof of concepts (thx!): - [ ] Static (and later Dynamic) Resource Scheduling (S/DRS) - [ ] Coordinate with Alexandre as he's working partly on that too, but we may want to use a bit of a different design and/or feature set (at least initially) and integration timeline - [ ] checkout TOPSIS more closely and implement relevant parts in rust to expose via perlmod, that's then fast and much easier to reason correctness in static, and safety focused language like rust. - [ ] Make basic static resource capacity like CPU (# of socket, core and hyper threads) and memory available for other cluster nodes (for example, via kv_broadcast after (re)start of pve-cluster) - [ ] Add infrastructure to use that static (!) information for balancing out the cluster. - [ ] spit out a list of actions that would result in a balanced cluster: migrate guest A to node X, migrate guest B to node Y - [ ] use that for creating a simulation and regression testing system in the spirit of the ha-managers simulation and regression test system, but as independent test & executable - [ ] integrate in HA, due to static-ness and safe-and-slow integration it should be first only be done on recovery, for better balancing out. - [ ] add API create support for creating a CT/VM to the best fitting node, i.e., the lowest used one - [ ] make balancing algo available for non-ha too, allow a cluster wide manual re-balance (e.g., with action-proposal shown to user for confirmation) - [ ] Extend with dynamic information like IO/memory/CPU pressure - [ ] Finally: Allow to opt-in in periodic auto-balancing for HA managed IMO the semi-static resource availability and usage would be nice in general as first step, that could then also allow one to relatively easily pre-error/warn out on VM start if there won't be enough memory available (maybe overridable for odd zram/KSM cases, or when one just doesn't want to care about that and likes OOMs ;) > > - Pool quota (restrict the total mem/cpu/disk allocated to all vms in a > pool). as they have some clusters shared between differents > departements / students/ ... > It'd need a bit new infrastructure, but it wouldn't be /that/ hard to implement. > > A big thanks to Daniela for the T-shirts and other goodies ! > > Also, another recurrent question: > > "Why proxmox team is not more present on differents events/conference > like fosdem, ... ?" > Some colleagues frequent fosdem, but in the last two and a half years doing or attending presence conferences was a bit difficult. > Yes, we would like to drink some beers with you guys 😉 Would be nice! :-) ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pve-devel] proxmox French days conference feedback 2022-06-23 10:59 ` Thomas Lamprecht @ 2022-08-24 12:59 ` DERUMIER, Alexandre 0 siblings, 0 replies; 3+ messages in thread From: DERUMIER, Alexandre @ 2022-08-24 12:59 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox VE development discussion Hi Thomas, Sorry I totally miss your reponse ! >> - a drs feature like vmware for vm balancing >> (I'm still working on it, I'll try to have a working after this summer. >> I still need vm pressure stats pending patches apply first 😉 > > I'd really like to more actively work on this from our part too, I thought about > 7.3 feature planning a bit yesterday and wrote a few (rough) edge points for > tackling this, using the alogrithm and rough direction you already worked on in > your proof of concepts (thx!): > > - [ ] Static (and later Dynamic) Resource Scheduling (S/DRS) > - [ ] Coordinate with Alexandre as he's working partly on that too, but we > may want to use a bit of a different design and/or feature set (at > least initially) and integration timeline I have free time to work/help on this in coming months, just tell me if we can sync work. > - [ ] checkout TOPSIS more closely and implement relevant parts in rust to > expose via perlmod, that's then fast and much easier to reason > correctness in static, and safety focused language like rust. I can help if you have question to reimplement topsis in rust (I have done it from stratch in perl following the youtube math tutorial, it's not too difficult). > - [ ] Make basic static resource capacity like CPU (# of socket, core and > hyper threads) and memory available for other cluster nodes (for > example, via kv_broadcast after (re)start of pve-cluster) > - [ ] Add infrastructure to use that static (!) information for balancing > out the cluster. > - [ ] spit out a list of actions that would result in a balanced > cluster: migrate guest A to node X, migrate guest B to node Y > - [ ] use that for creating a simulation and regression testing system > in the spirit of the ha-managers simulation and regression test > system, but as independent test & executable I think adding to user a manual balancing feature, with static/preview list of migration with manual approval could be great too > - [ ] integrate in HA, due to static-ness and safe-and-slow integration it > should be first only be done on recovery, for better balancing out. > - [ ] add API create support for creating a CT/VM to the best fitting node, > i.e., the lowest used one yes, needed. (and also maybe start on the best fitting node) > - [ ] make balancing algo available for non-ha too, allow a cluster wide > manual re-balance (e.g., with action-proposal shown to user for > confirmation) yes, some users already have asked me about the non-ha vm. > - [ ] Extend with dynamic information like IO/memory/CPU pressure cpu pressure is really the most important here. Because you can't trust cpu usage. (I have some servers with 60% cpu usage totally overloaded, and other servers with 80% cpu usage not overloaded). The main problem is that we have average value across all cores, and with a lot of cores, some cores can be stuck at 100%, other at 10%, this give you a low cpu usage. But if some vms need to use a lot of cores, they will be overloaded. That's why in my code, I'm looking only for cpu pressure on source node, and on the target node, I'm looking to low cpu pressure + cpu usage under 80%. (We can only trust cpu usage if cpu pressure is low) > - [ ] Finally: Allow to opt-in in periodic auto-balancing for HA managed > > IMO the semi-static resource availability and usage would be nice in general as > first step, that could then also allow one to relatively easily pre-error/warn out > on VM start if there won't be enough memory available (maybe overridable for odd > zram/KSM cases, or when one just doesn't want to care about that and likes OOMs ;) > I have a lot of customers wanting to migrate from vmware with the broadcom acquisition, but the missing drs feature is really blocking for them. If needed, we could also help to finance this feature, if you need extra developper. Just ask me ! ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-08-24 13:00 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-23 8:43 [pve-devel] proxmox French days conference feedback DERUMIER, Alexandre 2022-06-23 10:59 ` Thomas Lamprecht 2022-08-24 12:59 ` DERUMIER, Alexandre
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox