public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] Plan for (invasive) shrink of pve-manager git repository
@ 2023-05-26  9:45 Thomas Lamprecht
  2023-05-28 18:38 ` Thomas Lamprecht
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Lamprecht @ 2023-05-26  9:45 UTC (permalink / raw)
  To: PVE development discussion

Hi all!

It follows a head's up for the plan of making it easier to work with our
pve-manager git repository by rewriting its history to filter out huge
artefacts.

This will only affect developers, nothing in the current pve-manager Debian
package will change.


# Background

Our current pve-manager git repository is huge (> 500 MB) and this is mostly
due to hosting various huge copies of ExtJS, both as ZIP archive and as
extracted version directly in its git history.

Nowadays, well since Q1 of 2017 (before Proxmox VE 5), those huge artefacts are
not used anymore, as we slit the one still in use, like the ExtJS GPL source
code, out to its own repo, without any ZIP archives.  But, git being git and
providing a full history of every change still needs to hold copies of those
artefacts in its CAS object store, one cannot really mask those in any (for
development) ergonomic way.


# Proposed Solution

I'll use the git filter-repo [0] tool, a replacement for filter-branch with
better UX and less potential for getting it wrong, to rewrite the history,
filtering out any problematic artefact or directory.

For this I'll use the following file-list

www/ext6 www/ext5 www/ext4 www/touch po glob:*.zip

used as inverted match via the following command:

git filter-repo --invert-paths --paths-from-file
~/pve-manager-inverted-filter-paths

Then, I'd rename the current "pve-manager.git" hosted at git.proxmox.com to
"pve-manager-legacy.git", so it will still be able as reference for ancient
history, providing the possibility to build pre PVE 5 pve-manager packages
(why ever one would want/needs to do that).

A new repo, with the same name "pve-manager.git", would then get created and
the now cleaned up git repo pushed to it.


# Result

The result of above command measured by .git disk usage:

Before:  551 MB After:    26 MB

So a huge reduction.


# Fallout

This naturally has some fallout for developers currently working patch series,
similar to any force-push (which we normally avoid at all cost).

Rebasing won't work IIUC, but as the source file layout won't change, you can
simply use "git cherry-pick <rev-range>" if you have the before filter and
after filter remotes & branches in the same git repo.  Otherwise, one can
always use "git format-patch -o ~/patches/ <rev-range>" in the old repo to
export patches cleanly, and then use "git am -3 ~/patches/*.patch" in the new
repo.

Note that git commit hash references inside commit messages of pve-manager will
get rewritten, so here won't notice anything.  Commit references from other
repos are naturally untouched, but pve-manager being a leave package means that
it won't have that many in other repos.

I'll safe a copy of the old -> new commit reference map that git filter-repo
produces, ensuring we got full transparency.


# Date of Change

I'll probably carry above out tomorrow, Saturday 2023-05-27, sometimes between
10:00 CEST and day's end, but writing today for a short heads-up.

For the record: this plan was discussed with Dietmar Maurer and Dominik, and as
said, this is "only" affecting developers.  And yes, it is a bit of a nuisance
and generating some churn, but we talked about doing this every other year, and
it won't get better on it's own, so let's just finally go for it.

cheers
 Thomas




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] Plan for (invasive) shrink of pve-manager git repository
  2023-05-26  9:45 [pve-devel] Plan for (invasive) shrink of pve-manager git repository Thomas Lamprecht
@ 2023-05-28 18:38 ` Thomas Lamprecht
  2023-05-28 18:51   ` Thomas Lamprecht
  2023-05-30  8:36   ` Fiona Ebner
  0 siblings, 2 replies; 4+ messages in thread
From: Thomas Lamprecht @ 2023-05-28 18:38 UTC (permalink / raw)
  To: PVE development discussion

Am 26/05/2023 um 11:45 schrieb Thomas Lamprecht:
> I'll use the git filter-repo [0] tool, a replacement for filter-branch with
> better UX and less potential for getting it wrong, to rewrite the history,
> filtering out any problematic artefact or directory.
> 
> For this I'll use the following file-list
> 
> www/ext6
> www/ext5
> www/ext4
> www/touch
> po
> glob:*.zip
> 
> used as inverted match via the following command:
> 
> git filter-repo --invert-paths --paths-from-file
> ~/pve-manager-inverted-filter-paths
> 
> Then, I'd rename the current "pve-manager.git" hosted at git.proxmox.com to
> "pve-manager-legacy.git", so it will still be able as reference for ancient
> history, providing the possibility to build pre PVE 5 pve-manager packages
> (why ever one would want/needs to do that).
> 
> A new repo, with the same name "pve-manager.git", would then get created and
> the now cleaned up git repo pushed to it.

Above has been carried out now.

Old repo is still available here:
https://git.proxmox.com/?p=pve-manager-legacy.git;a=summary


If you fetch in an existing pve-manager.git repository you'll see something like:
From git://git.proxmox.com/git/pve-manager
 + f548e4fca...4a8501a8b master     -> origin/master  (forced update)
 + 40ccc11c4...d26a7c43e stable-3   -> origin/stable-3  (forced update)
 + 08ba4d2dd...789b4067b stable-4   -> origin/stable-4  (forced update)
 + d0ec33c69...b80838a2f stable-5   -> origin/stable-5  (forced update)
 + 6ba2c0bcf...b31a318d0 stable-6   -> origin/stable-6  (forced update)

For re-aligning your local master branch you can do a hard-reset, BUT check
if you got any local commits yet (move them over to another branch with e.g.
`git checkout -b feature-to-re-apply-on-master`

git checkout master
git reset --hard origin/master

Then re-create your active development branches freshly from the master
and cherry-pick the relevant patches from the old branch.

After that you can delete the old branches.

> # Fallout
> 
> This naturally has some fallout for developers currently working patch series,
> similar to any force-push (which we normally avoid at all cost).
> 
> Rebasing won't work IIUC, but as the source file layout won't change, you can
> simply use "git cherry-pick <rev-range>" if you have the before filter and
> after filter remotes & branches in the same git repo.  Otherwise, one can
> always use "git format-patch -o ~/patches/ <rev-range>" in the old repo to
> export patches cleanly, and then use "git am -3 ~/patches/*.patch" in the new
> repo.

FWIW, I migrated over my branches, and cherry-picking worked well.

> 
> Note that git commit hash references inside commit messages of pve-manager will
> get rewritten, so here won't notice anything.  Commit references from other
> repos are naturally untouched, but pve-manager being a leave package means that
> it won't have that many in other repos.

Note that above is wrong, my test for that was misguided, but filter-repo does
check for this and outputs it as "suboptimal-issues" file (see below), it luckily
ain't that many as we only (relatively) recently began to track stuff like "Fixes"
in there.

> I'll safe a copy of the old -> new commit reference map that git filter-repo
> produces, ensuring we got full transparency.

This is publicly available here:

https://pve.proxmox.com/pve-manager-filter-repo-result/

Most interesting will be the "commit-map" file.
In the "ref-map" I marked those branches which I did not copy over, mostly some
ancient hot fix branches; OTOH, all stable-X branches *got* copied over.

Again, sorry for any trouble and headache this may cause, if you have specific
question (or see something that is off) -> ask me (e.g., reply to this mail on
the list)

cheers,
 Thomas




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] Plan for (invasive) shrink of pve-manager git repository
  2023-05-28 18:38 ` Thomas Lamprecht
@ 2023-05-28 18:51   ` Thomas Lamprecht
  2023-05-30  8:36   ` Fiona Ebner
  1 sibling, 0 replies; 4+ messages in thread
From: Thomas Lamprecht @ 2023-05-28 18:51 UTC (permalink / raw)
  To: PVE development discussion

Am 28/05/2023 um 20:38 schrieb Thomas Lamprecht:
> For re-aligning your local master branch you can do a hard-reset, BUT check
> if you got any local commits yet (move them over to another branch with e.g.
> `git checkout -b feature-to-re-apply-on-master`
> 
> git checkout master
> git reset --hard origin/master
> 
> Then re-create your active development branches freshly from the master
> and cherry-pick the relevant patches from the old branch.
> 
> After that you can delete the old branches.
> 

Two things I forgot to mention, after above and ensuring no remote or branch
refers to the old git repo anymore, you can use the following to shrink:

git gc --aggressive --prune=now

But, moving the current pve-manager dir to a backup location and just cloning
freshly is waay faster

The other thing was that I had to split out sencha-touch ZIP into it's own repo
before the filter-repo clean up, it lives now in a libjs-sencha-touch package
and its source can be found here: https://git.proxmox.com/?p=sencha-touch.git;a=summary

(and just for completeness sake, note that this was only done for pure compat
reasons only, the mobile UI in PVE that uses it is pretty bare bones and doesn't
gets much love, we should replace it by something slightly more future proof some
day).

cheers,
 Thomas




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] Plan for (invasive) shrink of pve-manager git repository
  2023-05-28 18:38 ` Thomas Lamprecht
  2023-05-28 18:51   ` Thomas Lamprecht
@ 2023-05-30  8:36   ` Fiona Ebner
  1 sibling, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2023-05-30  8:36 UTC (permalink / raw)
  To: Proxmox VE development discussion, Thomas Lamprecht,
	PVE development discussion

Am 28.05.23 um 20:38 schrieb Thomas Lamprecht:
> If you fetch in an existing pve-manager.git repository you'll see something like:
> From git://git.proxmox.com/git/pve-manager
>  + f548e4fca...4a8501a8b master     -> origin/master  (forced update)
>  + 40ccc11c4...d26a7c43e stable-3   -> origin/stable-3  (forced update)
>  + 08ba4d2dd...789b4067b stable-4   -> origin/stable-4  (forced update)
>  + d0ec33c69...b80838a2f stable-5   -> origin/stable-5  (forced update)
>  + 6ba2c0bcf...b31a318d0 stable-6   -> origin/stable-6  (forced update)
> 
> For re-aligning your local master branch you can do a hard-reset, BUT check
> if you got any local commits yet (move them over to another branch with e.g.
> `git checkout -b feature-to-re-apply-on-master`
> 
> git checkout master
> git reset --hard origin/master
> 
> Then re-create your active development branches freshly from the master
> and cherry-pick the relevant patches from the old branch.
> 
> After that you can delete the old branches.
> 

Just a small addendum, because my repository was still pretty large
after the above. I had to remove stale remote branches, which can be
done with e.g. 'git fetch --all --prune' and I had to run 'git stash
clear'. Only then my repository shrunk below 260 MiB. You also might
want to check for tags that could still be referencing old stuff.




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-05-30  8:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26  9:45 [pve-devel] Plan for (invasive) shrink of pve-manager git repository Thomas Lamprecht
2023-05-28 18:38 ` Thomas Lamprecht
2023-05-28 18:51   ` Thomas Lamprecht
2023-05-30  8:36   ` Fiona Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal