[dev] Git splitting

Michael M Slusarz slusarz at horde.org
Thu Jun 20 16:06:58 UTC 2013


Quoting Thomas Jarosch <thomas.jarosch at intra2net.com>:

> On Wednesday, 19. June 2013 23:25:56 Michael M Slusarz wrote:
>> Now that the main portion of the x.1 releases are out the door, time
>> to talk about splitting the Git repo.
>>
>> (If there was any doubt we need to split, try cloning the Git repo
>> from GitHub.  I did it today and it took 3 1/2 minutes.  Ugh)
>
> Forgive my ignorance, won't cloning of all the (small) repositories
> also take 3 1/2 minutes or even more?

As Ralf pointed out - most people don't need all repos.  I only work  
on a specific subset of Horde myself, so I don't need all the  
components.

The only reason to download all components would be to do something  
like changing copyright dates in all files.  But that's it.

> We splitted up some git repos at work and I made notes
> of certain complex git commands. May be they might come in handy:
>
> ------------------------------------------------
> Filter directory based on a whitelist:
> git filter-branch --tag-name-filter cat --index-filter \
>     'git ls-files -s |grep -P "\t(DIR1|DIR2)" \
>     |GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info &&
>     mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' -- --all

Not useful for us - we aren't looking to filter the indexes.

> Rename directories / remove prefixes (in this case source/):
> git filter-branch --tag-name-filter cat --index-filter \
>     'git ls-files -s | sed "s-\tsource/-\t-" \
>     |GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info &&
>     mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' -- --all

See above.

> Delete single files or directories:
> git filter-branch --tag-name-filter cat --index-filter 'git rm  
> --cached --ignore-unmatch -r -f CVSROOT Attic packages/Attic  
> source/Attic' -- --all

Not for us.  We don't want to lose history.

> Remove empty commits:
> git filter-branch --tag-name-filter cat --commit-filter 'if [ z$1 =  
> z`git rev-parse $3^{tree}` ]; then skip_commit "$@"; else git  
> commit-tree "$@"; fi' "$@" -- --all

There is already a command-line flag for filter-branch to do this.

> Free up space for real:
> git for-each-ref --format='%(refname)' refs/original | xargs -i git  
> update-ref -d {}
> git reflog expire --expire=0 --all
> git repack -a -d

This is not needed.

> git prune

Or this.  Both of these have been replaced by 'git gc --aggressive  
--prune=now' in recent git versions.

As mentioned in the wiki page, this is not enough.  Even if you do  
this, you won't see any change in the physical disk size of your pack  
file (at least I didn't).  It is not until you clone the directory  
that git will correctly prune the size of the pack file in the cloned  
copy.

michael

___________________________________
Michael Slusarz [slusarz at horde.org]



More information about the dev mailing list