[dev] Split Git

Jan Schneider jan at horde.org
Wed Nov 19 14:57:49 UTC 2014


Zitat von Michael M Slusarz <slusarz at horde.org>:

> Quoting Michael J Rubinsky <mrubinsk at horde.org>:
>
>> Quoting Jan Schneider <jan at horde.org>:
>>
>>> Zitat von Michael J Rubinsky <mrubinsk at horde.org>:
>>>
>>>> I've been slowly looking into options for development workflows  
>>>> after splitting our repo. I'm convinced we should use  
>>>> git-subtree, but not in the way I think others have suggested.
>>>>
>>>> Gunnar has written an article detailing the use of git-subtree as  
>>>> a way to keep BOTH the monolithic and split repos. There was a  
>>>> short mailing list discussion to go along with thisa. I don't  
>>>> think that this method will work; First of all, this will lead to  
>>>> two different canonical Horde repositories for the same code.  
>>>> That's confusing. Secondly, it is resource intensive. Gunnar's  
>>>> approach utilizes an interim repository that will do the actual  
>>>> split filtering and pushing, but it is s l o w. Third, and  
>>>> perhaps most important, is that published topic branches won't be  
>>>> portable between the two.
>>>>
>>>> Assume the monolithic repo is being used for development - it is  
>>>> impossible to checkout a topic branch of one of the subtrees  
>>>> without git rm'ing the folder, then re adding the subtree's topic  
>>>> branch via git-subtree add. Even with squashing all the commits  
>>>> along the way, this is *messy*. The monolithic's history will be  
>>>> polluted with at least 2 merge commits everytime a branch is  
>>>> changed. Not to mention that the state of the upstream monolithic  
>>>> repository will be inconsistent. There is no telling what branch  
>>>> any of the subtrees are currently on. E.g., if I replace  
>>>> framework/component with a topic branch and the push the  
>>>> monolithic repository without replacing it with master again, the  
>>>> next person who pulls will get a monolithic repository with  
>>>> framework/component's code from the topic branch.
>>>>
>>>> I propose that we only provide our individual repositories  
>>>> publicly. Locally, however, we can utilize git-subtree to build a  
>>>> monolithic repository we can develop against. This repository  
>>>> remains local and is not expected to match the state of any other  
>>>> local repository. This allows us to continue developing with  
>>>> more-or-less the same workflow we utilize now, and utilizing  
>>>> things like our framework_install script (mostly) unchanged. The  
>>>> components script will probably need to be tweaked, since we will  
>>>> obviously be releasing from the discrete repositories.  Dealing  
>>>> with branch changes will still be messy, but the mess will be  
>>>> confined to the local repository and not pushed up to any public  
>>>> monolithic repository.
>>>>
>>>> Of course, helper scripts can be used to lessen the burden of  
>>>> things like changing branches, and split/pushing back to the  
>>>> upstream repositories. I've been cobbling together some ideas in  
>>>> a utility script locally that, among other things, can be used to  
>>>> setup the initial local repository.
>>>>
>>>> Thoughts? I'm sure I'm not the only one who wants to get this  
>>>> moving. Among other reasons, I don't want to work on any BC  
>>>> breaking code until we have this sorted and working.
>>>
>>> What's your reasoning to still have a monolith repo with  
>>> sub-trees, even if only privately, locally? I understand that want  
>>> to keep the workflow as close to the current workflow as possible.  
>>> Though as you already mentioned, this won't work without scripts  
>>> and tools for that anyway.
>>
>> It's not that this won't work without the tools, but rather makes  
>> it easier, by combining a few commands. I do see the point you are  
>> making though (see below).
>>
>>> My question is, what benefit do we have, managing a monolith, or  
>>> container repo repository locally, opposed to having those tools  
>>> and scripts just manage the individual repositories in a local  
>>> container *directory*. I see that being able to git-commit from  
>>> the base directory is a good thing. But if this only works by  
>>> later splitting this commit to the individual repos through tools,  
>>> why not having this tool making the "base" commit right from the  
>>> start? Or is it possible to use the split tools automatically from  
>>> git hooks?
>>
>> Yes, mostly it's to take advantage of git's functionality and our  
>> existing tool set. Not only for git-commit from the base directory,  
>> but things like diff|status|log as well. The latter may be of  
>> dubious value if you choose to always squash the subtree updates,  
>> though on the other hand, this would result in only changes made  
>> locally showing in the log.
>
> I just went through a whole from-the-ground up installation of a dev  
> system while preparing the vagrant image.  These are the lessons I  
> learned:
>
> 1. Our current dev installation process is absolutely terrible.  It  
> took me the better part of **4** hours to try to configure the  
> system to a point where it was usable.  And I'm supposedly a person  
> (as a developer) who has pre-existing knowledge of the process.
>
> It was a humbling/embarrassing experience.
>
> In short: nothing about the current development process, including  
> tools, workflow, etc., should be a factor in deciding how to change  
> the system.
>
> 2. Monolithic repositories are a terrible idea.
>
> And git subtree is not the answer, for a variety of reasons.
>
> (My horror story: it took 6 hours to clone the current repo to a  
> memory card so I can do work on a single library during a flight.   
> There's no need to clone 25,000 files when you are only working on  
> 25).
>
> 3. We have a bunch of different installation utilities living in  
> several different locations.  Very confusing knowing what to use and  
> how to use it.
>
> 4. Composer is going to make this all easier.  Warming up to it  
> quickly (especially after trying to deal with an automated process  
> to install all required PEAR libraries for a given set of  
> applications/libraries).
>
>
> The problem is the idea of trying to structure the git repo in a way  
> where the structure itself defines the development  
> process/environment.  This is the mistake we have done in the past  
> and we can't repeat it again.
>
> VCS is nothing more than a way to store code (and revisions).  *How*  
> the various components interact is something that needs to be done  
> at the environment level.  We can provide tools that create this  
> environment automatically, but that is only one interpretation and  
> an installation is free to do with the components as they wish.
>
>
> My proposed solution:
>
> - All apps and libraries live in separate git repos.  They are all  
> entirely independent of each other.

I think we all agree on this one.

>   - Requires maintenance of a list of git repos, but that is a minor  
> hassle.  (This can be automated via a script on www.horde.org, for  
> example).

Agreed.

> - Installation of code from git repos can be facilitated by a script.
>   - This script can create/clone the git repos as needed.

And update, and diff, and commit, and probably all the major git  
stuff. Wait, why did I argument against sub-tree again? :)

Some extra sugar would be if you could checkout one repo, and have all  
dependent packages' repos checked out too.

>   - All repos will be stored in a base folder
>   - Option to create a separate, web-accessible directory.

Not sure what you mean. Some like what install_dev does today?

> - We combine all installation code that currently exists into a  
> single script.
>   - Includes installation script described above.
>   - Includes stuff in framework/bin
>   - Includes stuff in horde-support/maintenance-tools
>   - Includes the groupware install code (in fact, the  
> Horde_Core_Bundle code is probably a good place to start in terms of  
> creating the install script).
>   - This script can be packaged via PHAR
>   - Benefit: development install script work can be leveraged to  
> make end-user installs better also.
>   - From a technical perspective: the goal would be to create test  
> installations and a developer installation using Vagrant where the  
> provisioning file contains nothing but 'horde-install' commands.

Initially I was like "why mix and match end user and developer  
needs?", but you slowly convince me on that.

> Given the fact that we need to be using Composer ASAP, and that  
> Travis is currently broken/unusable, the priority on this is high.   
> My schedule looks a bit more clear the next week or two, so I can  
> hopefully provide support to help get this done.



-- 
Jan Schneider
The Horde Project
http://www.horde.org/
https://www.facebook.com/hordeproject



More information about the dev mailing list