[dev] Split Git
Jan Schneider
jan at horde.org
Wed Nov 19 14:57:49 UTC 2014
Zitat von Michael M Slusarz <slusarz at horde.org>:
> Quoting Michael J Rubinsky <mrubinsk at horde.org>:
>
>> Quoting Jan Schneider <jan at horde.org>:
>>
>>> Zitat von Michael J Rubinsky <mrubinsk at horde.org>:
>>>
>>>> I've been slowly looking into options for development workflows
>>>> after splitting our repo. I'm convinced we should use
>>>> git-subtree, but not in the way I think others have suggested.
>>>>
>>>> Gunnar has written an article detailing the use of git-subtree as
>>>> a way to keep BOTH the monolithic and split repos. There was a
>>>> short mailing list discussion to go along with thisa. I don't
>>>> think that this method will work; First of all, this will lead to
>>>> two different canonical Horde repositories for the same code.
>>>> That's confusing. Secondly, it is resource intensive. Gunnar's
>>>> approach utilizes an interim repository that will do the actual
>>>> split filtering and pushing, but it is s l o w. Third, and
>>>> perhaps most important, is that published topic branches won't be
>>>> portable between the two.
>>>>
>>>> Assume the monolithic repo is being used for development - it is
>>>> impossible to checkout a topic branch of one of the subtrees
>>>> without git rm'ing the folder, then re adding the subtree's topic
>>>> branch via git-subtree add. Even with squashing all the commits
>>>> along the way, this is *messy*. The monolithic's history will be
>>>> polluted with at least 2 merge commits everytime a branch is
>>>> changed. Not to mention that the state of the upstream monolithic
>>>> repository will be inconsistent. There is no telling what branch
>>>> any of the subtrees are currently on. E.g., if I replace
>>>> framework/component with a topic branch and the push the
>>>> monolithic repository without replacing it with master again, the
>>>> next person who pulls will get a monolithic repository with
>>>> framework/component's code from the topic branch.
>>>>
>>>> I propose that we only provide our individual repositories
>>>> publicly. Locally, however, we can utilize git-subtree to build a
>>>> monolithic repository we can develop against. This repository
>>>> remains local and is not expected to match the state of any other
>>>> local repository. This allows us to continue developing with
>>>> more-or-less the same workflow we utilize now, and utilizing
>>>> things like our framework_install script (mostly) unchanged. The
>>>> components script will probably need to be tweaked, since we will
>>>> obviously be releasing from the discrete repositories. Dealing
>>>> with branch changes will still be messy, but the mess will be
>>>> confined to the local repository and not pushed up to any public
>>>> monolithic repository.
>>>>
>>>> Of course, helper scripts can be used to lessen the burden of
>>>> things like changing branches, and split/pushing back to the
>>>> upstream repositories. I've been cobbling together some ideas in
>>>> a utility script locally that, among other things, can be used to
>>>> setup the initial local repository.
>>>>
>>>> Thoughts? I'm sure I'm not the only one who wants to get this
>>>> moving. Among other reasons, I don't want to work on any BC
>>>> breaking code until we have this sorted and working.
>>>
>>> What's your reasoning to still have a monolith repo with
>>> sub-trees, even if only privately, locally? I understand that want
>>> to keep the workflow as close to the current workflow as possible.
>>> Though as you already mentioned, this won't work without scripts
>>> and tools for that anyway.
>>
>> It's not that this won't work without the tools, but rather makes
>> it easier, by combining a few commands. I do see the point you are
>> making though (see below).
>>
>>> My question is, what benefit do we have, managing a monolith, or
>>> container repo repository locally, opposed to having those tools
>>> and scripts just manage the individual repositories in a local
>>> container *directory*. I see that being able to git-commit from
>>> the base directory is a good thing. But if this only works by
>>> later splitting this commit to the individual repos through tools,
>>> why not having this tool making the "base" commit right from the
>>> start? Or is it possible to use the split tools automatically from
>>> git hooks?
>>
>> Yes, mostly it's to take advantage of git's functionality and our
>> existing tool set. Not only for git-commit from the base directory,
>> but things like diff|status|log as well. The latter may be of
>> dubious value if you choose to always squash the subtree updates,
>> though on the other hand, this would result in only changes made
>> locally showing in the log.
>
> I just went through a whole from-the-ground up installation of a dev
> system while preparing the vagrant image. These are the lessons I
> learned:
>
> 1. Our current dev installation process is absolutely terrible. It
> took me the better part of **4** hours to try to configure the
> system to a point where it was usable. And I'm supposedly a person
> (as a developer) who has pre-existing knowledge of the process.
>
> It was a humbling/embarrassing experience.
>
> In short: nothing about the current development process, including
> tools, workflow, etc., should be a factor in deciding how to change
> the system.
>
> 2. Monolithic repositories are a terrible idea.
>
> And git subtree is not the answer, for a variety of reasons.
>
> (My horror story: it took 6 hours to clone the current repo to a
> memory card so I can do work on a single library during a flight.
> There's no need to clone 25,000 files when you are only working on
> 25).
>
> 3. We have a bunch of different installation utilities living in
> several different locations. Very confusing knowing what to use and
> how to use it.
>
> 4. Composer is going to make this all easier. Warming up to it
> quickly (especially after trying to deal with an automated process
> to install all required PEAR libraries for a given set of
> applications/libraries).
>
>
> The problem is the idea of trying to structure the git repo in a way
> where the structure itself defines the development
> process/environment. This is the mistake we have done in the past
> and we can't repeat it again.
>
> VCS is nothing more than a way to store code (and revisions). *How*
> the various components interact is something that needs to be done
> at the environment level. We can provide tools that create this
> environment automatically, but that is only one interpretation and
> an installation is free to do with the components as they wish.
>
>
> My proposed solution:
>
> - All apps and libraries live in separate git repos. They are all
> entirely independent of each other.
I think we all agree on this one.
> - Requires maintenance of a list of git repos, but that is a minor
> hassle. (This can be automated via a script on www.horde.org, for
> example).
Agreed.
> - Installation of code from git repos can be facilitated by a script.
> - This script can create/clone the git repos as needed.
And update, and diff, and commit, and probably all the major git
stuff. Wait, why did I argument against sub-tree again? :)
Some extra sugar would be if you could checkout one repo, and have all
dependent packages' repos checked out too.
> - All repos will be stored in a base folder
> - Option to create a separate, web-accessible directory.
Not sure what you mean. Some like what install_dev does today?
> - We combine all installation code that currently exists into a
> single script.
> - Includes installation script described above.
> - Includes stuff in framework/bin
> - Includes stuff in horde-support/maintenance-tools
> - Includes the groupware install code (in fact, the
> Horde_Core_Bundle code is probably a good place to start in terms of
> creating the install script).
> - This script can be packaged via PHAR
> - Benefit: development install script work can be leveraged to
> make end-user installs better also.
> - From a technical perspective: the goal would be to create test
> installations and a developer installation using Vagrant where the
> provisioning file contains nothing but 'horde-install' commands.
Initially I was like "why mix and match end user and developer
needs?", but you slowly convince me on that.
> Given the fact that we need to be using Composer ASAP, and that
> Travis is currently broken/unusable, the priority on this is high.
> My schedule looks a bit more clear the next week or two, so I can
> hopefully provide support to help get this done.
--
Jan Schneider
The Horde Project
http://www.horde.org/
https://www.facebook.com/hordeproject
More information about the dev
mailing list