Saturday, September 30, 2017

Day-to-day use of Convention over Git

This page explains how to use the Conventions over Git on a daily basis.
Or how to live with Git remote repositories auto synchronized by the Convention over Git.

How to work with the Convention over Git

Use Git, as you always do, except for the following.
  • Name branches and tags with a prefix of your side (as an example "company/some_branch_name").
  • Do plain commits to prefixed branches of your side.
  • Delete only branches and tags of your side. Deleted by you refs from other sides will be recreated automatically.

How to migrate your work to another side refs

All refs will be updated on all sides automatically.
But when you will want to migrate your work to another side, you will do one of the following.
By a merge or cherry-pick from your side
  1. Do a plain merge from you prefixed branch to a prefixed branch of another side.
  2. After a synchronization interval do "git fetch" and check your merge-commit is still there.
  3. If not, then just repeat everything.
This is not a problem, really. Your local Git repository always preserves all local branches (it even stores deleted commits in the Git's reflog).

Commits between different sides may disappear if somebody else do a "git push" to the same branch before you.
The auto conflict resolving of Convention over Git will delete your merge-commit only in a case of non-fast-forward differences. This is not so often.

By a merge or cherry-pick at another side.
Nobody doing so :-)
This works if your can connect to a remote Git repository of the other side.
Wait the sin—Āhronization interval and your commits will appear on the remote repository of the other side.
Do a merge between branches there.
Now this is safe. No checks, no worries. Done.

By a plain commit on your side to an other side ref.
Junior developers doing so. Prevent this.
If you feel lucky you can do a plain commit directly to a branch of the other side.
Again, you must recheck you commit is still there after a synchronization interval.
And if you're unlucky then you'll end up searching your commit in the Git-reflog. It is time wasting operation. I warned you.
So, be careful or inform guys on the other side.

Out of convention refs

In the provided implementation mode of the Convention over Git, the sides only see conventional (prefixed) refs.
Not-prefixed branches and tags will stay undisclosed for other sides. They even can have the same names on different remote Git repositories.

Convention over Git

Automated synchronization of Git remote repositories.

Convention over Git is a straightforward approach for an automatic synchronization between Git remote repositories.
Absolutely separate remote repositories will begin behave as a single remote repository.

Proven and tested solutions were created based on this approach.

This approach uses well-known Git tools. It is because Git has an innate ability to do this. All we need to add is some convention.

This article explains the main idea and implementation of the Convention over Git. If you can't read anymore, you can follow the links below:

(draft, in an active editing)

The situation

Say, we have teams in separate companies. Each team owns their own repository hosted on their own separate git server. This often happens for vendor-client scenarios.

Warn in advance:

  • Solution is applied per-repository (vs per-server)
  • It has been downgraded here to use two remote Git repositories for clarity, but it can work with many.

Explanatory Glossary

  • fast-forward - safe, non-overwriting git fetch or push.
  • non-fast-forward - unsafe, forced & overwriting git fetch or push.
  • side - directly related one remote and many local git repositories. They are located on one side, i.e. in one company.
  • owner - owner of the side. A team that owns and uses their remote repository directly.
  • refs - git references, mostly such as branches and tags.
  • conventional refs - refs strictly separated by a convention and divided into different sides. Each side owns and manages its refs but can affect refs of others.
  • prefixed refs - it is the a naming convention. The implementation of conventional refs by prefixes with a trailing slash.
    As an example: "company1/some_feature", "company2/develop"
    where the "company1/" and "company2/" are conventional prefixes.
  • refspec - specifies source and destination refs and can specify the non-fast-forward fetch or push by the plus + sign at the beginning.
  • synchronization interval - an interval between two synchronizations of remote Git repositories. Usually it is from one to three minutes.
  • reflog - Git Garbage Collector.
  • agent, sync-agent - synchronization agent. Some implementation doing all the work.

Real Crucial Breaking Troubles

Any such a system has the following stumbling-stones (sorted by difficulty)
  • Automated deletion of old branches and tags. I.e. a conflict of constant recreation of deleted branches.
  • Auto conflict resolving (non-fast-forward branch's conflicts).
  • Occasional deletion of an entire repository.
  • Failover & auto recovery of synchronization. Especially for network troubles.

The solution

The solution uses
  • Naming convention
  • Automated deletion of conventional refs
  • Automated fast-forward synchronization
  • Automated non-fast-forward conflict resolving
  • Synchronization agent
  • Bare repositories
Naming convention
It is any prefixed name. Each side owns its prefix and non-fast-forward conflicts for prefixed Git refs will be solved in favor of an owner.
Examples
company1/develop, company2/develop
company1/JIRA-123, company2/JIRA-321

Automated deletion of conventional refs
It deletes refs everywhere. Deletion must be initiated from the side of ref owner. I.e. a ref "team_1/jira-123" can only be deleted from the "team_1" side.

Auto fast-forward synchronization
It is the fast-forward Git synchronization of all branches and tags at once, from everywhere to everywhere.
I have restricted it here for conventional refs only, and you better too (a long story).

Automated non-fast-forward conflict resolving
This performs non-fast-forward conflict resolving by naming convention.

Synchronization agent
It is some infrastructure that constantly invokes all the logic through a synchronization interval. You need repeatable invocations. Use any job scheduler or an automation server like Jenkins.

Bare repositories
Bare repositories allows to do things more efficiently.

The targets

  • Create a single force, a single team.
  • Eliminate severe and constant errors as a result of disunity.
  • Eliminate significant time wasting.
  • Continuous Integration. Commits are verified immediately by an automated build, allowing to detect problems early.
  • Continuous deployment. Merges to special branches are auto deployed.
  • Continuous delivery. No single repository, no heart beating.

Conclusion

Due to the amount of details, I even call this approach as Conventional Distributed Version Control System over Git.

Thursday, September 28, 2017

Cheat-sheet of implementation for Convention over Git

This is a cheat-sheet for the simplest implementation of the Convention over Git approach.
It is used for an automated synchronization between remote Git repositories.

If you understand the idea of the Convention over Git, then it is easier to look at this cheat-sheet. Otherwise look at the Convention over Git idea article.

Here we'll start.
Let's say some two development teams in companies "BCD-1" and "JHI-2" want to auto-synchronize a Git-repository named "repo_name" with each other.

We have some preconditions:
  • The teams want to synchronize only conventionally named branches and tags. Not the entire repository, which is also possible.
  • Logic below uses bare repositories hosted on some your synchronization-agent machine.
  • There is a huge amount of variations for what I'm writing here.
  • There are some unmentioned details why this is done in such a way.

In every synchronization cycle we shall do:

Creating a bare repositories of synchronization-agent, if they do not exist. One for each side (team).


Checking if there are changes in remote Git repositories. Interrupt if not.

git ls-remote --heads --tags https://git.bcd1.com/repo-name
git ls-remote --heads --tags https://jhi2.com/git/repo-name


Updating repositories of synchronization-agent.

git fetch --prune https://git.bcd1.com/repo-name "+refs/heads/*:refs/heads/*" "+refs/tags/*:refs/tags/*"
git fetch --prune https://jhi2.com/git/repo-name "+refs/heads/*:refs/heads/*" "+refs/tags/*:refs/tags/*"


Checking if the Deletion & Recovering is allowable. Use git show-ref and git show to do this.
Each repository at least must have a branch with the same sha-1 and some fixed commit. There should be a difference in ref-names to allow the Deletion & Recovering.
If allowable, add the "--prune" options to git pushes of the following step.


Deletion & Recovering of refs. And a fast-forward synchronization of refs.
(the "--prune" option below is commented out by a bash inline comment)

git push `#--prune` https://jhi2.com/git/repo-name "refs/heads/bcd1/*:refs/heads/bcd1/*" "refs/tags/bcd1/*:refs/tags/bcd1/*"
git push `#--prune` https://git.bcd1.com/repo-name "refs/heads/jhi2/*:refs/heads/jhi2/*" "refs/tags/jhi2/*:refs/tags/jhi2/*"


Updating repositories of synchronization-agent.

git fetch --prune https://git.bcd1.com/repo-name "+refs/heads/*:refs/heads/*" "+refs/tags/*:refs/tags/*"
git fetch --prune https://jhi2.com/git/repo-name "+refs/heads/*:refs/heads/*" "+refs/tags/*:refs/tags/*"


Resolving conflicts of conventionally named refs. Do non-fast-forward pushes.

git push https://jhi2.com/git/repo-name "+refs/heads/bcd1/*:refs/heads/bcd1/*" "+refs/tags/bcd1/*:refs/tags/bcd1/*"
git push https://git.bcd1.com/repo-name "+refs/heads/jhi2/*:refs/heads/jhi2/*" "+refs/tags/jhi2/*:refs/tags/jhi2/*"

All-in-one working implementation

You can test an all-in-one implementation that I published on GitHub.
It creates local and remote repositories in your folder, so you can easily emulate and play with Convention over Git on your machine.