Saturday, September 30, 2017

Convention over Git

Automated synchronization of Git remote repositories.

Announcement: An update is coming with some bug fixing. Also later I will publish an additional project with significant performance considerations.

Convention over Git is a straightforward approach for an automatic synchronization between Git remote repositories.
Absolutely separate remote repositories will begin behave as a single remote repository.

Proven and tested solutions were created based on this approach.

This approach uses well-known Git tools. It is because Git has an innate ability to do this. All we need to add is some convention.

This article explains the main idea and implementation of the Convention over Git. If you can't read anymore, you can follow the links below:

(draft, in an active editing)

The situation

Say, we have teams in separate companies. Each team owns their own repository hosted on their own separate git server. This often happens for vendor-client scenarios.

Warn in advance:

  • Solution is applied per-repository (vs per-server)
  • It has been downgraded here to use two remote Git repositories for clarity, but it can work with many.

Explanatory Glossary

  • fast-forward - safe, non-overwriting git fetch or push.
  • non-fast-forward - unsafe, forced & overwriting git fetch or push.
  • side - directly related one remote and many local git repositories. They are located on one side, i.e. in one company.
  • owner - owner of the side. A team that owns and uses their remote repository directly.
  • refs - git references, mostly such as branches and tags.
  • conventional refs - refs strictly separated by a convention and divided into different sides. Each side owns and manages its refs but can affect refs of others.
  • prefixed refs - it is the a naming convention. The implementation of conventional refs by prefixes with a trailing slash.
    As an example: "company1/some_feature", "company2/develop"
    where the "company1/" and "company2/" are conventional prefixes.
  • refspec - specifies source and destination refs and can specify the non-fast-forward fetch or push by the plus + sign at the beginning.
  • synchronization interval - an interval between two synchronizations of remote Git repositories. Usually it is from one to three minutes.
  • reflog - Git Garbage Collector.
  • agent, sync-agent - synchronization agent. Some implementation doing all the work.

Real Crucial Breaking Troubles

Any such a system has the following stumbling-stones (sorted by difficulty)
  • Automated deletion of old branches and tags. I.e. a conflict of constant recreation of deleted branches.
  • Auto conflict resolving (non-fast-forward branch's conflicts).
  • Occasional deletion of an entire repository.
  • Failover & auto recovery of synchronization. Especially for network troubles.

The solution

The solution uses
  • Naming convention
  • Automated deletion of conventional refs
  • Automated fast-forward synchronization
  • Automated non-fast-forward conflict resolving
  • Synchronization agent
  • Bare repositories
Naming convention
It is any prefixed name. Each side owns its prefix and non-fast-forward conflicts for prefixed Git refs will be solved in favor of an owner.
Examples
company1/develop, company2/develop
company1/JIRA-123, company2/JIRA-321

Automated deletion of conventional refs
It deletes refs everywhere. Deletion must be initiated from the side of ref owner. I.e. a ref "team_1/jira-123" can only be deleted from the "team_1" side.

Auto fast-forward synchronization
It is the fast-forward Git synchronization of all branches and tags at once, from everywhere to everywhere.
I have restricted it here for conventional refs only, and you better too (a long story).

Automated non-fast-forward conflict resolving
This performs non-fast-forward conflict resolving by naming convention.

Synchronization agent
It is some infrastructure that constantly invokes all the logic through a synchronization interval. You need repeatable invocations. Use any job scheduler or an automation server like Jenkins.

Bare repositories
Bare repositories allows to do things more efficiently.

The targets

  • Create a single force, a single team.
  • Eliminate severe and constant errors as a result of disunity.
  • Eliminate significant time wasting.
  • Continuous Integration. Commits are verified immediately by an automated build, allowing to detect problems early.
  • Continuous deployment. Merges to special branches are auto deployed.
  • Continuous delivery. No single repository, no heart beating.

Conclusion

Due to the amount of details, I even call this approach as Conventional Distributed Version Control System over Git.

No comments:

Post a Comment