We run a deployment pipeline where we build a versioned binary, tag the commit it was built from with the same version as the binary, and then can deploy the binary into arbitrary environments (typically qa, then live).
I'd like to keep a record of which commit is currently deployed in an environment as a git tag or branch on the canonical remote repository.
I'm imagining a branch or a tag called live
(or prod
or whatever) and a deployment process that on successful deployment moves that tag / hard resets that branch to the commit that is deployed, so in principle you can just do a git pull && git checkout live
and (race conditions aside) you're looking at the code that is currently deployed in live.
However, neither tags nor branches quite measure up...
Branches might feel more correct in that they are pointers that are meant to move between commits. However, in a world where people can roll back, a branch with its assumption of always moving forward doesn't quite fit. Resetting the remote live
branch to an earlier commit will mean that a dev checking out live
may be ahead of the remote branch and will need to git reset --hard origin/live
to get back to what is actually in live. I can also imagine a situation where pulling might present you with a nasty merge conflict. We'd also need to protect the branch as it would not be intended for developers to commit and push to it.
On the other hand tags aren't really designed to move; if you move a tag on the remote repo, a git pull --tags
will fail as so:
! [rejected] live -> live (would clobber existing tag)
unless you do a git pull -f --tags
.
Still feels like tags are slightly the better option, as they convey the idea that it's just a marker, not a work in progress.
Does anyone have a view either way? Or is there a Third Way of some kind? Or is trying to do this just a bad idea? We bake the git hash into the binary and allow easy reading of it, so it's not a huge hardship to go to the environment and find out which commit is deployed there. It would just be convenient to be able to see it in the git log.
7 Answers 7
This is not a problem that version control was meant to solve. Recording which commit is deployed to which environment is an artifact of the build process. If using an automated tool for build and deployment, check the features it supports. Most tools support a naming convention for builds. Many of them integrate with Git and other version control systems to associate a build or release with a commit identifier. Typically a new tag is created for each deployment, something like 1.0.3
if using semantic versioning.
Part of the build process could record the commit Id as part of some config setting or version number that becomes visible through the application. For a web application, this could be an HTML comment or some static text pulled from a config file. Web APIs could have an endpoint that returns the current version information, including a git commit Id or tag name.
Version control is meant to record changes to source code. The commit Id deployed to a particular environment is transient meta data about the larger system that is out of scope for version control.
-
4Note: you don't need to tie yourself to only deploying on a tag. A command I use on one project of mine to create the version string is
git describe --tags --always --dirty=-dirty-"$(date +%s)"
. When not on tag, but there is tagged ancestor, it gives output like v0.2.1-5-g1606aab (last tag + hash prefix), and if there is no tag in history, would give just the hash prefix. It also adds suffix if working directory isn't clean. Actually, I follow that with ` || echo "<unknown version>"`, so that it works even if builing without git.Frax– Frax05/03/2022 20:09:49Commented May 3, 2022 at 20:09 -
3Fully agree with this answer. The crux of the matter is that whatever the git repo says is "live" can't 100% be trusted to actually match what's in production, and source control shouldn't be depended on to know this. Basically you can't do any better than what tags already do, unless you track release metadata in a different repo, as someone else suggested.Christopher Hunter– Christopher Hunter05/05/2022 03:53:44Commented May 5, 2022 at 3:53
Don't record this kind of metainformation in the same git repository as your project. Record it in a separate git repository whose history is the history of deployed revisions. In its simplest form, the only thing in this repository needs to be a single file containing the commit hash from the main repository, but you could also keep scripts related to building your project and updating the metarepository in this metarepository.
Reverting then is not a matter of reverting anything in the metarepository, but rather making a new commit to the metarepository changing the hash back to an older hash from the project repository.
These same principles apply to continuous integration (CI). Don't store the CI configuration in your project repository, because it should be versioned independently of your project and should allow building different versions of the project using the same up-to-date (e.g. with regard to which CI service you use, etc.) CI configuration.
-
This. I was going to suggest a repo with submodules (e.g.
git submodule
).Klik– Klik05/06/2022 02:21:41Commented May 6, 2022 at 2:21 -
@Klik: Yep. That's actually how
git submodule
works under the hood, and submodule is a completely reasonable way to do this. Depending on your needs though, doing it "manually" may offer a little more flexibility.R.. GitHub STOP HELPING ICE– R.. GitHub STOP HELPING ICE05/06/2022 02:35:16Commented May 6, 2022 at 2:35
Given that branches are more easily movable than tags, I think this is best addressed with branches. As you say, they need to be arranged so that deployment can force-push whatever it thinks the current branch is, but nobody else is allowed to.
The pull issue can maybe be addressed with a script/alias:
git fetch origin
git update-ref refs/heads/live refs/remotes/origin/live
git switch live
Obviously this obliterates any local changes to "live"! But that's intrinsic to your requirement that "live" track what's actually deployed.
(idea from https://stackoverflow.com/questions/1591107/reset-other-branch-to-current-without-a-checkout )
-
2You can also delete the
live
branch and make a new one based on current actual live, or you can call the branch something other thanlive
so it doesn't confuse you (e.g.hotfix-2022年02月22日
).Stack Exchange Broke The Law– Stack Exchange Broke The Law05/03/2022 11:21:54Commented May 3, 2022 at 11:21 -
1
git checkout -b hotfix-2022年02月22日 origin/live
will takeorigin/live
and create a new branch calledhotfix-2022年02月22日
at that point and switch to that branchStack Exchange Broke The Law– Stack Exchange Broke The Law05/04/2022 12:15:12Commented May 4, 2022 at 12:15
Use another repository, and use submodules
Git's submodule system allows a "master" repository to pull in the contents of another repository at a specific version. This sounds ideal for your purposes.
Set up a second repository, which will be your "deployment" repository. From this, include the "source" repository as a submodule. Each deployment location (QA, beta-testing, production, whatever) has its own branch on the "deployment" repository, and each references a different version of the "source" repository. Whenever some bundle of work passes the appropriate process step, you update the submodule version for that deployment location's branch.
Now not only can anyone get the current latest source corresponding to a deployment location, but they can also track how and when that changed. Independently, the "source" can move on in whatever direction it likes, secure in the knowledge that "deployment" is referring to a specific version and can't be affected.
Git submodules certainly have their limitations, but I don't think those limitations are going to affect you here.
Use a branch that you never roll back.
See how the linux-rolling-stable branch is handled? This is a branch in the "stable" Linux repo that tracks the current "running stable" kernel release:
$ cd ~/src/linux
$ git -P log --oneline --first-parent -5 linux-rolling-stable
2d01c3611156 (linux-rolling-stable) Merge v5.17.5
d8b78dc2f582 Merge v5.17.4
aeeb1f66846d Merge v5.17.3
f91ccd0d5059 Merge v5.16.20
45e4558f7300 Merge v5.16.19
Each commit along this lineage is a merge of two parents:
- the first parent is the previous commit of the same branch (the previous "rolling stable" release)
- the second parent is the release that "running stable" is tracking.
The tree in the commit (i.e. the recorded project snapshot) is the same as the tree of the second parent. In other words, the merge was done using the hypothetical "theirs" strategy: the contents of the first parent is completely ignored.
You can use this technique to track your deployments. Since the branch
is never rolled back, you will never have to --force
anything. Added
benefit: not only do you know which commit is currently deployed, you
also have a full history of those deployments (after all, that's what
git is meant for).
Note that this doesn't prevent you from rolling back your deployments. You could well have a history like this:
$ git -P log --oneline --first-parent live
11d41be Roll back to to commit bar
7aaf1a5 Deploy commit baz
c784aa6 Depoy commit bar
700d035 Deploy commit foo
matching a commit graph like this:
700d035---c784aa6---7aaf1a5---11d41be (live)
/ / / /
/ /,------- / --------'
foo---X---bar---Y---baz (master)
Note: If you git checkout live
, you get to see the code that is
currently deployed. If you want to see the actual commit, you can
git show live^2
where ^2
means "second parent". Same code, different commit. You can
also checkout live^2
, although that puts you in the "detached HEAD"
state.
Use branches, but be careful with them.
Your problem seems to be that the developer confuses the branch on their machine called live
, with the branch on the server called live
(which is called e.g. origin/live
on the developer's machine).
If you do git checkout live
and there is not already a branch called live
, git will create one based on the current value of origin/live
. If origin/live
changes, this has nothing to do with the developer's live
branch which could be confusing.
However, the developer doesn't have to create a branch called live
on their machine. They could just as well call it hotfix-12345
for example, using the command git checkout -b hotfix-12345 origin/live
which creates a new branch and switches to it. If origin/live
changes tomorrow, you don't get confused and expect hotfix-12345
to update.
Another option is to do git checkout origin/live
. Since origin/live
refers to a branch on a different machine, git on the developer's machine won't update it. It will warn them that they are in "detached HEAD state", meaning they are not currently working on any branch, and no branch will get updated when they commit (slightly dangerous because it's easy to lose commits if they don't have a branch name). They can then create a branch if they want one. Again, it is clear that the developer's branch is not the same as the server's branch.
Note: remote references like origin/live
are only updated when doing git fetch
(or git pull
). Git doesn't check the server every time you write origin/live
.
Frame Challenge: Keep the History instead
You should seek not to see which version is currently deployed in a given environment, but which version was deployed at a given time in a given environment -- including now.
History is Necessary
The reason for this larger goal is that when a user reports that last Friday around 3 PM they hit that issue for the first time, it is valuable information to know which version was deployed at that time, and which version was previously deployed that seemingly didn't have the issue.
This immediately invalidates the idea of keeping this information as a "tag" or "branch" because the history of those is not tracked.
Instead, I would suggest using either database or a file in a repository to keep this information. It could be as basic as a tuple (timestamp, environment, version, hash-tag).
I do note that in some cases deploying a version can take time: in a previous company, deployments would happen over 24h, as each server was progressively taken down, upgraded, and brought up again. In this case, you may want finer grained information than just "environment": the good news is that both database or file are flexible enough to encode that.
I would personally recommend a database -- even as committed SQL scripts -- due to the ease of querying them: finding which versions were active last week, which was the previous version, which environment ever had a given version (or range of versions) deployed, etc... are all fairly trivial over SQL.
History is Live
And if you have the history of deployments, then you do NOT want to duplicate the "current" version information anywhere else, because then you'll end up with discrepancies, and in that case which do you trust?
Instead, you want to compute the "live" information based on the history. If you want to display that as a pseudo-tag, it may be possible in git. Otherwise, a simple dashboard would do -- even a static one, recalculated whenever an upgrade/downgrade occurs.
git checkout live
checks out the developer's branchlive
which is not necessarily the same as the server's branch. By default if it does not exist it will create one based on the server'slive
branch. But I think if you use e.g.git checkout origin/live
then it will always check out what's on the server, and warn the developer that the branch won't be updated if they commit because it is not their machine's branch.