What is the best way to record in git which commit is currently deployed to which environment?

Question 1

We run a deployment pipeline where we build a versioned binary, tag the commit it was built from with the same version as the binary, and then can deploy the binary into arbitrary environments (typically qa, then live).

I'd like to keep a record of which commit is currently deployed in an environment as a git tag or branch on the canonical remote repository.

I'm imagining a branch or a tag called live (or prod or whatever) and a deployment process that on successful deployment moves that tag / hard resets that branch to the commit that is deployed, so in principle you can just do a git pull && git checkout live and (race conditions aside) you're looking at the code that is currently deployed in live.

However, neither tags nor branches quite measure up...

Branches might feel more correct in that they are pointers that are meant to move between commits. However, in a world where people can roll back, a branch with its assumption of always moving forward doesn't quite fit. Resetting the remote live branch to an earlier commit will mean that a dev checking out live may be ahead of the remote branch and will need to git reset --hard origin/live to get back to what is actually in live. I can also imagine a situation where pulling might present you with a nasty merge conflict. We'd also need to protect the branch as it would not be intended for developers to commit and push to it.

On the other hand tags aren't really designed to move; if you move a tag on the remote repo, a git pull --tags will fail as so:

! [rejected] live -> live (would clobber existing tag)

unless you do a git pull -f --tags.

Still feels like tags are slightly the better option, as they convey the idea that it's just a marker, not a work in progress.

Does anyone have a view either way? Or is there a Third Way of some kind? Or is trying to do this just a bad idea? We bake the git hash into the binary and allow easy reading of it, so it's not a huge hardship to go to the environment and find out which commit is deployed there. It would just be convenient to be able to see it in the git log.

Question 2

I don't understand the problem you have with branches. If the live environment rolls back, why shouldn't the live branch also roll back?

Question 3

note that git checkout live checks out the developer's branch live which is not necessarily the same as the server's branch. By default if it does not exist it will create one based on the server's live branch. But I think if you use e.g. git checkout origin/live then it will always check out what's on the server, and warn the developer that the branch won't be updated if they commit because it is not their machine's branch.

Question 4

I don't understand your problem... Say you deploy your code as docker containers. You would build a docker image that ends with the commit hash. Then you can just look at the name of the running live image and checkout the commit hash. This works even if you have a broken update and have some containers with a version and others with a different one, you can easily distinguish the two and checkout both of them to troubleshoot deployment problems, what you suggest only works if deployment always work.

Question 5

Or more generally, it's far easier to record in production which git commit the deployment is from than the other way around.

Question 6

Your focus on solving this problem with git is pushing you to sub-optimal solutions. Try to rephrase your question to omit specific technology. Eg: "What is the best way to record which commit is currently deployed to which environment?" Such question could help you find better solutions :)

Question 7

This is not a problem that version control was meant to solve. Recording which commit is deployed to which environment is an artifact of the build process. If using an automated tool for build and deployment, check the features it supports. Most tools support a naming convention for builds. Many of them integrate with Git and other version control systems to associate a build or release with a commit identifier. Typically a new tag is created for each deployment, something like 1.0.3 if using semantic versioning.

Part of the build process could record the commit Id as part of some config setting or version number that becomes visible through the application. For a web application, this could be an HTML comment or some static text pulled from a config file. Web APIs could have an endpoint that returns the current version information, including a git commit Id or tag name.

Version control is meant to record changes to source code. The commit Id deployed to a particular environment is transient meta data about the larger system that is out of scope for version control.

Question 8

Note: you don't need to tie yourself to only deploying on a tag. A command I use on one project of mine to create the version string is git describe --tags --always --dirty=-dirty-"$(date +%s)". When not on tag, but there is tagged ancestor, it gives output like v0.2.1-5-g1606aab (last tag + hash prefix), and if there is no tag in history, would give just the hash prefix. It also adds suffix if working directory isn't clean. Actually, I follow that with ` || echo "<unknown version>"`, so that it works even if builing without git.

Question 9

Fully agree with this answer. The crux of the matter is that whatever the git repo says is "live" can't 100% be trusted to actually match what's in production, and source control shouldn't be depended on to know this. Basically you can't do any better than what tags already do, unless you track release metadata in a different repo, as someone else suggested.

Question 10

Don't record this kind of metainformation in the same git repository as your project. Record it in a separate git repository whose history is the history of deployed revisions. In its simplest form, the only thing in this repository needs to be a single file containing the commit hash from the main repository, but you could also keep scripts related to building your project and updating the metarepository in this metarepository.

Reverting then is not a matter of reverting anything in the metarepository, but rather making a new commit to the metarepository changing the hash back to an older hash from the project repository.

These same principles apply to continuous integration (CI). Don't store the CI configuration in your project repository, because it should be versioned independently of your project and should allow building different versions of the project using the same up-to-date (e.g. with regard to which CI service you use, etc.) CI configuration.

Question 11

This. I was going to suggest a repo with submodules (e.g. git submodule).

Question 12

@Klik: Yep. That's actually how git submodule works under the hood, and submodule is a completely reasonable way to do this. Depending on your needs though, doing it "manually" may offer a little more flexibility.

Question 13

Given that branches are more easily movable than tags, I think this is best addressed with branches. As you say, they need to be arranged so that deployment can force-push whatever it thinks the current branch is, but nobody else is allowed to.

The pull issue can maybe be addressed with a script/alias:

git fetch origin
git update-ref refs/heads/live refs/remotes/origin/live
git switch live

Obviously this obliterates any local changes to "live"! But that's intrinsic to your requirement that "live" track what's actually deployed.

(idea from https://stackoverflow.com/questions/1591107/reset-other-branch-to-current-without-a-checkout )

Question 14

You can also delete the live branch and make a new one based on current actual live, or you can call the branch something other than live so it doesn't confuse you (e.g. hotfix-2022年02月22日).

Question 15

git checkout -b hotfix-2022年02月22日 origin/live will take origin/live and create a new branch called hotfix-2022年02月22日 at that point and switch to that branch

Question 16

Use another repository, and use submodules

Git's submodule system allows a "master" repository to pull in the contents of another repository at a specific version. This sounds ideal for your purposes.

Set up a second repository, which will be your "deployment" repository. From this, include the "source" repository as a submodule. Each deployment location (QA, beta-testing, production, whatever) has its own branch on the "deployment" repository, and each references a different version of the "source" repository. Whenever some bundle of work passes the appropriate process step, you update the submodule version for that deployment location's branch.

Now not only can anyone get the current latest source corresponding to a deployment location, but they can also track how and when that changed. Independently, the "source" can move on in whatever direction it likes, secure in the knowledge that "deployment" is referring to a specific version and can't be affected.

Git submodules certainly have their limitations, but I don't think those limitations are going to affect you here.

Question 17

Use a branch that you never roll back.

See how the linux-rolling-stable branch is handled? This is a branch in the "stable" Linux repo that tracks the current "running stable" kernel release:

$ cd ~/src/linux
$ git -P log --oneline --first-parent -5 linux-rolling-stable 
2d01c3611156 (linux-rolling-stable) Merge v5.17.5
d8b78dc2f582 Merge v5.17.4
aeeb1f66846d Merge v5.17.3
f91ccd0d5059 Merge v5.16.20
45e4558f7300 Merge v5.16.19

Each commit along this lineage is a merge of two parents:

the first parent is the previous commit of the same branch (the previous "rolling stable" release)
the second parent is the release that "running stable" is tracking.

The tree in the commit (i.e. the recorded project snapshot) is the same as the tree of the second parent. In other words, the merge was done using the hypothetical "theirs" strategy: the contents of the first parent is completely ignored.

You can use this technique to track your deployments. Since the branch is never rolled back, you will never have to --force anything. Added benefit: not only do you know which commit is currently deployed, you also have a full history of those deployments (after all, that's what git is meant for).

Note that this doesn't prevent you from rolling back your deployments. You could well have a history like this:

$ git -P log --oneline --first-parent live
11d41be Roll back to to commit bar
7aaf1a5 Deploy commit baz
c784aa6 Depoy commit bar
700d035 Deploy commit foo

matching a commit graph like this:

700d035---c784aa6---7aaf1a5---11d41be (live)
 / / / /
 / /,------- / --------'
foo---X---bar---Y---baz (master)

Note: If you git checkout live, you get to see the code that is currently deployed. If you want to see the actual commit, you can

git show live^2

where ^2 means "second parent". Same code, different commit. You can also checkout live^2, although that puts you in the "detached HEAD" state.

Question 18

Use branches, but be careful with them.

Your problem seems to be that the developer confuses the branch on their machine called live, with the branch on the server called live (which is called e.g. origin/live on the developer's machine).

If you do git checkout live and there is not already a branch called live, git will create one based on the current value of origin/live. If origin/live changes, this has nothing to do with the developer's live branch which could be confusing.

However, the developer doesn't have to create a branch called live on their machine. They could just as well call it hotfix-12345 for example, using the command git checkout -b hotfix-12345 origin/live which creates a new branch and switches to it. If origin/live changes tomorrow, you don't get confused and expect hotfix-12345 to update.

Another option is to do git checkout origin/live. Since origin/live refers to a branch on a different machine, git on the developer's machine won't update it. It will warn them that they are in "detached HEAD state", meaning they are not currently working on any branch, and no branch will get updated when they commit (slightly dangerous because it's easy to lose commits if they don't have a branch name). They can then create a branch if they want one. Again, it is clear that the developer's branch is not the same as the server's branch.

Note: remote references like origin/live are only updated when doing git fetch (or git pull). Git doesn't check the server every time you write origin/live.

Question 19

Frame Challenge: Keep the History instead

You should seek not to see which version is currently deployed in a given environment, but which version was deployed at a given time in a given environment -- including now.

History is Necessary

The reason for this larger goal is that when a user reports that last Friday around 3 PM they hit that issue for the first time, it is valuable information to know which version was deployed at that time, and which version was previously deployed that seemingly didn't have the issue.

This immediately invalidates the idea of keeping this information as a "tag" or "branch" because the history of those is not tracked.

Instead, I would suggest using either database or a file in a repository to keep this information. It could be as basic as a tuple (timestamp, environment, version, hash-tag).

I do note that in some cases deploying a version can take time: in a previous company, deployments would happen over 24h, as each server was progressively taken down, upgraded, and brought up again. In this case, you may want finer grained information than just "environment": the good news is that both database or file are flexible enough to encode that.

I would personally recommend a database -- even as committed SQL scripts -- due to the ease of querying them: finding which versions were active last week, which was the previous version, which environment ever had a given version (or range of versions) deployed, etc... are all fairly trivial over SQL.

History is Live

And if you have the history of deployments, then you do NOT want to duplicate the "current" version information anywhere else, because then you'll end up with discrepancies, and in that case which do you trust?

Instead, you want to compute the "live" information based on the history. If you want to display that as a pseudo-tag, it may be possible in git. Otherwise, a simple dashboard would do -- even a static one, recalculated whenever an upgrade/downgrade occurs.

score 77 · Answer 1 · 2022-05-03 11:50:48Z

This is not a problem that version control was meant to solve. Recording which commit is deployed to which environment is an artifact of the build process. If using an automated tool for build and deployment, check the features it supports. Most tools support a naming convention for builds. Many of them integrate with Git and other version control systems to associate a build or release with a commit identifier. Typically a new tag is created for each deployment, something like 1.0.3 if using semantic versioning.

Part of the build process could record the commit Id as part of some config setting or version number that becomes visible through the application. For a web application, this could be an HTML comment or some static text pulled from a config file. Web APIs could have an endpoint that returns the current version information, including a git commit Id or tag name.

Version control is meant to record changes to source code. The commit Id deployed to a particular environment is transient meta data about the larger system that is out of scope for version control.

Note: you don't need to tie yourself to only deploying on a tag. A command I use on one project of mine to create the version string is git describe --tags --always --dirty=-dirty-"$(date +%s)". When not on tag, but there is tagged ancestor, it gives output like v0.2.1-5-g1606aab (last tag + hash prefix), and if there is no tag in history, would give just the hash prefix. It also adds suffix if working directory isn't clean. Actually, I follow that with ` || echo "<unknown version>"`, so that it works even if builing without git.
Fully agree with this answer. The crux of the matter is that whatever the git repo says is "live" can't 100% be trusted to actually match what's in production, and source control shouldn't be depended on to know this. Basically you can't do any better than what tags already do, unless you track release metadata in a different repo, as someone else suggested.

score 12 · Answer 2 · 2022-05-04 14:06:39Z

Don't record this kind of metainformation in the same git repository as your project. Record it in a separate git repository whose history is the history of deployed revisions. In its simplest form, the only thing in this repository needs to be a single file containing the commit hash from the main repository, but you could also keep scripts related to building your project and updating the metarepository in this metarepository.

Reverting then is not a matter of reverting anything in the metarepository, but rather making a new commit to the metarepository changing the hash back to an older hash from the project repository.

These same principles apply to continuous integration (CI). Don't store the CI configuration in your project repository, because it should be versioned independently of your project and should allow building different versions of the project using the same up-to-date (e.g. with regard to which CI service you use, etc.) CI configuration.

This. I was going to suggest a repo with submodules (e.g. git submodule).
@Klik: Yep. That's actually how git submodule works under the hood, and submodule is a completely reasonable way to do this. Depending on your needs though, doing it "manually" may offer a little more flexibility.

pjc50 pjc50 15.3k1 gold badge37 silver badges40 bronze badges · Answer 3 · 2022-05-03 11:06:41Z

Given that branches are more easily movable than tags, I think this is best addressed with branches. As you say, they need to be arranged so that deployment can force-push whatever it thinks the current branch is, but nobody else is allowed to.

The pull issue can maybe be addressed with a script/alias:

git fetch origin
git update-ref refs/heads/live refs/remotes/origin/live
git switch live

Obviously this obliterates any local changes to "live"! But that's intrinsic to your requirement that "live" track what's actually deployed.

(idea from https://stackoverflow.com/questions/1591107/reset-other-branch-to-current-without-a-checkout )

You can also delete the live branch and make a new one based on current actual live, or you can call the branch something other than live so it doesn't confuse you (e.g. hotfix-2022年02月22日).
git checkout -b hotfix-2022年02月22日 origin/live will take origin/live and create a new branch called hotfix-2022年02月22日 at that point and switch to that branch

Graham Graham 2,0621 gold badge14 silver badges11 bronze badges · Answer 4 · 2022-05-04 14:15:09Z

Use another repository, and use submodules

Git's submodule system allows a "master" repository to pull in the contents of another repository at a specific version. This sounds ideal for your purposes.

Set up a second repository, which will be your "deployment" repository. From this, include the "source" repository as a submodule. Each deployment location (QA, beta-testing, production, whatever) has its own branch on the "deployment" repository, and each references a different version of the "source" repository. Whenever some bundle of work passes the appropriate process step, you update the submodule version for that deployment location's branch.

Now not only can anyone get the current latest source corresponding to a deployment location, but they can also track how and when that changed. Independently, the "source" can move on in whatever direction it likes, secure in the knowledge that "deployment" is referring to a specific version and can't be affected.

Git submodules certainly have their limitations, but I don't think those limitations are going to affect you here.

Edgar Bonet Edgar Bonet 1,3391 gold badge7 silver badges6 bronze badges · Answer 5 · 2022-05-04 14:20:23Z

Use a branch that you never roll back.

See how the linux-rolling-stable branch is handled? This is a branch in the "stable" Linux repo that tracks the current "running stable" kernel release:

$ cd ~/src/linux
$ git -P log --oneline --first-parent -5 linux-rolling-stable 
2d01c3611156 (linux-rolling-stable) Merge v5.17.5
d8b78dc2f582 Merge v5.17.4
aeeb1f66846d Merge v5.17.3
f91ccd0d5059 Merge v5.16.20
45e4558f7300 Merge v5.16.19

Each commit along this lineage is a merge of two parents:

the first parent is the previous commit of the same branch (the previous "rolling stable" release)
the second parent is the release that "running stable" is tracking.

The tree in the commit (i.e. the recorded project snapshot) is the same as the tree of the second parent. In other words, the merge was done using the hypothetical "theirs" strategy: the contents of the first parent is completely ignored.

You can use this technique to track your deployments. Since the branch is never rolled back, you will never have to --force anything. Added benefit: not only do you know which commit is currently deployed, you also have a full history of those deployments (after all, that's what git is meant for).

Note that this doesn't prevent you from rolling back your deployments. You could well have a history like this:

$ git -P log --oneline --first-parent live
11d41be Roll back to to commit bar
7aaf1a5 Deploy commit baz
c784aa6 Depoy commit bar
700d035 Deploy commit foo

matching a commit graph like this:

700d035---c784aa6---7aaf1a5---11d41be (live)
 / / / /
 / /,------- / --------'
foo---X---bar---Y---baz (master)

Note: If you git checkout live, you get to see the code that is currently deployed. If you want to see the actual commit, you can

git show live^2

where ^2 means "second parent". Same code, different commit. You can also checkout live^2, although that puts you in the "detached HEAD" state.

score 1 · Answer 6 · 2022-05-04 12:21:33Z

Use branches, but be careful with them.

Your problem seems to be that the developer confuses the branch on their machine called live, with the branch on the server called live (which is called e.g. origin/live on the developer's machine).

If you do git checkout live and there is not already a branch called live, git will create one based on the current value of origin/live. If origin/live changes, this has nothing to do with the developer's live branch which could be confusing.

However, the developer doesn't have to create a branch called live on their machine. They could just as well call it hotfix-12345 for example, using the command git checkout -b hotfix-12345 origin/live which creates a new branch and switches to it. If origin/live changes tomorrow, you don't get confused and expect hotfix-12345 to update.

Another option is to do git checkout origin/live. Since origin/live refers to a branch on a different machine, git on the developer's machine won't update it. It will warn them that they are in "detached HEAD state", meaning they are not currently working on any branch, and no branch will get updated when they commit (slightly dangerous because it's easy to lose commits if they don't have a branch name). They can then create a branch if they want one. Again, it is clear that the developer's branch is not the same as the server's branch.

Note: remote references like origin/live are only updated when doing git fetch (or git pull). Git doesn't check the server every time you write origin/live.

Matthieu M. Matthieu M. 15.2k5 gold badges47 silver badges68 bronze badges · Answer 7 · 2022-05-05 12:35:28Z

Frame Challenge: Keep the History instead

You should seek not to see which version is currently deployed in a given environment, but which version was deployed at a given time in a given environment -- including now.

History is Necessary

The reason for this larger goal is that when a user reports that last Friday around 3 PM they hit that issue for the first time, it is valuable information to know which version was deployed at that time, and which version was previously deployed that seemingly didn't have the issue.

This immediately invalidates the idea of keeping this information as a "tag" or "branch" because the history of those is not tracked.

Instead, I would suggest using either database or a file in a repository to keep this information. It could be as basic as a tuple (timestamp, environment, version, hash-tag).

I do note that in some cases deploying a version can take time: in a previous company, deployments would happen over 24h, as each server was progressively taken down, upgraded, and brought up again. In this case, you may want finer grained information than just "environment": the good news is that both database or file are flexible enough to encode that.

I would personally recommend a database -- even as committed SQL scripts -- due to the ease of querying them: finding which versions were active last week, which was the previous version, which environment ever had a given version (or range of versions) deployed, etc... are all fairly trivial over SQL.

History is Live

And if you have the history of deployments, then you do NOT want to duplicate the "current" version information anywhere else, because then you'll end up with discrepancies, and in that case which do you trust?

Instead, you want to compute the "live" information based on the history. If you want to display that as a pseudo-tag, it may be possible in git. Otherwise, a simple dashboard would do -- even a static one, recalculated whenever an upgrade/downgrade occurs.

Stack Exchange Network

What is the best way to record in git which commit is currently deployed to which environment?

7 Answers 7

Use another repository, and use submodules

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

What is the best way to record in git which commit is currently deployed to which environment?

7 Answers 7

Use another repository, and use submodules

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions