How to automate version bumping, when version bumping involves changes in source code and a git tag?
I have a small open source package and I'm trying to automate parts of the release life cycle. I am very confused about how to automate version bumps. Here's the relevant information about my version bump procedure.
- I'm working in Python. I define the version hardcoded as the
__version__
attribute in__init__.py
in the source directory. I've configuredpyproject.toml
and Sphinx docsconf.py
to dynamically read this attribute, so updating__version__
in__init__.py
is the only place I need to change it in the source code. - I make feature branches and merge them into the
main
branch using GitHub pull requests. - My current version bump procedure looks like this: After completing work on a feature branch (but before merging the PR) I manually update
__version__
in__init__.py
and updateCHANGELOG.rst
with the new version and information about the version. I commit these changes. - Then when all work on the feature is complete (tested, etc.) I merge the PR into
main
. - Next I locally pull
main
and then tag the merge commit onmain
with the version number and push the tag back to GitHub.
This concludes my version bumping workflow. Pushing the tag triggers a build of my documentation on read the docs and I manually build the code then publish to PyPI. These final steps conclude my full release workflow, but this question is only about the version tagging workflow.
I am trying to automate the last step in the version bump workflow where after the merge I create a tag on the merge commit. One way I could do it is by writing shell commands that read __version__
out of __init__.py
and create a git tag based on that. I could then embed these commands in a GitHub action that triggers successful PR merges (e.g. pushes to main
).
This will get my job done and I'll start work on it.
However, I'm trying to implement the best packaging/release workflows in this package. I haven't seen any online tutorials/articles etc. advocate for this exact approach. And I haven't been able to find any obvious tools that address this exact workflow*. These two things are major red flags for me.
My questions:
- Am I trying to implement a bad strategy here?
- Should I be concerned about these red flags?
- Is an obvious well-used tool that does this that I should be using? This would alleviate my concerns.
- If I do spin the shell commands myself, is what I described above a good strategy? Any better suggestions or gotchas to look out for? (I haven't tried this yet, I'll have more info once I give it a shot).
*I have found a lot of packages dedicated to automating version bumping, but none of the ones I've found bump the version based on discovering the version string out of the source code. Many of them scrape the git logs or PR name to determine if the bump is major/minor/patch and then automatically bump the version in source code and create tags automatically. Some even generate a changelog. I have two problems with this approach. (1) I still want to pick the new version and write the changelog manually until I get more comfortable with the automation (or until never) and (2) All of this action would happen before the merge to main
. But in my workflow the tags need to happen on the main
branch. So the tag can't happen in the same "action" as the source code commit.
3 Answers 3
Am I trying to implement a bad strategy here?
If the strategy works for you, then it is not a bad strategy.
On the other hand, changing the same line of __init__.py
in every feature branch is a strategy that doesn't scale well to working in a team. That doesn't make it a bad strategy, but just an unsuitable one for a large number of development settings.
Should I be concerned about these red flags?
No. I believe it is far more important that you get something working that meets your immediate needs than that you keep searching for that one ultimate best workflow.
What you say are red flags are to me just indicators that your workflow is a bit unusual and that you should keep an open mind towards changing it when you start hitting the boundaries where that workflow works for you.
Is an obvious well-used tool that does this that I should be using? This would alleviate my concerns.
As you indicated in your question already, there is a myriad of tools available. That alone should already be a strong hint that people like to solve this problem in different ways and that there is no single go-to solution.
If I do spin the shell commands myself, is what I described above a good strategy? Any better suggestions or gotchas to look out for? (I haven't tried this yet, I'll have more info once I give it a shot).
What you described is a perfectly good solution.
When you have version information both in your git metadata (like in tags) and in your released artifacts, there are essentially two ways to automatically synchronize that.
- You store the version information in your source code and use tooling to put the correct tags in git. The source code is the source of truth here.
- The git tags (or other metadata) are your source of truth and tooling makes sure that the released packages report the correct version number.
Both options have advantages and drawbacks. Option 2 is currently the more popular choice, because it is perceived as giving less hassle to the developers. You are now going for option 1, because it fits better to your current workflow.
-
I think you've understood the situation well. Could you please say more about why changing
__init__.py
in ever feature branch doesn't scale well? And explain a strategy that does scale better? At least in enough detail that I can research it on my own further?Jagerber48– Jagerber4808/21/2023 13:47:19Commented Aug 21, 2023 at 13:47 -
1@Jagerber48: when you have 20 devs working on 20 different features in parallel, do you expect them all to set the version number in
__init__.py
correctly? That's the scaling problem Bart talks about. My approach to resolve this would be the following: why not increase the version number in the main branch, for example, from 2.8 to 2.9 in__init__.py
immediately after you tagged and released version 2.8? Its a way of telling everyone "2.8 is released, next features will go into 2.9".Doc Brown– Doc Brown08/21/2023 14:05:00Commented Aug 21, 2023 at 14:05 -
... the changelog entries for 2.9 can still be done per feature, and will be merged into main together with the code changes for the specific feature.Doc Brown– Doc Brown08/21/2023 14:07:58Commented Aug 21, 2023 at 14:07
-
@DocBrown In your suggestion would the version bump in the main branch involve a commit directly to
main
(probably by an automated system)? In workflows I've seen the version gets resolved near the end of the PR aftermain
is merged back into the feature branch prior to merging the feature branch in themain
. At this time the dev has to manually decide what the version should be which was one of the requirement I set in the OP. But obviously that might be a bad requirement. It's just all I've seen so far.Jagerber48– Jagerber4808/21/2023 14:33:02Commented Aug 21, 2023 at 14:33 -
@Jagerber48: "commit directly to main" - usually yes, starting a "new feature branch" just for bumping the version number seems not necessary. "by an automated system" - that is up to you. For a system I am working on from time to time, we created a tool which allows this to do semi-automatically, in a dialog: it pulls the old version number from source and shows it in a text box. There is a button to increase the number in the text box (but we can also write something else there), and another button which saves the new version number back to the source code.Doc Brown– Doc Brown08/21/2023 14:50:24Commented Aug 21, 2023 at 14:50
Answering my own question here at a later date.
Am I trying to implement a bad strategy here?
Yes, I think this is a bad strategy. Specifically, using a package-global __version__
attribute to specify versions of python source code is not a great idea and is what leads to lots of the issues. See e.g. this comment and surrounding discussion. The __version__
attribute provides a way for users of a package to dynamically access the version number, but in modern python this can already be done using importlib.metadata.version
. This call allows the users to determine the version number, as recorded in the built packages metadata, at runtime. A step better for the situation described in the OP would be (1) forget the __version__
attribute in __init__.py
entirely, (2) specify the version in pyproject.toml
and (3) read the version in conf.py
using importlib.metadata.version
. However, this still has the problem that the version needs to be duplicated in the git
tag.
Is an obvious well-used tool that does this that I should be using? This would alleviate my concerns.
An even better strategy is to use setuptools scm. This is a tool that, at build time (e.g. when a package is built into a wheel before being uploaded to pypi or when a package is built as an editable package locally using pip
), resolves the source code version by looking at git
tags. Basically it finds the nearest ancestor commit which has a tag that looks like a version number and uses that tag to derive a version (the version might be equal to that tag if the current commit is the tagged commit, otherwise a dynamic temporary version tag will be generated with information about what commit was used to generate the build in relation to the tagged commit). pyproject.toml
is compatible with setuptools_scm
by including dynamic=["version"]
under [project]
and [tool.setuptools_scm]
. With this configuration the version number will be parsed at build time and written into the package metadata. The version can then be extracted from the metadata at runtime using importlib.metadata.version
.
Now all that remains is to come up with a way to add git tags with the appropriate version numbers. Here's the strategy I have taken:
- Have an "unreleased changes" section of the changelog.
- Work on code changes in feature branches. In each branch update the unreleased section of the changelog before merging into
main
using github PR. - After a few features have been merged into
main
and it is time for a release then create arelease
branch. In the release branch update the changelog so that the collected changes reflect the new upcoming version number and clear the "unreleased changes" section. - Merge in the release branch using github PR.
- Create a github release that generates a new tag on the repo with the name of the new version.
- Run a github action on release to build the package and publish to pypi.
- readthedocs automatically detects the new tag and rebuilds the docs.
One big downside of this strategy is it does require writing the version number in two places, first in the changelog, and second in the release tag. This approach also requires some manual juggling of the changelog that could be error prone. The changelog juggling could possibly be helped by using changelog management tools like towncrier or conventional commits but I'm happy with the manual workflow for now. A github action could also possibly be written to figure out when release PRs are merged into main
, then it could scrape the changelog for the version number and use that to generate the github release with the right new tag.
Should I be concerned about these red flags?
I think there was reason to be concerned about these red flags. The key breakthrough for me was learning that the __version__
python idiom is an outdated practice. Once I learned to let go of that the rest of the workflow fell into place.
Here are my thoughts on the questions you asked, hopefully it is helpful for you.
- Am I trying to implement a bad strategy here?
Not at all! Automating version bumps can be a fantastic strategy to make your release process more efficient and reduce manual errors. It's a common practice in software development to automate repetitive tasks, so you're on the right track.
- Should I be concerned about these red flags?
It's completely understandable to have concerns when you don't find many online tutorials or articles advocating for your exact approach. However, it doesn't necessarily mean that your strategy is bad. It could simply be that your specific workflow is less common or that the available resources haven't covered it extensively. Trust your instincts and keep exploring!
- Is there an obvious, well-used tool that does this that I should be using?
Well from what I can see from here, and while there might not be an exact tool that matches your workflow, there are tools out there that can help automate versioning and release tasks. One such tool is "release-it." It provides a CLI interface to automate versioning, Git commits, tags, and package publishing. You can customize it to fit your needs, which is pretty cool! I've used it, it works pretty well.
- If I do spin the shell commands myself, is what I described above a good strategy?
In my experience, writing shell commands to read the version attribute from init.py and create a Git tag based on that is a valid approach. It allows you to automate the version tagging step after merging the feature branch into the main branch. Just keep in mind that maintaining and updating these shell commands may require additional effort as your project evolves.
Explore related questions
See similar questions with these tags.
__version
__ in__init__.py
, create your changelog, and even push to PyPI, after every merge tomain
.