My company has about 200 repos for microservices and libraries that we share. It's largely been great, but it's hard to keep DRY at the organizational level. We'll often have a large amount of package.json
, webpack.config.js
, .babelrc
, .github/workflows
and similar infrastructure-level code that gets copied in though templates or copy/pasted from different GitHub repos.
Obviously a great way of minimizing duplication in source code is breaking code out to a library and installing it as a separate package. What's the best way to keep infrastructure code DRY?
I can think of a few options:
Copy/paste
When creating a repo, just copy/paste another repo's infrastructure code over. Simple.
Pros
- Simplest approach when writing
- Explicit dependencies
- Effectively freezing code
Cons
- Lots of manual maintenance
- Code surface increases, abstraction decreases
- Requires developers parse through lots of (possibly) unfamiliar code
GitHub templates
Use GitHub templates to create new repos. Keep templates up to date. Possibly update repos via a script to pull template updates.
Pros
- Slightly more structured than copy/pasting
- Provides an ideal standard for other repositories
Cons
- Still requires lots of manual upkeep
- Functionally similar cons to copy/pasting
Include from parent repository
Add your new repo as a submodule of some other parent repo that contains infrastructure code. Either reference parent infrastructure or automatically copy parent code over (e.g. at build time).
Pros
- Single source of truth for infrastructure code
- Explicit dependency tracking
Cons
- Requires strong dependent relationship between parent and child repo
- Change to parent repo could break multiple children easily
Include as submodules
All infrastructure dependencies reference external repos that contain the deps. Potentially version-pinned to commit hashes.
Pros
- Single source of truth for infrastructure code
- Explicit versioned dependency tracking
Cons
- (afaik) no readily-available tools to manage this
- Dealing with submodules is not fun.
- (afaik) Not possible to grab one file from one revision of a separate repo
Have I missed any major pros/cons? What other ways would you minimize code surface of scaffolding/infrastructure code?
A
may be an input to the infrastructure build pipeline). Though this sounds to be in contention with the microservice methodology, it may be worthwhile to separate the build pipeline and the code if the majority of microservices are deployed the same way.