1
0
Fork
You've already forked BuildMachine
0
A no-magic build system providing only the reasoning and execution engine to run a build.
Java 99.4%
Shell 0.4%
CSS 0.2%
2025年04月29日 17:22:11 +02:00
src Deprecates FixedStateTarget.onDependencyChange for FileTarget.aliasTarget 2025年02月20日 16:12:01 +01:00
.gitignore Improves file state hashing and adds some helpers 2025年02月18日 12:15:16 +01:00
.npmrc Removes magic buma script and starts publishing just buma.jar to codeberg 2024年01月29日 21:32:56 +01:00
buma Adds javadocstyle, FileTarget.pathString() and renames mach to buma 2025年02月10日 12:37:43 +01:00
dependencies.svg Improves file state hashing and adds some helpers 2025年02月18日 12:15:16 +01:00
README.md Add pointer to bashbuilder 2025年04月29日 17:22:11 +02:00

A Generic Build System

I finally gave in and "rewrote" this in bash. OH, YES, I know what you think, but I think the same. But still, the usability is actually better: bashbuilder. Though I must say that it actually helped to design the concepts in Java first. Repeating them in bash was easy. Without the Java version, the bash implementation would have not been so clean.

No Magic: you are actually required and will be able to understand what is going on: an empty build file does nothing 😀 and as you add targets, it starts to do things.

No new specification syntax: plain Java.

How to use?

To use it in a project, you may write any Java file, for example src/buma/Buma.java using the classes Build or Machine (see api documentation above) and run it like

java -cp buma-<version>.jar src/buma/Buma.java

The build script, Buma.java, should be a classic Java main program, which makes use of functionality provided by this software to create targets with dependencies and tasks to create them. When using Build, a minimal skeletton looks like:

Buildbuild=newBuild(...);Build.Cmdlinecmdline=build.parseCmdline(argv);// create targets and add those to build which shall be callable from// the command linebuild.updateAll(cmdline.requestedTargets());

If you don't like Build and rather parse the command line differently, set up targets with their dependencies and ultimately just run

newMachine().update(someTarget)

How to build?

Since this is supposed to be a build system, it would be weird to use yet another build system to use it. There is currently only a bash script. Use

./mach

to create the jar build/build/buma-*.jar. The script compiles the initial classes and then uses src/buma/Buma.java to do the rest. It may serve as a first example build script. To see all targets with their dependencies, use

./mach dependencytree | dot -T pdf >dependencies.pdf

given you have graphviz installed (for the dot command).

Open Ends

  • Best way to add functionality for downloading libraries with their transitive dependencies. Maybe the best way is not to add this at all, because given the simplicity of BuildMachine, just add some resolver library or tool and use it. For example: jresolve-cli

Theory

There are not enough build systems in the world yet: make, cmake, ant, maven, npm, gradle are only those that I used at some time and there are even more. But I thought, hey, lets implement one more :-)

I always wondered what the core functionality of a build system is after having removed all the language specifics that crept in because the inventors wanted to support their specific ecosystem, in particular the programming language of choice.

Many built systems these days have built-in functionality to resolve and download the directed acyclic graph of library dependencies. But this is not what I consider should be part of a built system core. Because, read on!

What is the requirement?

  • Given some artifacts, typically files containing source code (think C, Java, Python, Typescript) plus additional files containing more source (think properties files, HTML files),
  • the build system shall perform tasks like getting required additional files (think libraries), compilation, wrapping, mixing, munching, zipping, dissecting, uploading and whatnot to create a final product.

On the most abstract level this boils down to performing several tasks in some order, which is what a linear script could do. But this tends to repeat tasks uneccessarily.

Some theory

What exactly does it mean for a task to be unnecessary? Let's have some nomenclature and definitions.

  • Call a target something that needs to be created by a build. Most of the time this means that a file or a set of files shall appear or be updated in some directory, local or remote. It could be a row in a database too, which typically also means that a file is updated, the database storage file.
  • Call the computation to create or update the target a task. Typically, it is one or more commands run on the operating system, but may of course include computations built into the build system.
  • The task to create a target often needs other targets as input. We call these the dependencies of the target.
  • Call a target without a task to build it an initial target (think source code edited by the developer).

For brevity, if A is a target, let T_A be the task that creates it and let d(A) be the set of dependencies. The corner case of an empty set d(A) shall be included. Then we can write A = T_A(d(A)).

Now we can formulate the following definition:

  • Updating a target A involves
    1. checking whether it is out-of-date and
    2. if yes, running T_A(d(A)).
  • A target is out-of-date, if, after updating all targets in d(A), running T_A(d(A)) would change it.

Note how the definition recurses into the dependencies. What does it mean, though, for the ground case of an initial target? No task, nothing to do.

Now consider target A with a single dependency B, an initial target. T_A must be run if it would change A. There is no way to check whether running T_A would change A other than actually running it. But this is exactly what we want to avoid doing unnecessarily. So we approximate as follows:

  • We assume that the output T_A(d(A)) = T_A({B}) depends on B alone and is deterministic, meaning the result can only change if B changed since we last ran T_A.
  • We accept that we might run T_A on a changed B even if it produces the excact same output than from the original B.

So for an initial target B, we must know whether it was changed since we last looked, because that would be our trigger to run T_A.

Interestingly, the same is true for all targets. If we ran T_A to create A and then A was changed independently of the build system, either by manually tampering with or deleting it, then it is not enough to check changes of dependencies. We must be able to detect that A changed "spuriously" since we last ran the build.

BuildMachine works with a StateProvider and asks it about some bytes which describe the state of a target at some point in time. This could be, for example

  • the whole target content (think file or database query result),
  • a hash of the whole target content,
  • the last modification time of the target content.

For the latter two, FileContentState is provided.

Under normal circumstances (non-degenerate hash, no time stamp tampering), all of these change if the content of a target is changed.

Implementation overview

That said, BuildMachine performs a rather simple algorithm to update a target A with dependencies dep(A) which can be re-created with task T_A. (See source code).

  1. Recursively do the following for all dependencies in d(A).
  2. Compute the combined state of the dependencies and the current state of A and check if it is the same as was saved at the last visit.
  3. If it is not the same:
    1. Run T_A.
    2. Compute the combined state like in 2. again and save it. Note that this does not require to compute the states of the dependencies, as these are kept, of course. Only the state of A individually is recomputed.

Target states are saved in a directory provided to either the Machine constructor or to the Build constructor.

How is this a generic build system?

Nothing of the above relates to any specific programming language. In fact the algorithm can be implemented in every programming language to build artifacts of any other programming language or perform tasks not even related to compilation.

Where is the dependency management that maven, gradle or npm provide? Well, right there: a library needed to create a program is a target and determining the correct version and fetching it whichever way is a task. Implementations are left as an exercise :-)

Questions Never Asked

I could call this FAQ, but BuildMachine is not famous enough to talk about "frequently" :-)

How can I have a task which is always run, unconditionally.

Check out FixedStateTarget.runAlways().

How can I parametrize a task to run differently for a development or production build?

We said above that a task must be deterministic and, given identical dependencies, its result, the target, must come out identical. This means that if you use, say, a system property as a parameter, it must, at the same time, be declared as a dependency of the target being generated. Use StringTarget, which allows to create dependencies from properties and environment variables.

Can I have a task for a dependency run differently depending on the ultimate target?

It is difficult to put the question into one sentence. Consider two targets, package and build both depending on junit. The latter contains some long running tests which shall only be run on demand or when package is called, but not when build is called during development. So when target package is considered, it should set a flag such that its dependency junit is build differently from when it is called via target build. Can this be done?

The answer is no, and intentionally. The number one reason is, to be honest, that when I tried to implement this, the simple and straight forward recursive build algorithm started to look weird and bloated.

Yet, there is an easy way out: look at your code which creates the junit target and parametrize it such that you can create, say, a junit_package and a junit_build target and use these as the dependencies as needed. Alternatively, use methods like FileTarget.buildWith() to "clone" one target from another with different parametrization to get the junit targets you need.