skip to main | skip to sidebar
Showing posts with label buildsystems. Show all posts
Showing posts with label buildsystems. Show all posts

Sunday, November 15, 2020

Data types for build system dependencies

Summary: Monadic and early cut-off? Use a sequence of sets.

In the Build Systems a la Carte paper we talk about the expressive power of various types of build systems. We deliberately simplify away parallelism and implementation concerns, but those details matter. In this post I'm going to discuss some of those details, specifically the representation of dependencies.

Applicative build systems

In an applicative build system like Make, all dependencies for a target are known before you start executing the associated action. That means the dependencies have no ordering, so are best represented as a set. However, because they can be calculated from the target, they don't usually need to be stored separately. The dependencies can also be evaluated in parallel. To build a target you evaluate the dependencies to values, then evaluate the action.

Early cut-off is when an action is skipped because none of its dependencies have changed value, even if some dependencies might have required recomputing. This optimisation can be incredibly important for build systems with generated code - potentially seconds vs hours of build time. To obtain early cut-off in applicative systems, after evaluating the dependencies you compare them to the previous results, and only run the action if there were changes.

Monadic build systems

In monadic build systems like Shake, the representation of dependencies is more complex. If you have an alternative mechanism of detecting whether a rule is dirty (e.g. reverse dependencies) you don't need to record the dependencies at all. If the key is dirty, you start executing the action, and that will request the dependencies it needs. The action can then suspend, calculate the dependencies, and continue.

If you want early cut-off in a monadic build system, you need to rerun the dependencies in advance, and if they all have the same result, skip rerunning the action. Importantly, you probably want to rerun the dependencies in the same order that the action originally requested them -- otherwise you might pay a severe and unnecessary time penalty. As an example, let's consider an action:

opt <- need "is_optimised"
object <- if opt then need "foo.optimised" else need "foo.unoptimised"
link object

This rule is monadic, as whether you need the optimised or unoptimised dependency depends on the result of calculating some is_optimised property. If on the first run is_optimised is True, then we build foo.optimised. On the second run, if is_optimised is False, it is important we don't build foo.optimised as that might take a seriously long time and be entirely redundant. Therefore, it's important when checking for early cut-off we build in the order that the previous action requested the dependencies, and stop on the first difference we encounter.

(If you have unlimited resources, e.g. remote execution, it might be profitable to evaluate everything in parallel - but we're assuming that isn't the case here.)

Provided a rule performs identically between runs (i.e. is deterministic and hasn't been changed), everything that we request to check for early cut-off will still be needed for real, and we won't have wasted any work. For all these reasons, it is important to store dependencies as a sequence (e.g. a list/vector).

Monadic build systems plus parallelism

Applicative build systems naturally request all their dependencies in parallel, but monadic build systems are naturally one dependency at a time. To regain parallelism, in build systems like Shake the primitive dependency requesting mechanism takes a set of dependencies that are computed in parallel. While requesting dependencies individually or in bulk gives the same result, in bulk gives significantly more parallelism. (In Shake we use lists to track correspondence between requests and results, but it's morally a set.)

As we saw previously, it is still important that for early cut-off you reproduce the dependencies much like they were in the action. That means you request dependencies in the order they were requested, and when they were requested in bulk, they are also checked in bulk. Now we have a sequence of sets to represent dependencies, where the elements of the sets can be checked in parallel, but the sequence must be checked in order.

Monadic build systems plus explicit parallelism

What if we add an explicit parallelism operator to a monadic build system, something like parallel :: [Action a] -> IO [a] to run arbitrary actions in parallel (which is what Shake provides). Now, instead of a sequence of sets, we have a tree of parallelism. As before it's important when replaying that the dependencies are requested in order, but also that as much is requested in parallel as possible.

What Shake does

Shake is a monadic build system with early cut-off, parallelism and explicit parallelism. When building up dependencies it uses a tree representation. The full data type is:

data Traces
 = None
 | One Trace
 | Sequence Traces Traces
 | Parallel [Traces]

Sequenced dependencies are represented with Sequence and the traces captured by parallelism use Parallel. Importantly, constructing Traces values is nicely O(1) in all cases. (Shake v0.19.1 used a different representation and repeatedly normalised it, which could have awful time complexity - potentially O(2^n) in pathological cases.)

While these traces store complete information, actually evaluating that trace when checking for rebuilds would be complicated. Instead, we flatten that representation to [[Trace]] for writing to the Shake database. The outer list is a sequence, the inner list is morally a set. We have the invariant that no Trace value will occur multiple times, since if you depend on something once, and then again, the second dependency was irrelevant. To flatten Parallel computations we take the first required dependency in each parallel action, merge them together, and then repeat for the subsequent actions. If you run code like:

parallel [
 need ["a"] >> parallel [need ["b"], need ["c"]]
 need ["d"]
]

It will get flattened to appear as though you wrote need ["a","d"] >> need ["b","c"]. When checking, it will delay the evaluation of b and c until after d completes, even though that is unnecessary. But simplifying traces at the cost of marginally less rebuild parallelism for those who use explicit parallelism (which is not many) seems like the right trade-off for Shake.

Conclusion

Applicative build systems should use sets for their dependencies. Monadic build systems should use sets, but if they support early cut-off, should use sequences of sets.

Friday, May 15, 2020

File Access Tracing

Summary: It is useful to trace files accessed by a command. Shake and FSATrace provide some tools to do that.

When writing a build system, it's useful to see which files a command accesses. In the Shake build system, we use that information for linting, an auto-deps feature and a forward build mode. What we'd like is a primitive which when applied to a command execution:

  1. Reports which files are read/written.
  2. Reports the start and end time for when the files were accessed.
  3. Reports what file metadata is accessed, e.g. modification times and directory listing.
  4. Lets us pause a file access (so the dependencies can be built) or deny a file access (so dependency violations can be rejected early).
  5. Is computationally cheap.
  6. Doesn't require us to write/maintain too much low-level code.
  7. Works on all major OSs (Linux, Mac, Windows).
  8. Doesn't require sudo or elevated privilege levels.

While there are lots of approaches to tracing that get some of those features, it is currently impossible to get them all. Therefore, Shake has to make compromises. The first fours bullet points are about features -- we give up on 2 (timestamps) and 4 (pause/deny); 1 (read/writes) is essential, and we make 3 (metadata) optional, using the imperfect information when its available and tolerating its absence. The last four bullet points are about how it works -- we demand 7 (compatibility) and 8 (no sudo) because Shake must be easily available to its users. We strive for 5 (cheap) and 6 (easy), but are willing to compromise a bit on both.

Shake abstracts the result behind the cmd function with the FSATrace return type. As an example I ran in GHCi:

traced :: [FSATrace] <- cmd "gcc -c main.c"
print traced

Which compiles main.c with gcc, and on my machine prints 71 entries, including:

[ FSARead "C:\\ghc\\ghc-8.6.3\\mingw\\bin\\gcc.exe"
, FSARead "C:\\Neil\\temp\\main.c"
, FSAWrite "C:\\Users\\ndmit_000\\AppData\\Local\\Temp\\ccAadCiR.s"
, FSARead "C:\\ghc\\ghc-8.6.3\\mingw\\bin\\as.exe"
, FSARead "C:\\Users\\ndmit_000\\AppData\\Local\\Temp\\ccAadCiR.s"
, FSAWrite "C:\\Neil\\temp\\main.o"
, ...
]

Most of the remaining entries are dlls that gcc.exe uses, typically from the Windows directory. I've reordered the list to show the flow more clearly. First the process reads gcc.exe (so it can execute it), which reads main.c and writes a temporary file ccAadCiR.s. It then reads as.exe (the assembler) so it can run it, which in turn reads ccAadCiR.s and writes main.o.

Under the hood, Shake currently uses FSATrace, but that is an implementation detail -- in particular the BigBro library might one day also be supported. In order to understand the limitations of the above API, it's useful to understand the different approaches to file system tracing, and which ones FSATrace uses.

Syscall tracing On Linux, ptrace allows tracing every system call made, examining the arguments, and thus recording the files accessed. Moreover, by tracing the stat system call even file queries can be recorded. The syscall tracking approach can be made complete, but because every syscall must be hooked, can end up imposing high overhead. This approach is used by BigBro as well as numerous other debugging and instrumentation tools.

Library preload On both Linux and Mac most programs use a dynamically linked C library to make file accesses. By using LD_LIBRARY_PRELOAD it is possible to inject a different library into the program memory which intercepts the relevant C library calls, recording which files are read and written. This approach is simpler than hooking syscalls, but only works if all syscall access is made through the C library. While normally true, that isn't the case for Go programs (syscalls are invoked directly) or statically linked programs (the C library cannot be replaced).

While the technique works on a Mac, from Mac OS X 1.10 onwards system binaries can't be traced due to System Integrity Protection. As an example, the C compiler is typically installed as a system binary. It is possible to disable System Integrity Protection (but not recommended by Apple); or to use non-system binaries (e.g. those supplied by Nix); or to copy the system binary to a temporary directory (which works provided the binary does not afterwards invoke another system binary). The library preload mechanism is implemented by FSATrace and the copying system binaries trick on Mac is implemented in Shake.

File system tracing An alternative approach is to implement a custom file system and have that report which files are accessed. One such implementation for Linux is TracedFS, which is unfortunately not yet complete. Such an approach can track all accesses, but may require administrator privileges to mount a file system.

Custom Linux tracing On Linux, thanks to the open-source nature of the kernel, there are many custom file systems (e.g FUSE) and tracing mechanisms (e.g. eBPF), many of which can be used/configured/extended to perform some kind of system tracing. Unfortunately, most of these are restricted to Linux only.

Custom Mac tracing BuildXL uses a Mac sandbox based on KAuth combined with TrustedBSD Mandatory Access Control (MAC) to both detect which files are accessed and also block access to specific files. The approach is based on internal Mac OS X details which have been reversed engineered, some of which are deprecated and scheduled for removal.

Windows Kernel API hooking On Windows it is possible to hook the Kernel API, which can be used to detect when any files are accessed. Implementing such a hook is difficult, particularly around 32bit v 64bit differences, as custom assembly language trampolines must be used. Furthermore, some antivirus products (incorrectly) detect such programs as viruses. Windows kernel hooking is available in both FSATrace and BigBro (sharing the same source code), although without support for 32bit processes that spawn 64bit processes.

Current State

Shake currently uses FSATrace, meaning it uses library preloading on Linux/Mac and kernel hooking on Windows. The biggest practical limitations vary by OS:

  • On Linux it can't trace into Go programs (or other programs that use system calls directly) and statically linked binaries. Integrating BigBro as an alternative would address these issues.
  • On Mac it can't trace into system binaries called from other system binaries, most commonly the system C/C++ compiler. Using your own C/C++ installation, via Homebrew or Nix, is a workaround.
  • On Windows it can't trace 64bit programs spawned by 32bit programs. In most cases the 32bit binaries can easily be replaced by 64bit binaries. The only problem I've seen was caused by a five year-old version of sh hiding out in my C:\bin directory, which was easily remedied with a newer version. The code to fix this issue is available, but scares me too much to try integrating.

Overall, the tracing available in Shake has a simple API, is very useful for Shake, and has been repurposed in other build systems. But I do dearly wish such functionality could be both powerful and standardised!

Sunday, October 13, 2019

Monads as Graphs

Summary: You can describe type classes like monads by the graphs they allow.

In the Build Systems a la Carte paper we described build systems in terms of the type class their dependencies could take. This post takes the other view point - trying to describe type classes (e.g. Functor, Applicative, Monad) by the graphs they permit.

Functor

The Functor class has one operation: given Functor m, we have fmap :: (a -> b) -> m a -> m b. Consequently, if we want to end up with an m b, we need to start with an m a and apply fmap to it, and can repeatedly apply multiple fmap calls. The kind of graph that produces looks like:

We've used circles for the values m a/m b etc and lines to represent the fmap that connects them. Functor supplies no operations to "merge" two circles, so our dependencies form a linear tree. Thinking as a build system, this represents Docker, where base images can be extended to form new images (ignoring the newer multi-stage builds).

Applicative

The Applicative class has two fundamental operations - pure :: a -> m a (which we ignore because its pretty simple) and liftA2 :: (a -> b -> c) -> m a -> m b -> m c (most people think of <*> as the other fundamental operation, but liftA2 is equivalent in power). Thinking from a graph perspective, we now have the ability to create a graph node that points at two children, and uses the function argument to liftA2 to merge them. Since Applicative is a superset of Functor, we still have the ability to point at one child if we want. Children can also be pointed at by multiple parents, which just corresponds to reusing a value. We can visualise that with:

The structure of an Applicative graph can be calculated before any values on the graph have been calculated, which can be more efficient for tasks like parsing or build systems. When viewed as a build system, this represents build systems like Make (ignoring dependencies on generated Makefiles) or Buck, where all dependencies are given up front.

Selective

The next type class we look at is Selective, which can be characterised by the operation ifS :: m Bool -> m a -> m a -> m a. From a graph perspective, Selective interrogates the value of the first node, and then selects either the second or third node. We can visualise that as:

We use two arrows with arrow heads to indicate that we must point at one of the nodes, but don't know which. Unlike before, we don't know exactly what the final graph structure will be until we have computed the value on the first node of ifS. However, we can statically over-approximate the graph by assuming both branches will be taken. In build system terms, this graph corresponds to something like Dune.

Monad

The final type class is Monad which can be characterised with the operation (>>=) :: m a -> (a -> m b) -> m b. From a graph perspective, Monad interrogates the value of the first node, and then does whatever it likes to produce a second node. It can point at some existing node, or create a brand new node using the information from the first. We can visualise that as:

The use of an arrow pointing nowhere seems a bit odd, but it represents the unlimited options that the Monad provides. Before we always knew all the possible structures of the graph in advance. Now we can't know anything beyond a monad-node at all. As a build system, this graph represents a system like Shake.

Sunday, July 08, 2018

Inside the paper: Build Systems a la Carte

Summary: How we went about writing a build systems comparison paper, how we learnt what we learnt, and why the end result surprised us. A glimpse inside writing a paper.

The final version of the Build Systems a la Carte paper has just been sent to ICFP 2018 - see an overview from one of my co-authors. The paper is a collaboration between Andrey Mokhov at Newcastle University, Simon Peyton Jones at Microsoft Research and me (working in industry). Below is the tale of how we discovered what we discovered, hopefully slightly demystifying the research process. While the paper is a collaboration, this blog post is my view and mine alone.

The paper started with the idea of comparing and contrasting build systems. There were two motivating factors, I wanted a blueprint for writing Cloud Shake, while Simon wanted to compare build systems (Andrey wanted a bit of both). The thing we all agreed on was that Haskell is a great executable specification for describing build systems, and that refactoring is a powerful tool. Armed with that approach, we went off to try and implement various build systems, chosen based on our familiarity with them and the perceived differences between them. You can see our progress in the git repo, starting 20th Feb (less than a month before the ICFP deadline!).

All of us came to the table with some predefined notions of what should and shouldn't be in the model. Andrey brought the Store abstraction. I brought the ideas of monadic vs applicative dependencies. We iterated and quickly made our first "breakthrough", a task abstraction which nicely modelled user rules, including the difference between monadic and applicative dependencies:

type Tasks c k v = forall f . c f => (k -> f v) -> (k -> f v)

Essentially, given a way to build dependencies, I can give you a way to any key. By parameterising the Tasks by c (of type Constraint) we can produce Tasks Monad and Tasks Applicative, nicely capturing the differences in power. It was only later when preparing an artefact for evaluation that we noticed Docker is a Tasks Functor build system. We made a number of iterations on this Tasks type (adding and removing newtype, how to represent input files, where to put the second k etc) - but fundamentally had a model to work with.

The next step was to start writing build systems. We picked Make, Shake, Excel, Ninja and Bazel as our first set to get working. Implementing these systems effectively became a four-dimensional optimisation problem:

  • Closeness of the model to the underlying system it was based on.
  • Simplicity of code for each individual system.
  • Reuse of code across build systems.
  • Reuse of abstractions across build systems.

The first versions were separate monoliths of code, reusing a handful of helper functions, with a fairly arbitrary set of what to model and what to exclude. Since we had executable specifications, with tests, we came up with possible improvements, tried them, and decided whether to keep them or discard them. We iterated, as individuals, as pairs (all three possible pairs) and as a group - making improvements along various dimensions. For a good few weeks Andrey and myself had competing directories in the repo, with different underlying ideas but stealing refinements from each other. I think there were about 25 separate "breakthroughs" to move the code to where we ended up. As the code became clearer, we started to understand the commonalities behind build systems, which helped the code become clearer - a virtuous cycle. Simon's role was to say "We have to make this simpler" or "I don’t understand that". Some of the time it couldn't be simpler; and we had to make sure the explanations were really understandable. But most of the time we really did make it simpler and the exposition is much better as a result.

The most academically interesting breakthrough was to realise that build systems can be split into something that decides what to rebuild, and something that orders the rebuilding, putting build systems in a two-dimensional table. While the result feels natural (if you carefully structure your description of a build system it even falls out grammatically!), it was entirely non-obvious beforehand, and emerged naturally by following the abstraction opportunities presented by the code.

By the end we were able to faithfully model details of Make/Excel/Shake that initially eluded us, with each build system being just two functions, where all functions could be combined to produce working build systems. As an example, Shake is:

shake = suspending vtRebuilder

The suspending is also used by Nix, and the vtRebuilder is also used by Ninja. Shake is just putting two existing things together, so we have great reuse of code and abstractions between build systems. In some places the code is more complex than I'd like, but you can't have everything (or maybe you can - we may well improve the models further).

After submitting the paper to ICFP 2018, we also put a draft online, which led to a deluge of comments from experts in many of the systems we talked about - the acknowledgements in the paper start to show how much excellent feedback we got. The most interesting feedback was that we'd misclassified Bazel - it's actually more like Excel than we realised. What was particularly nice is that our framework was able to describe what we thought Bazel was in enough detail that people involved with Bazel could correct us - a clear sign we were modelling interesting details.

Now that the paper is done, I hope the abstractions can start proving their value. In the context of Shake, I would like it can serve as a design document. Ever since the earliest days of Shake, I've used a two-timestamp approach to recording what happened with a key, as described in S2.3.1 of the original paper. Unfortunately, whenever I've tried to explain this trick to someone in person, their eyes glaze over. Fortunately, given a clear model, I now know that what I've really implemented is an optimisation over vtRebuilder. Furthermore, I now have the framework to talk about the benefits and costs of the optimisation, making it much easier to understand and justify.

My goal before writing the paper was to turn Shake into Cloud Shake, and the desire to do that in a principled way. Now the paper is finished I can resume that quest, with a fairly good understanding of how to do it. One thing the paper sensibly abstracts over is all the technical details (parallelism, network traffic etc) - armed with the right abstractions those technical details are what I'll be focusing on for Cloud Shake.

Thinking more broadly, the things this paper taught me (or that I already thought but it confirmed):

  • Follow the Simon Peyton Jones how to write a paper guidelines, of which number 1 is most important. "Writing papers is a primary mechanism for doing research (not just for reporting it)".
  • Innovation isn't thinking in isolation, it's thinking of a process that gives you the right problems, the right tools, and the right direction. With those things in place, the chances of ending up somewhere interesting increase dramatically.
  • Deadlines spur writing papers. It feels like we should be better, and not need the external deadlines, but it seems to help in practice.
  • Simplicity is valuable in its own right. The main research contribution of this paper sounds obvious a minute after explaining it, which makes me very happy.
  • Co-authors matter. As a set of co-authors we agree on some things (e.g. Haskell), but disagree strongly on others (e.g. two parallel branches of development, 6 rejected pull requests). I am sure the paper would have been significantly worse with anyone of us removed (these are my conclusions, I don't guarantee my co-authors agree!).
  • Feedback is super valuable, whether it comes from peer reviewers or Reddit comments. The feedback improved the paper, and also motivated us.

Hopefully this post lets people in on the secret that writing academic papers isn't magic, that papers don't emerge fully formed, and that it involves a lot of work and refinement.

Subscribe to: Comments (Atom)
 

AltStyle によって変換されたページ (->オリジナル) /