Report from the 2016 Reproducible Builds Summit
byon December 26, 2016
A couple of weeks ago I was at the Reproducible Builds Summit in Berlin. Over sixty representatives from all kinds of projects came together for three days to share information and ideas, plan solutions, and even squeeze in a little time to hack. It was my first real opportunity to dive into this work. I learned a ton, even enough to chip in a little, and I’m looking forward to working more on reproducible builds from here on out.
When we talk about reproducible builds, what we mean is a build process that produces the exact same binary every time you run it with the exact same inputs (like source code versions and compiler settings). If you’re interested in the details, check out the definition on the Reproducible Builds site—a bunch of folks hammered that out during the Summit.
You might think most build processes would be reproducible most of the time, but often the binaries include small inputs that are hard to reproduce, such as timestamps or build paths. Much of the work toward reproducible builds so far has focused on improving the inputs: removing inputs that aren’t really necessary to the final product, and better recording the ones that are. Once that’s done, most build processes are as reproducible as you’d expect. There’s still more to do there, but there’s enough of a foundation that we can start seeing some benefits from reproducible builds. Many of the discussions at the Summit were about planning those next steps.
Conservancy is really excited to help reproducible builds. Having a clear and trusted link from source code to binary helps the community in many different ways:
- The most obvious is security. When builds are reproducible, everyone can check for themselves that binaries they download actually come from the expected source code. We can demonstrate that unwanted code isn’t being added to distributors’ binaries, either accidentally or maliciously.
- A reproducible build is a documented build. When everyone can see exactly what inputs and build steps generated a binary, everyone can review and comment on that build process. It becomes easier to find binaries with “bad” inputs (like a version of a library with a critical bug) and plan an upgrade process for them.
- Reproducible builds can make license compliance easier for binary distributors. When a free software license requires distributors to provide source code, sometimes it can take a little work for them to figure out exactly what the right source code is. For example, if they have three versions of a development library installed on their build system, how do they know for sure which one went into the binary and should be included in the source code release? Reproducible builds record the answer unambiguously, in a format that can make it simple to put all the source code together.
We’ll reap the most benefits if there’s support at every level of the stack. Debian kickstarted the reproducible builds effort, and at the Summit there was a lot of great discussion about reaching out to other communities. Right now the focus is on other package distributors, so it was great to see representatives from Fedora, openSUSE, F-Droid, and Nix there. But our discussions also recognized the need for outreach to other projects that can play a role in this work, like build tools and other software that generates binaries that get shipped to users (such as filesystems or bytecode compilers). If you’re involved in a project like that, I encourage you to join us on the general mailing list for reproducible builds and introduce yourself. The more people working on this, the merrier!
Many thanks to all the Summit organizers for planning and running a productive working space. I’m already looking forward to the next reproducible builds meeting.
Please email any comments on this entry to firstname.lastname@example.org.