Insights on the reproducibility and future of free software with Chris Lamb

by Vladimir Bejdo on December 21, 2020

The Reproducible Builds project seeks to integrate a set of development practices into software which emphasize build reproducibility, or the ability to ensure that a given build process will lead to verifiably integrous binaries which correspond to their source code. Reproducibility is especially important in software that is used for sensitive applications or even by users living in repressive regimes under mortal danger – repressive governments, for example, may choose to introduce vulnerabilities into software used by dissidents to connect to the Internet by targeting pre-compiled binaries and build processes rather than source code. The project is working towards making many widely used pieces of free software reproducible, from its aims towards making (at the very least the packages of) several widely used distributions of GNU/Linux reproducible to achieving reproducibility for individual pieces of critical software like Tor and Tails.

Participants at the 2019 Reproducible Builds Summit cheer

Participants at the 2019 Reproducible Builds Summit. Photo © intrigeri, licensed CC BY-SA 4.0

The Reproducible Builds project has been a Conservancy member project since 2018. Chris Lamb, one of the project's core team members, took part in a remote interview with Vladimir Bejdo, a Conservancy intern, to talk about the Reproducible Builds project, his own participation in software freedom, the importance of reproducibility in software development practices, and to have a discussion about the issues facing free software as a whole today – while also thinking about what issues the free software community needs to focus on going into the future.

CL: Chris Lamb; VB: Vladimir Bejdo

VB: To start off with, it might be useful to first ask you this – how would you relate the importance of reproducibility to a user who is non-technical?

CL: I sometimes use the analogy of the food ‘supply chain’ to quickly relate our work to non-technical audiences. The multiple stages of how our food reaches our plates today (such as seeding, harvesting, picking, transportation, packaging, etc.) can loosely translate to how software actually ends up on our computers, particularly in the way that if any of the steps in the multi-stage food supply chain has an issue then it quickly becomes a serious problem.

For example, even if we could guarantee that only the most wholesome apples were picked in our orchards, if they became tainted on the way to the supermarket it will be a real problem for us at the end of the day. We may not even be able to even tell by simply inspecting our Pink Ladies or Honeycrisps, and washing them thoroughly under the tap may not be enough either.

In an ideal world, we would be able to personally inspect the provenance of our food at all of the stages of manufacturing and transportation. But at some point, we must place our trust in the process and in brands, as well as various regulatory bodies to ensure that potential problems in our food are minimized, possibly even paying a time/effort premium by growing our own or buying direct from local markets in order to minimize the number of steps, etc.

However, when we use free software we can do better: ‘Reproducible builds’ are a set of software development practices, ideas and tools that create an independently-verifiable path all the way from the original source code to what actually runs on our machines. Reproducible builds can reveal the injection of back-doors introduced by the hacking of developers’ own computers, build servers and package repositories, and also expose where volunteers or companies have been coerced into making changes via blackmail, court order, and so on.

With reproducible builds, there is no longer any need to trust any particular source of authority. In the same way that, say, a Mr Smith might check that his calculator is giving him the right answer to “2+2=4” by asking enough of his friends to check theirs too, users and developers of a reproducible build can verify the software they are using by creating a collective consensus instead.

VB: Tell me a bit about how you got into free software to begin with – was there a particular moment or experience you could relate back to which makes free software important to you and informed this project’s libre status?

CL: I was playing with various Linux distributions throughout my teens, but it was only much later when I got my first permanent internet connection that I seriously got free software, intrigued by its collaborative development style, charmed by its international community and finally won over by the feelings of mastery and autonomy it gave me over my own computers. Like many others, this was only enabled by the privilege of excessive free time at a state-subsidized university. However, I first learned about ‘reproducible builds’ many years later via some friends who had attended FOSDEM in 2015.

In many ways, reproducible builds cannot be anything other than a free/libre project. As you cannot even view the source code of almost all proprietary software, the end-user benefits of having a transparent software ‘supply chain’ outside of a free software context are consequently limited. The Reproducible Builds project also brings together a broad mix of communities, philosophies and competing motivations, making it a true entrepôt of software development – it is difficult to imagine such a diverse cross-section of interests collaborating and sharing knowledge in a proprietary software context.

VB: Were there any specific grievances or moments which drove the Reproducible Build’s project’s creation?

CL: Yes and no. The idea of reproducible builds has been continually rediscovered across many eras of computing, so there have actually been a number of important moments depending on your individual perspective and biases. For example, it was implemented for various GNU tools in the early 1990s and was a property in countless systems that existed before this. None of these earlier instances resulted in mainstream developer consciousness, and all the arguments tended to forefront technical, rather than security, concerns.

However, the recent surge in interest in reproducible builds can probably be attributed to the Bitcoin project around 2011, as users of the cryptocurrency needed a way to trust that they were not downloading corrupted software. This coincided with the “Snowden” disclosures of global surveillance in 2013 and the Tor browser began serious work in this area as a direct or indirect result. These successes and a growing wider concern around software integrity prompted Debian Developer Lunar Bobbio to cultivate a sub-project within the Debian project that quickly gained popular and — crucially — technical momentum.

VB: The Reproducible Builds project works specifically on making free software work securely for sensitive targets like dissidents living in repressive states, but its work obviously also helps secure projects that are used by other at-risk populations as well. How do you feel that free software philosophies align with the social good your project tries to help foster?

CL: The Reproducible Builds project aligns with a great many of the philosophies of the free software movement. Take, for example, freedom 2 from the FSF’s “Free Software Definition”, which demands the right “to redistribute copies so you can help your neighbor”. This is admittedly not a literal application of the text, but it is difficult to reconcile the underlying intent of “helping your neighbor” if you are unwittingly distributing software that contains back-doors, and the practices and ideas of reproducible builds are intended to dramatically minimize the risk of this occurring.

In a wider sense, the concept of Reproducible Builds aligns with the general desire for autonomy and transparency present that is present throughout the free software community. This is particularly apparent in the way that it does not require people to place their trust in centralized authorities and are instead empowered to come to decisions either by themselves or collectively in a bottom-up and consensus driven manner.

VB: How has the free software community taken up reproducibility and worked to integrate it into their own development practices? Can you share with us a short overview of how the project has grown over time, or some notable implementations of the practices behind reproducible builds?

CL: There have been countless integrations of the practices of reproducible builds across the free software community, from high-level tools such as photo editors, server components such as databases, all the way down to low-level system components such as the Linux kernel (spearheaded by Ben Hutchings). Thanks to the F-Droid project, we have gained some of the benefits reproducible builds on mobile devices as well, and in 2020 we are seeing a number of independently developed Covid-19 tracing application that support being built reproducibility too.

Another prominent success story is Tails. Tails is a security-focused Linux distribution aimed at preserving privacy and anonymity. For example, it uses Tor by default and leaves no digital footprint on the internet or on the machine itself, so is ideally suited for high-risk users who face targeted or aggressive surveillance. As a result, all its systems (and engineers) that contribute to Tails’ development and release are high-priority targets for compromise as a successful attack would provide access to a large number of vulnerable and high-value users.

After considerable effort, Tails now offers fully reproducible and verifiable images, helping to protect the users of Tails but also the developers that volunteer their time to the project.

VB: What do you feel are the greatest areas of need for free software projects to focus on today overall?

CL: One area we are lacking in free software is for more robust and critical analysis of the free software movement itself.

Without greater self-reflection, we are likely to be ineffective in our approaches to real-life problem solving, and may not be able to fully realize our shared vision of a better world. For example, we might fall short and only solving problems for people we can directly relate to: everywhere on Earth, there are countless moral and ethical decisions being made around technology today, but our solutions can easily exclude others, such as those that lack technical expertise as well as those with different priorities, cultures and economic backgrounds.

As part of this critical analysis, free software projects should also not be afraid to ask what the limits or the negative externalities of developing free software might be. Even the ‘ethical consumerism’ of open source software will be inherently constrained by its very nature, yet we rarely discuss what these constraints could be. Any potential issues within our collective movement can find themselves sidelined too. For example, the potential exploitation of an unpaid, volunteer labor force of open source maintainers is not widely addressed. We also see this in the conversations around the perceived unethical co-option of free software too, where the general discourse seldom rises above unserious trading over definitions. My point is not to provide my own position here or that the free software movement should even hold any position either, but given these issue’s potential impact it seems strange and possibly even dangerous to not be widely discussing these topics, if only to assure ourselves that we are on the right path.

Likewise, if we could refine our culture of robust critique we would also be able to improve our responses to the acute and systemic problems in our society as well. For example, we might be able to comprehensively and confidently address the many harmful effects of social networks, the consolidation of power in centralized content platforms, a pervasive surveillance culture, the relegation of human agency by artificial intelligence, the role of information technology in our healthcare, the erosion of our democracy and individual freedoms, not to mention reversing or even ameliorating the effects of catastrophic climate change. The assertion that free software can help all possible situations is inspiring to me, but this optimistic hypothesis remains mostly unsubstantiated. Indeed, to all of the urgent concerns listed above, the free software movement is not yet collectively articulating a coherent and clear answer, and we can often still come across as having the same conversations regarding the name of our movement and other embarrassingly unimportant matters. Saying that, the discourse in this area has definitely been maturing in the past year or so, particularly with regard to diversity, and I am also looking forward to reading a number of pending publications in this wider area.

Given that we don’t find some topics particularly comfortable, it is only natural that we don’t tend to discuss them widely. But it is of cardinal importance that we overcome this habit: without robust and forthright self-critique, free software may actually start to contribute to society’s problems instead of diminishing them. Indeed, the twentieth-century has repeatedly demonstrated that techno-utopian and accelerationist visions of the future can just as easily lead to dystopian outcomes over positive ones however well-meaning they were when they started.

To be clear, there is absolutely nothing inherently wrong with having more application sandboxes, discussions on the finer points of funding models or even more printer drivers, but prioritizing these discussions over others may be preventing the free software movement from being taken seriously in a wider context, as well as dramatically reducing our effectiveness in solving the very real problems in our real world.

VB: Closing thoughts – how could someone begin to get involved with the project? Any resources you would like to direct readers to, if they are interested in learning more about the project and about the driving ideas behind it?

CL: If you are interested in contributing to the Reproducible Builds project, the first thing to do is connect with our community, either via our IRC channel (#reproducible-builds on irc.oftc.net) or on our mailing list.

You can also please visit the Contribute page, discover more technical details on the rest of our website, particularly via our many presentations and monthly news reports. You can also follow us on Twitter via @ReproBuilds.

Software Freedom Conservancy is in the middle of its annual fundraiser. Please help us continue our work by becoming a Supporter. Donate now and have your donation matched by a group of generous individuals who care deeply about software freedom.

[permalink]

Please email any comments on this entry to info@sfconservancy.org.

Other Conservancy Blog entries…

Insights on the reproducibility and future of free software with Chris Lamb

See all blog posts…

Blog Index by Year

Blogs by Tag

Blogs by Author