[RSS] Conservancy Blog

Displaying posts tagged licensing

If Software is My Copilot, Who Programmed My Software?

by Bradley M. Kuhn on February 3, 2022

Software freedom is our goal. Copyleft is a strategy to reach that goal. That tenet is oft forgotten by activists. Copyleft is even abused to advance proprietary goals. We too often see concern about the future of copyleft overshadow the necessary fundamental question: does a particular behavior or trend — and the inevitable outcomes of those behaviors and trends — increase or decrease users’ rights to copy, share, modify, and reinstall modified versions of their software? That question remains paramount as we face new challenges.

Introduced first by Microsoft’s GitHub in their Copilot product, computer-assisted software authorship by way of machine learning models presents a formidable challenge to software freedom’s future. Yet, we can, in fact, imagine a software freedom utopia that embodies this technology. Imagine that all software authors have access to the global archive of machine learning models — and they are fullly reproducible. Everyone has equal rights to fork these models, train them further with their own datasets, provided that they must release new models (and the input code) freely in the global archive. All code produced by these models is also made freely available under copyleft. All code that builds the models, all historical input sets, and all trained models are all also made available to everyone under copyleft licenses.

While activists might quibble about minor details to optimize imagined utopia, this thought experiment shows computer-assisted software authorship does not inherently negate software freedom. Rather, the rules, requirements, and policies that apply will determine whether software freedom is respected. To paraphrase Hamlet: there is nothing either good or bad, but the policy makes it so.

What’s the Worse That Could Happen?

[They are] not a good [person] who, without a protest, allows wrong to be committed … with the means which [they] help to supply.

John Stewart Mill, University of St. Andrews, 1 February 1867

Obviously, ignoring machine learning for computer-assisted software authorship will not usher in this software freedom utopia. Copyleft activists cannot stand idly by in this situation, but we must temper our attention by considering the likelihood of dystopian and problematic outcomes, and the options available to prevent them.

In response to Copilot’s announcement, pundits speculated, without evidence, a prevailing feeling of “Free Software had a good run, but I guess that’s over now”. Such predictions seem consistent with the well-documented overoptimism of artificial intelligence success. Rapid replacement of traditional software development methodologies seem unlikely. As such, we should not overestimate the likelihood that these new systems will both accelerate proprietary software development, while we simultaneously fail to prevent copylefted software from enabling that activity. The former may not come to pass, so we should not unduly fret about the latter, lest we misdirect resources. In short, AI is usually slow-moving, and produces incremental change far more often than it produces radical change. The problem is thus not imminent nor the damage irreversible. However, we must respond deliberately with all due celerity — and begin that work immediately.

Currently, there are two factors that influence the timing of our response. First, if GitHub’s Copilot becomes a non-beta product available to the programming public, that would indicate necessity of an urgent response. Microsoft and GitHub are unlikely to share their product plans, so we cannot know for sure when this will occur. However, in the seven months since the first beta was made available, we’ve consistently heard anecdotally that more and more developers (particularly, FOSS developers!) have received beta invitations. Based on these (admittedly incomplete) facts, we must assume that a move from private beta to public deployment is imminent in 2022. This indicates some urgency of the problem.

Second, we already know that some of our worst fears are definitely true. Namely, that Microsoft and GitHub used copylefted software as part of Copilot’s training set.

Copilot was trained on “billions of lines of public code … written by others”. While GitHub has refused requests to release even a list of repositories included in the training set, the use of the word “public” indicates that only software with source-available licenses (even if not FOSS licenses) were input into Copilot. Furthermore, GitHub admits that during training, the system encountered a copy of the GPL more than 700,000 times. This effectively confirms that copylefted public code appears in the training set.

When questioned, former GNOME developer and GitHub CEO0, Nat Friedman, declared publicly “(1) training ML systems on public data is fair use (2) the output belongs to the operator”. Friedman himself, as well as Microsoft and GitHub’s other executives and lawyers, have ignored Software Freedom Conservancy’s requests for clarification and/or evidence supporting these statements.

Meanwhile, GitHub continues to improve this system, trained only on publicly source-available software, and seeks to market it to new users, including those who otherwise use FOSS development tools. Users continue to report gaining access to the beta and are noticing improvements. Microsoft and GitHub’s public position is meanwhile clear: they claim to have no copyleft obligations for training the model, the model itself, and deploying the service. They also believe there are no licensing obligations for the output.

While Friedman ignored the community’s requests publicly, we inquired privately with Friedman0 and other Microsoft and GitHub representatives in June 2021, asking for solid legal references for GitHub’s public legal positions of (1) and (2) above. They provided none, and reiterated, without evidence, that they believed the model does not contain copies of the software, and output produced by Copilot can be licensed under any license. We further asked if there are no licensing concerns on either side, why did Microsoft not also train the system on their large proprietary codebases such as Office? They had no immediate answer. Microsoft and GitHub promised to get back to us, but have not.

This secrecy and non-cooperativeness is expected from a proprietary software company and its subsidiary, but leaves us only with speculative conclusions to inform a strategy for copyleft here. We can reliably guess that the companies will claim “fair use” as their primary justification for creating the model and offering the service, and will argue that both the output and the trained model are not “work[s] based on the Program” (GPLv2) nor do they “copy from or adapt all or part of the work[s] in a fashion requiring copyright permission” (GPLv3/AGPLv3). Furthermore, we can reliably conclude, given the continuing product promotion, that the companies have at least a medium-term commitment to Copilot.

In short, they have already hunkered down for a protracted disagreement. Their positions are now incumbent — using their resources and power to successfully charge copyleft activists to “prove them wrong”. But we do not have to accept their unsubstantiated arguments at face value. In fact, these areas are so substantially novel that almost every issue has no definitive answers, but we must nevertheless begin to formulate our position and our response to Microsoft and GitHub’s assault on copyleft.

Consider GitHub’s claim that “training ML systems on public data is fair use”. We have not found any case of note — at least in the USA — that truly contemplates that question. The only legal case in the USA to look near this question is Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015). The Supreme Court denied certiorari on this case; it is not legal precedent in all jurisdictions where Microsoft and GitHub operate.

Even more, that case considered a fact pattern centered around search, not authorship of new/derived works. Google had made copies of entire copyrighted books, not for the purpose of displaying them, but so users could (1) run search queries, and (2) see a “snippet” of the search hits (i.e., to see the search hit in context). The Second Circuit held Google’s copying of the books was “fair use” because searching and providing context added value exceeding what a user could obtain from their own copies, and Google’s product did not substitute the market for the books.

The analogous fact pattern for code is obvious: GitHub could offer a search tool that assists users in finding key public repositories (and specific lines of code within those repositories) that seemed to solve tasks of interest. Developers could then easily utilitize those codebases in the usual, license-compliant ways. The actual Copilot fact pattern is not this one.

Meanwhile, the Authors Guild case begins and ends the list of major cases regarding machine learning systems and “fair use”. We should simply ignore GitHub’s risible claim that the “fair use question” on machine learning is settled.

Perhaps most importantly, in the USA, “fair use” is an affirmative defense to answer copyright infringement. In concrete terms, that means — particularly in cases where the circumstances are novel — a copyright holder brings an infringement lawsuit and then the alleged infringer shows in court that their actions met the relevant factors for “fair use” sufficiently. Frankly, we refuse to do these companies’ job for them. Copyleft activists need not tell Microsoft and GitHub why this isn’t “fair use”, rather, they need to tell us why training the model with copylefted code is “fair use” and prove that the trained model itself is not a “work based on” the GPL’d software.

GitHub has meanwhile artfully avoided the question of whether the trained model is a “work based on” the input. We contend that it probably is. However, given that “fair use” is an affirmative defense to copyright infringement, they are obviously anticipating a claim that the trained model is, in fact, a “work based on” the inputs to the model. Why else would they even bring up “fair use”, rather than simply say their use is fully non-infringing? Anyway, we have no way to even explore these questions authoritatively without examining the model, fully affixed in its tangible medium. We don’t expect GitHub to produce that unless compelled by a third party.

Indeed, discussion of these questions outside of a courtroom is moot. For this novel and contentious fact pattern, only a court decision can settle the matter adequately. As a strategic matter, copyleft activists should keep their own counsel about what we anticipate in the opposition’s “fair use” and/or non-infringement defenses, and the counter-arguments that we plan.

Copilot Users Should Worry

GitHub’s position does a great disservice to Copilot users. Their claim that “the output belongs to the operator” creates a false sense of legal justification. Users have already shown that Copilot can generate a substantial amount of unique, GPL’d code, and then (rather ironically, given GitHub’s claim that they removed the text of the GPL from the training set) also suggest a license that is non-copyleft. Friedman’s statement surely does not qualify as an indemnity for Copilot users who might face GPL enforcement actions. Users almost surely must construct their own “fair use” or “not copyrightable” defenses for Copilot’s output.

The length and detail of what Copilot can generate for users seems unbounded. The glaring example above appears primia facie to be copyright infringement; we expect further such problems. Consider the sheer amount that a fully functional and successful Copilot would generate. Surely, AI researchers seek the ability for Copilot to “figure out” that you are trying to solve some specific task when programming. The better Copilot gets at handing ready-made solutions to its users, the more likely it becomes that its output may offer the user copylefted software.

Copilot leaves copyleft compliance as an exercise for the user. Users likely face growing liability that only increases as Copilot improves. Users currently have no methods besides serendipity and educated guesses to know whether Copilot’s output is copyrighted by someone else. Proprietary software companies such as Synopsys provide so-called “scanning tools” — that can search your proprietary codebase and find hidden copylefted software. However, the FOSS tools for that job are in their infancy and unlikely to develop quickly, since historically those who want those tools are companies that primarily develop proprietary software and seek to avoid copylefted software.

We recommend users who wish to avoid infringing the copyrights of others simply avoid Copilot.

On Copyleft Maximalism and Unilateral Capitulation

Draconian copyright law generally horrifies software freedom activists for good reason. Nearly all copyleft activists would prefer a true, multilateral rewriting of copyright rules that prioritized the interest of the general public and software rights. Copyleft exists primarily because of the long-standing political non-viability of a copyright law reboot. Nothing has changed in this regard; if anything, changing legislation has become an even more expensive lobbying proposition than it was at copyleft’s advent. Copyleft activists should expect, indefinitely, for proprietary software companies and media oligarchs to control copyright legislation.

Fortunately, copyleft was designed specifically for this eventuality. Activists have called copyleft the “judo move” of software freedom, since copyleft uses the powerful copyright force (invented primarily by our opposition) against itself. That realization leads to a painful, but pragmatically necessary, awkwardness.

The issues herein — from training of machine learning models, to the copyright questions about those models, to the derivation questions about their output — are novel copyright questions. As software freedom activists, we are uniquely qualified to invent an ideal copyright structure for these technologies. But, without a path to promulgate such replacement copyright rules into the incumbent system, that exercise is futile. Furthermore, systems outside of copyright — including but not limited to EULAs, business agreements and patents — have long been used to proprietarize software without the need of copyright. Reality of facts on the ground dictate that we not concede the only wedge we have to compel software freedom; that wedge is copyleft.

Meanwhile, proprietary software companies regularly exploit any unilateral concessions on weakening of copyleft that FOSS projects make, while continuing to pursue copyright maximalism for their works. Particularly in novel areas, we must assume a copyleft maximalist approach — until courts or the legislature disarm all mechanisms to control users’ rights with regard to software. That adversarial process will frustrate us, but ultimately by choosing copyright as our primary tool, we already chose the courts as our battleground for contentious issues.

We all surely have our opinions about how copyleft should operate in these novel situations. We have even expressed some such opinions herein. But, ultimately, strong copyleft licenses do not defer the “what’s covered?” question to one individual or organization. The “judo” power comes from strong copyleft reaching to all of what copyright governs. When those issues are novel — and companies flaunt that novel manipulation of copylefted works — only a court can answer definitively.

A Community-Led Response

While these companies will likely not succeed in their efforts to disarm copyleft, they have nevertheless attacked the entire copyleft infrastructure. We must mount an effective response.

Software Freedom Conservancy has spent the last six months in deep internal discussions about this novel threat to the very efficacy of copyleft. We have a few ideas — a mix of short-term, medium-term and long-term strategies to address the problem. However, we recognize that a community (rather than the traditional BDFL) approach is needed — at least for this problem. Thus, putting first things first, we realized that we should gather the best minds in the software freedom community with direct experience in copyleft theory and practice. We will convene these individuals to a committee specifically chartered by Software Freedom Conservancy to — as quickly as reasonably possible – publish a series of recommendations to the community on how we should respond to both the immediate threat to copyleft found in Copilot, and (long-term) analyze the more general threat that AI-assisted programming techniques pose to the strategy of copyleft.

While we are not actively seeking applications for this committee, we do welcome anyone whom we have not yet solicited to participate to contact us and inquire. We will surely be unable to include everyone who is interested on the committee — either due to Conflicts of Interest or due to simple logistics of creating too large a committee. However, we will carefully consider anyone who expresses bona fide interest to participate.

Finally, as much as can be done during the pandemic using FOSS tools available, we will attempt to convene public discussions as much as possible. We will contemporaneously publish the committee’s minutes publicly. If you’d like to get involved today in public discussions about this issue, please join the mailing we launched today for this topic.

0In November 2021, Nat Friedman was replaced by Thomas Dohmke as GitHub’s CEO. However, to our knowledge, Dohmke has not retracted or clarified Friedman's comments, and at the time of writing, no one from GitHub or Microsoft that we spoke to had responded to our requests for clarification.

Tags: conservancy, law, licensing

First Update on the Vizio lawsuit

by Bradley M. Kuhn and Karen M. Sandler on November 30, 2021

Yesterday, we received from Vizio their first official response in our pending litigation against Vizio for their copyleft license violations. So, what was their response?

Did Vizio release the source code — as the GPL and LGPL require — for the modified versions of Linux, alsa-utils, GNU bash, GNU awk, BusyBox, dmesg, findutils, dmsetup, GNU tar, mount and selinux found in their TV’s firmwares? No.

Did Vizio propose a CCS candidate for us to review, provide them with additional feedback, so that we could help them get consumers who bought their TVs the source code they deserve? Nope.

Did Vizio argue that we had erred, and in fact, none of those programs we list above appear in their firmware? Not that either. (Unlikely though — after all, they surely know those programs are in their firmware!)

Instead, Vizio filed a request to “remove” the case from California State Court (into US federal court), which indicates Vizio's belief that consumers have no third-party beneficiary rights under copyleft! In other words, Vizio’s answer to this complaint is not to comply with the copyleft licenses, but instead imply that Software Freedom Conservancy — and all other purchasers of the devices who might want to assert their right under GPL and LGPL to complete, corresponding source — have no right to even ask for that source code.

That’s right: Vizio’s filing implies that only copyright holders, and no one else, have a right to ask for source code under the GPL and LGPL. While we expected Vizio held this position (since they ultimately ignored us during our discussions with them in years past), Vizio has gone a disturbing step further and asked the federal United States District Court for the Central District of California to agree to the idea that not only do you as a consumer have no right to ask for source code, but that Californians have no right to even ask their state courts to consider the question!

Vizio’s strategy is to deny consumers their rights under copyleft licenses, and we intend to fight back.

We believe in complete transparency of the copyleft compliance process, and so encourage everyone to read the filings. We’ve even paid the Pacer fees and used the Recap browser plugin, so that all the documents in the case are freely available via the Recap project archives.

Software Freedom Conservancy’s annual fundraiser is happening right now! Please help us continue our work by becoming a Sustainer. Donate now and have your donation matched by a group of generous individuals who care deeply about software freedom.

Tags: conservancy, law, licensing

Trump's Social Media Platform and the Affero General Public License (of Mastodon)

by Bradley M. Kuhn on October 21, 2021

An analysis: Trump's Group has 30 days to remedy the violation, or their rights in the software are permanently terminated

In 2002, we used phrases like “Web 2.0” and “AJAX” to describe the revolution that was happening in web technology for average consumers. This was just before names like Twitter and Facebook became famous worldwide. Web 2.0 was the groundwork infrastructure of the “social media” to come.

As software policy folks, my colleagues and I knew that these technologies were catalysts for change. Software applications, traditionally purchased on media and installed explicitly, were now implicitly installed through web browsers — delivered automatically, or even sometimes run on the user's behalf on someone else's computer. As copyleft activists specifically, we knew that copyleft licensing would have to adjust, too.

In late 2001, I sat and read and reread section 2(c) of the GPLv2. After much thought, I saw how it could be adapted, using the geeky computer science concept called a quine — a program that has a feature to print its own source code for the user. A similar section to GPLv2§2(c) could be written that would assure that every user of a copylefted program on the Internet would be guaranteed the rights and freedoms to copy, modify, redistribute and/or reinstall their software — which was done by offering a source-code provision feature to every user on the network. The key concept behind the Affero GPL (AGPL) version 1 was born. Others drafted and released AGPLv1 based on my idea. Five years later, I was proudly in the “room where it happened” when Affero GPL version 3 was drafted. Some of the words in that section are ones I suggested.

We were imagining a lot about the future in those days; the task of copyleft licensing drafting requires trying to foresee how others might attempt to curtail the software rights and freedoms of others. Predicting the future is difficult and error-prone. Today, a piece of Affero GPLv3's future came to pass that I would not have predicted back in November 2007 at its release.

I invented that network source code disclosure provision of the AGPL — the copyleft license later applied to the Mastodon software — in 2002 in light of that very problem: parties who don't share our values might use (or even contribute to) software written by the FOSS community. The license purposefully treats everyone equally (even people we don't like or agree with), but they must operate under the same rules of the copyleft licenses that apply to everyone else.

Today, we saw the Trump Media and Technology Group ignoring those important rules — which were designed for the social good. Once caught in the act, Trump's Group scrambled and took the site down.

Early evidence strongly supports that Trump's Group publicly launched a so-called “test site” of their “Truth Social” product, based on the AGPLv3'd Mastodon software platform. Many users were able to create accounts and use it — briefly. However, when you put any site on the Internet licensed under AGPLv3, the AGPLv3 requires that you provide (to every user) an opportunity to receive the entire Corresponding Source for the website based on that code. These early users did not receive that source code, and Trump's Group is currently ignoring their very public requests for it. To comply with this important FOSS license, Trump's Group needs to immediately make that Corresponding Source available to all who used the site today while it was live. If they fail to do this within 30 days, their rights and permissions in the software are automatically and permanently terminated. That's how AGPLv3's cure provision works — no exceptions — even if you're a real estate mogul, reality television star, or even a former POTUS.

I and my colleagues at Software Freedom Conservancy are experts at investigating non-compliance with copyleft license and enforcing those licenses once we confirm the violations. We will be following this issue very closely and insisting that Trump's Group give the Corresponding Source to all who use the site.

Finally, it's worth noting that we could find no evidence that someone illegally broke into the website. All the evidence available on the Internet (as of 2021-10-22) indicates that the site was simply deployed live early as a test, and without proper configuration (such as pre-reserving some account names). Once discovered, people merely used the site legitimately to register accounts and use its features.

Update (2021-10-22): Some have asked us how this situation relates to our Principles of Community-Oriented GPL Enforcement, since we are publicly analyzing a copyleft violation publicly. Historically, we did similarly with the Canonical, Ltd., Cambium, Ubiquiti, and Tesla (twice!) violations. We do believe that “confidentiality can increase receptiveness and responsiveness”, but once a story is already made widely known to the public by a third-party, confidentiality is no longer possible, since the public already knows the details. At that moment, the need to educate the public supersedes any value in non-disclosure.

Tags: conservancy, GPL, licensing

"Tivoization" and Your Right to Install Under Copyleft

by Bradley M. Kuhn on July 23, 2021

Two schools of thought about the purpose of copyleft have been at odds for some time. Simply put, the question is: are copyleft licenses designed primarily to protect the rights of large companies that produce electronics and software products, or is copyleft designed primarily to protect individual users' rights to improve, modify, repair, and reinstall their software?

This debate quickly gets deep into complex policy questions. In the last few years, that general debate has slowly but surely focused almost entirely on the issue of users' ability to make effective use of FOSS on their own hardware by reinstalling their modified versions.

Historically, these nuanced policy questions about copyleft requirements have generally been discussed only in semi-public venues, and often fall prey to the tactic du jour: post-fact politics. I have realized in recent months that the failure to properly document and explain key historical narratives in copyleft history leaves software freedom activism at a disadvantage: well-resourced copyleft violators and their lawyers can use the ambiguity and confusion in the scant public record to spin false narratives and draw legal conclusions. While such legal conclusions should not be drawn (absent a Court ruling), companies have nevertheless pushed their views forward quite loudly recently. To use Herman and Chomsky's insightful phrasing, the incumbent power structures manufacture consent to their worldview to serve their interests, merely by being the loudest and most commonly heard voices.

Specifically on the issue of protections for the right to repair and reinstallation under copyleft licenses such as GPL, I am fortunate to have been a direct observer to many of the events that now serve as the connective tissue to build these false narratives about the GPL. However, I admit I have failed to write down and impart that knowledge to the general public in adequate measure, which has, in turn, inadvertently aided in promulgation of these false narratives. So, at least on the issue of “scripts used to control compilation and installation of the executable”, I hope this essay will serve as remedy. I, and everyone at Conservancy, all believe in intellectual transparency, and we strive to provide it wherever possible. The truth will out.

Installation Requirement Under GPLv2

Recent debates on this issue focus on the question of what is required to comply with the first two sentences of GPLv2§3¶2, which reads:

The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable.

Before explaining the historical understanding of these terms, I will, first of all, point out that any company or lawyer that seeks to do the bare minimum for compliance is likely not prioritizing users' rights to repair their software. In all compliance-related systems, bad actors seek a “race to the bottom”. Rules like GPL are similar to environmental regulations, workplace safety requirements, and the like. The more minimalistic the interpretation of the requirements, the more companies can profit from only doing the bare minimum.

Nevertheless, it has been the goal of organizations that advocate for software freedom — such as Conservancy ourselves or FSF — to state clearly our view about the minimum requirements as best we can. I often wonder if this strategy has been beneficial to software freedom. Sadly, the answer from the industry has primarily been to hear us clearly about the minimum requirements, and then work over time to lower the GPL compliance bar — even if it requires inaccurately quoting FOSS leaders and misleading the public about history. Most recently, industry has engaged in this bar-lowering process with GPLv2§3¶2 and installation information under GPLv2 generally. My hope herein is to fully explain the history of interpretation of GPLv2§3¶2 by pro-copyleft advocates, and explore the misdirection of arguments of those who seek to curtail users' rights to install modified versions of their GPL'd software.

FSF CLE Classes, 2003-2004

I began volunteering for licensing and GPL enforcement work for FSF in 1997, and officially worked on my first GPL enforcement action in 1999. I became an FSF employee that year, and worked there until 2005. I thereafter remained affiliated with the organization in various roles until my final affiliations ended with FSF in October 2019. Most notably, I was the Executive Director of FSF from 2001-2005. During that time, I led FSF's GPL enforcement and copyleft education measures, including the CLE classes (first taught in 2003-2004).

In preparation for teaching those courses, I began to write the tutorial which later became the Copyleft Guide. To begin that effort, I collected, curated, and verified interpretations and intent of the GPLv2 with Richard Stallman, Bob Chassell (a key but oft-forgotten leader of FSF during the 1980s and 1990s), and FSF's legal counsels. One of the many outcomes of that endeavor was that I wrote these words on 2003-05-09:

GPLv2§3 requires that the source code include “meta-material” like scripts, interface definitions, and other material that is used to “control compilation and installation” of the binaries.

In GPL enforcement actions at the time, during our “complete, corresponding source (CCS) checks”, we verified that the source code was not only complete, but that it corresponded to the binaries on the vendors' devices, and that we could install modified versions of the software. This was a standard part of any check to verify GPLv2 compliance. Passing this check was required, then and now, by FSF and Conservancy before distribution rights are restored after a violation.

That position was not controversial when I, along with then FSF counsel (Daniel Ravicher), taught it to lawyers in 2003 and 2004 on FSF's behalf. Nevertheless, today, many act as if this interpretation and intent of GPLv2§3¶2 is a recent and novel phenomena, rather than a long standing position held by all copyleft activists for at least 18 years. Today, most companies and lawyers argue (incorrectly, IMO) that users have no rights to reinstall their GPLv2'd software.

The 2003 TiVo GPL Enforcement Action

Even before teaching those CLE classes, as (then) FSF's Executive Director, I led the GPLv2 enforcement effort against TiVo. I've often seen those with only a passing familiarity with the subject jump to inaccurate conclusions about that enforcement action that tend to conveniently fit their policy agenda. I herein recount the entire history regarding the TiVo GPLv2 violation and how it led to the “tivoization” rhetoric. Since that rhetoric is often treated as dispositive truthiness that GPLv2 does not ensure the users' rights to repair by reinstalling their modified GPLv2'd software, we should examine the actual facts that back the rhetoric, and examine the conclusions that others make about GPLv2 based on it.

First and foremost, TiVo's GPL violation initially had nothing specific to do with GPLv2§3¶2. TiVo never raised any intention to not comply with that section. In fact, to my recollection, TiVo never disputed nor disagreed with FSF's interpretation on that section. The initial violation was a standard GPLv2§3(b) violation, wherein some distributions of the TiVo device had an offer for source that could not be successfully exercised. (At the time) acting on behalf of FSF, I contacted TiVo on 2002-06-11 to raise this issue, and, TiVo responded favorably and indicated they wanted to resolve the matter. As is usual practice in all GPL enforcement matters, I and my (then) team did our due diligence to verify full compliance, including any other potential issues under GPLv2. Eventually, my FSF colleague (David Turner) and I did a CCS check of TiVo's software. The procedures, criteria, and interpretations that Turner and I used then are exactly the same as the ones that Denver and I use today at Conservancy. To my knowledge (based on recent personal conversations with FSF staff), FSF still uses when these same procedures, criteria, and interpretations when FSF has the rare occasion to do GPLv2 enforcement these days.

Once the GPLv2§3(b) violation was resolved, Turner and I discovered — as has been true in nearly every one of the hundreds of GPL compliance matter that I've worked — that “the scripts used to control compilation and installation of the executable” were incomplete. When this was identified, TiVo's solution was to, in fact, agree with the interpretation that that such instructions are mandatory and must be provided and they provided them. To my knowledge, TiVo was in full compliance with the GPLv2, including the inclusion of instructions for installation as required under GPLv2. People were able to reinstall Linux on their TiVo boxes thanks to our enforcement action; community resources on how to take advantage of GPL reinstallation rights on TiVos (of that era) are still readily available! At the time0, TiVo was doing the right thing in providing what the GPLv2 requires — including the ability to reinstall GNU and Linux software onto the actual device. Keep in mind: this enforcement action, and the compliance achieved from it, occurred years before the GPLv3 process began.

Understanding The Tivoization Rhetoric

So, what did TiVo do that was so objectionable? What was the behavior that Stallman went to work drafting GPLv3 to prevent that TiVo was allowed to do under GPLv2? It's not, as others widely misreport, that TiVo forbade reinstallation “of the GPL'd software” itself. To my knowledge, TiVo never prevented such reinstallation. No one involved, including me, Stallman, TiVo, or anyone at FSF at the time believed that GPLv2 permitted TiVo to withhold the installation information for the GPL'd software itself. FSF demanded that TiVo provided its users the ability to reinstall Linux (and other GPL'd software, such as GNU bash). What TiVo later did, which some software freedom activists (including Stallman) found objectionable, was that TiVo designed the reinstallation process of that GPLv2'd software to cause the proprietary TiVo application to cease to function. I recall this being widely discussed when TiVo Series 3 was released in mid-2006, and my understanding was that all Series 3 devices had this particular anti-feature. (There were rumors that some of the Series 2 had this anti-feature as well, but not all models.) In other words, if you decided to modify your copy of Linux for the TiVo device and reinstall Linux, the TiVo userspace application would realize that cryptographic lockdown had been breached, and that proprietary software would no longer function. By exercising your reinstallation rights under GPLv2, you'd turn your TiVo DVR into a stand-alone server with some video processing equipment attached. You could use Kodi (which at the time had a different name) to turn that former-TiVo into a FOSS DVR, but your ability to use the proprietary DVR software from TiVo was lost — likely permanently.

Most have of course heard of the negative term “tivoization” that Richard Stallman popularized during the GPLv3 process — which was contemporaneous with the release of the TiVo Series 3. I nevertheless asked Stallman to not use that term — both then and many times since. I still disagree with Stallman's policy position on the narrow issue of preserving proprietary userspace functionality. Specifically, I just don't think it matters if, upon upgrading your copylefted software, that the proprietary software that was (to use GPLv2's terminology) “merely aggregated” alongside the copylefted software continue to function. I felt and still feel that it's actually better policy to break the (“merely aggregated”) proprietary software (as GPLv2 permits). My policy view is that this breakage inspires and encourages users to install a FOSS alternative for the userspace applications after they've reinstalled the FOSS operating system. Nevertheless, Stallman found this practice (using crypto lock-down to force the proprietary software to fail) illegitimate. He noted publicly that GPLv2 didn't prevent this behavior, and wanted (and wrote, as explained below) a GPLv3 draft that prohibited that behavior.

How Discussion Focused on Cryptographic Lockdown Generally

To this day, I'll remain frustrated that many pro-GPLv3 advocates, during the GPLv3 drafting process, saw fit to imply ideas that they had no basis to believe were true about GPLv2. We all knew, long before GPLv3 drafting began, that there was a clear installation requirement in GPLv2 — the word “install” appears prominently. The training materials that I developed for FSF (described above) were vetted through Stallman and FSF's legal counsel before using them to teach CLE classes. If anyone received a different impression, it was surely a miscommunication due to the aggressive “GPLv3 is much better” rhetoric of the time.

Meanwhile, much of the debate about cyptographic lockdown under GPL centered around the question of disclosure of specific authorization keys. It was said, probably correctly, that GPLv2 did not mandate disclosure of an any specific authorization key. What was often left unsaid (apparently in an effort to make GPLv2 seem weaker than it actually was) was what GPLv2 did still require: a functional installation method without disclosure of authorization keys. For example, it would, in my personal opinion, be entirely compliant with the GPLv2 to simply disable the secure boot chain, providing no path back to the vendor-provided cryptographically signed firmware1, and allow the user to reinstall only the GPLv2'd components on the device — never to return to the stock vendor firmware. I suspect such restriction would be prohibited under GPLv3, since GPLv3 clearly requires not that you just give a viable install path (as GPLv2 does), but GPLv3 additionally requires disclosure of the authorization keys.

We can debate whether this copyleft expansion under GPLv3 was good policy. What is not up for debate is the simple concept that: more requirements added to a later revision of a licensing document does not change the intent or standing requirements in the older document. That's true even if the authors of the original document, for marketing or other reasons, choose later to denigrate their own past work. As it turns out, historically, we know what GPLv2 intended because its author, Richard Stallman, talked so extensively about what he sought to accomplish by creating GPLv2.

Going back to the early 1990s and contemporaneous with GPLv2's publication, Stallman himself has been quite fond of telling his experience with the broken MIT printer, for which he begged for the source code and didn't receive it. Stallman doesn't end this story with: “what I really wanted was to get the source code to that printer so I could build my own printer from scratch and then compile and make a fresh install of that printer software on a new printer”. No, Stallman was clear that his goal was to fix the bugs on the printer that MIT already had, using the source code for that very same printer. Stallman expected that the source code for the printer would include information sufficient for him to recompile and reinstall the software onto the very same device. Larger printers of that era were simply embedded devices of unusual size. They have only minor technical differences from the TVs, wireless routers, and dozens of other Linux-based embedded devices we have today. Computers are tiny today when they were large before, but their functioning and basic methods of operations have not changed. Install meant install then. Install means install now. And FSF, Conservancy, every software freedom activist and every legitimate copyleft theorist that I've ever met still agrees with this! The intent of the GPLv2 is clear and always has been: to allow reinstallation of modified versions of the GPL'd software into the same place where the binaries were installed when you got the computer in the first place, and to reap the benefits of that change. It's ludicrous to suggest Stallman meant anything other than that when he wrote GPLv2.

Recent Confirmation from FSF

Nevertheless, opponents of users' right to repair their software persist in their claims that GPLv2 doesn't intend this. We at Conservancy hear it regularly; GPL violators frequently send us a recently compiled dossier of curated comments by FSF — quoted (and some even misquoted) completely out context — that purport to “prove” that FSF does not want users to repair their embedded devices that contain GPLv2'd software. My affiliations with FSF had already ended by the time this dossier started making the rounds, so we did what any reasonable person would do: we asked FSF to clarify their opinion for us directly.

The opportunity to ask presented itself about a year ago, in May 2020, when Conservancy worked with FSF's Executive Director (John Sullivan), FSF's Licensing and Compliance Manager (Donald Robertson), and FSF's (then) legal counsel (Marc Jones) on a joint GPLv2 enforcement matter against a pernicious and intentional violator who had infringed the copyrights of GNU Bash and Linux. (The violator was using a GPLv2'd fork of Bash.) We took the opportunity then to reaffirm our joint understanding of this 18-year-old interpretation of the GPLv2 as part of that specific joint embedded device enforcement action. We discussed the matter at length and confirmed everyone's understanding remained unchanged from the prior FSF positions going back (at least) 18 years.

At the end of our discussions, on 2020-05-11, I wrote to Sullivan saying: “I just want to summarize what I believe was our mutual view on the phone call last Friday. If you could confirm that I have summarized correctly, we'll use the below as a basis of our response to [those who are currently inquiring about this issue]:”

The GPLv2 does not have any specific requirement for preservation of the ability to reinstall proprietary-software-centric vendor-provided firmwares (even if such firmwares contain some GPLv2'd works) on embedded systems, provided that the downstream user (i.e., the consumer with the device) can build, install, and (repeatedly and successfully) reinstall a firmware containing only the copylefted components (such as Linux+Bash).

John replied on 2020-05-13 with: “Bradley, We suggest just a couple of small tweaks:”

The GPLv2 does not have any specific requirement for preservation of the ability to reinstall proprietary-software-centric vendor-provided firmwares (even if such firmwares contain some GPLv2'd works) on embedded systems, provided that the downstream user (i.e., the consumer with the device) can build, install, run, and (repeatedly and successfully) reinstall a firmware containing at least the copylefted components (such as Linux+Bash).

As you can see, Sullivan advocates inclusion of the term “run” (which admittedly I had accidentally failed to include in my original draft!). It was a great addition, and Sullivan's statement matched exactly the historical interpretation that FSF espoused when I worked there in 2003. Indeed, it read to me almost exactly what Chassell had originally taught it to me when I was volunteering for FSF in the 1990s. Furthermore, the quote from Sullivan above matches the position that I vetted with Stallman throughout my time at FSF, right up until the end of my affiliation with FSF in 2019. Thus, FSF's position, as stated above, on the question of installation under GPLv2 has remained consistent from 2003-2020.

This leaves me to wonder: how is it that so many people came to conclude that FSF's view was that the GPLv2 didn't speak to “install” at all? I can only speculate, but my view is that (a) people heard what they wanted to hear, (b) a few (but not most, or even many) Linux developers spoke widely that it was their personal view that installation information isn't required by GPLv2 (notwithstanding the obvious textual requirement), and (c) in their fervor to ballyhoo the GPLv3 as an improvement, some GPLv3 advocates chose to denigrate GPLv2 as “not good enough” — in an apparent effort to frighten pro-GPLv2 copyleft activists to rush away from GPLv2 as quickly as possible.

Stallman on GPLv3 Installation Information

In April 2012, I started an email thread with Stallman yet again about the term “tivoization”. I again urged him to stop using the term, because, in my view, what TiVo did for GPLv2 compliance was not bad for software freedom. I wrote to Stallman at that time to again remind him that upgrades of TiVo's Linux installation “can be done successfully” and (at least for TiVo product that FSF declared in compliance), the only offense was one that GPLv2 permits: merely disabling the proprietary components from working after reinstallation of the GPLv2'd components. At that time, Stallman informed me that he had indeed designed the GPLv3 to deal with this situation. Specifically, I asked him on 2012-05-05:

[so], these words in GPLv3: “The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made.” mean that the proprietary software that is not a combined work with the GPLv3'd work must also function?

Stallman replied on 2012-05-06 with:

Absolutely. And I wrote it specifically to do that!

Why This History Must Be Told

Generally speaking, long narratives of past events that have hitherto lived only in oral history. They make for great podcast or post-conference-dinner fodder, but they rarely make for good blog posts. Nevertheless, I've explained all this here in painstaking detail to counter the rising swell of opposition to users' right to repair their GPLv2'd software installations. Initially, these efforts to curtail the right to reinstall under GPLv2 have been done clandestinely — for example, by spreading the aforementioned misleading dossier. Recently, however, the effort has gone public.

In mid-July 2021, a lawyer named McCoy Smith, who tries to make his living (in part) representing GPL violators, published an article that makes outrageous and inaccurate claims about these long-standing positions held by both Conservancy and FSF. We at Conservancy don't fear transparency, and we urge you to read McCoy's response to Denver's article, as well as Denver's original article, and then reread this one that responds to McCoy's argument. You should decide for yourself who has the better argument, and decide whether or not we've adequately answered McCoy's outrageous and inaccurate claims. In our view, McCoy spins a false narrative about the differences between GPLv2 and GPLv3 regarding install, and provides specious evidence for this claim. I hope that the historical facts that I describe above clarify this issue.

A few of McCoy's fundamental arguments are easily disputed by the historical facts that I outlined above:

  • McCoy accused me and Conservancy of “Historical Revisionism”, by claiming my words about GPLv2§3 were “a recent effort … to reinterpret the requirements of GPLv2”. I've shown above, using reliable and accurate revision history logs, that those words, which McCoy claims were recent, were written and published in May 2003.
  • McCoy states that the objection to TiVo was regarding prohibition of reinstallation of the GPL'd binaries. I've confirmed above that during FSF's enforcement action against TiVo, TiVo agreed to allow reinstallation of the GPL'd binaries but caused the proprietary software not to function, and that FSF took the position that GPLv2 required reinstallation of GPLv2'd binaries to function.
  • McCoy claims that GPLv2's original intent was never to allow installation. I've shown above that Stallman, the author of GPLv2, specifically knew about situations of embedded device proprietarization before GPLv2 was drafted, and, in contemporaneous and ongoing rhetoric, spoke clearly that he intended to preserve and advance users' right to repair their software by engaging in truly functional reinstallation of GPL'd binaries into the actual location.
  • McCoy claims that FSF does not share Conservancy's position about installation under GPLv2. I've shown specific text written by FSF's Executive Director, which was also verified by FSF's legal counsel and FSF's Licensing and Compliance Manager as recently as May 2020 — wherein Conservancy and FSF are in full agreement. I can also further confirm that I spoke with John Sullivan on the telephone earlier this week, and he reconfirmed that he still agrees with the paragraph as written as correct policy for the situation.


I then quote my 2012 exchange with Stallman to point out clearly: the installation information definition in GPLv3 expands the requirements and does not reduce the existing installation requirements that we all saw as present in GPLv2 from its first publication in 1992. McCoy's article contains a simple logical fallacy: it assumes that since the installation information requirements in the GPLv3 are (in some respects) more expansive than those in GPLv2, that the requirement for installation information in GPLv2 are non-existent and/or are diminished merely through public discussion of GPLv3's policy goals by FSF during GPLv3's drafting. As I show above, it's clear that Stallman's rhetoric about extending installation information requirements in GPLv3 had complex additional policy goals that don't exist in GPLv2. Specifically, I don't think that GPLv2 reinstallation requires that all “merely aggregated” works continue to function as designed upon reinstallation of the GPLv2 works. Stallman has agreed with that GPLv2 interpretation, but differed from me regarding forward-looking policy (in that he finds such disabling of proprietary software deplorable) and thus Stallman wrote GPLv3 to prevent that practice that GPLv2 permits.

Furthermore, and most importantly, I quote the May 2020 recent exchange with Sullivan to point out that FSF policy regarding GPLv2 and installation has not wavered since FSF established it during Bob Chassell's time, which continued on into my time as FSF's Executive Director and then into Peter Brown's and John Sullivan's time, too. As I've shown, this interpretation of GPLv2 installation requirements is at least a 17-year unbroken chain from at May 2003 all the way through to May 2020. If Bob Chassell were still alive today, I'm sure he could account for that position remaining consistent in the 1990s, too.

There is also a central and inherent flaw in McCoy's underlying argument: the idea that FSF's view, or Linus' view, or any single individual's or organization's view, is what matters. The license says what it says. If the license steward has a view, it would not mean their view is dispositive, and I say that knowing that their view happens to agree with mine! Indeed, Linus Torvalds has stated he doesn't agree with FSF's views about GPLv2, and has his own views, which McCoy himself quotes. (I'm not sure why McCoy thinks that forwards his argument, because Linus' view differing from FSF undercuts McCoy's argument that what FSF said during GPLv3 drafting is relevant.) Other contributors to Linux disagree with both FSF's and Linus' view; many prominent Linux developers have told me that they agree with Conservancy and/or FSF about this. Others have told me they have an even broader interpretation of the installation requirements under GPLv2 than I do!

Thus, McCoy makes a classic “appeal to authority” fallacy as the center of his argument. Regardless of McCoy's mostly unsupported opinion, I suspect even he would agree that only three things will really definitively matter regarding this issue: (a) what the wronged party who didn't get their complete, corresponding source code believes, (b) what the entity refusing to give them that source code believes, and (c) what the Courts says when the former sues the latter. All else is simply bluster — full of sound and fury, but signifying nothing.

A Challenge to Debate

As I was completing drafting on this article, the Linux Foundation sent me a rejection letter for my talk about this issue at their annual Open Source Summit (which took place in September 2021), and simultaneously announced McCoy will speak on this matter instead. I invited McCoy to not take the easy way out of presenting his work unquestioned to a friendly audience. I would have been glad to come to the Open Source Summit in September and debate McCoy publicly on this issue during this session. I believe the audience would have benefited from hearing more than just McCoy's anti-software-repair view of this issue. Sadly, McCoy did not wish to debate me at Open Source Summit, yet still quoted me and Denver extensively in his talk without giving us the chance to respond.

Finally, as a reminder, please keep in mind that (as I already said in the text above), I no longer have any affiliation with FSF (since October 2019) and do not speak for them — which is precisely why I quote the words they told me.

0 Please note that I have not personally looked into TiVo's GPL compliance since late 2003. As such, it's entirely possible that TiVo models released from 2004 onward may have violated GPLv2§3¶2 and failed to include required “scripts used to control compilation and installation of the executable”. However, any later non-compliance is not capitulation by me, FSF, Conservancy or anyone else that McCoy's and others' interpretation of that clause is correct.

1Please be abundantly clear that even as I give an interpretation of what I happen to believe is correct at this given moment, I'm a flawed human being capable of error. (Also, IANAL and TINLA.) I can misspeak, misstate, and otherwise just be plain wrong about something one way or the other. This is also true of FSF, its representatives, and all the other pundits like McCoy Smith who opine on this question. One of the horrible “race to the bottom” traps that GPL violators constantly lay for us is unrelenting pressure that we choose between (a) reducing what we believe a given license requires, or (b) suing them to ask the Court to uphold our view. No one escapes that pressure cooker unscathed; nearly every pro-copyleft activist (including me) has fallen into this trap, and succumbed to the pressure of (a) at least once. I know, even as I write this footnote, that someday I'm going to have a GPL violator's lawyer quoting this blog post back to me in a deposition about some esoteric, “race to the bottom” issue of GPL compliance. They're going to look for a way to twist my words to argue that somehow I've given their client carte blanche to trample users' rights that GPL protects. Everyone who stands up for copyleft faces this constant challenge now that intentional GPL violations are the norm rather than the exception. Conservancy simply will not capitulate when standing up for users' rights to copy, share, modify, repair, reinstall and reinstall modified versions of their software on the devices they own.

Tags: conservancy, GPL, law, licensing

Next page (older) » « Previous page (newer)

1 [2] 3 4 5 6 7

Connect with Conservancy on Mastodon, Twitter, Facebook, and YouTube.

Main Page | Contact | Sponsors | Privacy Policy | RSS Feed

Our privacy policy was last updated 22 December 2020.