Conservancy Blog
Displaying posts tagged law
If Software is My Copilot, Who Programmed My Software?
by
on February 3, 2022Software freedom is our goal. Copyleft is a strategy to reach that goal. That tenet is oft forgotten by activists. Copyleft is even abused to advance proprietary goals. We too often see concern about the future of copyleft overshadow the necessary fundamental question: does a particular behavior or trend — and the inevitable outcomes of those behaviors and trends — increase or decrease users’ rights to copy, share, modify, and reinstall modified versions of their software? That question remains paramount as we face new challenges.
Introduced first by Microsoft’s GitHub in their Copilot product, computer-assisted software authorship by way of machine learning models presents a formidable challenge to software freedom’s future. Yet, we can, in fact, imagine a software freedom utopia that embodies this technology. Imagine that all software authors have access to the global archive of machine learning models — and they are fullly reproducible. Everyone has equal rights to fork these models, train them further with their own datasets, provided that they must release new models (and the input code) freely in the global archive. All code produced by these models is also made freely available under copyleft. All code that builds the models, all historical input sets, and all trained models are all also made available to everyone under copyleft licenses.
While activists might quibble about minor details to optimize imagined utopia, this thought experiment shows computer-assisted software authorship does not inherently negate software freedom. Rather, the rules, requirements, and policies that apply will determine whether software freedom is respected. To paraphrase Hamlet: there is nothing either good or bad, but the policy makes it so.
What’s the Worse That Could Happen?
[They are] not a good [person] who, without a protest, allows wrong to be committed … with the means which [they] help to supply.
— John Stewart Mill, University of St. Andrews, 1 February 1867
Obviously, ignoring machine learning for computer-assisted software authorship will not usher in this software freedom utopia. Copyleft activists cannot stand idly by in this situation, but we must temper our attention by considering the likelihood of dystopian and problematic outcomes, and the options available to prevent them.
In response to Copilot’s announcement, pundits speculated, without evidence, a prevailing feeling of “Free Software had a good run, but I guess that’s over now”. Such predictions seem consistent with the well-documented overoptimism of artificial intelligence success. Rapid replacement of traditional software development methodologies seem unlikely. As such, we should not overestimate the likelihood that these new systems will both accelerate proprietary software development, while we simultaneously fail to prevent copylefted software from enabling that activity. The former may not come to pass, so we should not unduly fret about the latter, lest we misdirect resources. In short, AI is usually slow-moving, and produces incremental change far more often than it produces radical change. The problem is thus not imminent nor the damage irreversible. However, we must respond deliberately with all due celerity — and begin that work immediately.
Currently, there are two factors that influence the timing of our response. First, if GitHub’s Copilot becomes a non-beta product available to the programming public, that would indicate necessity of an urgent response. Microsoft and GitHub are unlikely to share their product plans, so we cannot know for sure when this will occur. However, in the seven months since the first beta was made available, we’ve consistently heard anecdotally that more and more developers (particularly, FOSS developers!) have received beta invitations. Based on these (admittedly incomplete) facts, we must assume that a move from private beta to public deployment is imminent in 2022. This indicates some urgency of the problem.
Second, we already know that some of our worst fears are definitely true. Namely, that Microsoft and GitHub used copylefted software as part of Copilot’s training set.
Copilot was trained on “billions of lines of public code … written by others”. While GitHub has refused requests to release even a list of repositories included in the training set, the use of the word “public” indicates that only software with source-available licenses (even if not FOSS licenses) were input into Copilot. Furthermore, GitHub admits that during training, the system encountered a copy of the GPL more than 700,000 times. This effectively confirms that copylefted public code appears in the training set.
When questioned, former GNOME developer and GitHub CEO0, Nat Friedman, declared publicly “(1) training ML systems on public data is fair use (2) the output belongs to the operator”. Friedman himself, as well as Microsoft and GitHub’s other executives and lawyers, have ignored Software Freedom Conservancy’s requests for clarification and/or evidence supporting these statements.
Meanwhile, GitHub continues to improve this system, trained only on publicly source-available software, and seeks to market it to new users, including those who otherwise use FOSS development tools. Users continue to report gaining access to the beta and are noticing improvements. Microsoft and GitHub’s public position is meanwhile clear: they claim to have no copyleft obligations for training the model, the model itself, and deploying the service. They also believe there are no licensing obligations for the output.
While Friedman ignored the community’s requests publicly, we inquired privately with Friedman0 and other Microsoft and GitHub representatives in June 2021, asking for solid legal references for GitHub’s public legal positions of (1) and (2) above. They provided none, and reiterated, without evidence, that they believed the model does not contain copies of the software, and output produced by Copilot can be licensed under any license. We further asked if there are no licensing concerns on either side, why did Microsoft not also train the system on their large proprietary codebases such as Office? They had no immediate answer. Microsoft and GitHub promised to get back to us, but have not.
This secrecy and non-cooperativeness is expected from a proprietary software company and its subsidiary, but leaves us only with speculative conclusions to inform a strategy for copyleft here. We can reliably guess that the companies will claim “fair use” as their primary justification for creating the model and offering the service, and will argue that both the output and the trained model are not “work[s] based on the Program” (GPLv2) nor do they “copy from or adapt all or part of the work[s] in a fashion requiring copyright permission” (GPLv3/AGPLv3). Furthermore, we can reliably conclude, given the continuing product promotion, that the companies have at least a medium-term commitment to Copilot.
In short, they have already hunkered down for a protracted disagreement. Their positions are now incumbent — using their resources and power to successfully charge copyleft activists to “prove them wrong”. But we do not have to accept their unsubstantiated arguments at face value. In fact, these areas are so substantially novel that almost every issue has no definitive answers, but we must nevertheless begin to formulate our position and our response to Microsoft and GitHub’s assault on copyleft.
Trained Models, Fair Use, and Copyright Infringement
Consider GitHub’s claim that “training ML systems on public data is fair use”. We have not found any case of note — at least in the USA — that truly contemplates that question. The only legal case in the USA to look near this question is Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015). The Supreme Court denied certiorari on this case; it is not legal precedent in all jurisdictions where Microsoft and GitHub operate.
Even more, that case considered a fact pattern centered around search, not authorship of new/derived works. Google had made copies of entire copyrighted books, not for the purpose of displaying them, but so users could (1) run search queries, and (2) see a “snippet” of the search hits (i.e., to see the search hit in context). The Second Circuit held Google’s copying of the books was “fair use” because searching and providing context added value exceeding what a user could obtain from their own copies, and Google’s product did not substitute the market for the books.
The analogous fact pattern for code is obvious: GitHub could offer a search tool that assists users in finding key public repositories (and specific lines of code within those repositories) that seemed to solve tasks of interest. Developers could then easily utilitize those codebases in the usual, license-compliant ways. The actual Copilot fact pattern is not this one.
Meanwhile, the Authors Guild case begins and ends the list of major cases regarding machine learning systems and “fair use”. We should simply ignore GitHub’s risible claim that the “fair use question” on machine learning is settled.
Perhaps most importantly, in the USA, “fair use” is an affirmative defense to answer copyright infringement. In concrete terms, that means — particularly in cases where the circumstances are novel — a copyright holder brings an infringement lawsuit and then the alleged infringer shows in court that their actions met the relevant factors for “fair use” sufficiently. Frankly, we refuse to do these companies’ job for them. Copyleft activists need not tell Microsoft and GitHub why this isn’t “fair use”, rather, they need to tell us why training the model with copylefted code is “fair use” and prove that the trained model itself is not a “work based on” the GPL’d software.
GitHub has meanwhile artfully avoided the question of whether the trained model is a “work based on” the input. We contend that it probably is. However, given that “fair use” is an affirmative defense to copyright infringement, they are obviously anticipating a claim that the trained model is, in fact, a “work based on” the inputs to the model. Why else would they even bring up “fair use”, rather than simply say their use is fully non-infringing? Anyway, we have no way to even explore these questions authoritatively without examining the model, fully affixed in its tangible medium. We don’t expect GitHub to produce that unless compelled by a third party.
Indeed, discussion of these questions outside of a courtroom is moot. For this novel and contentious fact pattern, only a court decision can settle the matter adequately. As a strategic matter, copyleft activists should keep their own counsel about what we anticipate in the opposition’s “fair use” and/or non-infringement defenses, and the counter-arguments that we plan.
Copilot Users Should Worry
GitHub’s position does a great disservice to Copilot users. Their claim that “the output belongs to the operator” creates a false sense of legal justification. Users have already shown that Copilot can generate a substantial amount of unique, GPL’d code, and then (rather ironically, given GitHub’s claim that they removed the text of the GPL from the training set) also suggest a license that is non-copyleft. Friedman’s statement surely does not qualify as an indemnity for Copilot users who might face GPL enforcement actions. Users almost surely must construct their own “fair use” or “not copyrightable” defenses for Copilot’s output.
The length and detail of what Copilot can generate for users seems unbounded. The glaring example above appears primia facie to be copyright infringement; we expect further such problems. Consider the sheer amount that a fully functional and successful Copilot would generate. Surely, AI researchers seek the ability for Copilot to “figure out” that you are trying to solve some specific task when programming. The better Copilot gets at handing ready-made solutions to its users, the more likely it becomes that its output may offer the user copylefted software.
Copilot leaves copyleft compliance as an exercise for the user. Users likely face growing liability that only increases as Copilot improves. Users currently have no methods besides serendipity and educated guesses to know whether Copilot’s output is copyrighted by someone else. Proprietary software companies such as Synopsys provide so-called “scanning tools” — that can search your proprietary codebase and find hidden copylefted software. However, the FOSS tools for that job are in their infancy and unlikely to develop quickly, since historically those who want those tools are companies that primarily develop proprietary software and seek to avoid copylefted software.
We recommend users who wish to avoid infringing the copyrights of others simply avoid Copilot.
On Copyleft Maximalism and Unilateral Capitulation
Draconian copyright law generally horrifies software freedom activists for good reason. Nearly all copyleft activists would prefer a true, multilateral rewriting of copyright rules that prioritized the interest of the general public and software rights. Copyleft exists primarily because of the long-standing political non-viability of a copyright law reboot. Nothing has changed in this regard; if anything, changing legislation has become an even more expensive lobbying proposition than it was at copyleft’s advent. Copyleft activists should expect, indefinitely, for proprietary software companies and media oligarchs to control copyright legislation.
Fortunately, copyleft was designed specifically for this eventuality. Activists have called copyleft the “judo move” of software freedom, since copyleft uses the powerful copyright force (invented primarily by our opposition) against itself. That realization leads to a painful, but pragmatically necessary, awkwardness.
The issues herein — from training of machine learning models, to the copyright questions about those models, to the derivation questions about their output — are novel copyright questions. As software freedom activists, we are uniquely qualified to invent an ideal copyright structure for these technologies. But, without a path to promulgate such replacement copyright rules into the incumbent system, that exercise is futile. Furthermore, systems outside of copyright — including but not limited to EULAs, business agreements and patents — have long been used to proprietarize software without the need of copyright. Reality of facts on the ground dictate that we not concede the only wedge we have to compel software freedom; that wedge is copyleft.
Meanwhile, proprietary software companies regularly exploit any unilateral concessions on weakening of copyleft that FOSS projects make, while continuing to pursue copyright maximalism for their works. Particularly in novel areas, we must assume a copyleft maximalist approach — until courts or the legislature disarm all mechanisms to control users’ rights with regard to software. That adversarial process will frustrate us, but ultimately by choosing copyright as our primary tool, we already chose the courts as our battleground for contentious issues.
We all surely have our opinions about how copyleft should operate in these novel situations. We have even expressed some such opinions herein. But, ultimately, strong copyleft licenses do not defer the “what’s covered?” question to one individual or organization. The “judo” power comes from strong copyleft reaching to all of what copyright governs. When those issues are novel — and companies flaunt that novel manipulation of copylefted works — only a court can answer definitively.
A Community-Led Response
While these companies will likely not succeed in their efforts to disarm copyleft, they have nevertheless attacked the entire copyleft infrastructure. We must mount an effective response.
Software Freedom Conservancy has spent the last six months in deep internal discussions about this novel threat to the very efficacy of copyleft. We have a few ideas — a mix of short-term, medium-term and long-term strategies to address the problem. However, we recognize that a community (rather than the traditional BDFL) approach is needed — at least for this problem. Thus, putting first things first, we realized that we should gather the best minds in the software freedom community with direct experience in copyleft theory and practice. We will convene these individuals to a committee specifically chartered by Software Freedom Conservancy to — as quickly as reasonably possible – publish a series of recommendations to the community on how we should respond to both the immediate threat to copyleft found in Copilot, and (long-term) analyze the more general threat that AI-assisted programming techniques pose to the strategy of copyleft.
While we are not actively seeking applications for this committee, we do welcome anyone whom we have not yet solicited to participate to contact us and inquire. We will surely be unable to include everyone who is interested on the committee — either due to Conflicts of Interest or due to simple logistics of creating too large a committee. However, we will carefully consider anyone who expresses bona fide interest to participate.
Finally, as much as can be done during the pandemic using FOSS tools available, we will attempt to convene public discussions as much as possible. We will contemporaneously publish the committee’s minutes publicly. If you’d like to get involved today in public discussions about this issue, please join the mailing we launched today for this topic.
0In November 2021, Nat Friedman was replaced by Thomas Dohmke as GitHub’s CEO. However, to our knowledge, Dohmke has not retracted or clarified Friedman's comments, and at the time of writing, no one from GitHub or Microsoft that we spoke to had responded to our requests for clarification.
First Update on the Vizio lawsuit
by
on November 30, 2021Yesterday, we received from Vizio their first official response in our pending litigation against Vizio for their copyleft license violations. So, what was their response?
Did Vizio release the source code — as the GPL and LGPL require — for the modified versions of Linux, alsa-utils, GNU bash, GNU awk, BusyBox, dmesg, findutils, dmsetup, GNU tar, mount and selinux found in their TV’s firmwares? No.
Did Vizio propose a CCS candidate for us to review, provide them with additional feedback, so that we could help them get consumers who bought their TVs the source code they deserve? Nope.
Did Vizio argue that we had erred, and in fact, none of those programs we list above appear in their firmware? Not that either. (Unlikely though — after all, they surely know those programs are in their firmware!)
Instead, Vizio filed a request to “remove” the case from California State Court (into US federal court), which indicates Vizio's belief that consumers have no third-party beneficiary rights under copyleft! In other words, Vizio’s answer to this complaint is not to comply with the copyleft licenses, but instead imply that Software Freedom Conservancy — and all other purchasers of the devices who might want to assert their right under GPL and LGPL to complete, corresponding source — have no right to even ask for that source code.
That’s right: Vizio’s filing implies that only copyright holders, and no one else, have a right to ask for source code under the GPL and LGPL. While we expected Vizio held this position (since they ultimately ignored us during our discussions with them in years past), Vizio has gone a disturbing step further and asked the federal United States District Court for the Central District of California to agree to the idea that not only do you as a consumer have no right to ask for source code, but that Californians have no right to even ask their state courts to consider the question!
Vizio’s strategy is to deny consumers their rights under copyleft licenses, and we intend to fight back.
We believe in complete transparency of the copyleft compliance process, and so encourage everyone to read the filings. We’ve even paid the Pacer fees and used the Recap browser plugin, so that all the documents in the case are freely available via the Recap project archives.
Software Freedom Conservancy’s annual fundraiser is happening right now! Please help us continue our work by becoming a Sustainer. Donate now and have your donation matched by a group of generous individuals who care deeply about software freedom.
"Tivoization" & Your Right to Install Under Copyleft & GPL
by
on July 23, 2021Two schools of thought about the purpose of copyleft have been at odds for some time. Simply put, the question is: are copyleft licenses designed primarily to protect the rights of large companies that produce electronics and software products, or is copyleft designed primarily to protect individual users' rights to improve, modify, repair, and reinstall their software?
This debate quickly gets deep into complex policy questions. In the last few years, that general debate has slowly but surely focused almost entirely on the issue of users' ability to make effective use of FOSS on their own hardware by reinstalling their modified versions.
Historically, these nuanced policy questions about copyleft requirements have generally been discussed only in semi-public venues, and often fall prey to the tactic du jour: post-fact politics. I have realized in recent months that the failure to properly document and explain key historical narratives in copyleft history leaves software freedom activism at a disadvantage: well-resourced copyleft violators and their lawyers can use the ambiguity and confusion in the scant public record to spin false narratives and draw legal conclusions. While such legal conclusions should not be drawn (absent a Court ruling), companies have nevertheless pushed their views forward quite loudly recently. To use Herman and Chomsky's insightful phrasing, the incumbent power structures manufacture consent to their worldview to serve their interests, merely by being the loudest and most commonly heard voices.
Specifically on the issue of protections for the right to repair and reinstallation under copyleft licenses such as GPL, I am fortunate to have been a direct observer to many of the events that now serve as the connective tissue to build these false narratives about the GPL. However, I admit I have failed to write down and impart that knowledge to the general public in adequate measure, which has, in turn, inadvertently aided in promulgation of these false narratives. So, at least on the issue of “scripts used to control compilation and installation of the executable”, I hope this essay will serve as remedy. I, and everyone at Conservancy, all believe in intellectual transparency, and we strive to provide it wherever possible. The truth will out.
Installation Requirement Under GPLv2
Recent debates on this issue focus on the question of what is required to comply with the first two sentences of GPLv2§3¶2, which reads:
The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable.
Before explaining the historical understanding of these terms, I will, first of all, point out that any company or lawyer that seeks to do the bare minimum for compliance is likely not prioritizing users' rights to repair their software. In all compliance-related systems, bad actors seek a “race to the bottom”. Rules like GPL are similar to environmental regulations, workplace safety requirements, and the like. The more minimalistic the interpretation of the requirements, the more companies can profit from only doing the bare minimum.
Nevertheless, it has been the goal of organizations that advocate for software freedom — such as Conservancy ourselves or FSF — to state clearly our view about the minimum requirements as best we can. I often wonder if this strategy has been beneficial to software freedom. Sadly, the answer from the industry has primarily been to hear us clearly about the minimum requirements, and then work over time to lower the GPL compliance bar — even if it requires inaccurately quoting FOSS leaders and misleading the public about history. Most recently, industry has engaged in this bar-lowering process with GPLv2§3¶2 and installation information under GPLv2 generally. My hope herein is to fully explain the history of interpretation of GPLv2§3¶2 by pro-copyleft advocates, and explore the misdirection of arguments of those who seek to curtail users' rights to install modified versions of their GPL'd software.
FSF CLE Classes, 2003-2004
I began volunteering for licensing and GPL enforcement work for FSF in 1997, and officially worked on my first GPL enforcement action in 1999. I became an FSF employee that year, and worked there until 2005. I thereafter remained affiliated with the organization in various roles until my final affiliations ended with FSF in October 2019. Most notably, I was the Executive Director of FSF from 2001-2005. During that time, I led FSF's GPL enforcement and copyleft education measures, including the CLE classes (first taught in 2003-2004).
In preparation for teaching those courses, I began to write the tutorial which later became the Copyleft Guide. To begin that effort, I collected, curated, and verified interpretations and intent of the GPLv2 with Richard Stallman, Bob Chassell (a key but oft-forgotten leader of FSF during the 1980s and 1990s), and FSF's legal counsels. One of the many outcomes of that endeavor was that I wrote these words on 2003-05-09:
GPLv2§3 requires that the source code include “meta-material” like scripts, interface definitions, and other material that is used to “control compilation and installation” of the binaries.
In GPL enforcement actions at the time, during our “complete, corresponding source (CCS) checks”, we verified that the source code was not only complete, but that it corresponded to the binaries on the vendors' devices, and that we could install modified versions of the software. This was a standard part of any check to verify GPLv2 compliance. Passing this check was required, then and now, by FSF and Conservancy before distribution rights are restored after a violation.
That position was not controversial when I, along with then FSF counsel (Daniel Ravicher), taught it to lawyers in 2003 and 2004 on FSF's behalf. Nevertheless, today, many act as if this interpretation and intent of GPLv2§3¶2 is a recent and novel phenomena, rather than a long standing position held by all copyleft activists for at least 18 years. Today, most companies and lawyers argue (incorrectly, IMO) that users have no rights to reinstall their GPLv2'd software.
The 2003 TiVo GPL Enforcement Action
Even before teaching those CLE classes, as (then) FSF's Executive Director, I led the GPLv2 enforcement effort against TiVo. I've often seen those with only a passing familiarity with the subject jump to inaccurate conclusions about that enforcement action that tend to conveniently fit their policy agenda. I herein recount the entire history regarding the TiVo GPLv2 violation and how it led to the “tivoization” rhetoric. Since that rhetoric is often treated as dispositive truthiness that GPLv2 does not ensure the users' rights to repair by reinstalling their modified GPLv2'd software, we should examine the actual facts that back the rhetoric, and examine the conclusions that others make about GPLv2 based on it.
First and foremost, TiVo's GPL violation initially had nothing specific to do with GPLv2§3¶2. TiVo never raised any intention to not comply with that section. In fact, to my recollection, TiVo never disputed nor disagreed with FSF's interpretation on that section. The initial violation was a standard GPLv2§3(b) violation, wherein some distributions of the TiVo device had an offer for source that could not be successfully exercised. (At the time) acting on behalf of FSF, I contacted TiVo on 2002-06-11 to raise this issue, and, TiVo responded favorably and indicated they wanted to resolve the matter. As is usual practice in all GPL enforcement matters, I and my (then) team did our due diligence to verify full compliance, including any other potential issues under GPLv2. Eventually, my FSF colleague (David Turner) and I did a CCS check of TiVo's software. The procedures, criteria, and interpretations that Turner and I used then are exactly the same as the ones that Denver and I use today at Conservancy. To my knowledge (based on recent personal conversations with FSF staff), FSF still uses when these same procedures, criteria, and interpretations when FSF has the rare occasion to do GPLv2 enforcement these days.
Once the GPLv2§3(b) violation was resolved, Turner and I discovered — as has been true in nearly every one of the hundreds of GPL compliance matter that I've worked — that “the scripts used to control compilation and installation of the executable” were incomplete. When this was identified, TiVo's solution was to, in fact, agree with the interpretation that that such instructions are mandatory and must be provided and they provided them. To my knowledge, TiVo was in full compliance with the GPLv2, including the inclusion of instructions for installation as required under GPLv2. People were able to reinstall Linux on their TiVo boxes thanks to our enforcement action; community resources on how to take advantage of GPL reinstallation rights on TiVos (of that era) are still readily available! At the time0, TiVo was doing the right thing in providing what the GPLv2 requires — including the ability to reinstall GNU and Linux software onto the actual device. Keep in mind: this enforcement action, and the compliance achieved from it, occurred years before the GPLv3 process began.
Understanding The Tivoization Rhetoric
So, what did TiVo do that was so objectionable? What was the behavior that Stallman went to work drafting GPLv3 to prevent that TiVo was allowed to do under GPLv2? It's not, as others widely misreport, that TiVo forbade reinstallation “of the GPL'd software” itself. To my knowledge, TiVo never prevented such reinstallation. No one involved, including me, Stallman, TiVo, or anyone at FSF at the time believed that GPLv2 permitted TiVo to withhold the installation information for the GPL'd software itself. FSF demanded that TiVo provided its users the ability to reinstall Linux (and other GPL'd software, such as GNU bash). What TiVo later did, which some software freedom activists (including Stallman) found objectionable, was that TiVo designed the reinstallation process of that GPLv2'd software to cause the proprietary TiVo application to cease to function. I recall this being widely discussed when TiVo Series 3 was released in mid-2006, and my understanding was that all Series 3 devices had this particular anti-feature. (There were rumors that some of the Series 2 had this anti-feature as well, but not all models.) In other words, if you decided to modify your copy of Linux for the TiVo device and reinstall Linux, the TiVo userspace application would realize that cryptographic lockdown had been breached, and that proprietary software would no longer function. By exercising your reinstallation rights under GPLv2, you'd turn your TiVo DVR into a stand-alone server with some video processing equipment attached. You could use Kodi (which at the time had a different name) to turn that former-TiVo into a FOSS DVR, but your ability to use the proprietary DVR software from TiVo was lost — likely permanently.
Most have of course heard of the negative term “tivoization” that Richard Stallman popularized during the GPLv3 process — which was contemporaneous with the release of the TiVo Series 3. I nevertheless asked Stallman to not use that term — both then and many times since. I still disagree with Stallman's policy position on the narrow issue of preserving proprietary userspace functionality. Specifically, I just don't think it matters if, upon upgrading your copylefted software, that the proprietary software that was (to use GPLv2's terminology) “merely aggregated” alongside the copylefted software continue to function. I felt and still feel that it's actually better policy to break the (“merely aggregated”) proprietary software (as GPLv2 permits). My policy view is that this breakage inspires and encourages users to install a FOSS alternative for the userspace applications after they've reinstalled the FOSS operating system. Nevertheless, Stallman found this practice (using crypto lock-down to force the proprietary software to fail) illegitimate. He noted publicly that GPLv2 didn't prevent this behavior, and wanted (and wrote, as explained below) a GPLv3 draft that prohibited that behavior.
How Discussion Focused on Cryptographic Lockdown Generally
To this day, I'll remain frustrated that many pro-GPLv3 advocates, during the GPLv3 drafting process, saw fit to imply ideas that they had no basis to believe were true about GPLv2. We all knew, long before GPLv3 drafting began, that there was a clear installation requirement in GPLv2 — the word “install” appears prominently. The training materials that I developed for FSF (described above) were vetted through Stallman and FSF's legal counsel before using them to teach CLE classes. If anyone received a different impression, it was surely a miscommunication due to the aggressive “GPLv3 is much better” rhetoric of the time.
Meanwhile, much of the debate about cyptographic lockdown under GPL centered around the question of disclosure of specific authorization keys. It was said, probably correctly, that GPLv2 did not mandate disclosure of an any specific authorization key. What was often left unsaid (apparently in an effort to make GPLv2 seem weaker than it actually was) was what GPLv2 did still require: a functional installation method without disclosure of authorization keys. For example, it would, in my personal opinion, be entirely compliant with the GPLv2 to simply disable the secure boot chain, providing no path back to the vendor-provided cryptographically signed firmware1, and allow the user to reinstall only the GPLv2'd components on the device — never to return to the stock vendor firmware. I suspect such restriction would be prohibited under GPLv3, since GPLv3 clearly requires not that you just give a viable install path (as GPLv2 does), but GPLv3 additionally requires disclosure of the authorization keys.
We can debate whether this copyleft expansion under GPLv3 was good policy. What is not up for debate is the simple concept that: more requirements added to a later revision of a licensing document does not change the intent or standing requirements in the older document. That's true even if the authors of the original document, for marketing or other reasons, choose later to denigrate their own past work. As it turns out, historically, we know what GPLv2 intended because its author, Richard Stallman, talked so extensively about what he sought to accomplish by creating GPLv2.
Going back to the early 1990s and contemporaneous with GPLv2's publication, Stallman himself has been quite fond of telling his experience with the broken MIT printer, for which he begged for the source code and didn't receive it. Stallman doesn't end this story with: “what I really wanted was to get the source code to that printer so I could build my own printer from scratch and then compile and make a fresh install of that printer software on a new printer”. No, Stallman was clear that his goal was to fix the bugs on the printer that MIT already had, using the source code for that very same printer. Stallman expected that the source code for the printer would include information sufficient for him to recompile and reinstall the software onto the very same device. Larger printers of that era were simply embedded devices of unusual size. They have only minor technical differences from the TVs, wireless routers, and dozens of other Linux-based embedded devices we have today. Computers are tiny today when they were large before, but their functioning and basic methods of operations have not changed. Install meant install then. Install means install now. And FSF, Conservancy, every software freedom activist and every legitimate copyleft theorist that I've ever met still agrees with this! The intent of the GPLv2 is clear and always has been: to allow reinstallation of modified versions of the GPL'd software into the same place where the binaries were installed when you got the computer in the first place, and to reap the benefits of that change. It's ludicrous to suggest Stallman meant anything other than that when he wrote GPLv2.
Recent Confirmation from FSF
Nevertheless, opponents of users' right to repair their software persist in their claims that GPLv2 doesn't intend this. We at Conservancy hear it regularly; GPL violators frequently send us a recently compiled dossier of curated comments by FSF — quoted (and some even misquoted) completely out context — that purport to “prove” that FSF does not want users to repair their embedded devices that contain GPLv2'd software. My affiliations with FSF had already ended by the time this dossier started making the rounds, so we did what any reasonable person would do: we asked FSF to clarify their opinion for us directly.
The opportunity to ask presented itself about a year ago, in May 2020, when Conservancy worked with FSF's Executive Director (John Sullivan), FSF's Licensing and Compliance Manager (Donald Robertson), and FSF's (then) legal counsel (Marc Jones) on a joint GPLv2 enforcement matter against a pernicious and intentional violator who had infringed the copyrights of GNU Bash and Linux. (The violator was using a GPLv2'd fork of Bash.) We took the opportunity then to reaffirm our joint understanding of this 18-year-old interpretation of the GPLv2 as part of that specific joint embedded device enforcement action. We discussed the matter at length and confirmed everyone's understanding remained unchanged from the prior FSF positions going back (at least) 18 years.
At the end of our discussions, on 2020-05-11, I wrote to Sullivan saying: “I just want to summarize what I believe was our mutual view on the phone call last Friday. If you could confirm that I have summarized correctly, we'll use the below as a basis of our response to [those who are currently inquiring about this issue]:”
The GPLv2 does not have any specific requirement for preservation of the ability to reinstall proprietary-software-centric vendor-provided firmwares (even if such firmwares contain some GPLv2'd works) on embedded systems, provided that the downstream user (i.e., the consumer with the device) can build, install, and (repeatedly and successfully) reinstall a firmware containing only the copylefted components (such as Linux+Bash).
John replied on 2020-05-13 with: “Bradley, We suggest just a couple of small tweaks:”
The GPLv2 does not have any specific requirement for preservation of the ability to reinstall proprietary-software-centric vendor-provided firmwares (even if such firmwares contain some GPLv2'd works) on embedded systems, provided that the downstream user (i.e., the consumer with the device) can build, install, run, and (repeatedly and successfully) reinstall a firmware containing at least the copylefted components (such as Linux+Bash).
As you can see, Sullivan advocates inclusion of the term “run” (which admittedly I had accidentally failed to include in my original draft!). It was a great addition, and Sullivan's statement matched exactly the historical interpretation that FSF espoused when I worked there in 2003. Indeed, it read to me almost exactly what Chassell had originally taught it to me when I was volunteering for FSF in the 1990s. Furthermore, the quote from Sullivan above matches the position that I vetted with Stallman throughout my time at FSF, right up until the end of my affiliation with FSF in 2019. Thus, FSF's position, as stated above, on the question of installation under GPLv2 has remained consistent from 2003-2020.
This leaves me to wonder: how is it that so many people came to conclude that FSF's view was that the GPLv2 didn't speak to “install” at all? I can only speculate, but my view is that (a) people heard what they wanted to hear, (b) a few (but not most, or even many) Linux developers spoke widely that it was their personal view that installation information isn't required by GPLv2 (notwithstanding the obvious textual requirement), and (c) in their fervor to ballyhoo the GPLv3 as an improvement, some GPLv3 advocates chose to denigrate GPLv2 as “not good enough” — in an apparent effort to frighten pro-GPLv2 copyleft activists to rush away from GPLv2 as quickly as possible.
Stallman on GPLv3 Installation Information
In April 2012, I started an email thread with Stallman yet again about the term “tivoization”. I again urged him to stop using the term, because, in my view, what TiVo did for GPLv2 compliance was not bad for software freedom. I wrote to Stallman at that time to again remind him that upgrades of TiVo's Linux installation “can be done successfully” and (at least for TiVo product that FSF declared in compliance), the only offense was one that GPLv2 permits: merely disabling the proprietary components from working after reinstallation of the GPLv2'd components. At that time, Stallman informed me that he had indeed designed the GPLv3 to deal with this situation. Specifically, I asked him on 2012-05-05:
[so], these words in GPLv3: “The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made.” mean that the proprietary software that is not a combined work with the GPLv3'd work must also function?
Stallman replied on 2012-05-06 with:
Absolutely. And I wrote it specifically to do that!
Why This History Must Be Told
Generally speaking, long narratives of past events that have hitherto lived only in oral history usually have academic interest only. They make for great podcast or post-conference-dinner fodder, but they rarely make for good blog posts. Nevertheless, I've explained all this here in painstaking detail to counter the rising swell of opposition to users' right to repair their GPLv2'd software installations. Initially, these efforts to curtail the right to reinstall under GPLv2 have been done clandestinely — for example, by spreading the aforementioned misleading dossier. Recently, however, the effort has gone public.
In mid-July 2021, a lawyer named McCoy Smith, who tries to make his living (in part) representing GPL violators, published an article that makes outrageous and inaccurate claims about these long-standing positions held by both Conservancy and FSF. We at Conservancy don't fear transparency, and we urge you to read McCoy's response to Denver's article, as well as Denver's original article, and then reread this one that responds to McCoy's argument. You should decide for yourself who has the better argument, and decide whether or not we've adequately answered McCoy's outrageous and inaccurate claims. In our view, McCoy spins a false narrative about the differences between GPLv2 and GPLv3 regarding install, and provides specious evidence for this claim. I hope that the historical facts that I describe above clarify this issue.
A few of McCoy's fundamental arguments are easily disputed by the historical facts that I outlined above:
- McCoy accused me and Conservancy of “Historical Revisionism”, by claiming my words about GPLv2§3 were “a recent effort … to reinterpret the requirements of GPLv2”. I've shown above, using reliable and accurate revision history logs, that those words, which McCoy claims were recent, were written and published in May 2003.
- McCoy states that the objection to TiVo was regarding prohibition of reinstallation of the GPL'd binaries. I've confirmed above that during FSF's enforcement action against TiVo, TiVo agreed to allow reinstallation of the GPL'd binaries but caused the proprietary software not to function, and that FSF took the position that GPLv2 required reinstallation of GPLv2'd binaries to function.
- McCoy claims that GPLv2's original intent was never to allow installation. I've shown above that Stallman, the author of GPLv2, specifically knew about situations of embedded device proprietarization before GPLv2 was drafted, and, in contemporaneous and ongoing rhetoric, spoke clearly that he intended to preserve and advance users' right to repair their software by engaging in truly functional reinstallation of GPL'd binaries into the actual location.
- McCoy claims that FSF does not share Conservancy's position about installation under GPLv2. I've shown specific text written by FSF's Executive Director, which was also verified by FSF's legal counsel and FSF's Licensing and Compliance Manager as recently as May 2020 — wherein Conservancy and FSF are in full agreement. I can also further confirm that I spoke with John Sullivan on the telephone earlier this week, and he reconfirmed that he still agrees with the paragraph as written as correct policy for the situation.
Conclusion
I then quote my 2012 exchange with Stallman to point out clearly: the installation information definition in GPLv3 expands the requirements and does not reduce the existing installation requirements that we all saw as present in GPLv2 from its first publication in 1992. McCoy's article contains a simple logical fallacy: it assumes that since the installation information requirements in the GPLv3 are (in some respects) more expansive than those in GPLv2, that the requirement for installation information in GPLv2 are non-existent and/or are diminished merely through public discussion of GPLv3's policy goals by FSF during GPLv3's drafting. As I show above, it's clear that Stallman's rhetoric about extending installation information requirements in GPLv3 had complex additional policy goals that don't exist in GPLv2. Specifically, I don't think that GPLv2 reinstallation requires that all “merely aggregated” works continue to function as designed upon reinstallation of the GPLv2 works. Stallman has agreed with that GPLv2 interpretation, but differed from me regarding forward-looking policy (in that he finds such disabling of proprietary software deplorable) and thus Stallman wrote GPLv3 to prevent that practice that GPLv2 permits.
Furthermore, and most importantly, I quote the May 2020 recent exchange with Sullivan to point out that FSF policy regarding GPLv2 and installation has not wavered since FSF established it during Bob Chassell's time, which continued on into my time as FSF's Executive Director and then into Peter Brown's and John Sullivan's time, too. As I've shown, this interpretation of GPLv2 installation requirements is at least a 17-year unbroken chain from at May 2003 all the way through to May 2020. If Bob Chassell were still alive today, I'm sure he could account for that position remaining consistent in the 1990s, too.
There is also a central and inherent flaw in McCoy's underlying argument: the idea that FSF's view, or Linus' view, or any single individual's or organization's view, is what matters. The license says what it says. If the license steward has a view, it would not mean their view is dispositive, and I say that knowing that their view happens to agree with mine! Indeed, Linus Torvalds has stated he doesn't agree with FSF's views about GPLv2, and has his own views, which McCoy himself quotes. (I'm not sure why McCoy thinks that forwards his argument, because Linus' view differing from FSF undercuts McCoy's argument that what FSF said during GPLv3 drafting is relevant.) Other contributors to Linux disagree with both FSF's and Linus' view; many prominent Linux developers have told me that they agree with Conservancy and/or FSF about this. Others have told me they have an even broader interpretation of the installation requirements under GPLv2 than I do!
Thus, McCoy makes a classic “appeal to authority” fallacy as the center of his argument. Regardless of McCoy's mostly unsupported opinion, I suspect even he would agree that only three things will really definitively matter regarding this issue: (a) what the wronged party who didn't get their complete, corresponding source code believes, (b) what the entity refusing to give them that source code believes, and (c) what the Courts says when the former sues the latter. All else is simply bluster — full of sound and fury, but signifying nothing.
Finally, as a reminder, please keep in mind that (as I already said in the text above), I no longer have any affiliation with FSF (since October 2019) and do not speak for them — which is precisely why I quote the words they told me.
0 Please note that I have not personally looked into TiVo's GPL compliance since late 2003. As such, it's entirely possible that TiVo models released from 2004 onward may have violated GPLv2§3¶2 and failed to include required “scripts used to control compilation and installation of the executable”. However, any later non-compliance is not capitulation by me, FSF, Conservancy or anyone else that McCoy's and others' interpretation of that clause is correct.
1Please be abundantly clear that even as I give an interpretation of what I happen to believe is correct at this given moment, I'm a flawed human being capable of error. (Also, IANAL and TINLA.) I can misspeak, misstate, and otherwise just be plain wrong about something one way or the other. This is also true of FSF, its representatives, and all the other pundits like McCoy Smith who opine on this question. One of the horrible “race to the bottom” traps that GPL violators constantly lay for us is unrelenting pressure that we choose between (a) reducing what we believe a given license requires, or (b) suing them to ask the Court to uphold our view. No one escapes that pressure cooker unscathed; nearly every pro-copyleft activist (including me) has fallen into this trap, and succumbed to the pressure of (a) at least once. I know, even as I write this footnote, that someday I'm going to have a GPL violator's lawyer quoting this blog post back to me in a deposition about some esoteric, “race to the bottom” issue of GPL compliance. They're going to look for a way to twist my words to argue that somehow I've given their client carte blanche to trample users' rights that GPL protects. Everyone who stands up for copyleft faces this constant challenge now that intentional GPL violations are the norm rather than the exception. Conservancy simply will not capitulate when standing up for users' rights to copy, share, modify, repair, reinstall and reinstall modified versions of their software on the devices they own.
It Matters Who Owns Your Copylefted Copyrights
by
on June 30, 2021Throughout the history of Free and Open Source software (FOSS), copyright assignment has simultaneously been controversial and accepted as the norm in our FOSS communities. This paradox, I believe, stems entirely from some key misunderstandings that perpetuate. This issue requires urgent discussion, as two of the most important FOSS projects in history (GCC and glibc) are right now considering substantial and swift changes to long-standing copyright policies that date back to the 1980s. This event, and other recent events over the last few years in the area of GPL compliance and corporate FOSS adoption, point to long-term problems for projects. This essay works through these nuances, and will hopefully assist FOSS contributors as they make difficult decisions about copyright ownership for their projects. At the end, I provide a summary list of issues to consider when creating copyright ownership policies for FOSS.
The Default Is: You Give It Away To Your Boss
I was at one time bemused, but now am mostly aghast, at the true paradox of common wisdom about copyright assignment for FOSS projects. I don't believe anyone has done a statistically valid study, but anecdotally it's rare to find a FOSS contributor who enjoys assigning copyright to another entity. FOSS contributors who do assign copyright to a non-profit charity that collects copyrights (such as the FSF or Conservancy) often complain about the complexity and annoyance of the paperwork to get started as a contributor. Even more commonly, contributors express disgust or annoyance at the pressure from the project to give away something that should be rightfully theirs.
I am sympathetic on both points, and have worked over my career to design, implement, and improve copyright assignment processes for projects — in hopes to address that first complaint. However, the second concern — the preference of FOSS contributors to keep their own copyrights, is an ironic complaint. Here's why:
Almost no contributors to larger FOSS projects hold their own copyrights. If they have not gone through a process to assign them to a charity, the most likely scenario is that their employers own those copyrights. My colleagues at Conservancy have been working on the Contract Patch project — which seeks to educate FOSS contributors about their inherent right to employment agreement negotiation, and seek more favorable terms in those employment contracts. However, over the years of our Contract Patch work, we still find that only dozens (among the thousands of FOSS contributors) have insisted that their company allow them to keep their own copyrights. If you work for a company, check your employment agreement. I'll bet that your employer owns your copyrights for everything you do at work — including your contributions to FOSS — unless there's a separate agreement that gives those copyrights to a charity like Conservancy or FSF.
As a result, in debates about copyright ownership, discussions of what policy contributors want regarding the fruits of their labor is sadly moot. Without a clear, organized mitigation strategy to assure that FOSS contributors keep their own copyrights, a project (such as GCC or glibc) that switches from a standing “(nearly) all copyrights assigned to a charity” model to a plain Developer Certificate of Origin (DCO) or naked inbound=outbound contributor arrangement will, after a period of years, mostly likely have copyrights that are primarily held by the employers of the most prolific contributors, rather than by the contributors themselves.
The Faustian bargain that causes this default is the “work-for-hire” doctrine in the USA (and similar arrangements that exist around the world). When you take a job, in most places in the world, by default, your employer owns and/or effectively controls all your copyrights. They argue that they're paying you handsomely for this and as such they have every right to own it all. Perhaps that argument has merit with regard to software that will ultimately be proprietarized, but for copylefted software, the value proposition is different and history shows that the arrangement leads to surprising results.
Copyleft Is Weakened When Most Copyright Holders Oppose Enforcement
Copyleft licenses are not magic pixie dust. Copyleft is not a legislatively-mandated regulation (e.g., pollution regulations) — which are enforced by government staff. Ultimately, an entity — most commonly the copyright holders themselves — must proactively enforce GPL. This entity can be an organization, an individual, a group of individuals, a group of organizations, or a mix of all of those. Someone must enforce the rules; so-called “spontaneous compliance” is a myth promulgated by those who oppose copyleft.
Yet that myth's apparent unexamined truthiness has taken hold; many FOSS contributors reasonably think that it does not matter who owns the copyright of their copylefted works. If it's GPL'd, they surmise, surely someone out there will enforce the GPL when it's violated. Or, perhaps the contributors think that GPL violations on their particular codebase are rare. Either way, most contributors avoid the effort required to figure out who owns their copyrights and ponder whether that entity will uphold copyleft in a reasonable way. Nevertheless, I really urge FOSS contributors to think through this issue with care and healthy skepticism, as they may be surprised at the outcome. What lurks in the backchannels may well surprise you:
Once upon a time, copyleft enforcement was not considered controversial by many otherwise-pro-FOSS companies. In recent years, companies that prefer non-copylefted FOSS have succeeded in making even the most mundane GPL enforcement appear as controversial and industry-destabilizing. Many developers who contribute to GCC and glibc — even ones that work for historically FOSS-friendly companies — will be surprised to learn that their employers (or at least their legal departments) are opposed to any GPL enforcement. Conservancy, as the main organization actively doing GPL enforcement, hears this kind of pressure regularly. Our colleagues at FSF historically told us that they hear the same. It's rarely done publicly — but, when public, criticisms are delivered by proxies in a subtle, “you don't really need copyleft to be enforced anyway, do you?” manner. The public arguments are usually couched in a careful “dog whistle” style. While the message comes through loud and clear to executives, lawyers, companies and their trade associations, it's easily missed by contributors — who understandably have a much more important task to do: writing great FOSS code!
We expect that you may be surprised by this, but we can tell you that: in private, companies that contribute to key copylefted projects are not so subtle. They have communicated to those of us in the GPL enforcement community on several occasions — a message delivered by many executives over a period of many years — that they would prefer both Conservancy and FSF to curtail, or outright end, GPL enforcement. Their view, as communicated to us, is that GPL enforcement causes customers to think copyleft is problematic. As one executive once said to me: “We consider it a failure if any customer ever even asks us if they have to worry about GPL compliance”. These companies usually say: “well, we will always comply with the license, but we'd prefer if anyone who is our customer (or potential customer) simply is just left alone if they violate”. Once, an executive actually claimed to me that his company, a huge multinational software company, would: “cease to build any products on Linux and other GPL'd software at all if Conservancy and FSF don't cease their enforcement”. That statement was fortunately bluster and bluff; the company in question is still heavily invested in copylefted software, including Linux, GCC, and glibc, but the sentiment is a common one that we hear privately often. And, that company has since funded work to discredit GPL enforcement. Conservancy faces constant pressure and threats regarding its GPL enforcement efforts, and we're aware that FSF has faced to similar pressure.
In parallel, these companies have also steadily worked to hold more of the copyrights in key copylefted works — primarily by hiring the key developers that contribute to key copylefted projects. While I oppose this outcome, it's undeniably a rational approach: if companies hold the copyrights, they can decide when, how, where and most important if copyleft licenses are ever enforced. This process is well underway for copylefted works that don't have any policies about copyright ownership. For example, the “Who Contributes to Linux?” reports by LWN show starkly over recent years: fewer contributions coming copyrighted by individuals, and more contributions copyrighted by companies. While corporate involvement is always welcome in FOSS projects, we (as a FOSS community) have not responded with the necessary care to question why the companies, rather than the individual contributors, should own our FOSS. Only very few developers have even questioned this issue; we thank them for their efforts, but such questions are simply not raised enough.
As such, the most important consideration in ending copyright assignment to a charity is to consider what the employers of most of the developers will do with those copyrights, and whether they will enforce the copyleft in the best interest of the community. Even worse, might they eventually sell those copyrights to someone who will use enforcement in a manner that a charity never would — for their own avarice rather than the good of the community?
While the remainder of this essay addresses at length other secondary considerations and nuances regarding copyright ownership in copyleft projects, this point is the absolutely most important one:
Most for-profit copyright holders prefer copyleft to function much like non-copylefted FOSS. Namely, “it's certainly nice when folks voluntarily release their changes, improvements and ‘scripts used to control compilation and installation’, but if they don't, their failure to do so should be begrudgingly tolerated, and compliance should never be compelled for those who refuse”.
Violations Are More Common Than You Think
During this recent upheaval in the GCC community, I spoke at length with a GCC developer who has contributed to the codebase regularly for decades. After I'd explained some of issues above, the developer stated: “Well, is any of this really an issue for GCC? When was the last time there was a GPL violation on GCC?”. I glibly answered: “Well, I haven't heard about one for about a week, but I'll surely hear about another one soon.” The developer was flummoxed to learn that part-and-parcel to nearly every embedded Linux GPL violation (which have been, for years, innumerable) is a companion violation on GCC. Specifically, most system-on-chip (SoC) vendors not only have a stock Linux build given to the OEMs, but also include a toolchain. More often than not, a GPL violator who has what we call an “incomplete Complete, Corresponding Source (CCS)” violation on Linux will also have a “no-source-or-offer” or “incomplete CCS” violation on GCC. The usual reason is that part of the SDK given to OEMs must include an appropriate binary toolchain (sometimes with vendor-specific patches, and almost always with a complex configuration that's difficult to reverse engineer) so that the SoC integrators can compile other software for the system. This scenario is so common that we in the GPL enforcement community coined the shorthand term “toolchain violation” to refer to the scenario where the GCC violation “tags along” with a Linux violation. While glibc violations of this nature are admittedly less common than GCC, glibc violations do occur relatively often in the same manner.
Upstream developers are mostly unaware of how bad the compliance problem is regarding their code for one simple reason: the folks impacted most by GPL violations are downstream users, not upstream developers. Furthermore, correction of compliance problems usually produces code releases that “work” for the given device on which the original violation occurred, but rarely is that CCS released resulting from a GPL enforcement action trivially upstreamable. After all, this is code written, modified, and/or integrated by developers whose plan was to “get away” with a GPL violation, so why would they have invested the time to provide the improvements in a manner that it was immediately digestible for upstream? My view is that users matter, and FOSS contributors should care about their experience with the downstream consequences of their code. I urge FOSS contributors to implement copyright ownership schemes that have the most likely chance of enabling enforcement actions principled parties.
The Power of Collective Action
Regarding my views on FOSS copyright assignment, I often quote the famous gaffe of (former USA presidential candidate and senator) John Kerry, “I voted for it before I was against it” when I'm talking about copyright assignment. But I don't quote it purely for self-deprecation; I never felt the press was all that fair to Kerry, because I think policy makers should be willing to change their positions over time as new information comes to light, or when policies that we thought were beneficial in theory prove problematic in practice. I once thought universal copyright assignment for a copylefted work to a single charity was an absolutely necessity. Then, Conservancy's successful and continuing work with various BusyBox copyright holders, and our collaboration with Christoph Hellwig, showed me that there was huge value in individuals having copyrights and making their own choices about enforcing copyleft. Still later, I realized that individuals like Erik Andresen and Christoph Hellwig are quite rare, as most FOSS contributors — even if they've kept their own copyrights — ultimately have to take on such political and career risk to enforce copyleft that they must often chose not to because they could easily get blacklisted from major employers for doing so.
The central thread here is collective action by principled people who will use copyleft primarily as a tool for rights of users and for the improvement of copylefted projects. Ultimately, copyleft functions best when leaders of a project have the agency, wherewithal, commitment, and collective consensus to either ensure copyleft functions to protect their users' rights, or enable another entity to do that job for them in a principled way that serves the public good (usually in contrast to the interests of for-profit industry). (Note that even if that interest is “vendor-neutral”, as that means the interests served are the common business interests of the vendors, but not their customers.)
These issues are complex. Every policy change, or lack of policy change, will have many unintended consequences. An apparent “simple move” away from mandated assignment to a centralized organization could well be a decision to ultimately move copyright holding to for-profit corporations and their trade associations — all of whom are unlikely to stand up for the GPL and user rights. Beyond all else, I urge caution and slow deliberation before any policy changes about copyright control. I ask FOSS contributors to do the substantial and necessary homework of not only reading and considering the points of this essay, but those points raised by all parties. As you do, keep in mind that every party, including me, has a policy goal in mind for copyleft. I and Conservancy are very transparent that our policy goal is to see copyleft licenses enforced regularly on behalf of end users who buy products on the open market that contain copylefted code. We believe, as a policy matter, that copylefted copyrights are best held by entities that have a bona-fide track record of serving the public good, act in a principled and ethical manner in doing so — the most important principles being to never prioritize financial gain over users' rights — and who won't back down and give up when faced with the inevitable anti-copyleft backlash. Others, probably including your employer and the trade associations to which they pay membership dues, probably disagree with this position, but you should ask questions and listen carefully to see if you're getting a straight answer. After you've heard everyone's position, decide for yourself who deserves to have your copyrights: your employer, a charity, or yourself. Once you've decided, you'll have hard work ahead to change the default (which will likely be your employer once projects drop their copyright assignment mandate).
A Word on FSF's GPL Enforcement
The elephant in the room that I've not yet mentioned is the current status of the FSF as an organization. There is no question, given the recent mass resignation of their management, and other rapid leadership change surprises, that the FSF faces serious challenges in the next few years. I personally was previously affiliated with the FSF. That affiliation ended in 2019, so I have no real information about the internal status of the FSF since then. What I do know is that FSF currently lacks sufficient capacity to follow up on any GPL violations on GCC and glibc that we've reported for years. That said, if I have to chose between the strict dichotomy of: (a) copyrights held by the FSF (which has the possibility of recovery in the future and restoration to an organization that actively enforces the GPL) vs. (b) copyrights held by companies — many of whom are known to oppose enforcement of the GPL, I would still pick the FSF. Remember that copyright does not expire for a very, very long time. GCC and glibc are both codebases that prove the half-life of well-written software is very long, and that software stays relevant and useful for much longer than any of us usually admit. We have to think long term about the copyright ownership of copylefted works. While there are serious short-term and medium-term problems at the FSF, we have to consider carefully who is the best long-term steward of copyrights: companies or charities. Most importantly, changing 30 years of careful planning about the copyright inventory of important codebases is certainly not a decision to be finalized in a matter of just a few weeks or even months. Thus, most of all, I ask FOSS contributors to take care and ample time in their deliberations on such matters.
A Summary of Recommendations and Considerations
As promised at the start of this essay, here is a laundry list of recommendations and considerations that any copylefted project should consider regarding how to structure its copyright ownership rules. This list is provided as a quick reference only, and readers should keep in mind the complexity of the topic before jumping to a conclusion about the right policies. Not every issue below was addressed in detail in the essay above, but if there is interest from the community, I will publish further essays on other topics.
- Without your active work to avoid it (such as by modifying your contract or demanding assignment to another entity), for-profit employers will control your copyrights. You typically have no say into how or whether the license of your project is enforced if your employer holds your copyrights.
- Modern copyright assignments to charities such as the FSF or Conservancy usually take two forms: either your employer has a blanket copyright assignment, or you must individually assign copyright. In the latter case, there are typically two components: a disclaimer from copyright from your employer, and an assignment from you. In either case, you should speak at length with your employer's legal department to figure out what form, if any, is in place, and what will happen to those systems if your project changes its copyright aggregation policies.
- FOSS contributors have safety in numbers. A strong consensus that your project wants to see copyleft enforced for your users can create leverage that mitigates risk of “blacklisting” of those who chose to enforce copyleft licenses. When policies like copyright assignment to a charity or copyright ownership only by individuals are set project-wide, companies historically have simply accepted it. (The Samba project is an excellent example of a project where the developers have control.) By contrast, each contributor faces a steep climb in negotiating their ownership of the copyrights in the absence of such policies.
- While assignment to a charity can be annoying and feel problematic, this is often the most expedient way to assure that the copyleft license of your project is upheld. Conservancy, for example, will accept one-off assignment from FOSS contributors to strategically important FOSS projects. Even if you (collectively) chose to not mandate assignment for your project, it's valuable to assist and encourage your contributors to either develop together a collective individual copyright ownership plan, or assign to different charities, or recommend a mixture of the two.
- As an upstream developer, you're likely going to be unaware of most copyleft violations on your code. If possible, work with an entity that has the time and resources to investigate violations for you, and empower that entity to act on your behalf if possible.
- Pay attention to the details of Contributor Licensing Agreements (CLAs). Often, these require that you give up almost as many rights as a copyright assignment would require you to give up anyway. Avoid asking your contributors to agree to a CLA.
- Developer Certificate of Origins (DCOs) and other inbound=outbound contribution regimes are very good, useful and recommended. However ultimately such systems are entirely orthogonal to who holds the copyright. DCOs and contribution terms are general statements that attest to your right and permission to make contributions under the project's license, but are, by design, silent on the details. A DCO is absolutely compatible with a project that separately requires copyright assignment, or with a project that has other recommendations about how developers should handle copyright ownership.
- Seek and consider advice from all sources. Everyone with an opinion on these topics has a policy agenda. Ask what their policy agenda is when they advise you. Ask them their view on copyleft enforcement. Ask them if they've ever investigated copyleft violations on your software, and if so what they found and what their opinion is. If you're considering assigning your copyrights to someone, most notably your employer (who, again, is likely to by default receive your copyrights) be sure you've fully understood their policy and views on these matters. Insist that they put their policy and views in writing for you for future reference. Ask for strong commitments, and be skeptical if you cannot receive them.