Displaying posts tagged law
A Comprehensive Analysis of the GPL Issues With the Red Hat Enterprise Linux (RHEL) Business Model
byon June 23, 2023
This article was originally published primarily as a response to IBM's Red Hat's change to no longer publish complete, corresponding source (CCS) for RHEL and the prior discontinuation of CentOS Linux (which are related events, as described below). We hope that this will serve as a comprehensive document that discusses the history of Red Hat's RHEL business model, the related source code provisioning, and the GPL compliance issues with RHEL.
For approximately twenty years, Red Hat (now a fully owned subsidiary of IBM) has experimented with building a business model for operating system deployment and distribution that looks, feels, and acts like a proprietary one, but nonetheless complies with the GPL and other standard copyleft terms. Software rights activists, including SFC, have spent decades talking to Red Hat and its attorneys about how the Red Hat Enterprise Linux (RHEL) business model courts disaster and is actively unfriendly to community-oriented Free and Open Source Software (FOSS). These pleadings, discussions, and encouragements have, as far as we can tell, been heard and seriously listened to by key members of Red Hat's legal and OSPO departments, and even by key C-level executives, but they have ultimately been rejected and ignored — sometimes even with a “fine, then sue us for GPL violations” attitude. Activists have found this discussion frustrating, but kept the nature and tenure of these discussions as an “open secret” until now because we all had hoped that Red Hat's behavior would improve. Recent events show that the behavior has simply gotten worse, and is likely to get even worse.
What Exactly Is the RHEL Business Model?
The most concise and pithy way to describe RHEL's business model is: “if you exercise your rights under the GPL, your money is no good here”. Specifically, IBM's Red Hat offers copies of RHEL to its customers, and each copy comes with a support and automatic-update subscription contract. As we understand it, this contract clearly states that the terms do not intend to contradict any rights to copy, modify, redistribute and/or reinstall the software as many times and as many places as the customer likes (see §1.4). Additionally, though, the contract indicates that if the customer engages in these activities, that Red Hat reserves the right to cancel that contract and make no further contracts with the customer for support and update services. In essence, Red Hat requires their customers to choose between (a) their software freedom and rights, and (b) remaining a Red Hat customer. In some versions of these contracts that we have reviewed, Red Hat even reserves the right to “Review” a customer (effectively a BSA-style audit) to examine how many copies of RHEL are actually installed (see §10) — presumably for the purpose of Red Hat getting the information they need to decide whether to “fire” the customer.
Red Hat's lawyers clearly take the position that this business model complies with the GPL (though we aren't so sure), on grounds that that nothing in the GPL agreements requires an entity keep a business relationship with any other entity. They have further argued that such business relationships can be terminated based on any behaviors — including exercising rights guaranteed by the GPL agreements. Whether that analysis is correct is a matter of intense debate, and likely only a court case that disputed this particular issue would yield a definitive answer on whether that disagreeable behavior is permitted (or not) under the GPL agreements. Debates continue, even today, in copyleft expert circles, whether this model itself violates GPL. There is, however, no doubt that this provision is not in the spirit of the GPL agreements. The RHEL business model is unfriendly, captious, capricious, and cringe-worthy.
Furthermore, this RHEL business model remains, to our knowledge, rather unique in the software industry. IBM's Red Hat definitely deserves credit for so carefully constructing their business model such that it has spent most of the last two decades in murky territory of “probably not violating the GPL”.
Does The RHEL Business Model Violate the GPL Agreements?
Perhaps the biggest problem with a murky business model that skirts the line of GPL compliance is that violations can and do happen — since even a minor deviation from the business model clearly violates the GPL agreements. Pre-IBM Red Hat deserves a certain amount of credit, as SFC is aware of only two documented incidents of GPL violations that have occurred since 2006 regarding the RHEL business model. We've decided to share some general details of these violations for the purpose of explaining where this business model can so easily cross the line.
In the first violation, a large Fortune 500 company (which we'll call Company A), who both used RHEL internally and also built public-facing Linux-based products, decided to create a consumer-facing product (which we'll call Product P) based primarily on CentOS Linux, but P included a few packages built from RHEL sources. Company A did not seek nor ask for support or update services for this separate Product P. Red Hat later became aware that Product P contained some part of RHEL, and Red Hat demanded royalty payments for Product P. Red Hat threatened to revoke the support and update services on Company A's internal RHEL servers if such royalties were not paid.
Since Company A was powerful and had good lawyers and savvy business development staff, they did not acquiesce. Company A ultimately continued (to our knowledge) on as a RHEL customer for their internal servers and continued selling Product P without royalty payments. Nevertheless, a demand for royalties for distribution is clearly a violation as that demand creates a “further restriction” on the permissions granted by GPL. As stated in GPLv3:
You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License.
Red Hat tried to impose a further restriction in this situation, and therefore violated the GPL. The violation was resolved since no royalty was paid and Company A faced no consequences. SFC learned of the incident later, and informed Red Hat that the past royalty demand was a violation. Red Hat did not dispute nor agree that it was a violation, and did informally agree such demands would not be made in future.
In another violation incident, we learned that Red Hat, in a specific non-USA country, was requiring that any customer who lowered the number of RHEL machines under service contract with Red Hat sign an additional agreement. This additional agreement promised that the customer had deleted every copy of RHEL in their entire organization other than the copies of RHEL that were currently contracted for service with Red Hat. Again, this is a “further restriction”. The GPL agreements give everyone the unfettered right to make and keep as many copies of the software as they like, and a distributor of GPL'd software may not require a user to attest that they've deleted these legitimate, licensed copies of third-party-licensed software under the GPL. SFC informed Red Hat's legal department of this violation, and we were assured that this additional agreement would no longer be presented to any Red Hat customers in the future.
In both these situations, we at SFC were worried they were merely a “tip of the proverbial iceberg”. For years, we have heard from Red Hat customers who are truly confused. It's common in the industry to talk about RHEL “seat licenses”, and many software acquisition specialists in the industry are not aware of the nuances of the RHEL business model and do not understand their rights. We remain very concerned that RHEL salespeople purposely confuse customers to sell more “seat licenses”. It's often led us to ask: “If a GPL violation happens in the woods, and everyone involved doesn't hear it, how does anyone know that software rights have indeed been trampled upon in those woods?”. As we do for as many GPL violation reports as we can, we zealously pursue RHEL-related GPL violations that are reported to us, and if you're aware of one, please do email us at <email@example.com> immediately. We fear that be it through incompetence or malice, many RHEL salespeople and business development professionals may regularly violate GPL and no one knows about it. That said, the business model as described by IBM's Red Hat may well comply with the GPL — it's just so murky that any tweak to the model in any direction seems to definitely violate, in our experience.
Furthermore, Red Hat exploits the classic “caveat emptor” approach — popular in many a shady business deal throughout history. While, technically speaking, a careful reader of the GPL and the RHEL agreements understands the bargain they're making, we suspect most small businesses just don't have the FOSS licensing acumen and knowledge to truly understand that deal.
Why Was an Independent CentOS So Important?
Until Red Hat's “aquisition” of CentOS in early 2014, CentOS provided an excellent counterbalance to the problems with the RHEL business model. Specifically, CentOS was a community-driven project, with many volunteers, supported by some involvement from small businesses, to re-create RHEL releases using the CCS releases made for RHEL. Our pre-2014 view was that CentOS was the “canary in the murky coalmine” of the RHEL business. If CentOS seemed vibrant, usable, and a viable alternative to RHEL for those who didn't want to purchase Red Hat's updates and services, the community could rest easy. Even if there were GPL violations by Red Hat on RHEL, CentOS' vibrancy assured that such violations were having only a minor negative impact on the FOSS community around RHEL's codebase.
Red Hat, however, apparently knew that this vibrant community was cutting into their profits. Starting in 2013, Red Hat engaged in a series of actions that increased their grip. First, they “acquired” CentOS. This was initially couched as a cooperation agreement, but Red Hat systematically made job offers that key CentOS volunteers couldn't refuse, acquired the small businesses who might ultimately build CentOS into a product, and otherwise integrated CentOS into Red Hat's own operations.
After IBM acquired Red Hat, the situation got worse. Having gotten rights to the CentOS brand as part of the “aquisition”, Red Hat slowly began to change what CentOS was. CentOS Linux quickly ceased to be a check-and-balance on RHEL, and just became a testing ground for RHEL. Then, in 2020, when most of us were distracted by the worst of the COVID-19 pandemic, Red Hat unilaterally terminated all CentOS Linux development. Later (during the Delta variant portion of the pandemic in late 2021) Red Hat ended CentOS Linux entirely. IBM's Red Hat then used the name “CentOS Stream” to refer to experimental source packages related to RHEL. These were (and are) not actually the RHEL source releases — rather, they appear to be primarily a testing ground for what might appear in RHEL later.
Finally, Red Hat announced two days ago that RHEL CCS will no longer be publicly available in any way. Now, to be clear, the GPL agreements did not obligate Red Hat to make its CCS publicly available to everyone. This is a common misconception about GPL's requirements. While the details of CCS provisioning vary in the different versions of the GPL agreements, the general principle is that CCS need to be provided either (a) along with the binary distributions to those who receive, or (b) to those who request pursuant to a written offer for source. In a normal situation, with no mitigating factors, the fact that a company moved from distributing CCS publicly to everyone to only giving it to customers who received the binaries already would not raise concerns.
In this situation, however, this completes what appears to be a decade-long plan by Red Hat to maximize the level of difficulty of those in the community who wish to “trust but verify” that RHEL complies with the GPL agreements. Namely, Red Hat has badly thwarted efforts by entities such as Rocky Linux and Alma Linux. These entities are de-facto the intellectual successors to CentOS Linux project that Red Hat carefully dismantled over the last decade. These organizations sought to build Linux-based distributions that mirrored RHEL releases, and it is now unclear if they can do that effectively, since Red Hat will undoubtedly capriciously refuse to sell them exactly-one RHEL service and update “seat license” at a reasonable price. It appears that, as of this week, one must have at least that to get timely access to RHEL CCS.
What Should Those Who Care About Software Rights Do About RHEL?
Due to this ongoing bad behavior by IBM's Red Hat, the situation has become increasingly complex and difficult to face. No third party can effectively monitor RHEL compliance with the GPL agreements, since customers live in fear of losing their much-needed service contracts. Red Hat's legal department has systematically refused SFC's requests in recent years to set up some form of monitoring by SFC. (For example, we asked to review the training materials and documents that RHEL salespeople are given to convince customers to buy RHEL, and Red Hat has not been willing to share these materials with us.) Nevertheless, since SFC serves as the global watchdog for GPL compliance, we welcome reports of RHEL-related violations.
We finally express our sadness that this long road has led the FOSS community to such a disappointing place. I personally remember standing with Erik Troan in a Red Hat booth at a USENIX conference in the late 1990s, and meeting Bob Young around the same time. Both expressed how much they wanted to build a company that respected, collaborated with, engaged with, and most of all treated as equals the wide spectrum of individuals, hobbyists, and small businesses that make the plurality of the FOSS community. We hope that the modern Red Hat can find their way back to this mission under IBM's control.
An Erroneous Preliminary Injunction Granted in Neo4j v. PureThink
byon March 30, 2022
Bad Early Court Decision for AGPLv3 Has Not Yet Been Appealed
We at Software Freedom Conservancy proudly and vigilantly watch out for your rights under copyleft licenses such as the Affero GPLv3. Toward this goal, we have studied the Neo4j, Inc. v. PureThink, LLC ongoing case in the Northern District of California , and the preliminary injunction appeal decision in the Ninth Circuit Court this month. The case is complicated, and we've seen much understandable confusion in the public discourse about the status of the case and the impact of the Ninth Circuit's decision to continue the trial court's preliminary injunction while the case continues. While it's true that part of the summary judgment decision in the lower court bodes badly for an important provision in AGPLv3§7¶4, the good news is that the case is not over, nor was the appeal (decided this month) even an actual appeal of the decision itself! This lawsuit is far from completion.
A Brief Summary of the Case So Far
The primary case in question is a dispute between Neo4j, a proprietary relicensing company, against a very small company called PureThink, run by an individual named John Mark Suhy. Studying the docket of the case, and a relevant related case, and other available public materials, we've come to understand some basic facts and events. To paraphrase LeVar Burton, we encourage all our readers to not take our word (or anyone else's) for it, but instead take the time to read the dockets and come to your own conclusions.
After canceling their formal, contractual partnership with Suhy, Neo4j alleged multiple claims in court against Suhy and his companies. Most of these claims centered around trademark rights regarding “Neo4j” and related marks. However, the claims central to our concern relate to a dispute between Suhy and Neo4j regarding Suhy's clarification in downstream licensing of the Enterprise version that Neo4j distributed.
Specifically, Neo4j attempted to license the codebase under something they (later, in their Court filings) dubbed the “Neo4j Sweden Software License” — which consists of a LICENSE.txt file containing the entire text of the Affero General Public License, version 3 (“AGPLv3”) (a license that I helped write), and the so-called “Commons Clause” — a toxic proprietary license. Neo4j admits that this license mash-up (if legitimate, which we at Software Freedom Conservancy and Suhy both dispute), is not an “open source license”.
There are many complex issues of trademark and breach of other contracts in this case; we agree that there are lots of interesting issues there. However, we focus on the matter of most interest to us and many FOSS activists: Suhy's permissions to remove the “Commons Clause”. Neo4j accuses Suhy of improperly removing the “Commons Clause” from the codebase (and subsequently redistributing the software under pure AGPLv3) in paragraph 77 of their third amended complaint. (Note that Suhy denied these allegations in court — asserting that his removal of the “Commons Clause” was legitimate and permitted.
Neo4j filed for summary judgment on all the issues, and throughout their summary judgment motion, Neo4j argued that the removal of the “Commons Clause” from the license information in the repository (and/or Suhy's suggestions to others that removal of the “Commons Clause” was legitimate) constituted behavior that the Court should enjoin or otherwise prohibit. The Court partially granted Neo4j's motion for summary judgment. Much of that ruling is not particularly related to FOSS licensing questions, but the section regarding licensing deeply concerns us. Specifically, to support the Court's order that temporarily prevents Suhy and others from saying that the Neo4j Enterprise edition that was released under the so-called “Neo4j Sweden Software License” is a “free and open source” version and/or alternative to proprietary-licensed Neo4j EE, the Court held that removal of the “Commons Clause” was not permitted. (BTW, the court confuses “commercial” and “proprietary” in that section — it seems they do not understand that FOSS can be commercial as well.)
In this instance, we're not as concerned with the names used for the software; as much as the copyleft licensing question — because it's the software's license, not its name, that either assures or prevents users to exercise their fundamental software rights. Notwithstanding our disinterest in the naming issue, we'd all likely agree that — if “AGPLv3 WITH Commons-Clause” were a legitimate form of licensing — such a license is not FOSS. The primary issue, therefore, is not about whether or not this software is FOSS, but whether or not the “Commons Clause” can be legitimately removed by downstream licensees when presented with a license of “AGPLv3 WITH Commons-Clause”. We believe the Court held incorrectly by concluding that Suhy was not permitted to remove the “Commons Clause”. Their order that enjoins Suhy from saying that such removal is permitted is problematic because the underlying holding (if later upheld on appeal) could seriously harm FOSS and copyleft.
The Confusion About the Appeal
Because this was an incomplete summary judgment and the case is ongoing, the injunction against Suhy's on making such statements is a preliminary injunction, and cannot be made permanent until the case actually completes in the trial court. The decision by the Ninth Circuit appeals court regarding this preliminary injunction has been widely reported by others as an “appeal decision” on the issue of what can be called “open source”. However, this is not an appeal of the entire summary judgment decision, and certainly not an appeal of the entire case (which cannot even been appealed until the case completes). The Ninth Circuit decision merely affirms that Suhy remains under the preliminary injunction (which prohibits him and his companies from taking certain actions and saying certain things publicly) while the case continues. In fact, the standard that an appeals Court uses when considering an appeal of a preliminary injunction differs from the standard for ordinary appeals. Generally speaking, appeals Courts are highly deferential to trial courts regarding preliminary injunctions, and appeals of actual decisions have a much more stringent standard.
The Affero GPL Right to Restriction Removal
In their partial summary judgment ruling, the lower Court erred because they rejected an important and (in our opinion) correct counter-argument made by Suhy's attorneys. Specifically, Suhy's attorneys argued that Neo4j's license expressly permitted the removal of the “Commons Clause” from the license. AGPLv3 was, in fact, drafted to permit such removal in this precise fact pattern.
Specifically, the AGPLv3 itself has the following provisions (found in AGPLv3§0 and AGPLv3§7¶4):
- “This License” refers to version 3 of the GNU Affero General Public License.
- “The Program” refers to any copyrightable work licensed under this License. Each licensee is addressed as “you”.
- If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term.
That last term was added to address a real-world, known problem with GPLv2. Frequently throughout the time when GPLv2 was the current version, original copyright holders and/or licensors would attempt to license work under the GPL with additional restrictions. The problem was rampant and caused much confusion among licensees. As an attempted solution, the FSF (the publisher of the various GPL's) loosened its restrictions on reuse of the text of the GPL — in hopes that would provide a route for reuse of some GPL text, while also avoiding confusion for licensees. Sadly, many licensors continued to take the confusing route of using the entire text a GPL license with an additional restriction — attached either before or after, or both. Their goals were obvious and nefarious: they wanted to confuse the public into “thinking” the software was under the GPL, but in fact restrict certain other activities (such as commercial redistribution). They combined this practice with proprietary relicensing (i.e., a sole licensor selling separate proprietary licenses while releasing a (seemingly FOSS) public version of the code as demoware for marketing). Their goal is to build on the popularity of the GPL, but in direct opposition to the GPL's policy goals; they manipulate the GPL to open-wash bad policies rather than give actual rights to users. This tactic even permitted bad actors to sell “gotcha” proprietary licenses to those who were legitimately confused. For example, a company would look for users operating commercially with the code in compliance with GPLv2, but hadn't noticed the company's code had the statement: “Licensed GPLv2, but not for commercial use”. The user had seen GPLv2, and knew from its brand reputation that it gave certain rights, but hadn't realized that the additional restriction outside of the GPLv2's text might actually be valid. The goal was to catch users in a sneaky trap.
Neo4j tried to use the AGPLv3 to set one of those traps. Neo4j, despite the permission in the FSF's GPL FAQ to “use the GPL terms (possibly modified) in another license provided that you call your license by another name and do not include the GPL preamble”, left the entire AGPLv3 intact as the license of the software — adding only a note at the front and at the end. However, their users can escape the trap, because GPLv3 (and AGPLv3) added a clause (which doesn't exist in GPLv2) to defend users from this. Specifically, AGPLv3§7¶4 includes a key provision to help this situation.
Specifically, the clause was designed to give more rights to downstream recipients when bad actors attempt this nasty trick. Indeed, I recall from my direct participation in the A/GPLv3 drafting that this provision was specifically designed for the situation where the original, sole copyright holder/licensor0 added additional restrictions. And, I'm not the only one who recalls this. Richard Fontana (now a lawyer at IBM's Red Hat, but previously legal counsel to the FSF during the GPLv3 process), wrote on a mailing list1 in response to the Neo4j preliminary injunction ruling:
For those who care about anecdotal drafting history … the whole point of the section 7 clause (“If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term.”) was to address the well known problem of an original GPL licensor tacking on non-GPL, non-FOSS, GPL-norm-violating restrictions, precisely like the use of the Commons Clause with the GPL. Around the time that this clause was added to the GPLv3 draft, there had been some recent examples of this phenomenon that had been picked up in the tech press.
Fontana also pointed us to the FSF's own words on the subject, written during their process of drafting this section of the license (emphasis ours):
Unlike additional permissions, additional requirements that are allowed under subsection 7b may not be removed. The revised section 7 makes clear that this condition does not apply to any other additional requirements, however, which are removable just like additional permissions. Here we are particularly concerned about the practice of program authors who purport to license their works under the GPL with an additional requirement that contradicts the terms of the GPL, such as a prohibition on commercial use. Such terms can make the program non-free, and thus contradict the basic purpose of the GNU GPL; but even when the conditions are not fundamentally unethical, adding them in this way invariably makes the rights and obligations of licensees uncertain.
While the intent of the original drafter of a license text is not dispositive over the text as it actually appears in the license, all this information was available to Neo4j as they drafted their license. Many voices in the community had told them that provision in AGPLv3§7¶4 was added specifically to prevent what Neo4j was trying to do. The FSF, the copyright holder of the actual text of the AGPLv3, also publicly gave Neo4j permission to draft a new license, using any provisions they like from AGPLv3 and putting them together in a new way. But Neo4j made a conscious choice to not do that, but instead constructed their license in the exact manner that allowed Suhy's removal of the “Commons Clause”.
In addition, that provision in AGPLv3§7¶4 has little meaning if it's not intended to bind the original licensor! Many other provisions (such as AGPLv3§10¶3) protect the users against further restrictions imposed later in the distribution chain of licensees. This clause was targeted from its inception against the exact, specific bad behavior that Neo4j did here.
We don't dispute that copyright and contract law give Neo4j authority to license their work under any terms they wish — including terms that we consider unethical or immoral. In fact, we already pointed out above that Neo4j had permission to pick and choose only some text from AGPLv3. As long as they didn't use the name “Affero”, “GNU” or “General Public” or include any of the Preamble text in the name/body of their license — we'd readily agree that Neo4j could have put together a bunch of provisions from the AGPLv3, and/or the “Commons Clause”, and/or any other license that suited their fancy. They could have made an entirely new license. Lawyers commonly do share text of licenses and contracts to jump-start writing new ones. That's a practice we generally support (since it's sharing a true commons of ideas freely — even if the resulting license might not be FOSS).
But Neo4j consciously chose not to do that. Instead, they license their software “subject to the terms of the GNU AFFERO GENERAL PUBLIC LICENSE Version 3, with the Commons Clause”. (The name “Neo4j Sweden Software License” only exists in the later Court papers, BTW, not with “The Program” in question.) Neo4j defines “This License” to mean “version 3 of the GNU Affero General Public License.”. Then, Neo4j tells all licensees that “If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term”. Yet, after all that, Neo4j had the audacity to claim to the Court that they didn't actually mean that last sentence, and the Court rubber-stamped that view.
Simply put, the Court erred when it said: “Neither of the two provisions in the form AGPLv3 that Defendants point to give licensees the right to remove the information at issue.”. The Court then used that error as a basis for its ruling to temporarily enjoin Suhy from stating that software with “Commons Clause” removed by downstream is “free and open source”, or tell others that he disagrees with the Court's (temporary) conclusion about removing the “Commons Clause” in this situation.
The case isn't over. The lower Court still has various issues to consider — including a DMCA claim regarding Suhy's removal of the “Commons Clause”. We suspect that's why the Court only made a preliminary injunction against Suhy's words, and did not issue an injunction against the actual removal of the clause! The issue as to whether the clause can be removed is still pending, and the current summary judgment decision doesn't address the DMCA claim from Neo4j's complaint.
Sadly, the Court has temporarily enjoined Suhy from “representing that Neo4j Sweden AB’s addition of the Commons Clause to the license governing Neo4j Enterprise Edition violated the terms of AGPL or that removal of the Commons Clause is lawful, and similar statements”. But they haven't enjoined us, and our view on the matter is as follows:
Clearly, Neo4j gave explicit permission, pursuant to the AGPLv3, for anyone who would like to to remove the “Commons Clause” from their LICENSE.txt file in version 3.4 and other versions of their Enterprise edition where it appears. We believe that you have full permission, pursuant to AGPLv3, to distribute that software under the terms of the AGPLv3 as written. In saying that, we also point out that we're not a law firm, our lawyers are not your lawyers, and this is not legal advice. However, after our decades of work in copyleft licensing, we know well the reason and motivations of this policy in the license (described above), and given the error by the Court, it's our civic duty to inform the public that the licensing conclusions (upon which they based their temporary injunction) are incorrect.
Meanwhile, despite what you may have read last week, the key software licensing issues in this case have not been decided — even by the lower Court. For example, the DMCA issue is still before the trial court. Furthermore, if you do read the docket of this case, it will be obvious that neither party is perfect. We have not analyzed every action Suhy took, nor do we have any comment on any action by Suhy other than this: we believe that Suhy's removal of the “Commons Clause” was fully permitted by the terms of the AGPLv3, and that Neo4j gave him that permission in that license. Suhy also did a great service to the community by taking action that obviously risked litigation against him. Misappropriation and manipulation of the strongest and most freedom-protecting copyleft license ever written to bolster a proprietary relicensing business model is an affront to FOSS and its advancement. It's even worse when the Courts are on the side of the bad actor. Neo4j should not have done this.
Finally, we note that the Court was rather narrow on what it said regarding the question of “What Is Open Source?”. The Court ruled that one individual and his companies — when presented with ambiguous licensing information in one part of a document, who then finds another part of the document grants permission to repair and clarify the licensing information, and does so — is temporarily forbidden from telling others that the resulting software is, in fact, FOSS, after making such a change. The ruling does not set precedent, nor does it bind anyone other than the Defendants as to what they can or cannot say is FOSS, which is why we can say it is FOSS, because the AGPLv3 is an OSI-approved license and the AGPLv3 permits removal of the toxic “Commons Clause” in this situation.
We will continue to follow this case and write further when new events occur..
0 We were unable to find anywhere in the Court record that shows Neo4j used a Contributor Licensing Agreement (CLA) or Copyright Assignment Agreement (©AA) that sufficiently gave them exclusive rights as licensor of this software. We did however find evidence online that Neo4j accepted contributions from others. If Neo4j is, in fact, also a licensor of others' AGPLv3'd derivative works that have been incorporated into their upstream versions, then there are many other arguments (in addition to the one presented herein) that would permit removal of the “Commons Clause”. This issue remains an open question of fact in this case.
1 Fontana made these statements on a mailing list governed by an odd confidentiality rule called CHR (which was originally designed for in-person meetings with a beginning and an end, not a mailing list). Nevertheless, Fontana explicitly waived CHR (in writing) to allow me to quote his words publicly.
Copyleft Won't Solve All Problems, Just Some of Them
byon March 17, 2022
Toward a Broad Ethical Software Licensing Coalition
We are passionate about and dedicated to the cause of software freedom and rights because proprietary software harmfully takes control of and agency in software away from users. In 2014, we started talking about FOSS as fundamental to “ethical software” (and, more broadly “ethical technology”) — which contrasts FOSS with the unethical behavior that Big Tech carries out with proprietary software. Some FOSS critics (circa 2018) coined the phrase “ethical source” — which outlined a new approach to these issues — based on the assumption that software freedom activists were inherently complicit in the bad behavior of Big Tech and other bad actors since the inception of FOSS. These folks argue that copyleft — the only form of software licensing that makes any effort to place ethical and moral requirements on FOSS redistributors/reusers — has fundamentally ignored the larger problems of society such as human rights abuses and unbridled capitalism. They propose new copyleft-like licenses, which, rather than focusing on the requirement of disclosure of source code, they instead use the mechanisms of copyleft to mandate behaviors in areas of ethics generally unrelated to software. For example, the Hippocratic License molds a copyleft clause into a generalized mechanism for imposing a more comprehensive moral code on software redistributors/re-users. In essence, they argue that copylefted software (such as software under the GPL) is unethical software. This criticism of copyleft reached crescendo in the last three weeks as pundits began to criticize FOSS licenses for failing to prohibit Putin from potentially using FOSS in his Ukrainian invasion or other bad acts.
We have in the past avoided a comprehensive written response to the so-called “ethical source” arguments — lest our response create acrimony with an adjacent community of activists who mean well and with whom we share some goals, but with whose strategies (and conclusions about our behavior and motivations) we disagree. Nevertheless, the recent events have shown that a single, comprehensive response would help clarify our position on a matter of active, heated public debate and fully answer these ongoing criticism of FOSS and our software freedom principles.
The primary criticism is that FOSS licensing over-prioritizes the rights of software freedom above substantially more important rights and causes — such as sanctions against war criminals. This rhetoric implies that software freedom activists have “tunnel vision” about the relatively minor issue of the rights to copy, modify, redistribute and reinstall software while we ignore bigger societal problems. This essay gives a comprehensive explanation of the specific reasons why copyleft avoids the “scope creep” of handling moral and ethical issues that relate only tangentially to software — even though those moral issues are indeed more urgent and dire than the moral issue of software freedom.
Software Freedom Isn't The Most Important Human Right
I personally, and many of my colleagues, have been admittedly imperfect advocates for software freedom. For the last thirty years, Big Tech and their allies have unfortunately successfully convinced the public that rights for users to control their own software are unimportant, and even trivial. (Apple has even successfully convinced their biggest fans that Apple's ironclad device lock-down is in your interest as a consumer.) In that climate, software freedom activists often overcompensated for the tech community's trivialization of software rights — specifically, overstating the relative importance of software freedom when compared to other human rights. Our error left a political vulnerability, allowing the opposition to successfully even further trivialize users' rights. Critics capitalized on this miscommunication, and often claim that FOSS activists believe that software freedom is the most important human right. Of course, none of us believe that.
I suspect most software freedom activists agree with me on the following: while I do believe software freedom should be a human right, I don't believe that our society should urgently pursue universal software freedom at the expense of upholding the many other essential rights (such as those listed in the Universal Declaration of Human Rights). Clearly many other rights are more fundamental. In a society that fails to guarantee those fundamental human rights, software freedom (by itself) is virtually useless. Those who would violate the most basic human rights will simply ignore the issue of software freedom, too. Or, even worse, such bad actors will gladly use any software, flagrantly in violation of any license, to bolster their efforts to violate other human rights.
Software freedom as a general cause becomes essential and relevant when a society has already reached a minimal level of justice. Indeed, I've spent much of my career as a software rights activist considering whether I should instead work on a more urgent cause — such as ending human trafficking, animal rights, or remedying climate change. Personally, the only valid moral justification for my personal focus on software freedom instead of those other rights is four-fold: (a) there is an increasingly limited number of qualified people who are willing to work on software freedom as a charitable cause at all, (b) there is an increasing number of talented people who are actively working to create more proprietary software and seeking to thwart software freedom and copyleft, (c) my personal talents are in the area of software production and authorship, not in areas directly applicable to other causes, and (d) an increasingly digitized society mean software rights slowly increase in importance as an “enabler right” to defend and protect other rights (just as Free Speech enables activists to expose (and hopefully prevent) atrocities and their cover-ups). In other words, I am unlikely to make any useful impact on any other cause in my whole career, whereas due to the unique match of my skills to the cause of software freedom, I have made measurable positive impact on software rights. I generally encourage activists to focus on tasks that directly coincide with their existing talents, and have tried to do the same myself.
So my argument starts in fervent agreement with the first point made by proponents of adding non-software ethical issues into copyleft licensing: yes, I absolutely agree there are social justice causes that are more urgent than the right to copy, modify, redistribute and reinstall software. That begs their question: then, why not immediately begin using all the tools, mechanisms and strategies used for FOSS advocacy to advocate for these other causes? The TL;DR answer is simple: because these tools, mechanisms and strategies are highly unlikely to have any measurable impact on those other causes, while using them for these other causes would ultimately minimize software freedom and rights unjustly.
Indeed, we need to make progress on the issue of software freedom, precisely because even while others are working to address and redress these other social justice issues, proprietary software (such as through proprietary AI-based advertising software that manipulates public opinion) is currently used to undermine these other causes. Universal software freedom would thwart Big Tech's efforts to undermine other causes. Proprietarization of software isn't the most heinous human rights violation possible; nevertheless, proprietarization of software does assist companies to do harm regarding other social justice causes. I conclude from that realization that our society should seek to make progress on both upholding the existing human rights already listed in the Universal Declaration on Human Rights, and also seek to make simultaneous progress on key rights not listed there, such as software freedom. We also err as activists if one group of activists seeks to thwart another by falsely claiming the other group is “complicit in human rights abuses” merely due to a strategic disagreement.
Ultimately, copyleft (and other FOSS licensing) is a strategy, not a moral principle unto itself. The moral principle is that proprietary software is harmful to people because it forbids their right to control their own software, learn how it works, and remove spyware from it (among many other ills). That moral principle remains valuable and deserves some of our collective attention, even if there are other more urgent moral principles that deserve even more attention.
Copyleft Is The Worst Strategy, Except for all the Others
So, if the production of proprietary software harms society, then why not focus all efforts on lobbying legislators to make proprietary software illegal? This should be the first question any new software freedom activist asks themselves. After all, for those of us who live in societies with relatively minimal corruption and that are governed by the rule of law, we should seek to make criminal those acts that harm others.
Criminalizing proprietary software has always been, and remains, politically unviable. We should constantly reevaluate that political viability (which software freedom activists have done throughout the last three decades). But as of the time of writing, this strategy remains unviable, primarily due to the worldwide domination of incumbent unbridled capitalism and a near universal poor understanding of the harm that proprietary software causes and enables.
Another possible approach to ending proprietary software is a universal boycott on authorship of proprietary software (perhaps through mass unionization of software developers). This is one of my favorite “thought experiments”, as it shows how much power individual software developers have regarding proprietary software. However, this universal boycott is also politically unviable, at least as long as proprietary software companies continue to pay such exorbitant salaries relative to other fields of endeavor.
So, if we can't make proprietary software illegal, and we can't dissuade developers from taking piles of money to write proprietary software, what's the next best strategy? The answer is to organize people to write alternative software that is not proprietary. This was the strategy that the software freedom movement pursued in earnest beginning in the early 1980s, and currently remains our best politically viable strategy. However, this approach always contained a fundamental problem: such software can easily be used as a basis for proprietary software. Thus non-copylefted FOSS competes against itself, rather hopelessly, since the proprietary version will likely always be a feature or two ahead, and the FOSS version a bug or two behind. Copyleft is the innovative strategy designed specifically to address that specific problem. Without copyleft, the only possible approach to answering the harm of proprietary software is the aforementioned general strike of all software development, since non-copyleft FOSS can be and is regularly used as a basis for advancing proprietary software and Big Tech's interests.
Copyleft generally works reasonably well as a strategy, but it admittedly requires constant vigilance. Copyleft needs someone to enforce it, and resources to do that. Copyleft must withstand the pressure of proprietary software companies who seek to erode and question its validity. The primary conceit of those who seek to use a copyleft-style strategy to address other software-tangential social injustices is their apparent belief that merely writing policy into a software license has any chance of changing behavior on its own. It simply doesn't.
Other Mechanisms Are More Effective If Politically Viable
The Hippocratic License and similar efforts have a laudable goal: they seek to assure that companies who deal in software always respect human rights. However, advocacy for universally recognized human rights, as a social justice cause, does have access to better advocacy mechanisms that software freedom activism does not.
Most notably, almost everything listed in the Universal Declaration of Human Rights is illegal in the USA and in most other industrialized nations where the bulk of software development occurs. Also, it is certainly politically viable to improve those laws — for those rare cases where a violation of a particular universally recognized human right is legal. In short, because these other rights are much more widely accepted as fundamental by the public, we can employ other, better means (including those listed above that don't work for FOSS) to compel compliance by companies with these other principles.
Furthermore, copyleft is ill-suited as a mechanism to enforce any rights in places where human rights violations are common. For example, no one has ever bothered to enforce copyleft licenses in jurisdictions where corruption is rampant and the judiciary is easily bribed. Over the last twenty years, we've received many reports of GPL violations in the Russian Federation, but we don't pursue them — not because they shouldn't be addressed, but because, under Putin's regime, it's highly unlikely we can get a fair hearing to uphold software freedom and rights for Russian citizens. Copyleft relies on a well-formed rule-of-law for contracts and copyright to protect people's rights (of any kind). In jurisdictions that already hold human life and the rights of its people in low regard (or simply have an exceedingly corrupt government), it's a pointless symbolic act to also take away the permissions of software redistribution and modification for bad behavior (of any kind). Companies and oligarchs operating in a corrupt, unjust society will successfully ignore those injunctions, too.
Meanwhile, in jurisdictions with relatively less corruption, other systems besides distribution licenses function well to curtail bad behavior. For example, I've owned exactly two cars in my life here in the USA. While I concede there are many problems with corruption here, we have a relatively just society that usually respects the rule of law for contracts and copyrights. The cars that I purchased here did not have a license that said: if you drive dangerously with the vehicle, you cannot purchase and utilize cars in the future from that manufacturer. We don't look to the car manufacturers to enforce the ethical use of vehicles; we instead make traffic laws, with various escalating penalties, including a driver licensing structure that can be revoked temporarily or permanently for egregious acts. We don't require manufacturers to contract with drivers to pollute less; we instead create and enforce environmental regulation and incentives both before and after the time of purchase. Because such systems exist and because there is widespread societal consensus about what is or is not ethical driving behavior, there is no point in enforcing these rules using copyright and contracts that bind the vehicle's purchaser. A more resilient system (of traffic and environmental laws, and their enforcement) works to deal with the problem, and improving those laws is politically viable. Additional licensing terms from the car manufacturers (imposed at the point of sale of vehicles) would create a useless redundancy, since the penalties and remedies available under that license are substantially less severe than those available under the laws that regulate drivers.
There are strategies other than licensing changes that would likely work well to both build a stronger coalition for software freedom and rights and curtail the atrocities committed by Big Tech and their customers. These strategies might become political viable, and are worth pursuing in parallel and in coalition. For example, widespread unionization of tech workers (not over wages, which are generally high, but over other issues, such as bad behavior and policy by their employers) could both improve companies' respect of software freedom and handle many problems raised by those who seek tangential expansion of copyleft into non-software issues. For our part, Software Freedom Conservancy has done some work in this area by encouraging developers to begin insisting on better terms in their employment contracts. I do worry that a functioning coalition on these matters is exceedingly difficult to build (and the very fact this essay ultimately became necessary hints at the difficulty in building that coalition). We'd be glad to work in coalition with such activists to further those causes if they include software freedom as an issue that belongs on the coalition's agenda.
But that's a long-term, speculative action. Meanwhile, for software freedom, copyleft is the best-available compromise strategy — since software rights are not and cannot be defended in a more robust way (such as through direct legislation, as opposed to indirectly relying on the copyright and contract legal systems to assure the rights). Copyleft is a round-about strategy. Using copyleft as a strategy to impact broader ills that have more effective mechanisms to address those ills is (at the very least) wasted time and (possibly) downright counter-productive.
Copyleft Focuses On Coalition
In our increasingly politically divided society, omnibus social justice reform has always been exceedingly difficult. Copyleft works precisely because it holds together a very thin coalition — by confining the issues to only those that happen with software.
Consider this example: I became a vegetarian in 1992. It does bother me that software that I've written could potentially assist a slaughterhouse to run more efficiently. I obviously have considered licensing my software under terms that would forbid use in a slaughterhouse (and a dozen other activities that I personal morally oppose, including for the waging of war). However, hand-picking my most important social justice causes and stringing a copyleft clause on them would dissolve a rather thinly-held coalition of copyleft proponents. Successful advocacy for a given cause relies on building broad coalitions among people with widely disparate views on other topics. Imagine how difficult activism on climate change would be if activists working to end human trafficking claimed that activists working to address climate change were complicit in human trafficking because The Paris Climate Agreement does not include penalties if participating nation-states fail to meet benchmarks on reducing human trafficking. Coalition building is complex. Context matters.
In a diverse political ecosystem, elegant solutions that work “ok” often fare better than comprehensive-but-complex solutions. Copyleft's innovation is that the only action you can take that revokes your right to copy, modify, redistribute and reinstall the software is failure to give that same right to someone else. This elegance makes the copyleft strategy powerful and effective. “Porting” the copyleft strategy to other causes may seem that it would yield “more of a good thing”. But, in practice, that approach turns copyleft licenses into complex omnibus legislation around which coalitions will evaporate.
Relatedly, the most difficult hurdle of copyleft has always been the creation of software that was so enticingly useful that political opponents (i.e., proprietary software companies) would gladly give users the rights to copy, modify, and reinstall the software — in direct exchange for having the benefit of building their new software on top of the existing copylefted components (rather than rewriting it themselves). I do not see a viable path to create the necessary coalition that would, after agreeing on an omnibus list of social justice issues, also find the funding and volunteer labor necessary to build software (under that license) that would entice those who currently work against that list of social justice causes to stop working against those causes merely because they'd gain so much more from the software than they gain from violating the principles. Copylefted software in a vacuum, adopted only by other copyleft activists does not change behavior of bad actors. For example, imagine if we wrote into our licenses that all who copy, modify and distribute the software must cease use of fossil fuels. That's an important cause, but it's hard to imagine our software would be so useful that companies would accelerate their reduction of fossil fuel use merely to gain immediate the permission to copy, modify and redistribute that software.
Copyleft Requires Constant Vigilance
Copyleft isn't magic pixie dust that liberates software. In fact, likely one of the biggest flaws in copyleft design has been a gross underestimation of resources required for enforcement in the scenario we now have. Broad adoption of key copylefted components remains an important step to curtail proprietary software developers' mistreatment of users. The situation slowly improves as such developers incorporate copylefted software like Linux into their essential computing systems — provided that is done so in compliance with the license. However, violations on essential GPL'd components such as Linux and GCC are rampant and limited funding is available to resolve these violations and restore users' rights in the software. Big Tech has also been relentless and highly creative in thwarting our enforcement efforts.
Thus, even if not for my earlier strategic reasons that I oppose adding ethical-but-software-unrelated restrictions to FOSS licenses, I'd still oppose it on tactical grounds. Namely, there is no clear funding path whereby additional terms seeking to protect and advance software-tangential social justice causes could be adequately enforced to make a measurable difference in advancement of those causes.
FOSS Must Still Have a Conscience on Non-Software Issues
This essay merely argues that FOSS licenses are not an effective tool to advance social justice causes other than software freedom. It does not argue that FOSS communities have no duties to other causes and issues; in fact, they do have such a moral obligation. For example, FOSS developers should refuse to work specifically on bug reports from companies who don't pay their workers a living wage. I also recommend that FOSS communities create (alongside their Codes of Conduct for behavior inside the project), written rules of the types of entities that the projects will officially assist with volunteer labor, or (in the case of a commercial FOSS community or organization), what types of entities the community will engage in business deals.
At Software Freedom Conservancy, we regularly discuss at both the staff and Board of Directors level what other social justice issues that we have a moral obligation to incorporate. Most notably, we've been the home for Outreachy, a program our own Executive Director, Karen Sandler, helped create, and for which we are glad to have Sage Sharp on staff to work on full-time. We know that FOSS lags behind proprietary software development in welcoming and providing opportunities for underrepresented groups. We dedicate significant organizational resources on these issues through Outreachy and other newer programs (such as the Institute for Computing Research). We made a public statement that Trump's travel ban directly thwarted FOSS. We go beyond the mere legal requirements to create ethical and equitable hiring practices that are without bias. In defending the rights of users under copyleft, we do not leave other issues behind. I believe that the critics have simply not paid attention to, or are willfully ignoring, the holistic and intersectional approach that we have brought to FOSS.
Regarding Putin's FOSS Permissions Upon Invasion of Ukraine
Initially, only a few FOSS critics insisted on this radical change to copyleft licensing structure. The issue had fallen into the far background of our community — until the last few weeks. Specifically, many recently began asking whether we should redraft FOSS licenses to impose sanctions on Putin in retaliation for his violent and unprovoked invasion of Ukraine. Admittedly, FOSS licenses do not prevent Putin from incorporating existing FOSS already in his possession into his war machine. I personally have been a conscientious objector to all military action since 1990, so I am sympathetic. I have always felt the OSI's framing the discussion of military use of FOSS as a “field of use restriction” misses the point; it inappropriately analogizes software to physical materiel, and analogizes those who write FOSS to de-facto military contractors. Software, fundamentally, is the written word; while it “feels” like more than that to us, factually speaking, software is merely a written record of knowledge, methods, and instructions for how to solve digital problems. It is disturbing that the plans for heinous acts can sometimes be modeled as digital problems, and that some of those problems may be solvable with existing FOSS. But we must curtail and punish actual actions, not knowledge nor writing, nor the unfettered sharing of generally useful technical information. Particularly in cyber-warfare circles, some folks tend to talk about software as we did during the days when sharing encryption software was banned: as if certain software is more like bombs than books. I don't think we should concede that rhetoric; all software remains much more like books than it is like bombs.
Even if we choose to not take away the right to read from the Russian people, that does not mean that FOSS activists concede that nothing can be done. Our nations can, should, and many currently do, forbid commerce with Russia during this period. This can and should include embargoes of selling new books, new copies of software, providing services for improvements to software, and any other commercial activities that could inadvertently aid Putin's war effort. Every FOSS license in existence permits capricious distribution; software freedom guarantees the right to refuse to distribute new versions of the software. (i.e., Copyleft does not require that you publish all your software on the Internet for everyone, or that you give equal access to everyone — rather, it merely requires that those whom you chose to give legitimate access to the software also receive CCS). FOSS projects should thus avoid providing Putin easy access to updates to their FOSS. Indeed, FOSS licenses planned well for how to manage bad actors who want your software: all FOSS licensing authorities have upheld the right to capricious distribution — precisely so that the license would not compel any developer to provide software to a bad actor.
I suspect activists will continue to disagree about whether we have a moral imperative to change FOSS licenses themselves to contractually forbid Putin to copy, modify, redistribute and reinstall the FOSS he already has (or surreptitiously downloaded by circumventing sanctions). However, these horrendous events in Ukraine offer real world examples to consider the viability of expanding copyleft term expansion beyond software, and consider how it might work. My analysis is that such changes would only give us the false sense of having “done something”. Ultimately enforcement of such licensing changes would either be impossible or pointless. The very entities (such as the varied international courts and treaty organizations) that could enforce such terms will also have plenty of other war crimes and sanctions violations to bring against Putin and his cronies anyway. The penalties for the actions of war that Putin took will be much stronger than Putin's contractual breach or copyright infringement claim that could be brought under a modified copyleft license and/or the Hippocratic License.
Copyleft licensing is a powerful strategy. As a strategy, copyleft has both its upsides and downsides in its ability to advance the software freedom and rights of users. However, the proverbial hammer of copyleft will not help you when your problem is more like a screw than a nail. Having already dedicated my entire career to advance the copyleft strategy, I do feel honored that folks who care deeply (as I do) about other important social justice causes are seeking to apply that strategy to new types of problems. However, despite my lifelong love and excitement for copyleft, and perhaps because of it, it's my duty to point out that copyleft is not a panacea for all that ills our troubled world.
Copyleft works because it's the best strategy we have for software freedom, and because copyleft elegantly confines itself to the software rights of users. Attempts to apply the copyleft strategy to software-unrelated causes will (at the very least) fail to achieve the intended results, and at their worst, will primarily serve to trivialize the important issue of software freedom that copyleft was invented to accentuate.
If Software is My Copilot, Who Programmed My Software?
byon February 3, 2022
Software freedom is our goal. Copyleft is a strategy to reach that goal. That tenet is oft forgotten by activists. Copyleft is even abused to advance proprietary goals. We too often see concern about the future of copyleft overshadow the necessary fundamental question: does a particular behavior or trend — and the inevitable outcomes of those behaviors and trends — increase or decrease users’ rights to copy, share, modify, and reinstall modified versions of their software? That question remains paramount as we face new challenges.
Introduced first by Microsoft’s GitHub in their Copilot product, computer-assisted software authorship by way of machine learning models presents a formidable challenge to software freedom’s future. Yet, we can, in fact, imagine a software freedom utopia that embodies this technology. Imagine that all software authors have access to the global archive of machine learning models — and they are fullly reproducible. Everyone has equal rights to fork these models, train them further with their own datasets, provided that they must release new models (and the input code) freely in the global archive. All code produced by these models is also made freely available under copyleft. All code that builds the models, all historical input sets, and all trained models are all also made available to everyone under copyleft licenses.
While activists might quibble about minor details to optimize imagined utopia, this thought experiment shows computer-assisted software authorship does not inherently negate software freedom. Rather, the rules, requirements, and policies that apply will determine whether software freedom is respected. To paraphrase Hamlet: there is nothing either good or bad, but the policy makes it so.
What’s the Worse That Could Happen?
[They are] not a good [person] who, without a protest, allows wrong to be committed … with the means which [they] help to supply.
— John Stewart Mill, University of St. Andrews, 1 February 1867
Obviously, ignoring machine learning for computer-assisted software authorship will not usher in this software freedom utopia. Copyleft activists cannot stand idly by in this situation, but we must temper our attention by considering the likelihood of dystopian and problematic outcomes, and the options available to prevent them.
In response to Copilot’s announcement, pundits speculated, without evidence, a prevailing feeling of “Free Software had a good run, but I guess that’s over now”. Such predictions seem consistent with the well-documented overoptimism of artificial intelligence success. Rapid replacement of traditional software development methodologies seem unlikely. As such, we should not overestimate the likelihood that these new systems will both accelerate proprietary software development, while we simultaneously fail to prevent copylefted software from enabling that activity. The former may not come to pass, so we should not unduly fret about the latter, lest we misdirect resources. In short, AI is usually slow-moving, and produces incremental change far more often than it produces radical change. The problem is thus not imminent nor the damage irreversible. However, we must respond deliberately with all due celerity — and begin that work immediately.
Currently, there are two factors that influence the timing of our response. First, if GitHub’s Copilot becomes a non-beta product available to the programming public, that would indicate necessity of an urgent response. Microsoft and GitHub are unlikely to share their product plans, so we cannot know for sure when this will occur. However, in the seven months since the first beta was made available, we’ve consistently heard anecdotally that more and more developers (particularly, FOSS developers!) have received beta invitations. Based on these (admittedly incomplete) facts, we must assume that a move from private beta to public deployment is imminent in 2022. This indicates some urgency of the problem.
Second, we already know that some of our worst fears are definitely true. Namely, that Microsoft and GitHub used copylefted software as part of Copilot’s training set.
Copilot was trained on “billions of lines of public code … written by others”. While GitHub has refused requests to release even a list of repositories included in the training set, the use of the word “public” indicates that only software with source-available licenses (even if not FOSS licenses) were input into Copilot. Furthermore, GitHub admits that during training, the system encountered a copy of the GPL more than 700,000 times. This effectively confirms that copylefted public code appears in the training set.
When questioned, former GNOME developer and GitHub CEO0, Nat Friedman, declared publicly “(1) training ML systems on public data is fair use (2) the output belongs to the operator”. Friedman himself, as well as Microsoft and GitHub’s other executives and lawyers, have ignored Software Freedom Conservancy’s requests for clarification and/or evidence supporting these statements.
Meanwhile, GitHub continues to improve this system, trained only on publicly source-available software, and seeks to market it to new users, including those who otherwise use FOSS development tools. Users continue to report gaining access to the beta and are noticing improvements. Microsoft and GitHub’s public position is meanwhile clear: they claim to have no copyleft obligations for training the model, the model itself, and deploying the service. They also believe there are no licensing obligations for the output.
While Friedman ignored the community’s requests publicly, we inquired privately with Friedman0 and other Microsoft and GitHub representatives in June 2021, asking for solid legal references for GitHub’s public legal positions of (1) and (2) above. They provided none, and reiterated, without evidence, that they believed the model does not contain copies of the software, and output produced by Copilot can be licensed under any license. We further asked if there are no licensing concerns on either side, why did Microsoft not also train the system on their large proprietary codebases such as Office? They had no immediate answer. Microsoft and GitHub promised to get back to us, but have not.
This secrecy and non-cooperativeness is expected from a proprietary software company and its subsidiary, but leaves us only with speculative conclusions to inform a strategy for copyleft here. We can reliably guess that the companies will claim “fair use” as their primary justification for creating the model and offering the service, and will argue that both the output and the trained model are not “work[s] based on the Program” (GPLv2) nor do they “copy from or adapt all or part of the work[s] in a fashion requiring copyright permission” (GPLv3/AGPLv3). Furthermore, we can reliably conclude, given the continuing product promotion, that the companies have at least a medium-term commitment to Copilot.
In short, they have already hunkered down for a protracted disagreement. Their positions are now incumbent — using their resources and power to successfully charge copyleft activists to “prove them wrong”. But we do not have to accept their unsubstantiated arguments at face value. In fact, these areas are so substantially novel that almost every issue has no definitive answers, but we must nevertheless begin to formulate our position and our response to Microsoft and GitHub’s assault on copyleft.
Trained Models, Fair Use, and Copyright Infringement
Consider GitHub’s claim that “training ML systems on public data is fair use”. We have not found any case of note — at least in the USA — that truly contemplates that question. The only legal case in the USA to look near this question is Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015). The Supreme Court denied certiorari on this case; it is not legal precedent in all jurisdictions where Microsoft and GitHub operate.
Even more, that case considered a fact pattern centered around search, not authorship of new/derived works. Google had made copies of entire copyrighted books, not for the purpose of displaying them, but so users could (1) run search queries, and (2) see a “snippet” of the search hits (i.e., to see the search hit in context). The Second Circuit held Google’s copying of the books was “fair use” because searching and providing context added value exceeding what a user could obtain from their own copies, and Google’s product did not substitute the market for the books.
The analogous fact pattern for code is obvious: GitHub could offer a search tool that assists users in finding key public repositories (and specific lines of code within those repositories) that seemed to solve tasks of interest. Developers could then easily utilitize those codebases in the usual, license-compliant ways. The actual Copilot fact pattern is not this one.
Meanwhile, the Authors Guild case begins and ends the list of major cases regarding machine learning systems and “fair use”. We should simply ignore GitHub’s risible claim that the “fair use question” on machine learning is settled.
Perhaps most importantly, in the USA, “fair use” is an affirmative defense to answer copyright infringement. In concrete terms, that means — particularly in cases where the circumstances are novel — a copyright holder brings an infringement lawsuit and then the alleged infringer shows in court that their actions met the relevant factors for “fair use” sufficiently. Frankly, we refuse to do these companies’ job for them. Copyleft activists need not tell Microsoft and GitHub why this isn’t “fair use”, rather, they need to tell us why training the model with copylefted code is “fair use” and prove that the trained model itself is not a “work based on” the GPL’d software.
GitHub has meanwhile artfully avoided the question of whether the trained model is a “work based on” the input. We contend that it probably is. However, given that “fair use” is an affirmative defense to copyright infringement, they are obviously anticipating a claim that the trained model is, in fact, a “work based on” the inputs to the model. Why else would they even bring up “fair use”, rather than simply say their use is fully non-infringing? Anyway, we have no way to even explore these questions authoritatively without examining the model, fully affixed in its tangible medium. We don’t expect GitHub to produce that unless compelled by a third party.
Indeed, discussion of these questions outside of a courtroom is moot. For this novel and contentious fact pattern, only a court decision can settle the matter adequately. As a strategic matter, copyleft activists should keep their own counsel about what we anticipate in the opposition’s “fair use” and/or non-infringement defenses, and the counter-arguments that we plan.
Copilot Users Should Worry
GitHub’s position does a great disservice to Copilot users. Their claim that “the output belongs to the operator” creates a false sense of legal justification. Users have already shown that Copilot can generate a substantial amount of unique, GPL’d code, and then (rather ironically, given GitHub’s claim that they removed the text of the GPL from the training set) also suggest a license that is non-copyleft. Friedman’s statement surely does not qualify as an indemnity for Copilot users who might face GPL enforcement actions. Users almost surely must construct their own “fair use” or “not copyrightable” defenses for Copilot’s output.
The length and detail of what Copilot can generate for users seems unbounded. The glaring example above appears primia facie to be copyright infringement; we expect further such problems. Consider the sheer amount that a fully functional and successful Copilot would generate. Surely, AI researchers seek the ability for Copilot to “figure out” that you are trying to solve some specific task when programming. The better Copilot gets at handing ready-made solutions to its users, the more likely it becomes that its output may offer the user copylefted software.
Copilot leaves copyleft compliance as an exercise for the user. Users likely face growing liability that only increases as Copilot improves. Users currently have no methods besides serendipity and educated guesses to know whether Copilot’s output is copyrighted by someone else. Proprietary software companies such as Synopsys provide so-called “scanning tools” — that can search your proprietary codebase and find hidden copylefted software. However, the FOSS tools for that job are in their infancy and unlikely to develop quickly, since historically those who want those tools are companies that primarily develop proprietary software and seek to avoid copylefted software.
We recommend users who wish to avoid infringing the copyrights of others simply avoid Copilot.
On Copyleft Maximalism and Unilateral Capitulation
Draconian copyright law generally horrifies software freedom activists for good reason. Nearly all copyleft activists would prefer a true, multilateral rewriting of copyright rules that prioritized the interest of the general public and software rights. Copyleft exists primarily because of the long-standing political non-viability of a copyright law reboot. Nothing has changed in this regard; if anything, changing legislation has become an even more expensive lobbying proposition than it was at copyleft’s advent. Copyleft activists should expect, indefinitely, for proprietary software companies and media oligarchs to control copyright legislation.
Fortunately, copyleft was designed specifically for this eventuality. Activists have called copyleft the “judo move” of software freedom, since copyleft uses the powerful copyright force (invented primarily by our opposition) against itself. That realization leads to a painful, but pragmatically necessary, awkwardness.
The issues herein — from training of machine learning models, to the copyright questions about those models, to the derivation questions about their output — are novel copyright questions. As software freedom activists, we are uniquely qualified to invent an ideal copyright structure for these technologies. But, without a path to promulgate such replacement copyright rules into the incumbent system, that exercise is futile. Furthermore, systems outside of copyright — including but not limited to EULAs, business agreements and patents — have long been used to proprietarize software without the need of copyright. Reality of facts on the ground dictate that we not concede the only wedge we have to compel software freedom; that wedge is copyleft.
Meanwhile, proprietary software companies regularly exploit any unilateral concessions on weakening of copyleft that FOSS projects make, while continuing to pursue copyright maximalism for their works. Particularly in novel areas, we must assume a copyleft maximalist approach — until courts or the legislature disarm all mechanisms to control users’ rights with regard to software. That adversarial process will frustrate us, but ultimately by choosing copyright as our primary tool, we already chose the courts as our battleground for contentious issues.
We all surely have our opinions about how copyleft should operate in these novel situations. We have even expressed some such opinions herein. But, ultimately, strong copyleft licenses do not defer the “what’s covered?” question to one individual or organization. The “judo” power comes from strong copyleft reaching to all of what copyright governs. When those issues are novel — and companies flaunt that novel manipulation of copylefted works — only a court can answer definitively.
A Community-Led Response
While these companies will likely not succeed in their efforts to disarm copyleft, they have nevertheless attacked the entire copyleft infrastructure. We must mount an effective response.
Software Freedom Conservancy has spent the last six months in deep internal discussions about this novel threat to the very efficacy of copyleft. We have a few ideas — a mix of short-term, medium-term and long-term strategies to address the problem. However, we recognize that a community (rather than the traditional BDFL) approach is needed — at least for this problem. Thus, putting first things first, we realized that we should gather the best minds in the software freedom community with direct experience in copyleft theory and practice. We will convene these individuals to a committee specifically chartered by Software Freedom Conservancy to — as quickly as reasonably possible – publish a series of recommendations to the community on how we should respond to both the immediate threat to copyleft found in Copilot, and (long-term) analyze the more general threat that AI-assisted programming techniques pose to the strategy of copyleft.
While we are not actively seeking applications for this committee, we do welcome anyone whom we have not yet solicited to participate to contact us and inquire. We will surely be unable to include everyone who is interested on the committee — either due to Conflicts of Interest or due to simple logistics of creating too large a committee. However, we will carefully consider anyone who expresses bona fide interest to participate.
Finally, as much as can be done during the pandemic using FOSS tools available, we will attempt to convene public discussions as much as possible. We will contemporaneously publish the committee’s minutes publicly. If you’d like to get involved today in public discussions about this issue, please join the mailing we launched today for this topic.
0In November 2021, Nat Friedman was replaced by Thomas Dohmke as GitHub’s CEO. However, to our knowledge, Dohmke has not retracted or clarified Friedman's comments, and at the time of writing, no one from GitHub or Microsoft that we spoke to had responded to our requests for clarification.