Conservancy Blog
Displaying posts by Bradley M. Kuhn
Open Source AI Definition Erodes the Meaning of “Open Source”
by
on October 31, 2024This week, the Open Source Initiative (OSI) made their new Open Source Artificial Intelligence Definition (OSAID) official with its 1.0 release. With this announcement, we have reached the moment that software freedom advocates have feared for decades: the definition of “open source” — with which OSI was entrusted — now differs in significant ways from the views of most software freedom advocates.
There has been substantial acrimony during the drafting process of OSAID, and this blog post does not summarize all the community complaints about the OSAID and its drafting process. Other bloggers and the press have covered those. The TLDR here, IMO is simply stated: the OSAID fails to require reproducibility by the public of the scientific process of building these systems, because the OSAID fails to place sufficient requirements on the licensing and public disclosure of training sets for so-called “Open Source” systems. The OSI refused to add this requirement because of a fundamental flaw in their process; they decided that “there was no point in publishing a definition that no existing AI system could currently meet”. This fundamental compromise undermined the community process, and amplified the role of stakeholders who would financially benefit from OSI's retroactive declaration that their systems are “open source”. The OSI should have refrained from publishing a definition yet, and instead labeled this document as ”recommendations” for now.
As the publication date of the OSAID approached, I could not help but
remember a fascinating statement that Donald E. Knuth, one of the founders
of the field of computer
science, once
said: [M]y role is to be on the bottom of things. … I try to
digest … knowledge into a form that is accessible to people who don't
have time for such study
. If we wish to engage in the
highly philosophical (and easily politically corruptible) task
of defining what terms like “software freedom” and
“open source” mean, we must learn to be on the “bottom of
things”. OSI made an unforced error in this regard. While they could
have humbly announced this as “recommendations” or “guidelines”,
they instead formalized it as a “definition” — with equivalent authority to their
OSD.
Yet, OSI itself only turned its attention to AI only recently, when they announced their “deep dive” — for which Microsoft's GitHub was OSI's “Thought Leader”. OSI has responded too rapidly to this industry ballyhoo. Their celerity of response made OSI an easy target for regulatory capture.
By comparison, the original OSD was first published in February 1999. That was at least twelve years after the widespread industry adoption of various FOSS programs (such as the GNU C Compiler and BSD). The concept was explored and discussed publicly (under the moniker “Free Software”) for decades before it was officially “defined”. The OSI announced itself as the “marketing department for Free Software” and based the OSD in large part on the independently developed Debian Free Software Guidelines (DFSG). The OSD was thus the culmination of decades of thought and consideration, and primarily developed by a third-party (Debian) — which provided a balance on OSI's authority. (Interestingly, some folks from Debian are attempting to check OSI's authority again due to the premature publication of the OSAID.)
OSI claims that they must move quickly so that they can counter the software companies from coopting the term “open source” for their own aims. But OSI failed to pursue trademark protection for “open source” in the early days, so the OSI can't stop Mark Zuckerberg and his cronies in any event from using the “open source” moniker for his Facebook and Instagram products — let alone his new Llama product. Furthermore, OSI's insistence that the definition was urgently needed and that the definition be engineered as a retrofit to apply to an existing, available system has yielded troublesome results. Simply put, OSI has a tiny sample set to examine, in 2024, of what LLM-backed generative AI systems look like. To make a final decision about the software freedom and rights implications of such a nascent field led to an automatic bias to accept the actions of first movers as legitimate. By making this definition official too soon, OSI has endorsed demonstrably bad LLM-backed generative AI systems as “open source” by definition!
OSI also disenfranchised the users and content creators in this process. FOSS activists should be engaging with the larger discussions with impacted communities of content creators about what “open source” means to them, and how they feel about incorporation of their data in the training sets into these third-party systems. The line between data and code is so easily crossed with these systems that we cannot rely on old, rote conclusions that the “data is separate and can be proprietary (or even unavailable), and yet the system remains ‘open source’”. That adage fails us when analyzing this technology, and we must take careful steps — free from the for-profit corporate interest of AI fervor — as we decide how our well-established philosophies apply to these changes.
FOSS activists err when we unilaterally dictate and define what is ethical, moral, open and Free in areas outside of software. Software rights theorists can (and should) make meaningful contributions in these other areas, but not without substantial collaboration with those creative individuals who produce the source material. Where were the painters, the novelists, the actors, the playwrights, the musicians, and the poets in the OSAID drafting process? The OSD was (of course) easier because our community is mostly programmers and developers (or folks adjacent to those fields); software creators knew best how to consider philosophical implications of pure software products. The OSI, and the folks in its leadership, definitely know software well, but I wouldn't name any of them (or myself) as great thinkers in these many areas outside software that are noticeably impacted by the promulgation of LLMs that are trained on those creative works. The Open Source community remains consistently in danger of excessive insularity, and the OSAID is an unfortunate example of how insular we can be.
Meanwhile, I have spent literally months of time over the last 30 years trying to make sure the coalition of software freedom & rights activists remained in basic congruence (at least publicly) with those (like OSI) who are oriented towards a more for-profit and corporate open source approach. Until today, I was always able to say: “I believe that anything the OSI calls ‘open source’ gives you all the rights and freedoms that you deserve”. I now cannot say that again unless/until the OSI revokes the OSAID. Unfortunately, that Rubicon may have now been permanently crossed! OSI has purposely made it politically unviable for them to revoke the OSAID. Instead, they plan only incremental updates to the OSAID. Once entities begin to rely on this definition as written, OSI will find it nearly impossible to later declare systems that were “open source” under 1.0 as no longer so (under later versions). So, we are likely stuck with OSAID's key problems forever. OSI undermines its position as a philosophical leader in Open Source as long as OSAID 1.0 stands as a formal defintion.
I truly don't know for sure (yet) if the only way to respect user rights in an LLM-backed generative AI system is to only use training sets that are publicly available and licensed under Free Software licenses. I do believe that's the ideal and preferred form for modification of those systems. Nevertheless, a generally useful technical system that is built by collapsing data (in essence, via highly lossy compression) into a table of floating point numbers is philosophically much more complicated than binary software and its Corresponding Source. So, having studied the issue myself, I believe the Socratic Epiphany currently applies. Perhaps there is an acceptable spot for compromise regarding the issues of training set licensing, availability and similar reproducibility issues. My instincts, after 25 years as a software rights philosopher, lead me to believe that it will take at least a decade for our best minds to find a reasonable answer on where the bright line is of acceptable behavior with regard to these AI systems. While OSI claims their OSAID is humble, I beg to differ. The humble act now is to admit that it was just too soon to publish a “definition” and rebrand these the OSAID 1.0 as “current recommendations”. That might not grab as many headlines or raise as much money as the OSAID did, but it's the moral and ethical way out of this bad situation.
Finally, rather than merely be a pundit on this matter, I am instead today putting myself forward to try to be part of the solution. I plan to run for the OSI Board of Directors at the next elections on a single-issue platform: I will work arduously for my entire term to see the OSAID repealed, and republished not as a definition, but merely recommendations, and to also issue a statement that OSI published the definition sooner than was appropriate. I'll write further about the matter as the next OSI Board election approaches. I also call on other software rights activists to run with me on a similar platform; the OSI has myriad seats that are elected by different constituents, so there is opportunity to run as a ticket on this issue. (Please contact me privately if you'd like to be involved with this ticket at the next OSI Board election. Note, though, that election results are not actually binding, as OSI's by-laws allow the current Board to reject results of the elections.)
RHEL Panel Discussion at FOSSY 2023
by
on July 19, 2023This past weekend, July 13-16th, 2023, Software Freedom Conservancy (SFC) hosted and ran a new conference, FOSSY (Free and Open Source Software Yearly) in Portland, Oregon, USA. I was glad to host the keynote panel discussion on the recent change made by Red Hat (now a subsidiary of IBM) regarding the public source code releases for Red Hat Enterprise Linux (RHEL).
The panelists included (in alphabetical order) Jeremy Alison, software engineer at CIQ (focused on Rocky Linux) and Samba co-founder, myself, Bradley M. Kuhn, policy fellow at SFC, benny Vasquez, the Chair of the AlmaLinux OS Foundation, and James (Jim) Wright, who is Oracle’s Chief Architect for Open Source Policy, Strategy, Compliance, and Alliances.
Red Hat themselves did not reply to our repeated requests to join us on this panel, but we were able to gather the key organizations impacted by Red Hat's recent decision to cease public distribution of RHEL sources. SUSE was also invited but let us know they were unable to send someone on short notice to Portland for the panel.
We're very glad to make the video available to everyone who has been following this evolving story. FOSSY is a new event, and we've hopefully shown how running a community-led FOSS event here in Portland each summer creates an environment where these kinds of important discussions can be held to explore issues impacting FOSS users around the world.
I thank our panelists again for booking last-minute travel to be with us for this exciting panel and thank all the FOSSY attendees for their excellent questions during the panel.
I hope to see all of you at next years' FOSSY!
A Comprehensive Analysis of the GPL Issues With the Red Hat Enterprise Linux (RHEL) Business Model
by
on June 23, 2023This article was originally published primarily as a response to IBM's Red Hat's change to no longer publish complete, corresponding source (CCS) for RHEL and the prior discontinuation of CentOS Linux (which are related events, as described below). We hope that this will serve as a comprehensive document that discusses the history of Red Hat's RHEL business model, the related source code provisioning, and the GPL compliance issues with RHEL.
For approximately twenty years, Red Hat (now a fully owned subsidiary of IBM) has experimented with building a business model for operating system deployment and distribution that looks, feels, and acts like a proprietary one, but nonetheless complies with the GPL and other standard copyleft terms. Software rights activists, including SFC, have spent decades talking to Red Hat and its attorneys about how the Red Hat Enterprise Linux (RHEL) business model courts disaster and is actively unfriendly to community-oriented Free and Open Source Software (FOSS). These pleadings, discussions, and encouragements have, as far as we can tell, been heard and seriously listened to by key members of Red Hat's legal and OSPO departments, and even by key C-level executives, but they have ultimately been rejected and ignored — sometimes even with a “fine, then sue us for GPL violations” attitude. Activists have found this discussion frustrating, but kept the nature and tenure of these discussions as an “open secret” until now because we all had hoped that Red Hat's behavior would improve. Recent events show that the behavior has simply gotten worse, and is likely to get even worse.
What Exactly Is the RHEL Business Model?
The most concise and pithy way to describe RHEL's business model is: “if you exercise your rights under the GPL, your money is no good here”. Specifically, IBM's Red Hat offers copies of RHEL to its customers, and each copy comes with a support and automatic-update subscription contract. As we understand it, this contract clearly states that the terms do not intend to contradict any rights to copy, modify, redistribute and/or reinstall the software as many times and as many places as the customer likes (see §1.4). Additionally, though, the contract indicates that if the customer engages in these activities, that Red Hat reserves the right to cancel that contract and make no further contracts with the customer for support and update services. In essence, Red Hat requires their customers to choose between (a) their software freedom and rights, and (b) remaining a Red Hat customer. In some versions of these contracts that we have reviewed, Red Hat even reserves the right to “Review” a customer (effectively a BSA-style audit) to examine how many copies of RHEL are actually installed (see §10) — presumably for the purpose of Red Hat getting the information they need to decide whether to “fire” the customer.
Red Hat's lawyers clearly take the position that this business model complies with the GPL (though we aren't so sure), on grounds that that nothing in the GPL agreements requires an entity keep a business relationship with any other entity. They have further argued that such business relationships can be terminated based on any behaviors — including exercising rights guaranteed by the GPL agreements. Whether that analysis is correct is a matter of intense debate, and likely only a court case that disputed this particular issue would yield a definitive answer on whether that disagreeable behavior is permitted (or not) under the GPL agreements. Debates continue, even today, in copyleft expert circles, whether this model itself violates GPL. There is, however, no doubt that this provision is not in the spirit of the GPL agreements. The RHEL business model is unfriendly, captious, capricious, and cringe-worthy.
Furthermore, this RHEL business model remains, to our knowledge, rather unique in the software industry. IBM's Red Hat definitely deserves credit for so carefully constructing their business model such that it has spent most of the last two decades in murky territory of “probably not violating the GPL”.
Does The RHEL Business Model Violate the GPL Agreements?
Perhaps the biggest problem with a murky business model that skirts the line of GPL compliance is that violations can and do happen — since even a minor deviation from the business model clearly violates the GPL agreements. Pre-IBM Red Hat deserves a certain amount of credit, as SFC is aware of only two documented incidents of GPL violations that have occurred since 2006 regarding the RHEL business model. We've decided to share some general details of these violations for the purpose of explaining where this business model can so easily cross the line.
In the first violation, a large Fortune 500 company (which we'll call Company A), who both used RHEL internally and also built public-facing Linux-based products, decided to create a consumer-facing product (which we'll call Product P) based primarily on CentOS Linux, but P included a few packages built from RHEL sources. Company A did not seek nor ask for support or update services for this separate Product P. Red Hat later became aware that Product P contained some part of RHEL, and Red Hat demanded royalty payments for Product P. Red Hat threatened to revoke the support and update services on Company A's internal RHEL servers if such royalties were not paid.
Since Company A was powerful and had good lawyers and savvy business development staff, they did not acquiesce. Company A ultimately continued (to our knowledge) on as a RHEL customer for their internal servers and continued selling Product P without royalty payments. Nevertheless, a demand for royalties for distribution is clearly a violation as that demand creates a “further restriction” on the permissions granted by GPL. As stated in GPLv3:
You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License.
Red Hat tried to impose a further restriction in this situation, and therefore violated the GPL. The violation was resolved since no royalty was paid and Company A faced no consequences. SFC learned of the incident later, and informed Red Hat that the past royalty demand was a violation. Red Hat did not dispute nor agree that it was a violation, and did informally agree such demands would not be made in future.
In another violation incident, we learned that Red Hat, in a specific non-USA country, was requiring that any customer who lowered the number of RHEL machines under service contract with Red Hat sign an additional agreement. This additional agreement promised that the customer had deleted every copy of RHEL in their entire organization other than the copies of RHEL that were currently contracted for service with Red Hat. Again, this is a “further restriction”. The GPL agreements give everyone the unfettered right to make and keep as many copies of the software as they like, and a distributor of GPL'd software may not require a user to attest that they've deleted these legitimate, licensed copies of third-party-licensed software under the GPL. SFC informed Red Hat's legal department of this violation, and we were assured that this additional agreement would no longer be presented to any Red Hat customers in the future.
In both these situations, we at SFC were worried they were merely a “tip of the proverbial iceberg”. For years, we have heard from Red Hat customers who are truly confused. It's common in the industry to talk about RHEL “seat licenses”, and many software acquisition specialists in the industry are not aware of the nuances of the RHEL business model and do not understand their rights. We remain very concerned that RHEL salespeople purposely confuse customers to sell more “seat licenses”. It's often led us to ask: “If a GPL violation happens in the woods, and everyone involved doesn't hear it, how does anyone know that software rights have indeed been trampled upon in those woods?”. As we do for as many GPL violation reports as we can, we zealously pursue RHEL-related GPL violations that are reported to us, and if you're aware of one, please do email us at <compliance@sfconservancy.org> immediately. We fear that be it through incompetence or malice, many RHEL salespeople and business development professionals may regularly violate GPL and no one knows about it. That said, the business model as described by IBM's Red Hat may well comply with the GPL — it's just so murky that any tweak to the model in any direction seems to definitely violate, in our experience.
Furthermore, Red Hat exploits the classic “caveat emptor” approach — popular in many a shady business deal throughout history. While, technically speaking, a careful reader of the GPL and the RHEL agreements understands the bargain they're making, we suspect most small businesses just don't have the FOSS licensing acumen and knowledge to truly understand that deal.
Why Was an Independent CentOS So Important?
Until Red Hat's “aquisition” of CentOS in early 2014, CentOS provided an excellent counterbalance to the problems with the RHEL business model. Specifically, CentOS was a community-driven project, with many volunteers, supported by some involvement from small businesses, to re-create RHEL releases using the CCS releases made for RHEL. Our pre-2014 view was that CentOS was the “canary in the murky coalmine” of the RHEL business. If CentOS seemed vibrant, usable, and a viable alternative to RHEL for those who didn't want to purchase Red Hat's updates and services, the community could rest easy. Even if there were GPL violations by Red Hat on RHEL, CentOS' vibrancy assured that such violations were having only a minor negative impact on the FOSS community around RHEL's codebase.
Red Hat, however, apparently knew that this vibrant community was cutting into their profits. Starting in 2013, Red Hat engaged in a series of actions that increased their grip. First, they “acquired” CentOS. This was initially couched as a cooperation agreement, but Red Hat systematically made job offers that key CentOS volunteers couldn't refuse, acquired the small businesses who might ultimately build CentOS into a product, and otherwise integrated CentOS into Red Hat's own operations.
After IBM acquired Red Hat, the situation got worse. Having gotten rights to the CentOS brand as part of the “aquisition”, Red Hat slowly began to change what CentOS was. CentOS Linux quickly ceased to be a check-and-balance on RHEL, and just became a testing ground for RHEL. Then, in 2020, when most of us were distracted by the worst of the COVID-19 pandemic, Red Hat unilaterally terminated all CentOS Linux development. Later (during the Delta variant portion of the pandemic in late 2021) Red Hat ended CentOS Linux entirely. IBM's Red Hat then used the name “CentOS Stream” to refer to experimental source packages related to RHEL. These were (and are) not actually the RHEL source releases — rather, they appear to be primarily a testing ground for what might appear in RHEL later.
Finally, Red Hat announced two days ago that RHEL CCS will no longer be publicly available in any way. Now, to be clear, the GPL agreements did not obligate Red Hat to make its CCS publicly available to everyone. This is a common misconception about GPL's requirements. While the details of CCS provisioning vary in the different versions of the GPL agreements, the general principle is that CCS need to be provided either (a) along with the binary distributions to those who receive, or (b) to those who request pursuant to a written offer for source. In a normal situation, with no mitigating factors, the fact that a company moved from distributing CCS publicly to everyone to only giving it to customers who received the binaries already would not raise concerns.
In this situation, however, this completes what appears to be a decade-long plan by Red Hat to maximize the level of difficulty of those in the community who wish to “trust but verify” that RHEL complies with the GPL agreements. Namely, Red Hat has badly thwarted efforts by entities such as Rocky Linux and Alma Linux. These entities are de-facto the intellectual successors to CentOS Linux project that Red Hat carefully dismantled over the last decade. These organizations sought to build Linux-based distributions that mirrored RHEL releases, and it is now unclear if they can do that effectively, since Red Hat will undoubtedly capriciously refuse to sell them exactly-one RHEL service and update “seat license” at a reasonable price. It appears that, as of this week, one must have at least that to get timely access to RHEL CCS.
What Should Those Who Care About Software Rights Do About RHEL?
Due to this ongoing bad behavior by IBM's Red Hat, the situation has become increasingly complex and difficult to face. No third party can effectively monitor RHEL compliance with the GPL agreements, since customers live in fear of losing their much-needed service contracts. Red Hat's legal department has systematically refused SFC's requests in recent years to set up some form of monitoring by SFC. (For example, we asked to review the training materials and documents that RHEL salespeople are given to convince customers to buy RHEL, and Red Hat has not been willing to share these materials with us.) Nevertheless, since SFC serves as the global watchdog for GPL compliance, we welcome reports of RHEL-related violations.
We finally express our sadness that this long road has led the FOSS community to such a disappointing place. I personally remember standing with Erik Troan in a Red Hat booth at a USENIX conference in the late 1990s, and meeting Bob Young around the same time. Both expressed how much they wanted to build a company that respected, collaborated with, engaged with, and most of all treated as equals the wide spectrum of individuals, hobbyists, and small businesses that make the plurality of the FOSS community. We hope that the modern Red Hat can find their way back to this mission under IBM's control.
Trademark Was Made to Prevent Attack of the “Clones” Problem in App Stores
by
on July 11, 2022Suppose you go to your weekly MyTown market. The market runs Saturday and Sunday, and vendors set up booths to sell locally made products and locally grown and produced food. On Saturday, you buy some delicious almond milk from a local vendor — called Al's Awesome Almond Milk. You realize that Al's Awesome would make an excellent frozen dessert, so you make your new frozen dessert, which you name Betty's Best Almond Frozen Dessert. You get a booth for Sunday for yourself, and you sell some, but not as much as you'd like.
The next week, you realize you might sell more if you call it Al's Awesome Almond Frozen Dessert instead of your own name. Folks at the market know Al, but not you. So you change the name. Is this a morally and legally acceptable thing to do?
This is a question primarily regarding trademarks. We spend a lot of time in the Free and Open Source Software (FOSS) community talking about copyrights and patents, but another common area of legal issues that face FOSS projects (in addition to copyright and patent) is trademark.
In fact, FOSS projects probably don't spend enough time thinking about their trademark. Nearly ten years ago, Pam Chestek — a lawyer and expert in trademark law as it relates to FOSS and board member of OSI — gave an excellent talk at FOSDEM (2013), wherein she explored how FOSS projects can use trademarks better and to ensure rights of consumers — particularly when dealing with bad actors. Our own Executive Director, Karen Sandler, had also spoken about this issue as well. These older talks, in turn, spawned an ongoing conversation that continues to this day in FOSS policy circles.
Specifically, last week, we learned that the Microsoft Store was changing their policies, ostensibly to deal with folks (probably some of whom are unscrupulous) rebuilding binaries for well-known FOSS projects and uploading them to the Microsoft Store. Yet, this is a longstanding issue in FOSS policy. FOSS experts in this area would have been happy to share what's been learned over the last ten years of studying this issue.
The problem Microsoft faces here is the same problem that the MyTown market folks face if you show up trying to sell Al's Awesome Almond Frozen Dessert. The store/market can set rules that you will no longer be able to sell if you are found to infringe the trademark of another seller. The market could simply require the trademark holder to take trademark action themselves, or it could offer some form of assistance, arbitration, or other-extra-legal resolution mechanism0.
There is often temptation in FOSS to give special status to maintainers, or the original developer, or the copyright holder, or some other entity that is considered “official”. In FOSS, though, the only mechanism of officialness is the trademark — the name of the upstream project (or the fork). The entire point of FOSS is that for the code itself, everyone should have equal rights to the original developers, to the maintainers, or to any other entity.
We have faced this with our member project, Inkscape. While the Inkscape Project Leadership Committee has chosen not to charge for the version of Inkscape that they upload on Microsoft Store, we did see this very problem for many years before these app stores even existed. Namely, it was common for third-parties to sell Windows binaries on CD's for Inkscape in an effort to make a quick buck. We did trademark enforcement in these cases — not forbidding these vendors from selling — but simply requiring the vendors to clearly say that the product was a modified version of Inkscape. Or, if it was unmodified redistribution of Inkscape's own binaries, we required the vendor to note that the Inkscape project's website was the official source for these binaries.
I have often written to complain about copyright and patent law. I have my complaints about trademark law (and I've seen trademark grossly abused, even), but trademark laws tenets are really reasonable and solid: to ensure consumers know the source and quality of the products they receive.
The problem of concern here is one well handled by trademark. It doesn't need excessive app store rules; we don't need FOSS licenses to be usurped or superseded by Draconian policy. And, this solution to this particular problem has been long-known by FOSS. Pam's talk in 2013 explained it quite well!
The MyTown Market doesn't need to create a policy that forbids you from buying Al's Awesome Almond Milk on Saturday and reselling a product based on it on Sunday. They just need to let Al know his rights under trademark, and maybe offer a lightweight provisional suspension of your booth if the trademark complaint seems primia facie valid. But, most importantly, before it announces new rules with a 30 day clock, MyTown's leadership really should discuss it with the citizens first to find a policy that takes into account concerns of the people. Even if they fail to do that, there are MyTown's elected officials whose actions are accountable to the people. App store companies are accountable only to their shareholders, not the authors of the apps. Companies could benefit by learning that the FOSS community prioritizes respecting authors, protecting consumers' and users' rights, and by understanding that the line between user and contributor should blur. The FOSS marketplace functions because the community works.
Footnotes
0 I hesitate to even suggest that an app store should create an extra-legal process regarding trademark enforcement beyond the typical governmental mechanisms — lest they decide they have to do it. A major problem with app stores is that they create rules for software distribution that are capricious, and arbitrary. We all do want FOSS available on Microsoft, Apple, and Google-based platforms — and as such are forced to negotiate (or, rather, try to negotiate) for FOSS-friendly terms. Ultimately, though, the story of major vendor-controlled app stores is always the story of “just barely” being able to put FOSS on them, because the goal of these entities is to profit themselves, not serve the community. We prefer app stores like F-Droid that are community-organized and are not run for-profit.