Displaying posts by Brett Smith
Parsing GitHub’s data on queer participation in FOSS communities
byon June 6, 2017
Earlier this year, GitHub conducted a broad survey of “those who use, build, and maintain open source software.” They just released the results, and for those of us who care deeply about the inclusiveness of FOSS communities, it’s a lot of sobering reading. There’s still a dearth of women participating. It also provides numbers to incidents of bad behavior, and the impacts those have on our communities.
There’s potentially one bright spot in the demographic data, though—and you get the sense the authors were happy to find it, too, since they call it out themselves. They note:
Along other dimensions [than gender], representation is stronger: 1% of respondents identify as transgender (including 9% of women…), and 7% identify as lesbian, gay, bisexual, asexual, or another minority sexual orientation.
As far as I know, this is the first attempt to broadly quantify queer participation in the FOSS community, and I’m really grateful GitHub made it. As discussions about diversity in our communities have come to the fore, I’ve been frustrated that it’s been hard to include queer identities in them, because we didn’t have basic information like whether or not we’re even underrepresented in the first place. GitHub’s results start to help us answer those questions.
I say start to help us answer them, because no one survey will ever answer them authoritatively. Before people run out to declare we’re succeeding at building queer-inclusive communities, I want to contextualize these results a little to help people better understand what they do and don’t tell us.
One limit in this data is in the audience surveyed. GitHub “collected responses from 5,500 randomly sampled respondents sourced from over 3,800 open source repositories on GitHub.com, and over 500 responses from a non-random sample of communities that work on other platforms.” This skews the audience towards relatively technical participants in FOSS communities in a couple of different ways. First, surveying people who are active on GitHub or comparable development platforms means we’re only surveying people who work directly with the tools of developing software. This survey doesn’t collect responses from people who draw and post UI mockups, e-mail in suggested revisions to documentation, or answer other users’ questions on external forums. More than anything, I would love to see a similar survey conducted with a more expansive view of who participates in FOSS.
Second, including more respondents from GitHub biases the audience toward people who are working on newer or more modernized projects. Think of all the FOSS projects that predate GitHub and still host their development elsewhere: GNU, GNOME, Firefox, LibreOffice… when you think about the applications individuals use every day, these are a lot of the most popular ones. Including 500 responses from other communities helps mitigate that, but it’s not clear that’s a representative ratio, and the fact that they were chosen non-randomly is less than ideal too (although I recognize it’s not obvious how to incorporate those responses in a way that would be both random and fair).
Another reason these results aren’t authoritative is that sexual orientation and gender identity are complex. A single question about each on a survey will never be sufficient to accurately capture the community’s full diversity. Most survey results are sensitive to how their questions are worded, and this is famously true for these sorts of questions about identity. Wikipedia’s article “Demographics of sexual orientation” provides a good primer on these issues if you want to learn more. Briefly, it matters a lot whether you ask whether the respondent identifies themselves a certain way, versus whether others would identify them that way, or whether they’ve engaged in activities that could be classified that way. Words like “gay” are also categories that were invented in the west, so people from other countries and cultures may not recognize or identify themselves with them. I think GitHub’s survey asks the two most useful yes/no questions you can ask to inform discussions about queer participation in FOSS, but there’s lots of room for other surveys to dig deeper on these topics.
None of this is to say the survey is flawed or should’ve been done differently. There are many trade-offs involved in designing a survey like this, and I think the trade-offs GitHub made are both clear and justifiable. The best way to understand where we truly stand is not to try to craft a single perfect survey, but to have many surveys with different structures. Then we can learn the most by comparing and contrasting their results. I hope more surveys follow GitHub’s lead to ask about sexual orientation and gender identity, whether they’re small projects surveying their users, large cross-community surveys like this, or anywhere in between.
All that said, the numbers in these results seem to be on the high end when compared against similar surveys of large general populations. I think the authors are right to call them out as a bright spot, and I’m personally encouraged by them too.
Let’s be optimistic for a moment and and assume these results mean that queers are at least proportionally represented in FOSS communities. Does that mean we’re queer-friendly?
Not necessarily. Just like a workplace can have both gender balance and rampant sexism—in wage gaps, in promotion choices, in who does and doesn’t get heard at meetings—our communities can have both proportional queer participation and hostility toward us. Identity policing, bi erasure, transphobia—we see these problems in spaces that are explicitly or even exclusively queer. Of course they can arise in FOSS communities too.
While we work on getting more numbers, we should also be working to defend against these problems. There are a couple of concrete things we can do.
First, we should be working to adopt strong codes of conduct in more of our communites. Any code of conduct worth its salt like Geek Feminism’s already prohibits harassment based on gender identity and sexual orientation. We should be joining this work, both out of self-interest and to help our allies who have been looking out for us in turn.
(An aside to my fellow queer men: this goes extra for us, because this is one of those times when we can wield our male privilege as a force for good. Since it’s mostly been women leading the charge for codes of conduct so far, it’s easy for opponents to try to minimize this work as women “just” advocating for themselves. Tell your community you want a code of conduct, tell them you want it for your own wellbeing, and shut down that train of thought before it even leaves the station.)
Second, us queers need to be out more in our communities, to build personal networks that can identify, discuss, and resolve these issues when they arise. This is easier said than done. Most of our interactions in the community happen on channels focused on getting work done: planning development, putting together documentation, reviewing changes. There’s rarely a good time to say “hey, I’m queer” in these spaces. It’s easy for it not to come up until the annual conference after-party.
We have to be more out than that, for the sake of new or occasional participants. When queers are considering getting involved in a project, seeing people like them already invested can help demonstrate to them that this is a place where they’re welcome. If they’re harassed, they’re more likely to report it if it’s easy to find someone they feel will understand and be receptive.
We can’t wait to come out until the big meetup. We have to be out on the mailing lists, in the chat rooms, on social media. I’m not saying you have to make a dedicated coming out thread, but try putting a sign in your avatars, e-mail signatures, or personal bios. (Personally I love to paste the rainbow flag emoji 🏳️🌈 anywhere I can get away with it, but I know that symbol doesn’t work for everyone. Here’s to more representation in future Unicode standards.) Some people may ask you not to bring “politics” or “sexuality” into the community, but being out is more fundamental than that: it’s about making sure queer people can be in the space at all. If a straight person complains that you bring up your queerness too much, that means you’re undeniably out, and that’s the goal.
LibreHealth’s Michael Downey on Why He’s a Conservancy Supporter
byon January 2, 2017
Michael Downey is one of the developers at the helm of our newest member project, LibreHealth. He was eager for the project to join Conservancy because, as he put it, the organization is “a really important player taking on responsibilties that are often neglected in our projects.” Join Michael as a Conservancy Supporter now to help us continue to provide these services to more projects.
Debian’s Luke Faraone on Why He’s a Conservancy Supporter
byon December 28, 2016
Luke Faraone is a Debian developer involved in our Debian Copyright Aggregation Project. He’s also a Conservancy Supporter because, in his words, Conservancy is “one of the best defenders of the ideals of free software.” Join Luke as a Conservancy Supporter today to help sustain that important work through 2017.
Report from the 2016 Reproducible Builds Summit
byon December 26, 2016
A couple of weeks ago I was at the Reproducible Builds Summit in Berlin. Over sixty representatives from all kinds of projects came together for three days to share information and ideas, plan solutions, and even squeeze in a little time to hack. It was my first real opportunity to dive into this work. I learned a ton, even enough to chip in a little, and I’m looking forward to working more on reproducible builds from here on out.
When we talk about reproducible builds, what we mean is a build process that produces the exact same binary every time you run it with the exact same inputs (like source code versions and compiler settings). If you’re interested in the details, check out the definition on the Reproducible Builds site—a bunch of folks hammered that out during the Summit.
You might think most build processes would be reproducible most of the time, but often the binaries include small inputs that are hard to reproduce, such as timestamps or build paths. Much of the work toward reproducible builds so far has focused on improving the inputs: removing inputs that aren’t really necessary to the final product, and better recording the ones that are. Once that’s done, most build processes are as reproducible as you’d expect. There’s still more to do there, but there’s enough of a foundation that we can start seeing some benefits from reproducible builds. Many of the discussions at the Summit were about planning those next steps.
Conservancy is really excited to help reproducible builds. Having a clear and trusted link from source code to binary helps the community in many different ways:
- The most obvious is security. When builds are reproducible, everyone can check for themselves that binaries they download actually come from the expected source code. We can demonstrate that unwanted code isn’t being added to distributors’ binaries, either accidentally or maliciously.
- A reproducible build is a documented build. When everyone can see exactly what inputs and build steps generated a binary, everyone can review and comment on that build process. It becomes easier to find binaries with “bad” inputs (like a version of a library with a critical bug) and plan an upgrade process for them.
- Reproducible builds can make license compliance easier for binary distributors. When a free software license requires distributors to provide source code, sometimes it can take a little work for them to figure out exactly what the right source code is. For example, if they have three versions of a development library installed on their build system, how do they know for sure which one went into the binary and should be included in the source code release? Reproducible builds record the answer unambiguously, in a format that can make it simple to put all the source code together.
We’ll reap the most benefits if there’s support at every level of the stack. Debian kickstarted the reproducible builds effort, and at the Summit there was a lot of great discussion about reaching out to other communities. Right now the focus is on other package distributors, so it was great to see representatives from Fedora, openSUSE, F-Droid, and Nix there. But our discussions also recognized the need for outreach to other projects that can play a role in this work, like build tools and other software that generates binaries that get shipped to users (such as filesystems or bytecode compilers). If you’re involved in a project like that, I encourage you to join us on the general mailing list for reproducible builds and introduce yourself. The more people working on this, the merrier!
Many thanks to all the Summit organizers for planning and running a productive working space. I’m already looking forward to the next reproducible builds meeting.