The Parable of the Code Review

Last week’s events, with Linus Torvalds pledging to stop behaving like an asshole, instituting a code of conduct in Linux kernel development, and all but running off to join a monastery, have made a lot of waves. The last bastion of meritocracy has fallen! Linus, the man with five middle fingers on each hand, was going to save free software from ruin by tellin’ it like it is to all those writers of bad patches. Now he has gone over to the Dark Side, etc., etc.

There is one thing that struck me when reading the arguments last week, that I never realized before (as I guess I tend to avoid reading this type of material): the folks who argue against, are convinced that the inevitable end result of respectful behaviour is a weakening of technical skill in free software. I’ve read from many sources last week the “meritocracy or bust” argument that meritocracy means three things: the acceptance of patches on no other grounds than technical excellence, the promotion of no other than technically excellent people to maintainer positions within projects, and finally the freedom to disrespect people who are not technically excellent. As I understand these people’s arguments, the meritocracy system works, so removing any of these three pillars is therefore bound to produce worse results than meritocracy. Some go so far as to say that treating people respectfully, would mean taking technically excellent maintainers and replacing them with less proficient people chosen for how nice1 they are.

I never considered the motivations that way; maybe I didn’t give much thought to why on earth someone would argue in favour of behaving like an asshole. But it reminded me of a culture shift that happened a number of years ago, and that’s what this post is about.

Back in the bad old days…

It used to be that we didn’t have any code review in the free software world.

Well, of course we have always had code review; you would post patches to something like Bugzilla or a mailing list and the maintainer would review them and commit them, ask for a revision, or reject them (or, if the maintainer was Linus Torvalds, reject them and tell you to kill yourself.)

But maintainers just wrote patches and committed them, and didn’t have to review them! They were maintainers because we trusted them absolutely to write bug-free code, right?2 Sure, it may be that maintainers committed patches with mistakes sometimes, but those could hardly have been avoided. If you made avoidable mistakes in your patches, you didn’t get to be a maintainer, or if you did somehow get to be a maintainer then you were a bad one and you would probably run your project into the ground.

Somewhere along the line we got this idea that every patch should be reviewed, even if it was written by a maintainer. The reason is not because we want to enable maintainers who make mistakes all the time! Rather, because we recognize that even the most excellent maintainers do make mistakes, it’s just part of being human. And even if your patch doesn’t have a mistake, another pair of eyes can sometimes help you take it to the next level of elegance.

Some people complained: it’s bureaucratic! it’s for Agile weenies! really excellent developers will not tolerate it and will leave! etc. Some even still believe this. But even our tools have evolved over time to expect code review — you could argue that the foundational premise of the GitHub UI is code review! — and the perspective has shifted in our community so that code review is now a best practice, and what do you know, our code has gotten better, not worse. Maintainers who can’t handle having their code reviewed by others are rare these days.

By the way, it may not seem like such a big deal now that it’s been around for a while, but code review can be really threatening if you aren’t used to it. It’s not easy to watch your work be critiqued, and it brings out a fight-or-flight response in the best of us, until it becomes part of our routine. Even Albert Einstein famously wrote scornfully to a journal editor after a reviewer had pointed out a mistake in his paper, that he had sent the manuscript for publication, not for review.

And now imagine a future where we could say…

It used to be that we treated each other like crap in the free software world.

Well, of course we didn’t always treat each other like crap; you would submit patches and sometimes they would be gratefully accepted, but other times Linus Torvalds would tell you to kill yourself.

But maintainers did it all in the name of technical excellence! They were maintainers because we trusted them absolutely to be objective, right? Sure, it may be that patches by people who didn’t fit the “programmer” stereotype were flamed more often, and it may be that people got sick of the disrespect and left free software entirely, but the maintainers were purely objectively looking at technical excellence. If you weren’t purely objective, you didn’t get to be a maintainer, or if you somehow did get to be a maintainer then you were a bad one and you would probably run your project into the ground.

Somewhere along the line we got this idea that contributors should be treated with respect and not driven away from projects, even if the maintainer didn’t agree with their patches. The reason is not because we want to force maintainers to be less objective about technical excellence! Rather, because we recognize that even the most objective maintainers do suffer from biases, it’s just part of being human. And even if someone’s patch is objectively bad, treating them nonetheless with respect can help ensure they will stick around, contribute their perspectives which may be different from yours, and rise to a maintainer’s level of competence in the future.

Some people complained: it’s dishonest! it’s for politically correct weenies! really excellent developers will not tolerate it and will leave! etc. Some even still believe this. But the perspective has shifted in our community so that respect is now a best practice, and what do you know, our code (and our communities) have gotten better, not worse. Maintainers who can’t handle treating people respectfully are rare these days.

By the way, it may not seem like such a big deal now that it’s been around for a while, but confronting and acknowledging your own biases can be really threatening if you aren’t used to it… I think by now you get the idea.

Conclusion, and a note for the choir

I generally try not to preach to the choir anymore, and leave that instead to others. So if you are in the choir, you are not the audience for this post. I’m hoping, possibly vainly, that this actually might convince someone to think differently about meritocracy, and consider this a bug report.

But here’s a small note for us in the choir: I believe we are not doing ourselves any favours by framing respectful behaviour as the opposite of meritocracy, and I think that’s part of why the pro-disrespect camp have such a strong reaction against it. I understand why the jargon developed that way: those driven away by the current, flawed, implementation of meritocracy are understandably sick of hearing about how meritocracy works so well, and the term itself has become a bit poisoned.

If anything, we are simply trying to fix a bug in meritocracy3, so that we get an environment where we really do get the code written by the most technically excellent people, including those who in the current system get driven away by abusive language and behaviour.


[1] To be clear, I strive to be both nice and technically excellent, and the number of times I’ve been forced to make a tradeoff between those two things is literally zero. But that’s really the whole point of this essay

[2] A remnant of these bad old days of absolute trust in maintainers, that still persists in GNOME to this day, is that committer privileges are for the whole GNOME project. I can literally commit anything I like, to any repository in gitlab.gnome.org/GNOME, even repositories that I have no idea what they do, or are written in a programming language that I don’t know!

[3] A point made eloquently by Matthew Garrett several years ago

More Memory, More Problems

In GJS we recently committed a patch that has been making waves. Thanks to GJS contributor Georges Basile “Feaneron” Stavracas Neto, some infamous memory problems with GNOME Shell 3.28 have been mitigated. (What’s the link between GNOME Shell and GJS? GNOME Shell uses GJS as its internal Javascript engine, in which some of the UI and all of the extensions are implemented.)

There is a technical explanation, having to do with toggle-refs, a GObject concept which we use to interface the JS engine’s garbage collector with GObject’s reference counting system. Georges has already provided a fantastic introduction to the technical details so I will not do another one here. This post will be more about social issues, future plans, and answers to some myths I’ve seen in various comments recently. To read this post, you only need to know that the problem has to do with toggle-refs and that toggle-refs are difficult to reason about.

Not a Memory Leak

I really don’t want to call this a memory leak, much less “the GNOME memory leak” that’s become common in the press coverage lately. I find that that sets the wrong expectations for users suffering from these memory problems. You might say that for the end user it makes no difference, their computer’s memory is being occupied by GNOME Shell, so what’s the point in not calling it a memory leak? And you would be partially right. The effect is no different. The expectations are different though, especially for users who have some technical knowledge. A memory leak is a simple problem to fix. When you have one, you run your software under Valgrind or ASAN, you get a backtrace that shows where the memory was allocated that you didn’t free, and you free it. Problem solved. You can even run Valgrind in your automatic tests to prevent new leaks. That’s not the case here, and if we refer to it as a memory leak then it can only cause frustration on the part of users who are aware how simple it is to fix a memory leak.

This problem is different. It’s not a leak in the traditional sense. The memory does eventually get freed, but GNOME Shell holds onto it for too long; long enough to cause problems on some systems. As GJS contributor Andy Holmes put it, it’s a “tardy GC sweep.” I think that has a catchy ring to it, so I’ll call it the “tardy sweep problem” from now on.

Scruffy the janitor from Futurama, relaxing in a chair, captioned

Meme by Andy Holmes, used with kind permission

To be honest, I found that the OMG!Ubuntu article about “the memory leak” attracted a lot of comments that don’t sit well with me, and I think that the wrong expectations set by calling it a “memory leak” are partly to blame. With this post, I hope to give a better idea of what GNOME users can expect.

On the bright side, due to the recent publicity and especially the OMG!Ubuntu article, more GNOME developers are talking about the memory problems and suggesting things, which is causing an exciting confluence of ideas that I couldn’t have come up with on my own.

Edit: I want to be absolutely clear that with the above I’m not blaming bug reporters for not knowing whether something is a “proper” memory leak or not. This is intended to bring some attention to the wrong expectations that arise, especially among technically savvy users, when GNOME developers and the tech press use the term “memory leak,” and illustrate why we ourselves should not use the term here.

 

The Big Hammer

Hammer smashing a peanut

CC0 licensed image, by stevepb

I’ve been calling this patch the “Big Hammer” because it’s a drastic measure: starting a whole new garbage collection cycle in order to clean up some objects that we already know should be cleaned up.

The tardy sweep problem has now been mitigated with the Big Hammer, but reducing GNOME Shell’s memory usage has been a battle for years, and it has very little to do with memory leaks.

There are many other causes of high memory usage in GNOME Shell. Some are real memory leaks, that generally get fixed before too long. GNOME Shell developers have had their suspicions about NVidia drivers for years. Another cause is JS memory leaks in GNOME Shell. (Contrary to popular belief, it is possible to leak memory in pure Javascript code. Andy’s new heapgraph tool is useful when tracking these down, but throughout most of the life of GNOME Shell this tool didn’t exist.) There’s also memory fragmentation, which can look like a memory leak in a resource monitor.1 In addition, when diagnosing reports from users, configurations vary wildly. Memory usage simply differs from system to system. Finally, people have different configurations of Shell extensions, some of which leak memory as well.

The Sad Lifecycle of a GNOME Shell Memory Leak Bug Report

  1. User reports “I have a memory leak”
  2. Developer runs Valgrind, sifts through Valgrind trace, finds a small leak and fixes it
  3. Problem isn’t fixed
  4. Repeat 2 and 3 until no leaks shown in the Valgrind trace
  5. Problem still isn’t fixed
  6. User eagerly awaits each point release hoping for relief, and is disappointed each time
  7. Bug report gains popularity, accretes followers like a katamari, some of whom vent about unrelated bugs, hound the developer, or become abusive, until the original point of the bug report is lost
  8. Developer can’t do anything productive with the bug report at this point. They know there’s still a memory problem and it’s not a traditional leak, but the bug report is not helping them find it
  9. Users don’t accept that answer
  10. Developer closes the bug report and makes users even angrier. (Or, developer ignores the bug report until it fizzles out, and makes users sad.)

Here are some examples of long-running bug reports where you can see this dynamic in action. It’s quite sad to observe, because everybody involved is doing what makes perfect sense from their perspective (except for a few people behaving badly), yet the result is a mess.

I hope this illustrates why it’s important to assume that people are acting in good faith.

I know some people will argue that the developer mustn’t close the bug report until the bug is “fixed”, meaning that there is no more unnecessary memory usage. But in my opinion that’s just not a useful way to think of bug reports. GNOME Shell developers know (and so do I, from the GJS side) that GNOME Shell uses a lot of memory. I agree it’s nice to keep a bug report open so that users know that we’re aware of it and it’s on our to-do lists somewhere, but very soon the time we spend dealing with the noise on the bug report eclipses whatever benefit it might bring to the community.

I hope GitLab will improve things a bit here, since if you feel strongly about an issue in the bugtracker, you can upvote it or add an emoji reaction to it. This is a good way for users to show that an issue is important to them, and if enough people use it then it’s a good indicator for me to see which issues are prioritized highest by users and contributors.

5 Myths About GNOME Shell’s Memory Problems (Paraphrased)

“GNOME developers never cared about the tardy sweep problem until OMG!Ubuntu reported on it. They don’t care about users until someone makes them.”

Carlos Garnacho, of GNOME Shell fame, pointed out to me that in a more global perspective there has been a very active hunt for actual memory leaks across many of GNOME Shell’s dependencies for quite some time, and he has personally patched leaks in IBus, AccountsService, libgweather, gnome-desktop, and more.

It seems the tardy sweep problem has gotten worse in recent versions of GNOME, although it’s hard to measure between different systems with different configurations. I don’t know why it’s gotten worse.2

It seems to have been known for a long time, though: for history buffs, it was alluded to in a comment in commit ae34ec49, back in 2011. That knowledge was apparently lost when GJS was without a maintainer for a couple of years. To be honest, it has taken me well over a year to get familiar enough with the toggle-ref code (which integrates the JS garbage collector with GObject’s refcounting system), that I feel even remotely comfortable making or reviewing changes to it. I only fully realized the implications of the tardy sweep problem after talking to Georges and seeing his memory graph.

It seems that a previous GJS maintainer, Giovanni Campagna, was trying to mitigate the tardy sweep problem already five years ago, with a patch that allowed objects to escape the tardy sweep by opting out of the whole toggle-ref system in some cases. Unfortunately, as far as I can tell from my bug tracker archaeology, his patch went through a few reviews and the answer was always “Wow, this is really complicated, I need to study it some more in order to understand it.” Then it fell by the wayside when he stepped down from GJS maintainership.

I picked the patch up again late last year and fixed up most of the bit-rot. It still had a few problems with it. I never found the time to fix it up completely, which Georges kindly took over for me. I initially preferred Giovanni’s patch above the Big Hammer, but unfortunately for me, Georges proved that it wasn’t as effective as we thought it would be, only clearing up about 5% of the tardy sweep memory.3

“GNOME fixed the problem with a shoddy solution. This will decrease performance but they won’t notice because they all have top-of-the-line machines.”

I don’t call it the Big Hammer for nothing. We were concerned about performance regressions too, so that’s reasonable. However, as you can read in the bug tracker, we did actually do some testing on lower-end hardware before merging the Big Hammer, and it was not as bad as I expected. Carlos has been doing some measurements and found that garbage collection accounts for about 2–3% of the time that GNOME Shell occupies the CPU.

However, it’s exactly because we want to be cautious that the Big Hammer has only been committed to master, which will first be released in the unstable GNOME 3.29.2 snapshot. I don’t plan to release it on a stable branch until we’ve run it some more.

Ubuntu has already put the Big Hammer in their LTS version. That’s more of a risk than I would have recommended, but it’s not my decision to make, and I am grateful that we will be getting some testing through that avenue. Endless is also considering putting the Big Hammer in their stable version.

(And alas, I don’t have a top-of-the-line machine. Feel free to donate me one if that’s what’s required to make me conform to some stereotype of GNOME developers. 😇)

“The problem should be fixed now. GNOME Shell will run smooth from now on.”

No, it’s not. GNOME Shell still isn’t that great with memory.

Carlos is working on some merge requests which are approaching being ready to merge, which should make things a bit more memory-efficient. He’s also had some success with experiments trying to reduce memory fragmentation, taking better advantage of SpiderMonkey’s compacting garbage collector.

We are also bouncing around some ideas for making the Big Hammer into a smaller hammer. In particular, we’re trying to see if the extra garbage collections can be restricted to only the JS objects that represent GObjects, since those are the only objects that are affected by the tardy sweep problem. We’re also trying to see if there’s a way to return black-marked (reachable) objects to their original white-marked (eligible for collection) state when a GObject is toggled down in the middle of a garbage collection.

Another approach to investigate is to make better use of incremental garbage collection. SpiderMonkey offers this facility but we don’t use it yet. The idea is, instead of pausing and doing a big garbage collection, we do a slice of a few milliseconds whenever we have time. I don’t know yet whether this will have a large or small effect, or even render the Big Hammer unnecessary.

We’re also going to update to SpiderMonkey 60 in GNOME 3.30 which will hopefully bring in another year’s worth of Mozilla’s garbage collector research and optimization.

Finally, I’m gradually working on another unfinished merge request left over from Giovanni’s tenure as GJS maintainer, that should drastically increase the performance of GNOME Shell’s animations (though not necessarily help with memory.)

“GNOME has no business releasing any new versions until this problem is fixed.”

GNOME has a fixed release schedule, so they release new versions on the release dates, with whatever is aboard the train at that time. That’s not going to change.

“This version of GNOME is going into Ubuntu LTS! GNOME needs to work harder to fix this.”

Of course, I want whatever version of GNOME ships with any Linux distribution to be as good as possible. But as the upstream GJS maintainer, I have no say over what a downstream Linux distribution chooses to ship. The best way for a Linux distro to make sure their release is shipshape, is to contribute resources towards fixing whatever they consider a blocker.

That sounds a bit callous, as if I refuse to fix any bugs that Ubuntu wants fixed; that’s not what I mean at all. But my free time is limited. I’m paid for a part of my GJS maintainer work, but only for specific features. I can’t work to anyone’s external deadlines in my free time, because otherwise I’ll burn out and that’s not good for anyone with any interest in GJS either. Sometimes I have other priorities besides sitting at the computer; sometimes I do have time but no ideas about a particular problem; sometimes my brain isn’t up to fixing a difficult memory problem and I choose to work on something easier.4 Bugfixing work isn’t fungible.

I picked Ubuntu to illustrate this example, because contributing is exactly what the Ubuntu team has done; Ubuntu contributors fixed stability bugs in GJS, as well as GNOME Shell and Mutter, for GNOME 3.28. To say nothing of contributors from other downstreams, as well. That’s great and I’m looking forward to more of it! Some commenters seem to see downstreams fixing bugs as something that GNOME developers should be ashamed of, but I believe everyone is better off for it when that happens!

Acknowledgements

Thanks to Carlos Garnacho and Andy Holmes, who commented on a draft version of this blog post. Thanks in addition to Andy who coined the term “tardy sweep” and provided Scruffy as the mascot; Heartbleed has branding, why shouldn’t we? And of course, thanks to Georges who kicked off the whole research in the first place!


[1] and has often made people angry in bug reports when told it’s not a memory leak

[2] I have a hunch, though. When I updated SpiderMonkey to version 38 in GNOME 3.24, we went from a conservative collector to an exact-rooted, moving one — see this Wikipedia article for definitions of those terms. It may be that the old garbage collector, though generally considered inferior, did actually mitigate the tardy sweeps a little, because I think back then it would have been possible for more objects to make their way into an ongoing sweep. It’s also possible that it was made worse earlier than that, by some adjustments in GNOME Shell that adjusted how often the garbage collector was called.

[3] Technical explanation: Tweener, which is the animation framework used by GNOME Shell, renders many objects ineligible to opt out of the toggle-ref system. I would like to see Tweener replaced with Clutter implicit animations in GNOME Shell, which would make Giovanni’s patch much more effective, but that’s a big project.

[4] Like writing a blog post about a difficult memory problem. Joke’s on me, that’s actually really hard

Rain in the Desert, Part II

Recently I wrote about my experience with microblogging at a physics conference. I was gratified to find out that people actually read and enjoyed it, and it might even have had an effect on next year’s conference. Roy Meijer was kind enough to send me some tips on how to use social media at scientific conferences. I’ve said what I wanted to say about the experience, but I want to discuss my further thoughts about two inappropriately sexist messages that showed up on the big Twitter screens at the conference.

A conference-goer who made one of the sexist comments wrote a comment on this blog, anonymously, objecting to me calling him a socially retarded asshole in my essay. Let it be absolutely clear that I don’t take kindly to people who create an unwelcoming or unpleasant environment for women in physics. But it made me think anyway: he seemed genuinely convinced that women enjoy this kind of attention, and indeed, one shouldn’t ascribe others’ objectionable behavior to malice when it could be just cluelessness. So maybe I should just have said socially retarded.

But no matter whether it’s malice or cluelessness, I know that physicists can do better. Not just can, but have to. Sexist attitudes just don’t belong in the world today, and most other professions have gotten with the program. But physicists apparently still live in 1972. Here are some examples of what women in physics have to put up with:

  • A few years ago a professor at our Institute gave a really horrifying speech at the Christmas party, in which he thanked the secretaries for being our mothers who took care of us, and the technicians for being our fathers who brought us toys to play with. As if that piece of gender stereotyping weren’t enough of a train wreck by itself, he had actually started out his speech by telling an off-color story that involved mistresses, then said he’d heard that story from a Jewish colleague and that it was typical Jewish humor. (Although this post is about sexism, not racial stereotypes, and I can only be outraged about one thing at a time if I want to keep my message on track.)
  • Another professor at our Institute told grad students that to be a successful physicist, you have to have a supportive family, so his wife stayed home and took care of the children. Yes, there were female grad students in the audience. Perhaps there were gay grad students too.
  • At a conference I was at, a professor put a slide of a bikini model into his talk. He had photoshopped her head to be the professor organizing the conference (who is a woman).
  • If you go to the poster session at a physics conference, you’ll always see a crowd of nerds clustered around a few posters. At the center of each cluster is a female physicist presenting her poster. If I put myself in her place, I figure the attention to one’s research is gratifying — as a male physicist, I always have to work really hard to get anyone to even stop and take a look at my poster — but on the other hand, as a male physicist, I never have to worry about whether the attention comes from sincere interest in one’s research, or ulterior motives.

[Man writes incorrect equation on chalkboard] ONLOOKER: Wow, you suck at math. [Woman writes incorrect equation on chalkboard] ONLOOKER: Wow, girls suck at math.

Dear readers, I respect you, really I do. I know everybody on the whole internet links to this XKCD cartoon and you've seen it fifty thousand times already. But there's just no possible way to illustrate sexism in science more accurately and simply than Randall Munroe does.

So the question is what in particular is wrong with these comments I objected to. The Geek Feminism Wiki’s page on technical conferences has a list of problems with which women are often confronted. My experience is that FOM does a good job at making sure that most of these problems don’t occur at the Veldhoven conference. (Although my personal experience doesn’t really count, does it, since I’m lucky enough not to have to face these challenges.)

However, our conference-goer who made the inappropriate comment on the public screen, in my opinion, has made the mistake of falling into the “You should be flattered” trap: the misguided belief that since he was actually making a compliment, it should be OK. I quote (and slightly paraphrase) from that link on why this belief is wrong:

  • “It attempts to dictate women’s emotional responses to such comments, in particular perpetrating the idea that women are socially obliged to be pleasant and accommodating;
  • “It places the blame on women for responding negatively to attention which is wholly inappropriate in [a professional context];
  • “It reminds women that they are subject to men’s approval […];
  • “It reminds women who aren’t the object of the comment that they are also subject to men’s approval;
  • “It ignores the fact that many women have had negative experiences with sexual attention, such as immediate or eventual criticism or violence, and therefore do not view it with unmixed (or any) pleasure;
  • “It makes non-straight women feel particularly marginalised;
  • “Focusing on women’s appearance contributes to feelings of exceptionalism and conveys the judgment that a woman’s [physics] expertise is less valuable than her attractiveness.”

In my mind, there’s still an unsolved question. Does the conference organizer have a responsibility towards the conference-goers to prevent this stuff from happening or mitigate it when it does? On the one hand, you have to assume that people will comport themselves decently in public, and you can’t prevent every possible turd that people might drop in the punchbowl. On the other hand, allowing someone to create a poisonous environment for a minority lasts much longer than the conference does.

Addendum 1: I debated with myself whether to name-and-shame the professors in the examples above. They certainly deserve it, but I’m not sure it would serve any purpose other than spite. The time for denouncing the first and third incidents was right after they happened, and I will always regret that I said nothing in both cases. I wasn’t present at the second incident myself; I only heard about it from people who were there.

Addendum 2: Please realize that I’m not trying to flog a dead horse by chastising someone for an inappropriate comment at a conference that has been over for two months now. I am writing this because I think that the problem is an important one and I think that we have a responsibility to educate our colleagues so that physics can move out of the social Dark Ages, and women will actually feel welcome in our field, and nobody will argue about affirmative action policies, because we won’t need them.