Friday, November 28, 2008

Sage Patch Review

I just spent a lot of time during the last few days refereeing Sage patches in the Sage trac server. Basically, we got behind and had something like nearly a 100 patches in there that were marked as "[with patch; needs review]", which means that the patch is done, and is just waiting on review. Some of the patches in this state hadn't been commented on since August! In many cases they had "bit rotted", which means that they are broken, since related parts of Sage had changed after the batches had been posted.

In June 2008 at Sage Days 8.5 we had a long meeting and setup a patch referee editor system, but it turned out to be a total failure. Our system was setup to be like a journal editor system, except with the addition of a weekly meeting. Even in person, our one and only meeting was an inefficient experience, and it never worked over irc or email.

Review is a Generic Problem


Robert Bradshaw recently went to a Google Summer of Code Mentors summit, and told me that the number one problem open source projects have that was discussed a lot at the summit is with patch review, in particular, with patches not getting reviewed in a timely fashion.

Review is also a big problem with mathematics journals, though with journals the turnaround time is expected to be months. Or even years -- I have one paper that's been in the review process at Mathematics of Computation for 3 years now, and I've almost never gotten a first referee report back from that journal in less than a year from submission time! But for some reason math journals tend to work. That said, I had some discussions with a prominent publisher about a journal that had fallen apart because the chief editor had seriously dropped the ball, and said publisher had to spend a massive amount of his own time fixing the situation.

Patch Review and Paper Review



Reviewing a math paper is a much different experience from reviewing a software patch. When one reviews a math paper, you basically get a potentially very interesting paper that is at the cutting edge of research, which you would be crazy to not want to read anyways (or you're just going to easily not recommend it for publication). When you review the paper, you excitedly read it, and if things don't make sense -- or the style is bad, or whatever -- you get to ask the author to explain clearly any point that troubles you, and the author definitely will. It's a lot of work, but it is mathematical research.

Reviewing a patch is different than reviewing a math paper. First, very often a patch fixes a subtle bug in Sage or introduces some new functionality. If the patch fixes a subtle bug, then of course the fix itself could easily introduce a new subtle bug, or maybe not really fix the bug correctly, and the referee has to figure this out. If the patch introduces new functionality, it could easily contain bugs, often bugs that the author of the patch hasn't thought about testing for. Often when refereeing a patch, I bounce it back with a use case that exhibits a bug. Some of the most common and dangerous patches are the ones that speed up existing code. These are often some slight changes in a few places, which seem safe enough, and make some code a bit faster. Last time I looked at one of these, I decided to loop over the first 100 obvious test cases -- about the 90th input crashed the new code (the old code worked fine on that example). I would have never noticed that problem had I just read the patch and tried the tests included with it; it was thus critical for me to creatively think of new tests.

Of course, when refereeing a math paper one also looks for subtle flaws and tests theorems for consistency, but still it has a different feel than refereeing a patch.
And with refereeing patches there is also a time window, which also a huge problem. Patches stop working because the rest of the Sage system evolves past them -- math papers don't just stop working because somebody out there proves more theorems.


What Should We Do?


As I said above, I believe that patch review is a major problem for the Sage project. I think the best solution is that there be one person at any given time who acts as the "patch review manager". This person could vary from month to month. This position would be the analogue of release manager or more directly of the chief editor at a traditional journal. Every few days, this person looks at the list of all patches that need review and goes through every single one, either pinging people about them, or refereeing them if possible. Said person, must be bold enough to be able to understand code anywhere in Sage, and have a good sense of who can look at what. If a person did that, I believe our patch review problem would be solved.

If we don't do this, the Sage project will operate at reduced efficiency and new and experienced developers alike will get needlessly frustrated. (NOTE: We currently do have Michael Abshoff who does pretty much what I describe above, but he has other things he should be doing, since he is release manager.) The Sage project will have a harder time reaching its goal to be a viable alternative to Magma, Maple, Mathematica, and Matlab. We have to work smarter and do a better job, for the sake of all the people out there currently forced to use Magma, Maple, Mathematica, or Matlab, instead of an open source peer reviewed scientifically viable system like Sage aims to be.