Repeatability in Computer Systems Research

Christian Collberg and Todd A. Proebsting published an interesting piece in Communications of the ACM, “Repeatability in Computer Systems Research”.

They are concerned with computer research that reports new ideas and techniques, supposedly demonstrate by the results of software created as part of the research. Can other investigators confirm the results, not to mention extend or improve them? Without access to the source code, it is impossible to even repeat the results, let alone confirm they are valid.

The problem is, of course, that the source code underlying published papers is not routinely available.

Their paper recounts their exploration of whether source code really can be obtained, and if so, is the code usable (i.e., does it build).

The reader will not be surprised that for most cases, source is difficult to find, often does not compile easily, even with help from the authors. Probing the situation, they document a number of barriers that are scarcely surprising to anyone doing research.

Sometimes the code in question simply isn’t designed to hand out. Given that researchers, especially academic researchers, are not funded to do software development, it is difficult to expect them to produce useable code, even if they might want to, and even if they know how to do so. Funding is eleven tenths of the law in the academy.

Sometimes the code is proprietary or depends on proprietary code, an so cannot be made pubic. Other times the code simply depends on an obscure and possibly poorly understood software configuration. Software is extremely complex, and the boundaries of “the system” may resemble a fractal, with dependencies upon dependencies, to the point that it is hard to know where the edge lies. (E.g., code might depend on a library that runs only on Linux. Which versions of the library and Linux might it work with? Who knows?)

Furthermore, software ages very fast. Whatever was running when the paper was written has surely evolved, as has the development team. The time to obsolescence can be months—which is shorter than the review cycle for many conferences.

It’s a mess.

Collberg and Proebsting are pragmatic. They would like to see authors at least state whether code is available and expected to work. They would also like to shift some funding into reproducibility engineering. I wish them luck on that one.

This is a topic I’ve been aware of and concerned about for quite a while. I have personally seen pretty much all the cases they discuss: licensing issues (this is a minefield), lack of resources (and skills) to produce usable software, turn over in personnel and technology, and just plain lack of motivation. I have personally been in the position where even I, the author, could no longer replicate the software I wrote about a few months earlier.

Software is hard to do, software research is hard to do, and researchers are generally not in a position to publish code.

This is an issue not just for computer science: vast amounts of research has a significant software component, even outside science and engineering. Business, agriculture, social science, and even humanities studies now may include significant data manipulation and simulations. If professional software researchers have difficulty making their software replicable, how will “English majors” fare?

I’ll add one more note: code is scarcely the only “irreproducible” result coming out of computers these days. Various kinds of big data and crowd sourced research is based on one time samples, or on the undocumented contributions of millions of individuals or individual signals.

For example, in a study such as Galloway et al. [2], who used data selected “[w]ith the help of over 80,000 volunteers providing over 16 million classifications of over 300,000 galaxies”, it isn’t possible to describe how the results were achieved very precisely, let alone independently replicate them.

Software in enabling new and exciting research, but it is certainly “disrupting” conventional notions of reproducibility.


 

  1. Collberg, Christian and Todd A. Proebsting, Repeatability in computer systems research. Commun. ACM, 59 (3):62-69, 2016.
  2. Galloway, Melanie A., Kyle W. Willett, Lucy F. Fortson, Carolin N. Cardamone, Kevin Schawinski, Edmond Cheung, Chris J. Lintott, Karen L. Masters, Thomas Melvin, and Brooke D. Simmons, Galaxy Zoo: the effect of bar-driven fuelling on the presence of an active galactic nucleus in disc galaxies. Monthly Notices of the Royal Astronomical Society, 448 (4):3442-3454, April 21, 2015 2015. http://mnras.oxfordjournals.org/content/448/4/3442.abstract http://arxiv.org/abs/1502.01033

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s