I’ve been worrying about reproducibility of research results for quite a while now (since the late 90’s [2, 3, 5]). As digital and network technology we developed in the 80s and 90s has been taken up, science and technical research has become digital, computational, and digitally published. These technical advances are super useful, but they raise many issues for evaluating research results . They also revolutionize the notion of “publication” and “reproducing” research.
So we worry about not just the data and software, but also the computational steps involved . New technologies may or may not help track the complexities of data and computation underlying specific results (e.g., cloud computing, blockchain).
But all the technology in the world can’t solve the problem. To make results “reproducible” requires authors to do a bunch of work to maintain and publish adequate descriptions of the technical underpinnings of their claims. And it requires publishers to publish and archive not just papers, but digital data and metadata. (I’ll note that universities need to expand their mission to require data and metadata be deposited as part of thesis and other academic projects.)
We’ve been pushing these requirements since the early days of the World Wide Web [2, 3, 5]. So I’m glad to see more and more publishers and professional societies moving to finally deal with these issues.
I should note that the Astronomy community has long led in this field. For decades now, all major astronomy publications have required that the relevant datasets be deposited in open archives at the time that a paper is published. (Sensei Ray Plante pioneered some of the early efforts .) Well done.
The IEEE Computer Society is catching up.
The report sketches the interested parties and proposes some steps for the professional organization, which is an important publisher. Ironically, even in this savvy and well-funded professional field, the field that created the digital and network technologies in question, 60% of the publications do not have any processes in place.
In the end, the main proposals are to (1) enable and require submission of data and code along with manuscripts to be published, and (2) to link the archived code and data with the published paper. Just like astronomers have been doing for twenty years.
So I say, “yes, please”. This was the right idea when we wrote about it last century, and it’s long overdue now.
- Joanna Goodrich, Study Shows Ensuring Reproducibility in Research Is Needed, in IEEE Spectrum – News, September 30, 2021. https://spectrum.ieee.org/study-shows-ensuring-reproducibility-in-research-is-needed
- Robert E. McGrath, Joe Futrelle, Ray Plante, and Damien Guillaume. Digital Library Technology for Locating and Accessing Scientific Data. In ACM Digital Libraries ’99, 1999, 188-194. http://dx.doi.org/10.1145/313238.313305
- James D. Myers, Alan R. Chappell, Matthew Elder, Al Geist, and Jens Schwidder, Re-Integrating The Research Record. Computing in Science and Engineering, 5 (3):44-50, May/June 2003. http://ieeexplore.ieee.org/document/1196306/
- National Academies of Sciences Engineering and Medicine, Reproducibility and Replicability in Science., The National Academies Press, Washington, DC, 2019. https://doi.org/10.17226/25303.
- Raymond L. Plante, The NCSA Astronomy Digital Image Library: The Challenges of the Scientific Data Library. D-Lib Magazine, October 1997. http://www.dlib.org/dlib/october97/adil/10plante.html