The practice of reproducible research: case studies and lessons from the data-intensive sciences 9780520967779, 9780520294745, 9780520294752, 0520294742, 0520294750

"The Practice of Reproducible Research presents concrete examples of how researchers in the data-intensive sciences

388 17 7MB

English Pages xxv, 337 pages : illustrations ; 24 cm Year 2018

Report DMCA / Copyright

DOWNLOAD FILE

The practice of reproducible research: case studies and lessons from the data-intensive sciences
 9780520967779, 9780520294745, 9780520294752, 0520294742, 0520294750

Table of contents :
Part I. Practicing reproducibility. Assessing reproducibility / Ariel Rokem, Ben Marwick, and Valentina Staneva
The Basic reproducible workflow template / Justin Kitzes
Case studies in reproducible research / Daniel Turek and Fatma Deniz
Lessons learned / Kathryn Huff
Building toward a future where reproducible, open science is the norm / Karthik Ram and Ben Marwick --
Part II. High-level case studies. Case Study 1. Processing of airborne laser altimetry data using cloud-based Python and relational database tools / Anthony Arendt, Christian Kienholz, Christopher Larsen, Justin Rich, and Evan Burgess
Case Study 2. The trade-off between reproducibility and privacy in the use of social media data to study political behavior / Pablo Barberá
Case Study 3. A reproducibe R Notebook using Docker / Carl Boettiger
Case Study 4. Estimating the effect of soldier deaths on the military labor supply / Garret Christensen
Case Study 5. Turning simulations of Quantum Many-Body Systems into a Provenance-Rich publication / Jan Gukelberger and Matthias Troyer
Case Study 6. Validating statistical methods to detect data fabrication / Chris Hartgerink
Case Study 7. Feature extraction and data wrangling for predictive models of the brain in Python / Chris Holdgraf
Case study 8. Using observational data and numberical modeling to make scientific discoveries in climate science / David Holland and Denise Holland
Case Study 9. Analyzing bat distributions in a human-dominated landscape with autonomous acoustic detectors and machine learning models / Justin Kitzes
Case Study 10. An analysis of household location choice in major US metropolitan areas using R / Andy Krause and Hossein Esteri
Case Study 11. Analyzing cosponsorship data to detect networking patterns in Peruvian legislators / José Manuel Magallanes
Case Study 12. Using R and related tools for reproducible research in archaeology / Ben Marwick
Case Study 13. Achieving full replication of our own published CFD results, with four different codes / Olivier Mesnard and Lorena A. Barba
Case Study 14. Reproducible applied statistics: is tagging of therapist-patient interactions reliable? / K. Jarrod Millman, Kellie Ottoboni, Naomi A.P. Stark, and Philip B. Stark
Case Study 15. A dissection of computational methods used in a biogeographic study / K.A.S. Mislan
Case Study 16. A statistical analysis of salt and mortality at the level of nations / Kellie Ottoboni
Case Study 17. Reproducible workflows for understanding large-scale ecological effects of climate change / Karthik Ram
Case Study 18. Reproducibility in human neuroimaging research: a practical example from the analysis of diffusion MRI / Ariel Rokem
Case Study 19. Reproducible computational science on high-performance computers: a view from neutron transport / Rachel Slaybaugh
Case Study 20. Detection and classification of cervical cells / Daniela Ushizima
Case Study 21. Enabling astronomy image: processing with cloud computing using Apache Spark / Zhao Zhang --
Part III. Low-level case studies. Case Study 22. Software for analyzing supernova light curve data for cosmology / Kyle Barbary
Case Study 23. pyMooney: generating a database of two-tone Mooney images / Fatma Deniz
Case Study 24. Problem-specific analysis of molecular dynamics trajectories for biomolecules / Konrad Hinsen
Case Study 25. Developing an open, modular simulation framework for nuclear fuel cycle analysis / Kathryn Huff
Case Study 26. Producing a journal article on probabilistic tsunami hazard assessment / Randall J. Le Veque
Case Study 27. A reproducible neuroimagining workflow using the automated build tool "Make" / Tara Madhyastha, Natalie Koh, and Mary K. Askren
Case Study 28. Generation of uniform data products for AmeriFlux and FLUXNET / Gilberto Pastorello
Case Study 29. Developing a reproducible workflow for large-scale phenotyping / Russell Poldrack
Case Study 30. Developing and testing stochastic filtering methods for tracking objects in videos / Valentina Staneva
Case Study 31. Developing testing, and deploying efficient MCMC algorithms for hierarchial models using R / Daniel Turek.

Polecaj historie