Reproducible survey research
Research is defined as reproducible when results can be replicated using the documented data, code, and methods implemented by the author without the need for any additional information. The benefits are transparency; providing the ability for fellow researchers to check and validate findings, as well as the possibility of expanding on findings for further research.
Reproducible research is all the buzz in academic circles, but how does it apply to survey research? Great in theory, but how feasible is it when dealing with the often complicated, diverse datasets typically found in the social sciences? This presentation promotes the use of open source software and tools as a means for reproducibility, scientific rigour and research hygiene. We will explore some useful, open source tools available for researchers in the social sciences, namely Jupyter Notebooks and Rmarkdown.
These tools are very powerful and extensible. Whilst they require a level of skills in programming, they present many advantages for processing and dissemination. We will explore these advantages by showcasing a number of good examples in the field. Research should not be simply about publishing findings, but also the underlying methods. As such we will also cover a number of online hosting services available for researchers to freely publish their work.
This ecosystem of tools tie data collection and data processing much closer to the analysis and reporting results, making it easier to package up the entire research workflow for an external audience.