Danny Smith is a Senior Data Scientist at the Social Research Centre. Danny has worked as a survey programmer, analyst and data scientist for 10 years.
His main interest and expertise is in research systems architecture, building systems that support automation of data workflows and processes and associated tools. He is an avid R user and supporter of free and open source software.
Thinking like a programmer: open approaches to quantitative research
Open source software began as a licensing model born from the free software movement. Although often as social science researchers our exposure to the open source world is to tools developed under open source licenses, there is much to be learnt from the tools and approaches that support this development in an increasingly computational research world.
This talk introduces the open source paradigm and discusses its importance and relevance to quantitative research. We discuss open source software development as a model for successful decentralised open collaboration, and how the tools, collaboration frameworks and approach behind open source software development can and are being leveraged to advance quantitative research tools and methods.
Survey research datasets and R
Although R began as a specialist statistical programming language, the R ecosystem has grown wildly over the past few years making it a viable general-purpose research environment across the whole research lifecycle.
Survey research datasets come from a diverse range of sources, often containing richer metadata than your average data frame. This workshop provides a practical demonstration of several packages for accessing and working with survey data, associated metadata and official statistics in R.
We will demonstrate:
Working with external data sources from common statistical packages (SPSS, SAS, Stata, Excel) and their quirks
Easily working with categorical data in R with the “labelled” R package
Accessing external databases in an R native way using DBI and dbplyr
Accessing publicly accessible data in R scripts via the web
Resources for accessing official statistics data in R
Participants should have a basic working knowledge of R to follow along with examples, but beginners are also welcome.