Data Management in R: A Guide for Social Scientists

Elff, Martin. 2020. Data Management with R: A Guide for Social Scientists. London: SAGE Publications.

/book/data-management-r/DataManagement_in_R-huge.png

This page provides material to accompany my recent book Data Management in R: A Guide for Social Scientists, which is being published by Sage Publications. The material is organised into different pages each corresponding to a chapter of the book:

The material consists, firstly, of R-scripts and, where possible, R-data files that allow to run the code shown in the book. Unfortunately not all data sets used in the examples in the book can be made available here, because of restrictions on redistribution. In some instances redistribution is prohibited by the data providers and downloaders have to assure that they agree to this. In some other cases, downloading the data requires registration with the data providers, which indicates that they do not agree to redistribution of the data by third parties. In these cases, the supporting material indicates how to obtain the data sets from the relevant data providers.

The R-scripts on the following pages are accompanied by Jupyter notebooks (see https://jupyter.org), which contains input and output that will be created by running the script. Each notebook is rendered as a page of this website, which if applicable contains information how to obtain the data used in the script and the notebook. Each rendered notebook also contains a link to a dedicated container in https://dataman-r.elff.eu/, where the code in the notebook can be interactively run. The link is marked by this icon: JupyterHub

It is also possible to run RStudio on the notebook server, using the link https://dataman-r.elff.eu/user-redirect/rstudio. This allows to run the R-scripts in a virtual computing enviroment. To do this successrully one needs to make the directory in which the R-script is the working directory. To make this easy, each directory (introduction etc) contains an .Rproj file which can be clicked on in the files pane of RStudio.

In order to access the notebook container you have to log in via ORCID. See https://info.orcid.org/what-is-orcid/ to get an idea what ORCID is about.

The sources of the R-scripts and notebooks (as well as R-markdown files), are available in the GitHub repository https://github.com/melff/dataman-r. A ZIP-file with the contents of the repository can be downloaded from https://github.com/melff/dataman-r/archive/main.zip