Data Management in R: A Guide for Social Scientists

Elff, Martin. 2021. Data Management with R: A Guide for Social Scientists. London: Sage.

This page provides material to accompany my recent book Data Management in R: A Guide for Social Scientists, which is being published by Sage Publications. The material is organised into different pages each corresponding to a chapter of the book:

The material consists, firstly, of R-scripts and, where possible, R-data files that allow to run the code shown in the book. Unfortunately not all data sets used in the examples in the book can be made available here, because of restrictions on redistribution. In some instances redistribution is prohibited by the data providers and downloaders have to assure that they agree to this. In some other cases, downloading the data requires registration with the data providers, which indicates that they do not agree to redistribution of the data by third parties. In these cases, the supporting material indicates how to obtain the data sets from the relevant data providers. The R scripts are all available on a dedicated GitHub repository, namely https://github.com/melff/dataman-r

Most of the R-scripts on the following pages are accompanied by Jupyter notebooks (see https://jupyter.org), except for those instances where running the script would take a lot of time and the creation of an interactive notebook would not be reasonable. The sources of these notebooks, i.e. files with the filename suffix .ipynb are available in the same GitHub repository https://github.com/melff/dataman-r. A ZIP-file with the contents of the repository can be downloaded from https://github.com/melff/dataman-r/archive/main.zip

For each of these Jupyter notebooks, you will find:

  1. R input and output within a rendered Juptyter notebook as part of a page of this website
  2. a download link with the appropriate icon –
  3. a link to a rendering of the notebook at https://nbviewer.jupyter.org – indicated by the image nbviewer-img
  4. a link that opens an interactive notebook instance hosted by https://mybinder.org – indicated by the image mybinder-badge

    https://mybinder.org is a free service provided by a federation of organisations and companies. Please do not overuse the resources of this service, as this would be unethical and in the long run could lead to its suspension.

It is also possible to run RStudio on the mybinder server, using the link https://mybinder.org/v2/gh/melff/dataman-r/main?urlpath=rstudio. This allows to run the R-scripts in a virtual computing enviroment. To do this successrully one needs to make the directory in which the R-script is the working directory. To make this easy, each directory (introduction etc) contains an .Rproj file which can be clicked on in the files pane of RStudio.