Appending data frames

First we load some example data from the British Election Study 2010. The data set bes2010feelings-for-append.RData is prepared from the original available at https://www.britishelectionstudy.com/data-object/2010-bes-cross-section/ by removing identifying information and scrambling the data.


load("bes2010feelings-for-append.RData")

We now have two BES data frames, one from the pre-election wave and another from the post-election wave. They contain the same variables, but in a different order:


str(bes2010flngs_pre)
'data.frame':   1935 obs. of  14 variables:
 $ flng.brown  : num  6 3 8 4 5 5 5 4 7 4 ...
 $ flng.cameron: num  3 7 7 4 5 0 3 6 2 2 ...
 $ flng.clegg  : num  3 5 4 3 5 4 2 7 4 8 ...
 $ flng.salmond: num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.jones  : num  5 3 10 7 5 1 7 1 6 4 ...
 $ flng.labour : num  5 1 3 6 8 5 6 2 8 3 ...
 $ flng.cons   : num  6 6 4 6 4 1 3 3 3 3 ...
 $ flng.libdem : num  4 7 5 5 5 4 0 5 4 9 ...
 $ flng.snp    : num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.pcym   : num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.green  : num  7 6 5 5 4 4 1 5 5 5 ...
 $ flng.ukip   : num  3 0 0 3 NA 0 NA 2 3 1 ...
 $ flng.bnp    : num  0 0 0 2 2 0 0 0 0 0 ...
 $ region      : Factor w/ 3 levels "England","Scotland",..: 1 NA 1 1 NA 1 1 1 1 1 ...

str(bes2010flngs_post)
'data.frame':   3075 obs. of  14 variables:
 $ flng.jones  : num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.labour : num  5 2 9 7 0 2 6 5 7 2 ...
 $ flng.ukip   : num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.libdem : num  4 5 4 4 6 NA 4 4 7 7 ...
 $ flng.brown  : num  5 2 5 7 0 2 3 2 5 2 ...
 $ flng.bnp    : num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.snp    : num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.salmond: num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.pcym   : num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.cons   : num  5 5 3 10 10 3 3 8 7 7 ...
 $ flng.cameron: num  5 6 5 3 8 10 7 8 8 7 ...
 $ flng.green  : num  NA NA NA NA NA NA NA NA NA NA ...
 $ flng.clegg  : num  NA 4 3 NA 6 3 5 4 7 6 ...
 $ region      : Factor w/ 3 levels "England","Scotland",..: 1 1 1 1 1 1 1 1 1 1 ...

If the variables in the two data frames differ trying to use rbind() to append the data frames fails.


bes2010flngs_prepost <- rbind(bes2010flngs_pre[-1],
                              bes2010flngs_post[-1])
Error in match.names(clabs, names(xi)): names do not match previous names
Traceback:

1. rbind(bes2010flngs_pre[-1], bes2010flngs_post[-1])
2. rbind(deparse.level, ...)
3. match.names(clabs, names(xi))
4. stop("names do not match previous names")

If the variables in the two data frame are the same but differ in their order, rbind() succeeds and the variables are sorted all into the same order before the data frames are combined into a single one:


bes2010flngs_prepost <- rbind(bes2010flngs_pre,
                              bes2010flngs_post)

We compare the tail-ends of the data resulting data frame bes2010flngs_prepost and the data frame given as second argument to rbind. The tail-ends are identical except for the order of the variables.


options(width=200)
tail(bes2010flngs_prepost)

tail(bes2010flngs_post)

Downloadable R script and interactive version

Explanation

The link with the “jupyterhub” icon directs you to an interactive Jupyter1 notebook, which runs inside a Docker container2. There are two variants of the interative notebook. One shuts down after 60 seconds and does not require a sign it. The other requires sign in using your ORCID3 credentials, yet shuts down only after 24 hours. (There is no guarantee that such a container persists that long, it may be shut down earlier for maintenance purposes.) After shutdown all data within the container will be reset, i.e. all files created by the user will be deleted.4

Above you see a rendered version of the Jupyter notebook.5

1

For more information about Jupyter see http://jupyter.org. The Jupyter notebooks make use of the IRKernel package.

2

For more information about Docker see https://docs.docker.com/. The container images were created with repo2docker, while containers are run with docker spawner.

3

ORCID is a free service for the authentication of researchers. It also allows to showcase publications and contributions to the academic community such as peer review.. See https://info.orcid.org/what-is-orcid/ for more information.

4

The Jupyter notebooks come with NO WARRANTY whatsoever. They are provided for educational and illustrative purposes only. Do not use them for production work.

5

The notebook is rendered with the help of the nbsphinx extension.