Recoding data from the British Election Study

The following makes use of the memisc package. You may need to install it from CRAN using the code install.packages("memisc") if you want to run this on your computer. (The package is already installed on the notebook container, however.)


library(memisc)
Loading required package: lattice

Loading required package: MASS


Attaching package: ‘memisc’


The following objects are masked from ‘package:stats’:

    contr.sum, contr.treatment, contrasts


The following object is masked from ‘package:base’:

    as.array


The following code picks up with the British Election Study data of the previous script. We use the data file created earlier. For convenience, this data file is available for download from here:


load("BES-1983-classvot.RData")

# This code collapses the categories of the vote variable into just four:
BES.1983.classvot <- within(BES.1983.classvot,{
    vote.new <- vote
    vote.new[vote %in% 3:5]        <- 3
    vote.new[vote %in% c(6:10,97)] <- 4
})
# Checking the result:
codebook(BES.1983.classvot$vote.new)
================================================================================

   BES.1983.classvot$vote.new '[IF VOTED] PARTY VOTED FOR'

--------------------------------------------------------------------------------

   Storage mode: double
   Measurement: nominal
   Missing values: 95, 96, 97, 98, 99

   Values and labels         N Valid Total

    0   'SKIPPED'          660  17.1  16.7
    1   'CON'             1432  37.1  36.2
    2   'LAB'              937  24.3  23.7
    3   'ALLIANCE'         788  20.4  19.9
    4   'LIB'               46   1.2   1.2
    5   'SOCIAL  DEMOCR'     0   0.0   0.0
    6   'SNP'                0   0.0   0.0
    7   'PLAID   CYMRU'      0   0.0   0.0
    8   'ECOLOGY PARTY'      0   0.0   0.0
    9   'NATNL   FRONT'      0   0.0   0.0
   10   'COMNIST PARTY'      0   0.0   0.0
   95 M 'REFUSED'           92         2.3


# It is somewhat more convenient to use the 'recode()' function from the
# 'memisc' package:
BES.1983.classvot <- within(BES.1983.classvot,{
    vote.new <- recode(vote,
                       3 <- 3:5,
                       4 <- c(6:10,97),
                       otherwise="copy"
                       )
})
# Checking the result:
codebook(BES.1983.classvot$vote.new)
================================================================================

   BES.1983.classvot$vote.new '[IF VOTED] PARTY VOTED FOR'

--------------------------------------------------------------------------------

   Storage mode: double
   Measurement: nominal
   Missing values: 95, 96, 97, 98, 99

   Values and labels     N Valid Total

    0   'SKIPPED'      660  17.1  16.7
    1   'CON'         1432  37.1  36.2
    2   'LAB'          937  24.3  23.7
    3   'ALLIANCE'     788  20.4  19.9
    4   'LIB'           46   1.2   1.2
   95 M 'REFUSED'       92         2.3


# Since 'BES.1983.classvot' is not a data frame, but a "data.set" object, we can
# provide value labels while recoding:
BES.1983.classvot <- within(BES.1983.classvot,{
    vote.new <- recode(vote,
                       Conservative  = 1 <- 1,
                       Labour        = 2 <- 2,
                       Alliance      = 3 <- 3:5,
                       Other         = 4 <- c(6:10,97),
                       "Didn't vote" = 5 <- 0,
                       DK            = 8 <- 98,
                       Refused       = 9 <- 95)
    missing.values(vote.new) <- c(5,9)
})
# Checking the result:
codebook(BES.1983.classvot$vote.new)
Warning message in recode(vote, Conservative = 1 <- 1, Labour = 2 <- 2, Alliance = 3 <- 3:5, :
“recoding 8 <- 98 has no consequences”
================================================================================

   BES.1983.classvot$vote.new '[IF VOTED] PARTY VOTED FOR'

--------------------------------------------------------------------------------

   Storage mode: double
   Measurement: nominal
   Missing values: 5, 9

   Values and labels      N Valid Total

   1   'Conservative'  1432  44.7  36.2
   2   'Labour'         937  29.3  23.7
   3   'Alliance'       788  24.6  19.9
   4   'Other'           46   1.4   1.2
   5 M 'Didn't vote'    660        16.7
   9 M 'Refused'         92         2.3

Downloadable R script and interactive version

Explanation

The link with the “jupyterhub” icon directs you to an interactive Jupyter1 notebook, which runs inside a Docker container2. There are two variants of the interative notebook. One shuts down after 60 seconds and does not require a sign it. The other requires sign in using your ORCID3 credentials, yet shuts down only after 24 hours. (There is no guarantee that such a container persists that long, it may be shut down earlier for maintenance purposes.) After shutdown all data within the container will be reset, i.e. all files created by the user will be deleted.4

Above you see a rendered version of the Jupyter notebook.5

1

For more information about Jupyter see http://jupyter.org. The Jupyter notebooks make use of the IRKernel package.

2

For more information about Docker see https://docs.docker.com/. The container images were created with repo2docker, while containers are run with docker spawner.

3

ORCID is a free service for the authentication of researchers. It also allows to showcase publications and contributions to the academic community such as peer review.. See https://info.orcid.org/what-is-orcid/ for more information.

4

The Jupyter notebooks come with NO WARRANTY whatsoever. They are provided for educational and illustrative purposes only. Do not use them for production work.

5

The notebook is rendered with the help of the nbsphinx extension.