Reshaping data frames: An example with data from the British Election Study

First we load an R data file that contains data from the 2010 British election study. Here we use data from the British Election Study 2010. The data set bes2010feelings-prepost.RData is prepared from the original available at https://www.britishelectionstudy.com/data-object/2010-bes-cross-section/ by removing identifying information and scrambling the data.

load("bes2010feelings-prepost.RData")
names(bes2010flngs_pre)
 [1] "flng.brown"   "flng.cameron" "flng.clegg"   "flng.salmond" "flng.jones"  
 [6] "flng.labour"  "flng.cons"    "flng.libdem"  "flng.snp"     "flng.pcym"   
[11] "flng.green"   "flng.ukip"    "flng.bnp"     "region"      

A sensible way to bring these data into long format would be to have the feelings towards the parties and their leaders as multiple measurements. Therefore we reshape the data in the appropriate long format:

bes2010flngs_pre_long <- reshape(
              within(bes2010flngs_pre,
                     na <- NA),
              varying=list(
                  # Parties
                  c("flng.cons","flng.labour","flng.libdem",
                    "flng.snp","flng.pcym",
                    "flng.green","flng.ukip","flng.bnp"),
                  # Party leaders
                  c("flng.cameron","flng.brown","flng.clegg",
                    "flng.salmond","flng.jones",
                    "na","na","na")
              ),
              v.names=c("flng.parties",
                        "flng.leaders"),
              times=c("Conservative","Labour","LibDem",
                      "SNP","Plaid Cymru",
                      "Green","UKIP","BNP"),
              timevar="party",
              direction="long")
head(bes2010flngs_pre_long,n=14)
                 region        party flng.parties flng.leaders id
1.Conservative  England Conservative            6            3  1
2.Conservative     <NA> Conservative            6            7  2
3.Conservative  England Conservative            4            7  3
4.Conservative  England Conservative            6            4  4
5.Conservative     <NA> Conservative            4            5  5
6.Conservative  England Conservative            1            0  6
7.Conservative  England Conservative            3            3  7
8.Conservative  England Conservative            3            6  8
9.Conservative  England Conservative            3            2  9
10.Conservative England Conservative            3            2 10
11.Conservative    <NA> Conservative            6            4 11
12.Conservative England Conservative            3            2 12
13.Conservative England Conservative            0            4 13
14.Conservative England Conservative            5            5 14

The fellowing demostrates the convenience variant of reshape() provided by the memisc package, the function Reshape(). You may need to install this package using install.packages("memisc") from CRAN if you want to run this on your computer. (Package is already installed on the notebook container, however.)

library(memisc)
Loading required package: lattice
Loading required package: MASS

Attaching package: 'memisc'

The following objects are masked from 'package:stats':

    contr.sum, contr.treatment, contrasts

The following object is masked from 'package:base':

    as.array

With the Reshape() function the syntax is simpler than with reshape() from the stats package:

bes2010flngs_pre_long <- Reshape(bes2010flngs_pre,
       # Note that "empty" places designate measurement
       # occastions that are to be filled with NAs.
       # In the present case these are measurement 
       # feelings about party leaders that were not
       # asked in the BES 2010 questionnaires.
       flng.leaders=c(flng.cameron,flng.brown,
                      flng.clegg,flng.salmond,
                      flng.jones,,,),
       flng.parties=c(flng.cons,flng.labour,
                      flng.libdem,flng.snp,
                      flng.pcym,flng.green,
                      flng.ukip,flng.bnp),
       party=c("Conservative","Labour","LibDem",
               "SNP","Plaid Cymru",
               "Green","UKIP","BNP"),
       direction="long")

In long format the observations are sorted such that the variable that distinguishes measurement occasions (the party variable) changes faster than the variable that distinguishes individuals:

head(bes2010flngs_pre_long)
                region        party flng.leaders flng.parties id
1.Conservative England Conservative            3            6  1
1.Labour       England       Labour            6            5  1
1.LibDem       England       LibDem            3            4  1
1.SNP          England          SNP           NA           NA  1
1.Plaid Cymru  England  Plaid Cymru            5           NA  1
1.Green        England        Green           NA            7  1

Like with reshape(), reshaping back from long into wide format takes (almost) the same syntax as reshaping from wide into long format:

bes2010flngs_pre_wide <- Reshape(bes2010flngs_pre_long,
       # Note that "empty" places designate measurement
       # occastions that are to be filled with NAs.
       # In the present case these are measurement 
       # feelings about party leaders that were not
       # asked in the BES 2010 questionnaires.
       flng.leaders=c(flng.cameron,flng.brown,
                      flng.clegg,flng.salmond,
                      flng.jones,,,),
       flng.parties=c(flng.cons,flng.labour,
                      flng.libdem,flng.snp,
                      flng.pcym,flng.green,
                      flng.ukip,flng.bnp),
       party=c("Conservative","Labour","LibDem",
               "SNP","Plaid Cymru",
               "Green","UKIP","BNP"),
       direction="wide")

After reshaping into wide format, the variables that correspond to multiple measures of the same variable are grouped together:

head(bes2010flngs_pre_wide)
                region id flng.cameron flng.cons flng.brown flng.labour
1.Conservative England  1            3         6          6           5
2.Conservative    <NA>  2            7         6          3           1
3.Conservative England  3            7         4          8           3
4.Conservative England  4            4         6          4           6
5.Conservative    <NA>  5            5         4          5           8
6.Conservative England  6            0         1          5           5
               flng.clegg flng.libdem flng.salmond flng.snp flng.jones
1.Conservative          3           4           NA       NA          5
2.Conservative          5           7           NA       NA          3
3.Conservative          4           5           NA       NA         10
4.Conservative          3           5           NA       NA          7
5.Conservative          5           5           NA       NA          5
6.Conservative          4           4           NA       NA          1
               flng.pcym flng.green flng.ukip flng.bnp
1.Conservative        NA          7         3        0
2.Conservative        NA          6         0        0
3.Conservative        NA          5         0        0
4.Conservative        NA          5         3        2
5.Conservative        NA          4        NA        2
6.Conservative        NA          4         0        0
save(bes2010flngs_pre_long,file="bes2010flngs-pre-long.RData")