Constructing a survey design object from data of the 2016 American Election Study.¶
The following makes use of the memisc package. You may need to install it from
CRAN using the code
install.packages("memisc")
if you want to run this on your computer. (The
package is already installed on the notebook container, however.)
library(memisc)
Loading required package: lattice
Loading required package: MASS
Attaching package: 'memisc'
The following objects are masked from 'package:stats':
contr.sum, contr.treatment, contrasts
The following object is masked from 'package:base':
as.array
The the code makes used of the data file “anes_timeseries_2016.sav
”, which is not included in the supporting material. In order to
obtain this data file (and run this notebook successufully), you need to download them from
the ANES website for 2016 and upload them to the virtual
machine that runs this notebook. To do this,
- pull down the “File” menu item and select “Open”
- An overview of the folder that contains the notebook opens.
- The folder view has a button labelled “Upload”. Use this to upload the file that you downloaded from the ANES website.
Note that the uploaded data will disappear, once you “Quit” the notebook (and the Jupyter instance).
anes_2016_sav <- spss.file("anes_timeseries_2016.sav")
File character set is 'UTF-8'.
Converting character set to the local 'ascii'.
Loading a subset: Only pre-election waves and only face-to-face interviews
anes_2016_pre_work_ds <- subset(anes_2016_sav,
V160501 == 1,
select=c(
# According to docs, these are the
# sample weights for the
# face-to-face component
pre_w_f2f = V160101f,
# Face-to-face strata
strat_f2f = V160201f,
psu_f2f = V160202f,
pre_voted12 = V161005,
pre_recall12 = V161006,
pre_voted = V161026,
pre_vote = V161027,
pre_intov = V161030,
pre_voteint = V161031#,
))
library(magrittr) # For the '%<>%' operator
anes_2016_pre_work_ds %<>% within({
# Setting up recalled votes of 2012
# Since a "default" value for the remaining conditions
# is used, we use 'check.xor = FALSE' to avoid warnings.
recall12 <- cases(
'Did not vote' = 9 <- pre_voted12 == 2,
'Obama' = 1 <- pre_recall12 == 1,
'Romney' = 2 <- pre_recall12 == 2,
'Other' = 3 <- pre_recall12 == 5,
'Inap' = 99 <- TRUE, check.xor = FALSE
)
# Early voters
vote16_1 <- cases(
'Clinton' = 1 <- pre_voted == 1 & pre_vote == 1,
'Trump' = 2 <- pre_voted == 1 & pre_vote == 2,
'Other' = 3 <- pre_voted == 1 & pre_vote %in% 3:5,
'Inap' = 99 <- TRUE, check.xor = FALSE)
# Vote intentions
vote16 <- cases(
'Clinton' = 1 <- pre_intov == 1 & pre_voteint == 1,
'Trump' = 2 <- pre_intov == 1 & pre_voteint == 2,
'Other' = 3 <- pre_intov == 1 & pre_voteint %in% 3:6,
'Will not vote/Not registered' = 8 <- pre_intov %in% c(-1,2),
'Inap' = 99 <- TRUE, check.xor = FALSE)
vote16[] <- ifelse(vote16 == 99 & vote16_1 != 99,
vote16_1,
vote16)
measurement(pre_w_f2f) <- "ratio"
})
anes_2016_prevote <- as.data.frame(anes_2016_pre_work_ds)
save(anes_2016_prevote,file="anes-2016-prevote.RData")
#Unweighted crosstable
xtabs(~ vote16 + recall12,
data=anes_2016_prevote)
recall12
vote16 Obama Romney Other Did not vote Inap
Clinton 326 12 2 59 6
Trump 29 242 5 70 8
Other 30 28 7 16 4
Will not vote/Not registered 28 41 0 139 5
Inap 46 27 2 31 17
The following makes use of the survey package. You may need to install it from
CRAN using the code
install.packages("survey")
if you want to run this on your computer. (The
package is already installed on the notebook container, however.)
library(survey)
Loading required package: grid
Loading required package: Matrix
Loading required package: survival
Attaching package: 'survey'
The following object is masked from 'package:graphics':
dotchart
anes_2016_prevote_desgn <- svydesign(id = ~psu_f2f,
strata = ~strat_f2f,
weights = ~pre_w_f2f,
data = anes_2016_prevote,
nest = TRUE)
anes_2016_prevote_desgn
Stratified 1 - level Cluster Sampling design (with replacement)
With (65) clusters.
svydesign(id = ~psu_f2f, strata = ~strat_f2f, weights = ~pre_w_f2f,
data = anes_2016_prevote, nest = TRUE)
In order to later make use of the survey design object, we save it into a file.
save(anes_2016_prevote_desgn,file="anes-2016-prevote-desgn.RData")
We reduce the digits after dot …
ops <- options(digits=2)
(tab <- svytable(~ vote16 + recall12,
design = anes_2016_prevote_desgn))
recall12
vote16 Obama Romney Other Did not vote Inap
Clinton 316.0 11.7 1.1 69.9 8.6
Trump 35.9 228.8 4.2 73.0 5.1
Other 34.1 24.4 6.6 13.9 5.3
Will not vote/Not registered 28.8 41.4 0.0 150.2 4.3
Inap 44.8 25.0 1.9 28.3 16.0
and drop counts of non-valid responses before we compute percentages.
percentages(vote16 ~ recall12, data=tab[-6,-5])
recall12
vote16 Obama Romney Other Did not vote
Clinton 68.8 3.5 8.0 20.8
Trump 7.8 69.1 30.6 21.8
Other 7.4 7.4 47.6 4.1
Will not vote/Not registered 6.3 12.5 0.0 44.8
Inap 9.7 7.5 13.9 8.4
options(ops) # To undo the change in the options.
Here we compute a F-test of independence with the table, which uses the Rao-Scott second-order correction with a Satterthwaite approximation of the denominator degrees of freedom is used.
summary(tab)
Warning in chisq.test(svytable(formula, design, Ntotal = N), correct = FALSE):
Chi-squared approximation may be incorrect
recall12
vote16 Obama Romney Other Did not vote Inap
Clinton 316 12 1 70 9
Trump 36 229 4 73 5
Other 34 24 7 14 5
Will not vote/Not registered 29 41 0 150 4
Inap 45 25 2 28 16
Pearson's X^2: Rao & Scott adjustment
data: svychisq(~vote16 + recall12, design = anes_2016_prevote_desgn, statistic = "F")
F = 29.235, ndf = 9.3968, ddf = 310.0952, p-value < 2.2e-16
The more conventional Pearson-Chi-squared test adjusted with a design-effect estimate is obtained by a slight modification.
summary(tab, statistic="Chisq")
Warning in chisq.test(svytable(formula, design, Ntotal = N), correct = FALSE):
Chi-squared approximation may be incorrect
recall12
vote16 Obama Romney Other Did not vote Inap
Clinton 316 12 1 70 9
Trump 36 229 4 73 5
Other 34 24 7 14 5
Will not vote/Not registered 29 41 0 150 4
Inap 45 25 2 28 16
Pearson's X^2: Rao & Scott adjustment
data: svychisq(~vote16 + recall12, design = anes_2016_prevote_desgn, statistic = "Chisq")
X-squared = 778.41, df = 16, p-value < 2.2e-16
- R file: survey-design-objects-ANES2016.R
- Rmarkdown file: survey-design-objects-ANES2016.Rmd
- Jupyter notebook file: survey-design-objects-ANES2016.ipynb
- Interactive version of the Jupyter notebook (shuts down after 60s):
- Interactive version of the Jupyter notebook (sign in required):