Reshaping data to long format with tidyr¶
The following makes use of the packages tidyr and readr. You may need to install them from CRAN using the code install.packages(c("tidyr","readr"))
if you want to run this on your computer. (The packages are already installed on the notebook container, however.)
# Inspecting the file
substr(readLines("gini-oecd.tsv",n=5),start=1,stop=50)
[1] "Data table for: Income inequality, Gini coefficien"
[2] "Location \t 2007\t2008\t 2009\t2010\t2011\t2012\t2013\t201"
[3] "Australia\t\t\t\t\t\t0.326\t\t0.337\t\t0.330\t"
[4] "Austria\t0.284\t0.281\t0.289\t0.280\t0.281\t0.275\t0.279\t"
[5] "Belgium\t0.277\t0.266\t0.272\t0.267\t0.270\t0.265\t0.265\t"
library(readr)
gini.oecd <- read_tsv("gini-oecd.tsv",
skip=1)
Parsed with column specification:
cols(
Location = col_character(),
`2007` = col_double(),
`2008` = col_double(),
`2009` = col_double(),
`2010` = col_double(),
`2011` = col_double(),
`2012` = col_double(),
`2013` = col_double(),
`2014` = col_double(),
`2015` = col_double(),
`2016` = col_double(),
`2017` = col_double()
)
gini.oecd
Location 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
1 Australia NA NA NA NA NA 0.326 NA 0.337 NA 0.330 NA
2 Austria 0.284 0.281 0.289 0.280 0.281 0.275 0.279 0.274 0.276 0.284 NA
3 Belgium 0.277 0.266 0.272 0.267 0.270 0.265 0.265 0.266 0.268 0.266 NA
4 Brazil NA NA 0.485 NA 0.483 NA 0.470 NA NA NA NA
5 Canada 0.317 0.315 0.316 0.316 0.313 0.317 0.320 0.313 0.318 0.307 0.310
6 Chile NA NA 0.480 NA 0.471 NA 0.465 NA 0.454 NA 0.460
7 China (People's Republic of) NA NA NA NA 0.514 NA NA NA NA NA NA
8 Costa Rica NA NA NA 0.472 0.480 0.483 0.494 0.485 0.479 0.484 0.480
9 Czech Republic 0.256 0.259 0.257 0.259 0.257 0.253 0.259 0.257 0.258 0.253 NA
10 Denmark NA NA NA NA 0.251 0.249 0.254 0.256 0.263 0.261 NA
11 Estonia NA NA NA NA NA NA 0.357 0.346 0.330 0.314 NA
12 Finland 0.269 0.264 0.259 0.264 0.264 0.260 0.262 0.257 0.260 0.259 0.266
13 France NA NA NA NA NA 0.305 0.291 0.293 0.295 0.291 NA
14 Germany NA 0.285 NA NA 0.291 0.289 0.292 0.289 0.293 0.294 NA
15 Greece 0.329 0.328 0.330 0.336 0.333 0.338 0.342 0.339 0.340 0.333 NA
16 Hungary 0.257 0.246 0.241 0.266 0.270 0.280 0.277 0.282 0.284 0.282 NA
17 Iceland 0.286 0.305 0.266 0.249 0.252 0.253 0.241 0.246 0.255 NA NA
18 India NA NA NA NA 0.495 NA NA NA NA NA NA
19 Ireland 0.304 0.295 0.312 0.298 0.307 0.310 0.308 0.298 0.297 0.309 NA
20 Israel NA NA NA NA 0.371 0.371 0.360 0.365 0.360 0.346 0.344
21 Italy 0.313 0.317 0.315 0.327 0.327 0.330 0.325 0.326 0.333 0.328 NA
22 Japan NA NA 0.336 NA NA 0.330 NA NA 0.339 NA NA
23 Korea NA NA NA NA NA NA NA NA 0.352 0.355 0.355
24 Latvia 0.374 0.373 0.354 0.347 0.352 0.347 0.351 0.350 0.346 0.346 NA
25 Lithuania 0.338 0.358 0.366 0.329 0.322 0.350 0.352 0.381 0.372 0.378 NA
26 Luxembourg NA NA NA NA NA NA NA NA 0.306 0.304 NA
27 Mexico NA NA NA NA NA 0.457 NA 0.459 NA 0.458 NA
28 Netherlands NA NA NA NA 0.289 0.288 0.287 0.303 0.288 0.285 NA
29 New Zealand NA NA NA NA 0.323 0.333 NA 0.349 NA NA NA
30 Norway NA 0.250 0.245 0.249 0.250 0.253 0.252 0.257 0.272 0.262 0.262
31 Poland 0.316 0.308 0.303 0.305 0.301 0.297 0.299 0.298 0.292 0.284 NA
32 Portugal 0.361 0.355 0.337 0.341 0.337 0.337 0.341 0.338 0.336 0.331 NA
33 Russia NA NA NA NA 0.376 NA NA NA NA 0.331 NA
34 Slovak Republic 0.245 0.256 0.264 0.263 0.261 0.250 0.269 0.247 0.251 0.241 NA
35 Slovenia 0.239 0.234 0.245 0.244 0.244 0.249 0.254 0.251 0.250 0.244 NA
36 South Africa NA NA NA NA NA NA NA NA 0.620 NA NA
37 Spain 0.324 0.327 0.333 0.339 0.341 0.334 0.345 0.344 0.345 0.341 NA
38 Sweden NA NA NA NA NA NA 0.268 0.274 0.278 0.282 0.282
39 Switzerland 0.312 0.306 0.297 0.298 0.289 0.285 0.295 0.297 0.296 NA NA
40 Turkey NA NA NA NA 0.403 0.399 0.390 0.398 0.404 NA NA
41 United Kingdom 0.373 0.369 0.374 0.351 0.354 0.351 0.358 0.356 0.360 0.351 0.357
42 United States NA NA NA NA NA 0.396 0.394 0.390 0.391 0.390 NA
library(tidyr)
gini.oecd %>% gather(`2007`,`2008`,`2009`,`2010`,`2011`,`2012`,
`2013`,`2014`,`2015`,`2016`,`2017`,
key="year",value="gini") -> gini.oecd.long
gini.oecd.long
Location year gini
1 Australia 2007 NA
2 Austria 2007 0.284
3 Belgium 2007 0.277
4 Brazil 2007 NA
5 Canada 2007 0.317
6 Chile 2007 NA
7 China (People's Republic of) 2007 NA
8 Costa Rica 2007 NA
9 Czech Republic 2007 0.256
10 Denmark 2007 NA
11 Estonia 2007 NA
12 Finland 2007 0.269
13 France 2007 NA
14 Germany 2007 NA
15 Greece 2007 0.329
16 Hungary 2007 0.257
17 Iceland 2007 0.286
18 India 2007 NA
19 Ireland 2007 0.304
20 Israel 2007 NA
21 Italy 2007 0.313
22 Japan 2007 NA
23 Korea 2007 NA
24 Latvia 2007 0.374
25 Lithuania 2007 0.338
26 Luxembourg 2007 NA
27 Mexico 2007 NA
28 Netherlands 2007 NA
29 New Zealand 2007 NA
30 Norway 2007 NA
⋮ ⋮ ⋮ ⋮
433 France 2017 NA
434 Germany 2017 NA
435 Greece 2017 NA
436 Hungary 2017 NA
437 Iceland 2017 NA
438 India 2017 NA
439 Ireland 2017 NA
440 Israel 2017 0.344
441 Italy 2017 NA
442 Japan 2017 NA
443 Korea 2017 0.355
444 Latvia 2017 NA
445 Lithuania 2017 NA
446 Luxembourg 2017 NA
447 Mexico 2017 NA
448 Netherlands 2017 NA
449 New Zealand 2017 NA
450 Norway 2017 0.262
451 Poland 2017 NA
452 Portugal 2017 NA
453 Russia 2017 NA
454 Slovak Republic 2017 NA
455 Slovenia 2017 NA
456 South Africa 2017 NA
457 Spain 2017 NA
458 Sweden 2017 0.282
459 Switzerland 2017 NA
460 Turkey 2017 NA
461 United Kingdom 2017 0.357
462 United States 2017 NA
gini.oecd %>% gather(-Location,
key="year",value="gini") -> gini.oecd.long
gini.oecd.long
Location year gini
1 Australia 2007 NA
2 Austria 2007 0.284
3 Belgium 2007 0.277
4 Brazil 2007 NA
5 Canada 2007 0.317
6 Chile 2007 NA
7 China (People's Republic of) 2007 NA
8 Costa Rica 2007 NA
9 Czech Republic 2007 0.256
10 Denmark 2007 NA
11 Estonia 2007 NA
12 Finland 2007 0.269
13 France 2007 NA
14 Germany 2007 NA
15 Greece 2007 0.329
16 Hungary 2007 0.257
17 Iceland 2007 0.286
18 India 2007 NA
19 Ireland 2007 0.304
20 Israel 2007 NA
21 Italy 2007 0.313
22 Japan 2007 NA
23 Korea 2007 NA
24 Latvia 2007 0.374
25 Lithuania 2007 0.338
26 Luxembourg 2007 NA
27 Mexico 2007 NA
28 Netherlands 2007 NA
29 New Zealand 2007 NA
30 Norway 2007 NA
⋮ ⋮ ⋮ ⋮
433 France 2017 NA
434 Germany 2017 NA
435 Greece 2017 NA
436 Hungary 2017 NA
437 Iceland 2017 NA
438 India 2017 NA
439 Ireland 2017 NA
440 Israel 2017 0.344
441 Italy 2017 NA
442 Japan 2017 NA
443 Korea 2017 0.355
444 Latvia 2017 NA
445 Lithuania 2017 NA
446 Luxembourg 2017 NA
447 Mexico 2017 NA
448 Netherlands 2017 NA
449 New Zealand 2017 NA
450 Norway 2017 0.262
451 Poland 2017 NA
452 Portugal 2017 NA
453 Russia 2017 NA
454 Slovak Republic 2017 NA
455 Slovenia 2017 NA
456 South Africa 2017 NA
457 Spain 2017 NA
458 Sweden 2017 0.282
459 Switzerland 2017 NA
460 Turkey 2017 NA
461 United Kingdom 2017 0.357
462 United States 2017 NA
gini.oecd %>% pivot_longer(-Location,
names_to="year",
values_to="gini") -> gini.oecd.long
gini.oecd.long
Location year gini
1 Australia 2007 NA
2 Australia 2008 NA
3 Australia 2009 NA
4 Australia 2010 NA
5 Australia 2011 NA
6 Australia 2012 0.326
7 Australia 2013 NA
8 Australia 2014 0.337
9 Australia 2015 NA
10 Australia 2016 0.330
11 Australia 2017 NA
12 Austria 2007 0.284
13 Austria 2008 0.281
14 Austria 2009 0.289
15 Austria 2010 0.280
16 Austria 2011 0.281
17 Austria 2012 0.275
18 Austria 2013 0.279
19 Austria 2014 0.274
20 Austria 2015 0.276
21 Austria 2016 0.284
22 Austria 2017 NA
23 Belgium 2007 0.277
24 Belgium 2008 0.266
25 Belgium 2009 0.272
26 Belgium 2010 0.267
27 Belgium 2011 0.270
28 Belgium 2012 0.265
29 Belgium 2013 0.265
30 Belgium 2014 0.266
⋮ ⋮ ⋮ ⋮
433 Turkey 2010 NA
434 Turkey 2011 0.403
435 Turkey 2012 0.399
436 Turkey 2013 0.390
437 Turkey 2014 0.398
438 Turkey 2015 0.404
439 Turkey 2016 NA
440 Turkey 2017 NA
441 United Kingdom 2007 0.373
442 United Kingdom 2008 0.369
443 United Kingdom 2009 0.374
444 United Kingdom 2010 0.351
445 United Kingdom 2011 0.354
446 United Kingdom 2012 0.351
447 United Kingdom 2013 0.358
448 United Kingdom 2014 0.356
449 United Kingdom 2015 0.360
450 United Kingdom 2016 0.351
451 United Kingdom 2017 0.357
452 United States 2007 NA
453 United States 2008 NA
454 United States 2009 NA
455 United States 2010 NA
456 United States 2011 NA
457 United States 2012 0.396
458 United States 2013 0.394
459 United States 2014 0.390
460 United States 2015 0.391
461 United States 2016 0.390
462 United States 2017 NA
Downloadable R script and interactive version
- R Script: reshaping-to-long-with-tidyr.R
- Interactive version (shuts down after 60s):
- Interactive version (sign in required):
Explanation
The link with the “jupyterhub” icon directs you to an interactive Jupyter1 notebook, which runs inside a Docker container2. There are two variants of the interative notebook. One shuts down after 60 seconds and does not require a sign it. The other requires sign in using your ORCID3 credentials, yet shuts down only after 24 hours. (There is no guarantee that such a container persists that long, it may be shut down earlier for maintenance purposes.) After shutdown all data within the container will be reset, i.e. all files created by the user will be deleted.4
Above you see a rendered version of the Jupyter notebook.5
- 1
-
For more information about Jupyter see
http://jupyter.org
. The Jupyter notebooks make use of the IRKernel package. - 2
-
For more information about Docker see
https://docs.docker.com/
. The container images were created with repo2docker, while containers are run with docker spawner. - 3
-
ORCID is a free service for the authentication of researchers. It also allows to showcase publications and contributions to the academic community such as peer review.. See
https://info.orcid.org/what-is-orcid/
for more information. - 4
-
The Jupyter notebooks come with NO WARRANTY whatsoever. They are provided for educational and illustrative purposes only. Do not use them for production work.
- 5
-
The notebook is rendered with the help of the nbsphinx extension.