Dates, Times, and Time Series

Temporal data, consisting of dates and times, pose their own challenges. Time is measured in non-metric units, in hours, minutes and seconds. Dates can be recorded according to various calendaric systems, and are complicated by leap days and leap seconds. R provides facilities to convert times and dates into different calendaric systems, to format temporal data and to import temporal data recorded in different formats. This is one topic of this chapter. The other topic are time series and similar data structures (such as panels). Basic time series consist of measurements conducted in regular temporal intervals, but beside these basic variants, R also supports irregular time series. The chapter therefore also the discusses the construction and manipulation of regular and irregular time series.

Below is the supporting material for the various sections of the chapter.

Dates and Times

  • Date objects and date formatting

    • Script file: date-objects-date-formatting.R
    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      as.Date(20,origin="1970-01-01")
      
      [1] "1970-01-21"
      In [4]:
      d <- as.Date("1990-11-09")
      
      In [5]:
      format(d,"%e %B % Y")
      
      [1] " 9 November % Y"
      In [6]:
      format(d,"%b %d, %y")
      
      [1] "Nov 09, 90"
      In [7]:
      format(d,"%Y-%m-%d")
      
      [1] "1990-11-09"
      In [8]:
      as.Date("11/09/90", format="%m/%d/%y")
      
      [1] "1990-11-09"
  • Date arithmetic

    • Script file: date-arithmetic.R
    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      # R knows the lengths of months, e.g. that March has 31 days:
      d0 <- as.Date("1968-03-01")
      d0 + 31
      
      [1] "1968-04-01"
      In [3]:
      # R also knows that 1968 was a leap year,
      d1 <- as.Date("1968-02-28")
      d1 + 1
      
      [1] "1968-02-29"
      In [4]:
      # that 1900 was not a leap year,
      d2 <- as.Date("1900-02-28")
      d2 + 1
      
      [1] "1900-03-01"
      In [5]:
      # that 2000 was a leap year,
      d3 <- as.Date("2000-02-28")
      d3 + 1
      
      [1] "2000-02-29"
      In [6]:
      # and that leap years are 366 days long
      d3 + 366
      
      [1] "2001-02-28"
  • POSIXct time objects

    • Script file: POSIXct-time-objects.R
    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      as.POSIXct(7200,origin="1970-01-01")
      
      [1] "1970-01-01 03:00:00 CET"
      In [3]:
      t0 <- as.POSIXct(7200,origin="1970-01-01",tz="GMT")
      t0 <- as.POSIXct(7200,origin="1970-01-01")
      attr(t0,"tzone") <- "GMT"
      
      In [4]:
      as.POSIXct(c("97/11/12 12:45","98/01/23 14:20"),
                 format="%y/%m/%d %H:%M",tz="GMT")
      
      [1] "1997-11-12 12:45:00 GMT" "1998-01-23 14:20:00 GMT"
  • Time arithmetic

    • Script file: time-arithmetic.R
    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      # When in standard format, a string does not need a format spefication in order
      # to be translatable
      t0 <- as.POSIXct("2020-02-01 00:00",tz="GMT")
      t0
      
      [1] "2020-02-01 GMT"
      In [3]:
      # Adding 3600 seconds means adding an hour:
      t0 + 3600
      
      [1] "2020-02-01 01:00:00 GMT"
      In [4]:
      # Subtracting seconds may also change the date:
      t0 - 1
      
      [1] "2020-01-31 23:59:59 GMT"
      In [5]:
      # A day is 24 times 3600 seconds
      day <- 24*3600
      t0 + day
      
      [1] "2020-02-02 GMT"
  • POSIXlt time objects

    • Script file: POSIXlt-time-objects.R
    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      t0 <- as.POSIXlt(0,origin="2020-02-01",tz="GMT")
      
      In [3]:
      (t1 <- as.POSIXlt(t0 + 3630))
      
      [1] "2020-02-01 01:00:30 GMT"
      In [4]:
      # Get the seconds component of the time point
      t1$sec
      
      [1] 30
      In [5]:
      # Get the minutes component of the time point
      t1$min
      
      [1] 0
      In [6]:
      # Get the hours component
      t1$hour
      
      [1] 1
      In [7]:
      # Get the day(s) of the month
      t1$mday
      
      [1] 1
      In [8]:
      # Get the (numeric) month
      t1$mon
      
      [1] 1
      In [9]:
      # Get the (numeric) year
      t1$year
      
      [1] 120
      In [10]:
      # Get the (numeric) day of the week
      t1$wday
      
      [1] 6
  • Creation of date and time data for given years, months, and days

    • Script file: ISOdate.R
    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      # Here we create the first days of all months in the year 2000:
      # By default the time is noon
      ISOdate(2000,1:12,1)
      
       [1] "2000-01-01 12:00:00 GMT" "2000-02-01 12:00:00 GMT"
       [3] "2000-03-01 12:00:00 GMT" "2000-04-01 12:00:00 GMT"
       [5] "2000-05-01 12:00:00 GMT" "2000-06-01 12:00:00 GMT"
       [7] "2000-07-01 12:00:00 GMT" "2000-08-01 12:00:00 GMT"
       [9] "2000-09-01 12:00:00 GMT" "2000-10-01 12:00:00 GMT"
      [11] "2000-11-01 12:00:00 GMT" "2000-12-01 12:00:00 GMT"
      In [3]:
      # To get the start of the date we have to set the hour to midnight:
      ISOdate(2000,1:12,1,hour=0)
      
       [1] "2000-01-01 GMT" "2000-02-01 GMT" "2000-03-01 GMT" "2000-04-01 GMT"
       [5] "2000-05-01 GMT" "2000-06-01 GMT" "2000-07-01 GMT" "2000-08-01 GMT"
       [9] "2000-09-01 GMT" "2000-10-01 GMT" "2000-11-01 GMT" "2000-12-01 GMT"
      In [4]:
      # We can of course also create a sequence of days:
      ISOdate(2000,2,1:29,hour=0)
      
       [1] "2000-02-01 GMT" "2000-02-02 GMT" "2000-02-03 GMT" "2000-02-04 GMT"
       [5] "2000-02-05 GMT" "2000-02-06 GMT" "2000-02-07 GMT" "2000-02-08 GMT"
       [9] "2000-02-09 GMT" "2000-02-10 GMT" "2000-02-11 GMT" "2000-02-12 GMT"
      [13] "2000-02-13 GMT" "2000-02-14 GMT" "2000-02-15 GMT" "2000-02-16 GMT"
      [17] "2000-02-17 GMT" "2000-02-18 GMT" "2000-02-19 GMT" "2000-02-20 GMT"
      [21] "2000-02-21 GMT" "2000-02-22 GMT" "2000-02-23 GMT" "2000-02-24 GMT"
      [25] "2000-02-25 GMT" "2000-02-26 GMT" "2000-02-27 GMT" "2000-02-28 GMT"
      [29] "2000-02-29 GMT"
      In [5]:
      # 'Impossible' dates result in NA:
      ISOdate(2000,2,29:31,hour=0)
      
      [1] "2000-02-29 GMT" NA               NA              
  • Time differences

    • Script file: time-differences.R
    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      # It does not matter whether we have "POSIXct" or "POSIXlt" objects,
      # we can always obtain differences between the tiems
      t0 <- as.POSIXlt(0,origin="2020-02-01",tz="GMT")
      t1 <- as.POSIXct(0,origin="2020-02-01 3:00",tz="GMT")
      t2 <- as.POSIXlt(0,origin="2020-02-01 3:45",tz="GMT")
      t3 <- as.POSIXct(0,origin="2020-02-01 3:45:06",tz="GMT")
      
      In [3]:
      # The unit of measurement for time differences is selected
      # automatically. Usually it is the largest sensible unit:
      t1 - t0
      
      Time difference of 3 hours
      In [4]:
      t2 - t1
      
      Time difference of 45 mins
      In [5]:
      t3 - t2
      
      Time difference of 6 secs
      In [6]:
      t3 - t0
      
      Time difference of 3.751667 hours
      In [7]:
      # The last difference is in hours and hour fractions. It might be more sensible
      # to have seconds as units of measuremnt.
      diff.t <- t3 - t0
      units(diff.t) <- "secs"
      diff.t
      
      Time difference of 13506 secs
      In [8]:
      # It is also possible to compute differences between dates:
      d0 <- as.Date("2020-01-31")
      d1 <- as.Date("2020-02-28")
      d2 <- as.Date("2020-03-31")
      
      In [9]:
      # Usually the difference is in days:
      d1 - d0
      
      Time difference of 28 days
      In [10]:
      d2 - d0
      
      Time difference of 60 days
      In [11]:
      # We may also want to see the difference in hours:
      diff.d <- d1 - d0
      units(diff.d) <- "hours"
      diff.d
      
      Time difference of 672 hours
      In [12]:
      # It is also possible to create time durations from scratch
      # From strings:
      as.difftime("0:30:00")
      
      Time difference of 30 mins
      In [13]:
      # and from numbers, here it is necessary to specify the unit of measurement
      as.difftime(30, units="mins")
      
      Time difference of 30 mins

Time Series

Regular time series

  • Approval of US presidents

    • Script file: presidents-timeseries.R
    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      

      The following line is not really necessary, it is used here only to indicate that presidents is a pre-installed data example.

      In [2]:
      data(presidents)
      

      The data contains quarterly data about presidents' popularity. The function tsp() contains the time series properties: the starting point, the end point and the frequency in which the popularity is measured within years.

      In [3]:
      tsp(presidents)
      
      [1] 1945.00 1974.75    4.00

      With the functions start(), end() and frequency() we can obtain the respective time series properties.

      In [4]:
      start(presidents)
      
      [1] 1945    1
      In [5]:
      end(presidents)
      
      [1] 1974    4
      In [6]:
      frequency(presidents)
      
      [1] 4
      In [7]:
      presidents[1:12]
      
       [1] NA 87 82 75 63 50 43 32 35 60 54 55
      In [8]:
      window(presidents,
             start=1945,
             end=c(1947,4))
      
           Qtr1 Qtr2 Qtr3 Qtr4
      1945 NA   87   82   75  
      1946 63   50   43   32  
      1947 35   60   54   55  
      In [9]:
      nixon <- window(presidents,
                      start=1969,
                      end=c(1974,2))
      nixon
      
           Qtr1 Qtr2 Qtr3 Qtr4
      1969 59   65   65   56  
      1970 66   53   61   52  
      1971 51   48   54   49  
      1972 49   61   NA   NA  
      1973 68   44   40   27  
      1974 28   25            
      In [10]:
      plot(nixon)
      
      In [11]:
      time(nixon)
      
           Qtr1    Qtr2    Qtr3    Qtr4   
      1969 1969.00 1969.25 1969.50 1969.75
      1970 1970.00 1970.25 1970.50 1970.75
      1971 1971.00 1971.25 1971.50 1971.75
      1972 1972.00 1972.25 1972.50 1972.75
      1973 1973.00 1973.25 1973.50 1973.75
      1974 1974.00 1974.25                
  • OECD unemployment data

    • Script file: OECD-unemployment.R

      Data set used in the script: unemployment.csv, which was originally downloaded from https://data.oecd.org

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      unemployment <- read.csv("unemployment.csv")
      
      In [3]:
      unemployment.ts <- ts(unemployment[2:5],
                            start = 1970)
      
      In [4]:
      plot(unemployment.ts)
      
      In [5]:
      window(unemployment.ts,
             start=1980,
             end=1989)
      
           Germany France Italy  Netherlands
      1980 3.190    6.246  5.574  4.015     
      1981 4.505    7.396  6.269  5.818     
      1982 6.441    8.041  6.918  8.519     
      1983 7.921    8.253  7.694 10.987     
      1984 7.932    9.660  8.504 10.604     
      1985 8.002   10.234  8.611  9.191     
      1986 7.661   10.373  9.896  8.394     
      1987 7.611   10.479 10.248  7.982     
      1988 7.598    9.975 10.451  7.785     
      1989 6.863    9.348 10.214  6.917     
      In [6]:
      delta.unemployment.ts <- diff(unemployment.ts)
      
      In [7]:
      plot(delta.unemployment.ts)
      
  • Artificial time series data

    • Script file: time-arithmetic.R

      Data set used in the script: unemployment.csv

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      # When in standard format, a string does not need a format spefication in order
      # to be translatable
      t0 <- as.POSIXct("2020-02-01 00:00",tz="GMT")
      t0
      
      [1] "2020-02-01 GMT"
      In [3]:
      # Adding 3600 seconds means adding an hour:
      t0 + 3600
      
      [1] "2020-02-01 01:00:00 GMT"
      In [4]:
      # Subtracting seconds may also change the date:
      t0 - 1
      
      [1] "2020-01-31 23:59:59 GMT"
      In [5]:
      # A day is 24 times 3600 seconds
      day <- 24*3600
      t0 + day
      
      [1] "2020-02-02 GMT"

Irregular time series and the zoo package

  • Creating a “zoo” object from the presidents time series

    • Script file: creating-zoo-objects-presidents.R

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      npresidents <- as.numeric(presidents)
      library(zoo)
      
      Attaching package: ‘zoo’
      
      
      The following objects are masked from ‘package:base’:
      
          as.Date, as.Date.numeric
      
      
      
      In [3]:
      years <- 1945:1974
      quarters <- 1:4
      presi.times <- yearqtr(
          rep(years,each=4) +  # each year is repeated 4 times
          rep((quarters-1)/4,30) # the quarters are repeated 30 times
      )
      zpresidents <- zoo(npresidents,order.by=presi.times)
      zpresidents
      
      1945 Q1 1945 Q2 1945 Q3 1945 Q4 1946 Q1 1946 Q2 1946 Q3 1946 Q4 1947 Q1 1947 Q2 
           NA      87      82      75      63      50      43      32      35      60 
      1947 Q3 1947 Q4 1948 Q1 1948 Q2 1948 Q3 1948 Q4 1949 Q1 1949 Q2 1949 Q3 1949 Q4 
           54      55      36      39      NA      NA      69      57      57      51 
      1950 Q1 1950 Q2 1950 Q3 1950 Q4 1951 Q1 1951 Q2 1951 Q3 1951 Q4 1952 Q1 1952 Q2 
           45      37      46      39      36      24      32      23      25      32 
      1952 Q3 1952 Q4 1953 Q1 1953 Q2 1953 Q3 1953 Q4 1954 Q1 1954 Q2 1954 Q3 1954 Q4 
           NA      32      59      74      75      60      71      61      71      57 
      1955 Q1 1955 Q2 1955 Q3 1955 Q4 1956 Q1 1956 Q2 1956 Q3 1956 Q4 1957 Q1 1957 Q2 
           71      68      79      73      76      71      67      75      79      62 
      1957 Q3 1957 Q4 1958 Q1 1958 Q2 1958 Q3 1958 Q4 1959 Q1 1959 Q2 1959 Q3 1959 Q4 
           63      57      60      49      48      52      57      62      61      66 
      1960 Q1 1960 Q2 1960 Q3 1960 Q4 1961 Q1 1961 Q2 1961 Q3 1961 Q4 1962 Q1 1962 Q2 
           71      62      61      57      72      83      71      78      79      71 
      1962 Q3 1962 Q4 1963 Q1 1963 Q2 1963 Q3 1963 Q4 1964 Q1 1964 Q2 1964 Q3 1964 Q4 
           62      74      76      64      62      57      80      73      69      69 
      1965 Q1 1965 Q2 1965 Q3 1965 Q4 1966 Q1 1966 Q2 1966 Q3 1966 Q4 1967 Q1 1967 Q2 
           71      64      69      62      63      46      56      44      44      52 
      1967 Q3 1967 Q4 1968 Q1 1968 Q2 1968 Q3 1968 Q4 1969 Q1 1969 Q2 1969 Q3 1969 Q4 
           38      46      36      49      35      44      59      65      65      56 
      1970 Q1 1970 Q2 1970 Q3 1970 Q4 1971 Q1 1971 Q2 1971 Q3 1971 Q4 1972 Q1 1972 Q2 
           66      53      61      52      51      48      54      49      49      61 
      1972 Q3 1972 Q4 1973 Q1 1973 Q2 1973 Q3 1973 Q4 1974 Q1 1974 Q2 1974 Q3 1974 Q4 
           NA      NA      68      44      40      27      28      25      24      24 
      In [4]:
      str(zpresidents)
      
      ‘zoo’ series from 1945 Q1 to 1974 Q4
        Data: num [1:120] NA 87 82 75 63 50 43 32 35 60 ...
        Index:  'yearqtr' num [1:120] 1945 Q1 1945 Q2 1945 Q3 1945 Q4 ...
      
      In [5]:
      coredata(zpresidents)[1:15] # To save space we only look at the
      
       [1] NA 87 82 75 63 50 43 32 35 60 54 55 36 39 NA
      In [6]:
      index(zpresidents)[1:15]    # first 15 elements.
      
       [1] "1945 Q1" "1945 Q2" "1945 Q3" "1945 Q4" "1946 Q1" "1946 Q2" "1946 Q3"
       [8] "1946 Q4" "1947 Q1" "1947 Q2" "1947 Q3" "1947 Q4" "1948 Q1" "1948 Q2"
      [15] "1948 Q3"
      In [7]:
      time(zpresidents)[1:15]
      
       [1] "1945 Q1" "1945 Q2" "1945 Q3" "1945 Q4" "1946 Q1" "1946 Q2" "1946 Q3"
       [8] "1946 Q4" "1947 Q1" "1947 Q2" "1947 Q3" "1947 Q4" "1948 Q1" "1948 Q2"
      [15] "1948 Q3"
      In [8]:
      zpresidents[1:8]
      
      1945 Q1 1945 Q2 1945 Q3 1945 Q4 1946 Q1 1946 Q2 1946 Q3 1946 Q4 
           NA      87      82      75      63      50      43      32 
      In [9]:
      # Saved for later use:
      save(zpresidents,file="zpresidents.RData")
      
  • Creating a “zoo” object from OECD unemployment data

    • Script file: creating-zoo-objects-unemployment.R

      Data set used in the script: unemployment.csv, which was originally downloaded from https://data.oecd.org

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      unemployment <- read.csv("unemployment.csv")
      library(zoo)
      
      Attaching package: ‘zoo’
      
      
      The following objects are masked from ‘package:base’:
      
          as.Date, as.Date.numeric
      
      
      
      In [3]:
      unemployment.z <- zoo(unemployment[,2:7],
                            order.by=as.Date(
                                ISOdate(year=unemployment[,1],
                                        month=12,
                                        day=31)))
      
      In [4]:
      dim(unemployment.z)
      
      [1] 30  6
      In [5]:
      class(unemployment.z)
      
      [1] "zoo"
      In [6]:
      head(unemployment.z)
      
                 Germany France Italy Netherlands Belgium Luxembourg
      1970-12-31   0.557  2.477 4.000       0.868   1.913         NA
      1971-12-31   0.689  2.712 4.001       1.213   1.848         NA
      1972-12-31   0.912  2.806 4.711       2.114   2.350         NA
      1973-12-31   1.000  2.690 4.691       2.151   2.408         NA
      1974-12-31   2.132  2.853 3.942       2.624   2.523      0.067
      1975-12-31   3.965  4.028 4.312       3.772   4.522      0.200
      In [7]:
      start(unemployment.z)
      
      [1] "1970-12-31"
      In [8]:
      end(unemployment.z)
      
      [1] "1999-12-31"
      In [9]:
      end(unemployment.z) - start(unemployment.z)
      
      Time difference of 10592 days
      In [10]:
      # Saved for later use:
      save(unemployment.z,file="unemployment-z.RData")
      
  • Subsetting “zoo” objects

    • Script file: subsetting-zoo-objects.R

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

      Data file used in the script: zpresidents.RData created by earlier script creating-zoo-objects-presidents.R

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      library(zoo)
      
      Attaching package: ‘zoo’
      
      
      The following objects are masked from ‘package:base’:
      
          as.Date, as.Date.numeric
      
      
      
      In [3]:
      as.yearqtr("1945 Q2")
      
      [1] "1945 Q2"
      In [4]:
      load("zpresidents.RData")
      
      In [5]:
      zpresidents[as.yearqtr("1945 Q2")]
      
      1945 Q2 
           87 
      In [6]:
      qtrs3 <- as.yearqtr(paste(1960:1969,"Q3"))
      zpresidents[qtrs3]
      
      1960 Q3 1961 Q3 1962 Q3 1963 Q3 1964 Q3 1965 Q3 1966 Q3 1967 Q3 1968 Q3 1969 Q3 
           61      71      62      62      69      69      56      38      35      65 
      In [7]:
      qtrs <- paste(rep(1960:1964,each=4),rep(4:1,4),sep="-")
      qtrs
      
       [1] "1960-4" "1960-3" "1960-2" "1960-1" "1961-4" "1961-3" "1961-2" "1961-1"
       [9] "1962-4" "1962-3" "1962-2" "1962-1" "1963-4" "1963-3" "1963-2" "1963-1"
      [17] "1964-4" "1964-3" "1964-2" "1964-1"
      In [8]:
      zpresidents[as.yearqtr(qtrs)]
      
      1960 Q1 1960 Q2 1960 Q3 1960 Q4 1961 Q1 1961 Q2 1961 Q3 1961 Q4 1962 Q1 1962 Q2 
           71      62      61      57      72      83      71      78      79      71 
      1962 Q3 1962 Q4 1963 Q1 1963 Q2 1963 Q3 1963 Q4 1964 Q1 1964 Q2 1964 Q3 1964 Q4 
           62      74      76      64      62      57      80      73      69      69 
      In [9]:
      load("unemployment-z.RData")
      
      In [10]:
      unemployment.z[as.Date("1997-12-31")]
      
                 Germany France  Italy Netherlands Belgium Luxembourg
      1997-12-31  11.412 12.438 12.251        5.59  12.691      3.616
      In [11]:
      window(zpresidents,
             start = as.yearqtr("1969-1"),
             end   = as.yearqtr("1974-2"))
      
      1969 Q1 1969 Q2 1969 Q3 1969 Q4 1970 Q1 1970 Q2 1970 Q3 1970 Q4 1971 Q1 1971 Q2 
           59      65      65      56      66      53      61      52      51      48 
      1971 Q3 1971 Q4 1972 Q1 1972 Q2 1972 Q3 1972 Q4 1973 Q1 1973 Q2 1973 Q3 1973 Q4 
           54      49      49      61      NA      NA      68      44      40      27 
      1974 Q1 1974 Q2 
           28      25 
      In [12]:
      window(unemployment.z,
             start = as.Date("1980-12-31"),
             end   = as.Date("1989-12-31"))
      
                 Germany France  Italy Netherlands Belgium Luxembourg
      1980-12-31   3.190  6.246  5.574       4.015   8.029      0.721
      1981-12-31   4.505  7.396  6.269       5.818  10.279      1.042
      1982-12-31   6.441  8.041  6.918       8.519  12.030      1.302
      1983-12-31   7.921  8.253  7.694      10.987  13.319      1.630
      1984-12-31   7.932  9.660  8.504      10.604  13.363      1.750
      1985-12-31   8.002 10.234  8.611       9.191  12.442      1.688
      1986-12-31   7.661 10.373  9.896       8.394  11.792      1.478
      1987-12-31   7.611 10.479 10.248       7.982  11.461      1.710
      1988-12-31   7.598  9.975 10.451       7.785  10.422      1.564
      1989-12-31   6.863  9.348 10.214       6.917   9.377      1.420
  • Handling missing values

    • Script file: handling-missing-values.R

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

      Data file used in the script: zpresidents.RData created by earlier script creating-zoo-objects-presidents.R

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      library(zoo)
      load("zpresidents.RData")
      
      Attaching package: ‘zoo’
      
      
      The following objects are masked from ‘package:base’:
      
          as.Date, as.Date.numeric
      
      
      
      In [3]:
      # Leads to an error:
      presidents.o <- na.omit(presidents)
      
      Error in na.omit.ts(presidents): time series contains internal NAs
      Traceback:
      
      1. na.omit(presidents)
      2. na.omit.ts(presidents)
      3. stop("time series contains internal NAs")
      In [4]:
      zpresidents.o <- na.omit(zpresidents)
      
      In [5]:
      c("Original length" = length(zpresidents),
        "Length after dropping NAs"  = length(zpresidents.o))
      
                Original length Length after dropping NAs 
                            120                       114 
      In [6]:
      plot(zpresidents,lty=3)
      lines(na.contiguous(zpresidents),lwd=2)
      
      In [7]:
      plot(zpresidents,lwd=2)
      lines(na.approx(zpresidents),lty=2)
      lines(na.spline(zpresidents),lty=3)
      
  • Rolling statistics

    • Script file: rolling-statistics.R

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

      Data file used in the script: zpresidents.RData created by earlier script creating-zoo-objects-presidents.R

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      library(zoo)
      load("zpresidents.RData")
      
      Attaching package: ‘zoo’
      
      
      The following objects are masked from ‘package:base’:
      
          as.Date, as.Date.numeric
      
      
      
      In [3]:
      zpresidents.o <- na.omit(zpresidents)
      
      In [4]:
      zpresidents.o8 <- zpresidents.o[1:8]
      
      In [5]:
      rollmean(zpresidents.o8,k=7)
      
       1946 Q1  1946 Q2 
      61.71429 54.28571 
      In [6]:
      rollmean(zpresidents.o8,k=7,align="left")
      
       1945 Q2  1945 Q3 
      61.71429 54.28571 
      In [7]:
      rollmean(zpresidents.o8,k=7,align="right")
      
       1946 Q4  1947 Q1 
      61.71429 54.28571 
      In [8]:
      zpresidents.s <- na.spline(zpresidents)
      plot(zpresidents.s,lty=3)
      
      In [10]:
      zpresidents.m <- rollmean(zpresidents.s,k=9)
      plot(zpresidents.s,lty=3)
      lines(zpresidents.m,lwd=2)
      
      In [11]:
      zpresidents.sd <- rollapply(zpresidents.s,
                                  width=9,
                                  FUN=sd)
      
      In [12]:
      tv <- qt(.975,df=8)
      zpresidents.u <- zpresidents.m+tv*zpresidents.sd/sqrt(8)
      zpresidents.l <- zpresidents.m-tv*zpresidents.sd/sqrt(8)
      
      In [13]:
      plot(zpresidents.m,ylim=c(20,80))
      lines(zpresidents.u,lty=2)
      lines(zpresidents.l,lty=2)
      
  • Time arithmetics with “zoo” objects

    • Script file: time-arithmetic.R

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      # When in standard format, a string does not need a format spefication in order
      # to be translatable
      t0 <- as.POSIXct("2020-02-01 00:00",tz="GMT")
      t0
      
      [1] "2020-02-01 GMT"
      In [3]:
      # Adding 3600 seconds means adding an hour:
      t0 + 3600
      
      [1] "2020-02-01 01:00:00 GMT"
      In [4]:
      # Subtracting seconds may also change the date:
      t0 - 1
      
      [1] "2020-01-31 23:59:59 GMT"
      In [5]:
      # A day is 24 times 3600 seconds
      day <- 24*3600
      t0 + day
      
      [1] "2020-02-02 GMT"
  • Merging (multivariate) time series

    • Script file: merging-timeseries.R

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

      Data file used in the script: unemployment-z.RData created by earlier script creating-zoo-objects-unemployment.R

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      library(zoo)
      load("unemployment-z.RData")
      
      Attaching package: ‘zoo’
      
      
      The following objects are masked from ‘package:base’:
      
          as.Date, as.Date.numeric
      
      
      
      In [3]:
      Netherlands <- unemployment.z[,4]
      length(Netherlands)
      
      [1] 30
      In [4]:
      Belgium <- unemployment.z[,5]
      length(Belgium)
      
      [1] 30
      In [5]:
      Luxembourg <- na.omit(unemployment.z[,6])
      length(Luxembourg)
      
      [1] 26
      In [6]:
      unemployment.benelux <- merge(Netherlands,
                                    Belgium,
                                    Luxembourg)
      head(unemployment.benelux,n=10)
      
                 Netherlands Belgium Luxembourg
      1970-12-31       0.868   1.913         NA
      1971-12-31       1.213   1.848         NA
      1972-12-31       2.114   2.350         NA
      1973-12-31       2.151   2.408         NA
      1974-12-31       2.624   2.523      0.067
      1975-12-31       3.772   4.522      0.200
      1976-12-31       4.067   5.934      0.332
      1977-12-31       3.916   6.745      0.531
      1978-12-31       3.827   7.321      0.797
      1979-12-31       3.648   7.581      0.725
  • Importing data into “zoo” objects

    • Script file: importing-zoo-objects.R

      The script makes use of the zoo package, which is available from https://cran.r-project.org/package=zoo

      Data file used in the script: unemployment-z.RData created by earlier script creating-zoo-objects-unemployment.R

    • Interactive notebook:

      In [1]:
      options(jupyter.rich_display=FALSE) # Create output as usual in R
      
      In [2]:
      library(zoo)
      
      Attaching package: ‘zoo’
      
      
      The following objects are masked from ‘package:base’:
      
          as.Date, as.Date.numeric
      
      
      
      In [3]:
      unemployment_z <- read.csv.zoo("unemployment.csv")
      str(unemployment_z)
      
      ‘zoo’ series from 1970 to 1999
        Data: num [1:30, 1:29] 0.557 0.689 0.912 1 2.132 ...
       - attr(*, "dimnames")=List of 2
        ..$ : NULL
        ..$ : chr [1:29] "Germany" "France" "Italy" "Netherlands" ...
        Index:  int [1:30] 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 ...
      
      In [4]:
      Text <- "2012/1/6 20
      2012/1/7 30
      2012/1/8 40
      "
      read.zoo(text=Text)
      
      2012-01-06 2012-01-07 2012-01-08 
              20         30         40 
      In [5]:
      read.zoo(text=Text,format="%Y/%m/%d")
      
      2012-01-06 2012-01-07 2012-01-08 
              20         30         40 
      In [6]:
      Text <- "date,time,x,y
      2011-05-08,22:45:21,4,41
      2011-05-08,22:45:22,5,42
      2011-05-08,22:45:23,5,42
      2011-05-08,22:45:24,6,43
      "
      zobj <- read.csv.zoo(text=Text,
                           index.column=1:2)
      zobj
      
                          x  y
      2011-05-08 22:45:21 4 41
      2011-05-08 22:45:22 5 42
      2011-05-08 22:45:23 5 42
      2011-05-08 22:45:24 6 43