1. Introduction to Time Series Analysis

Time series analysis comprises methods for analyzing time-ordered data to extract meaningful statistics and identify characteristics of the data. A time series is a sequence of observations recorded at successive points in time, typically at uniform intervals.

Formally, a time series is a realization of a stochastic process {Yt: t ∈ T} where T is the index set representing time. Time series analysis aims to:

2. Time Series Components

The classical decomposition model represents a time series as a combination of systematic components:

Additive Decomposition
Yt = Tt + St + Ct + It
Multiplicative Decomposition
Yt = Tt × St × Ct × It
Where:
Tt = Trend (long-term movement)
St = Seasonal (regular periodic fluctuations)
Ct = Cyclical (irregular periodic fluctuations)
It = Irregular/Random (unpredictable fluctuations)

Trend Component

The trend represents the long-term progression of the series. It can be linear, polynomial, exponential, or follow other functional forms. Trend can be extracted using moving averages, regression, or filtering methods.

Seasonal Component

Seasonality refers to regular, predictable patterns that repeat over fixed periods (daily, weekly, monthly, quarterly, annually). The seasonal period s is the number of observations per cycle.

3. Stationarity

Stationarity is a fundamental concept in time series analysis. A stationary process has statistical properties that do not change over time.

Strict Stationarity

A process is strictly stationary if the joint distribution of (Yt1, ..., Ytk) is identical to that of (Yt1+h, ..., Ytk+h) for all t, k, and h.

Weak (Covariance) Stationarity

A process is weakly stationary if:

  1. E[Yt] = μ (constant mean)
  2. Var(Yt) = σ2 < ∞ (constant, finite variance)
  3. Cov(Yt, Yt+h) = γ(h) (autocovariance depends only on lag h)
Autocorrelation Function (ACF)
ρ(h) = γ(h) / γ(0) = Cov(Yt, Yt+h) / Var(Yt)
The ACF measures correlation between observations separated by h time periods. For stationary processes, -1 ≤ ρ(h) ≤ 1.

Testing for Stationarity

Differencing to Achieve Stationarity

Non-stationary series can often be made stationary through differencing:

Differencing Operators
First difference: ∇Yt = Yt - Yt-1

d-th difference: ∇dYt = (1 - B)dYt
Where B is the backshift operator: BYt = Yt-1

4. Time Series Decomposition

Classical Decomposition

The classical method uses moving averages to estimate trend, then extracts seasonal indices:

  1. Estimate trend using centered moving average of order s
  2. De-trend: Compute Yt - Tt (additive) or Yt/Tt (multiplicative)
  3. Average de-trended values by season to get seasonal indices
  4. Normalize seasonal indices
  5. Remainder = Original - Trend - Seasonal

STL Decomposition

Seasonal and Trend decomposition using Loess (STL) is a robust method that:

5. Exponential Smoothing Methods

Simple Exponential Smoothing (SES)

For series with no trend or seasonality:

Simple Exponential Smoothing
St = αYt + (1 - α)St-1
Where 0 < α ≤ 1 is the smoothing parameter. Larger α gives more weight to recent observations.

Holt's Linear Method

Extends SES to capture linear trend:

Holt's Method Equations
Level: lt = αYt + (1 - α)(lt-1 + bt-1)

Trend: bt = β(lt - lt-1) + (1 - β)bt-1

Forecast: Ft+h = lt + hbt

Holt-Winters Method

Extends Holt's method to include seasonality:

Holt-Winters Additive
Level: lt = α(Yt - st-m) + (1 - α)(lt-1 + bt-1)

Trend: bt = β(lt - lt-1) + (1 - β)bt-1

Seasonal: st = γ(Yt - lt) + (1 - γ)st-m

Forecast: Ft+h = lt + hbt + st+h-m
Where m is the seasonal period and γ is the seasonal smoothing parameter.

7. Autoregressive (AR) Models

An AR(p) model expresses the current value as a linear combination of p past values plus white noise:

AR(p) Model
Yt = c + φ1Yt-1 + φ2Yt-2 + ... + φpYt-p + εt
Where φi are the autoregressive coefficients, c is a constant, and εt ~ WN(0, σ2) is white noise.

AR(1) Properties

Partial Autocorrelation Function (PACF)

The PACF measures the correlation between Yt and Yt-h after removing the linear effect of intermediate lags. For AR(p), the PACF cuts off after lag p.

8. Moving Average (MA) Models

An MA(q) model expresses the current value as a linear combination of current and past white noise terms:

MA(q) Model
Yt = μ + εt + θ1εt-1 + θ2εt-2 + ... + θqεt-q
Where θi are the moving average coefficients.

MA(1) Properties

9. ARIMA Models

ARIMA (AutoRegressive Integrated Moving Average) combines AR and MA components with differencing to handle non-stationary data.

ARIMA(p, d, q) Model
φ(B)(1 - B)dYt = c + θ(B)εt
Where:
p = order of AR component
d = degree of differencing
q = order of MA component
φ(B) = 1 - φ1B - ... - φpBp
θ(B) = 1 + θ1B + ... + θqBq

Box-Jenkins Methodology

  1. Identification: Use ACF, PACF, and stationarity tests to determine p, d, q
  2. Estimation: Estimate parameters using maximum likelihood
  3. Diagnostic Checking: Verify residuals are white noise
  4. Forecasting: Generate predictions with confidence intervals
Model ACF Pattern PACF Pattern
AR(p)Exponential/sinusoidal decayCuts off after lag p
MA(q)Cuts off after lag qExponential/sinusoidal decay
ARMA(p,q)Decay after lag qDecay after lag p

10. Seasonal ARIMA (SARIMA)

SARIMA extends ARIMA to handle seasonality by including seasonal AR, MA, and differencing terms:

SARIMA(p,d,q)(P,D,Q)m
φ(B)Φ(Bm)(1-B)d(1-Bm)DYt = c + θ(B)Θ(Bmt
Where:
(p,d,q) = non-seasonal orders
(P,D,Q) = seasonal orders
m = seasonal period
Φ, Θ = seasonal AR and MA polynomials
Example: Monthly Sales Data

For monthly data with annual seasonality (m=12), a SARIMA(1,1,1)(1,1,1)12 model might be appropriate if:

  • First differencing removes trend
  • Seasonal differencing removes annual pattern
  • ACF/PACF show significant spikes at lags 1 and 12

11. Model Diagnostics

Residual Analysis

A well-specified model should have residuals that are:

Ljung-Box Test

Ljung-Box Q Statistic
Q = n(n+2) ∑k=1h rk2/(n-k)
Tests whether residual autocorrelations as a group differ significantly from zero. Under H0 (no autocorrelation), Q ~ χ2(h-p-q).

Information Criteria

For model selection:

12. Forecasting

Point Forecasts

The minimum mean squared error forecast is the conditional expectation:

t+h|t = E[Yt+h | Yt, Yt-1, ...]

Prediction Intervals

For Gaussian models, the (1-α) prediction interval is:

t+h|t ± zα/2 σh
Where σh is the forecast standard error at horizon h, which increases with h.

Forecast Accuracy Measures

References and Further Reading

  1. Box, G.E.P., Jenkins, G.M., Reinsel, G.C., & Ljung, G.M. (2015). Time Series Analysis: Forecasting and Control, 5th Edition. Wiley.
  2. Hyndman, R.J. & Athanasopoulos, G. (2021). Forecasting: Principles and Practice, 3rd Edition. OTexts.
  3. Hamilton, J.D. (1994). Time Series Analysis. Princeton University Press.
  4. Brockwell, P.J. & Davis, R.A. (2016). Introduction to Time Series and Forecasting, 3rd Edition. Springer.
  5. Shumway, R.H. & Stoffer, D.S. (2017). Time Series Analysis and Its Applications, 4th Edition. Springer.