Time Series Analysis Tutorial - Elneuro | ARIMA, Forecasting, Academic Guide

1. Introduction to Time Series Analysis

Time series analysis comprises methods for analyzing time-ordered data to extract meaningful statistics and identify characteristics of the data. A time series is a sequence of observations recorded at successive points in time, typically at uniform intervals.

Formally, a time series is a realization of a stochastic process {Y_t: t ∈ T} where T is the index set representing time. Time series analysis aims to:

Description: Characterize properties of the series
Explanation: Use variation in one series to explain variation in another
Prediction: Forecast future values
Control: Understand and manipulate the generating process

2. Time Series Components

The classical decomposition model represents a time series as a combination of systematic components:

Additive Decomposition

Y_t = T_t + S_t + C_t + I_t

Multiplicative Decomposition

Y_t = T_t × S_t × C_t × I_t

Where:
T_t = Trend (long-term movement)
S_t = Seasonal (regular periodic fluctuations)
C_t = Cyclical (irregular periodic fluctuations)
I_t = Irregular/Random (unpredictable fluctuations)

Trend Component

The trend represents the long-term progression of the series. It can be linear, polynomial, exponential, or follow other functional forms. Trend can be extracted using moving averages, regression, or filtering methods.

Seasonal Component

Seasonality refers to regular, predictable patterns that repeat over fixed periods (daily, weekly, monthly, quarterly, annually). The seasonal period s is the number of observations per cycle.

3. Stationarity

Stationarity is a fundamental concept in time series analysis. A stationary process has statistical properties that do not change over time.

Strict Stationarity

A process is strictly stationary if the joint distribution of (Y_t1, ..., Y_tk) is identical to that of (Y_t1+h, ..., Y_tk+h) for all t, k, and h.

Weak (Covariance) Stationarity

A process is weakly stationary if:

E[Y_t] = μ (constant mean)
Var(Y_t) = σ² < ∞ (constant, finite variance)
Cov(Y_t, Y_t+h) = γ(h) (autocovariance depends only on lag h)

Autocorrelation Function (ACF)

ρ(h) = γ(h) / γ(0) = Cov(Y_t, Y_t+h) / Var(Y_t)

The ACF measures correlation between observations separated by h time periods. For stationary processes, -1 ≤ ρ(h) ≤ 1.

Testing for Stationarity

Augmented Dickey-Fuller (ADF) test: Tests for unit root (non-stationarity)
KPSS test: Tests null hypothesis of stationarity
Phillips-Perron test: Non-parametric unit root test

Differencing to Achieve Stationarity

Non-stationary series can often be made stationary through differencing:

Differencing Operators

First difference: ∇Y_t = Y_t - Y_t-1

d-th difference: ∇^dY_t = (1 - B)^dY_t

Where B is the backshift operator: BY_t = Y_t-1

4. Time Series Decomposition

Classical Decomposition

The classical method uses moving averages to estimate trend, then extracts seasonal indices:

Estimate trend using centered moving average of order s
De-trend: Compute Y_t - T_t (additive) or Y_t/T_t (multiplicative)
Average de-trended values by season to get seasonal indices
Normalize seasonal indices
Remainder = Original - Trend - Seasonal

STL Decomposition

Seasonal and Trend decomposition using Loess (STL) is a robust method that:

Uses locally weighted regression (loess) for smoothing
Handles any type of seasonality
Is robust to outliers
Allows seasonal component to change over time

5. Exponential Smoothing Methods

Simple Exponential Smoothing (SES)

For series with no trend or seasonality:

Simple Exponential Smoothing

S_t = αY_t + (1 - α)S_t-1

Where 0 < α ≤ 1 is the smoothing parameter. Larger α gives more weight to recent observations.

Holt's Linear Method

Extends SES to capture linear trend:

Holt's Method Equations

Level: l_t = αY_t + (1 - α)(l_t-1 + b_t-1)

Trend: b_t = β(l_t - l_t-1) + (1 - β)b_t-1

Forecast: F_t+h = l_t + hb_t

Holt-Winters Method

Extends Holt's method to include seasonality:

Holt-Winters Additive

Level: l_t = α(Y_t - s_t-m) + (1 - α)(l_t-1 + b_t-1)

Trend: b_t = β(l_t - l_t-1) + (1 - β)b_t-1

Seasonal: s_t = γ(Y_t - l_t) + (1 - γ)s_t-m

Forecast: F_t+h = l_t + hb_t + s_t+h-m

Where m is the seasonal period and γ is the seasonal smoothing parameter.

7. Autoregressive (AR) Models

An AR(p) model expresses the current value as a linear combination of p past values plus white noise:

AR(p) Model

Y_t = c + φ₁Y_t-1 + φ₂Y_t-2 + ... + φ_pY_t-p + ε_t

Where φ_i are the autoregressive coefficients, c is a constant, and ε_t ~ WN(0, σ²) is white noise.

AR(1) Properties

Stationary if |φ₁| < 1
Mean: E[Y_t] = c / (1 - φ₁)
ACF: ρ(h) = φ₁^h (exponential decay)
PACF: cuts off after lag 1

Partial Autocorrelation Function (PACF)

The PACF measures the correlation between Y_t and Y_t-h after removing the linear effect of intermediate lags. For AR(p), the PACF cuts off after lag p.

8. Moving Average (MA) Models

An MA(q) model expresses the current value as a linear combination of current and past white noise terms:

MA(q) Model

Y_t = μ + ε_t + θ₁ε_t-1 + θ₂ε_t-2 + ... + θ_qε_t-q

Where θ_i are the moving average coefficients.

MA(1) Properties

Always stationary (finite combination of white noise)
ACF: ρ(1) = θ₁/(1 + θ₁²), ρ(h) = 0 for h > 1
ACF cuts off after lag q
PACF: exponential or sinusoidal decay

9. ARIMA Models

ARIMA (AutoRegressive Integrated Moving Average) combines AR and MA components with differencing to handle non-stationary data.

ARIMA(p, d, q) Model

φ(B)(1 - B)^dY_t = c + θ(B)ε_t

Where:
p = order of AR component
d = degree of differencing
q = order of MA component
φ(B) = 1 - φ₁B - ... - φ_pB^p
θ(B) = 1 + θ₁B + ... + θ_qB^q

Box-Jenkins Methodology

Identification: Use ACF, PACF, and stationarity tests to determine p, d, q
Estimation: Estimate parameters using maximum likelihood
Diagnostic Checking: Verify residuals are white noise
Forecasting: Generate predictions with confidence intervals

Model	ACF Pattern	PACF Pattern
AR(p)	Exponential/sinusoidal decay	Cuts off after lag p
MA(q)	Cuts off after lag q	Exponential/sinusoidal decay
ARMA(p,q)	Decay after lag q	Decay after lag p

10. Seasonal ARIMA (SARIMA)

SARIMA extends ARIMA to handle seasonality by including seasonal AR, MA, and differencing terms:

SARIMA(p,d,q)(P,D,Q)_m

φ(B)Φ(B^m)(1-B)^d(1-B^m)^DY_t = c + θ(B)Θ(B^m)ε_t

Where:
(p,d,q) = non-seasonal orders
(P,D,Q) = seasonal orders
m = seasonal period
Φ, Θ = seasonal AR and MA polynomials

Example: Monthly Sales Data

For monthly data with annual seasonality (m=12), a SARIMA(1,1,1)(1,1,1)₁₂ model might be appropriate if:

First differencing removes trend
Seasonal differencing removes annual pattern
ACF/PACF show significant spikes at lags 1 and 12

11. Model Diagnostics

Residual Analysis

A well-specified model should have residuals that are:

Uncorrelated: ACF of residuals within confidence bounds
Zero mean: Mean of residuals approximately zero
Constant variance: No heteroscedasticity
Normally distributed: (for valid prediction intervals)

Ljung-Box Test

Ljung-Box Q Statistic

Q = n(n+2) ∑_k=1^h r_k²/(n-k)

Tests whether residual autocorrelations as a group differ significantly from zero. Under H₀ (no autocorrelation), Q ~ χ²(h-p-q).

Information Criteria

For model selection:

AIC: -2log(L) + 2k (penalizes complexity less)
BIC: -2log(L) + k·log(n) (stronger penalty, prefers simpler models)

12. Forecasting

Point Forecasts

The minimum mean squared error forecast is the conditional expectation:

Ŷ_t+h|t = E[Y_t+h | Y_t, Y_t-1, ...]

Prediction Intervals

For Gaussian models, the (1-α) prediction interval is:

Ŷ_t+h|t ± z_α/2 σ_h

Where σ_h is the forecast standard error at horizon h, which increases with h.

Forecast Accuracy Measures

MAE: Mean Absolute Error = (1/n)∑|e_t|
RMSE: Root Mean Squared Error = √((1/n)∑e_t²)
MAPE: Mean Absolute Percentage Error = (100/n)∑|e_t/Y_t|
MASE: Mean Absolute Scaled Error (scale-independent)

References and Further Reading

Box, G.E.P., Jenkins, G.M., Reinsel, G.C., & Ljung, G.M. (2015). Time Series Analysis: Forecasting and Control, 5th Edition. Wiley.
Hyndman, R.J. & Athanasopoulos, G. (2021). Forecasting: Principles and Practice, 3rd Edition. OTexts.
Hamilton, J.D. (1994). Time Series Analysis. Princeton University Press.
Brockwell, P.J. & Davis, R.A. (2016). Introduction to Time Series and Forecasting, 3rd Edition. Springer.
Shumway, R.H. & Stoffer, D.S. (2017). Time Series Analysis and Its Applications, 4th Edition. Springer.

← Previous: Machine Learning Next: Design of Experiments →