State Space Models for Semi-Continuous Precipitation Data

Jiaye Xu

School of Economics, Nankai University

Wendy Meiring

Department of Statistics and Applied Probability, University of California, Santa Barbara

Introduction

Scientific Motivations

  • Highly skewed data with large proportion of zeros

  • Simultaneous inferences on the amount and the probability of occurrence, for example:

    • Precipitation data in semi-arid area in climatology

    • Claims for a covered risk in actuarial science

Statistical Motivations

  • Distribution: Why Tweedie?

    • Skewed semi-continuous distribution with a spike of point mass at zero for certain parameter values

    • Compound Poisson-Gamma within Tweedie class

  • Non-Stationary Process: Why state space model (SSM)?

    SSM vs. ARIMA (Campagnoli, Petrone, and Petris 2009; Wilson 2016)

    • non-stationary processes with missing data

    • interpretable state vector components

    • dynamic coefficient estimates

Statistical Background: Tweedie Distributions

Tweedie EDMs

The probability density function of Tweedie exponential dispersion model (EDM) family has the general form (Dunn and Smyth 2018)

\[ f(y; \theta ,\phi) = a(y, \phi, p) \exp{\left[\frac{1}{\phi} \{y\theta - b(\theta)\} \right]} \qquad(1)\]

where

  • \(\theta\) is the canonical parameter,

  • \(\phi >0\) is the dispersion parameter,

  • \(p\) is the index parameter (or shape parameter) for any real-valued \(p \in (-\infty , 0] \cup [1, \infty )\),

  • \(a(y,\phi,p)\) is a normalizing function.

Tweedie Class of EDMs

The Tweedie distribution (or Tweedie EDM) is a special class of EDMs with the variance function of the form \[V(\mu) = \mu ^ p.\]

If \(Y\) has an Tweedie distribution, \(Y \sim Tw(\mu, \phi,p)\), the density of \(Y\) has form in Equation 1 with

\[ \theta=\left\{\begin{array}{ll} {\displaystyle\frac{\mu^{1-p}}{1-p}} \text{, for } p \neq 1 \\ \ \\ \log \mu \text {, for } p=1 \end{array}\right. \hspace{0.7in} \quad b(\theta)= \left\{\begin{array}{l} {\displaystyle \frac{\mu^{2-p}}{2-p}} \text {, for } p \neq 2 \\ \ \\ {\displaystyle \log \mu } \text {, for } p=2 \end{array}\right. \]

Compound Poisson-Gamma in Tweedie Class

For \(Y \sim Tw(\mu,\phi,p)\) where \(1<p<2\), \(Y\) is a random Poisson sum of i.i.d. Gamma random variables, i.e.

\[ \begin{align} & Y = \sum_{i=1}^{C} X_i, \text{ where}\quad C \sim Pois(\lambda), \quad X_i \stackrel{iid}{\sim} Ga(\alpha,\gamma),\nonumber\\ &\quad \quad C \perp\!\!\!\perp X_i, \quad \text{and } C =0 \implies Y=0, \nonumber \end{align} \]

  • When \(C>0\), by summation of i.i.d. Gamma random variables, \(Y|C\sim Ga(C\alpha, \gamma)\).
  • When \(C=0\), \(Y=0\).

Tweedie Hierarchical State Space Model (HSSM)

Review State Space Models - Linear Gaussian Case

The linear Gaussian state space model (SSM) also called the dynamic linear model (DLM) has

\[ \begin{aligned} \text { Observation Eqn}&: \mathbf{Y}_t = \mathbf{F}_t\boldsymbol{x}_t+\mathbf{v}_t, \quad \mathbf{v}_t\sim\mathcal{N}_n(\textbf{0},\mathbf{V}_t) \nonumber \\ \text { Evolution Eqn}&: \boldsymbol{x}_t =\mathbf{G}_t\boldsymbol{x}_{t-1} +\mathbf{w}_t, \quad \mathbf{w}_t\sim\mathcal{N}_m(\mathbf{0},\mathbf{W}_t), \\ \text {Prior}&: \boldsymbol{x}_0 \sim \mathcal{N}_m (\textbf{m}_0, \textbf{C}_0). \end{aligned} \]

Where

  • \(\mathbf{Y}_t\) is \(n\)-variate time series,

  • \(\boldsymbol{x}_t\) is the state vector of length \(m\).

  • \(\mathbf{F}_t\) is a known \(n \times m\) observation matrix. \(\textbf{G}_t\) is a known \(m \times m\) evolution matrix.

  • \(\{\textbf{v}_t: t = 1, \ldots, T\}\) and \(\{\textbf{w}_t: t = 1, \ldots, T\}\) are the sequences of observation errors and evolution errors (also called disturbances).

  • \(\boldsymbol{x}_t\) is Markov: depends on \(\boldsymbol{x}_{t-1}\), but not on earlier history.

  • \(\boldsymbol{x}_t\) is independent of the observation errors.

  • \(\textbf{Y}_t\)’s are conditionally independent given \(\boldsymbol{x}_t\)’s.

Model Specification

In a local linear SSM with two seasonal components:

  • the \(m=6\) dimensional state vector at time \(t\) in units of months

\[\boldsymbol{x}_t = ({\mu_x}_t,\beta_t,S_{1t},S^{*}_{1t},S_{2t},S^{*}_{2t})^{\prime},\]

\({\mu_x}_t, \beta_t\) represent local level and local growth rate.

  • For \(j \in \{1,2\}\), Fourier harmonics \(S_{jt} = a_j\cos(t\omega_j) + b_j\sin(t\omega_j)\) and conjugate harmonics \(S^{*}_{jt} = -a_j\sin(t\omega_j) + b_j\cos(t\omega_j)\), where \(\omega_j = 2\pi j/s\) for period \(s\).

Monthly data: \(s=12\).

  • The observation matrix \(\mathbf{F}_{1\times m} = [1\ 0\ 1\ 0\ 1\ 0].\)

  • \(\mathbf{G}\) is a \(m\times m\) block diagonal matrix, with \(j^{th}\) block

\[ \mathbf{G}_j = \begin{bmatrix} 1 & 1\\ 0 & 1 \end{bmatrix} \qquad\qquad\qquad \text{for } j=1 \text{,} \]

\[ \mathbf{G}_j = \begin{bmatrix} \ \cos(\omega_j) & \sin(\omega_j)\\ -\sin(\omega_j) & \cos(\omega_j) \end{bmatrix} \qquad\qquad \text{for } j\in \{2,3\} \text{.} \]

  • Model \(V_t\) and \(\mathbf{W}_t\) either as time-varying, or constant \(V\) and \(\mathbf{W}\).

Tweedie Hierarchical State Space Models (HSSMs)

Let \(z_{t}\) = precipitation over a period of time, such as a month or a week, \(t\in\{1,\dots,T\}\), at a specific monitoring location.

The Tweedie HSSM for univariate time series we propose has four levels,

\[\begin{align} \text { Observation }& : \quad z_{ t}\sim Tw(\mu_t = e^{y_t}, p, \phi_t) \\ \text { Latent observation }&: \quad {y}_{t} =\mathbf{F} \boldsymbol{x}_{t}+v_{t}, \quad \ \ \ \ \ \ {v}_{t} \stackrel{indep.}{\sim} \mathcal{N}\left(0, V \right)\\ \text { Evolution}&:\quad \boldsymbol{x}_{t}=\mathbf{G} \boldsymbol{x}_{t-1}+ \boldsymbol{w}_{t}, \quad \boldsymbol{w}_{t} \stackrel{indep.}{\sim} \mathcal{N}_{m}\left(\mathbf{0}, \mathbf{W}\right) \\ \text { Prior Distributions}&: \quad V \sim \mathcal{IG}(\alpha, \beta) , \mathbf{W} \sim\mathcal{IW}_m(\nu, \mathbf{S}) \text{ and } \\&\quad \quad \boldsymbol{x_0}\sim \mathcal{N}_m(\mathbf{m}_0, \mathbf{C}_0), \end{align}\]

Bayesian Inference for Tweedie HSSM

Joint Posterior Distribution

  • Derive the joint posterior distribution given \(z_{t}\), for \(t=1,\dots,T\),

\[ \begin{align} & p(\boldsymbol{x}_{0:T} ,{y}_{1:T},{V}, \mathbf{W}|{z}_{1:T}, \hat{\phi},\hat{p}) \\ &\propto p(\boldsymbol{x}_{0:T} ,{y}_{1:T},{V}, \mathbf{W}, {z}_{1:T}|\hat{\phi},\hat{p})\\ &\propto f({z}_{1:T}|{y}_{1:T}, \hat{\phi},\hat{p}) \pi({y}_{1:T}|\boldsymbol{x}_{0:T}, {V}) \pi(\boldsymbol{x}_{0:T}| \mathbf{W} ) p( {V}) p(\mathbf{W}) \nonumber \\ & \propto \underbrace{f({z}_{1:T}|{y}_{1:T}, \hat{\phi},\hat{p})}_\text{Tweedie} \underbrace{\pi( {y}_{1:T}|\boldsymbol{x}_{1:T}, {V})}_\text{Observation} \underbrace{\left( \prod_{t=1}^T \pi (\boldsymbol{x}_t|\boldsymbol{x}_{t-1},\mathbf{W})\right)}_\text{Evolution} \underbrace{ p(\boldsymbol{x}_0) p( {V})p(\mathbf{W})}_\text{Priors}\\ & \propto \prod_{z_{ t} \geq 0} Tw(z_t | \mu_t =e^{y_{t}},\hat{\phi},\hat{p}) \prod_{t=1}^T \pi( {y}_{t}|\boldsymbol{x}_{t},{V}) \prod_{t=1}^T \pi (\boldsymbol{x}_t|\boldsymbol{x}_{t-1},\mathbf{W}) p(\boldsymbol{x}_0) p( {V})p(\mathbf{W}) \end{align} \]

where \(\hat{p}\) and \(\hat{\phi}\) are the profile MLEs of \(p\) and \(\phi\). (Dunn and Smyth 2005; Zhang 2012)

  • In practice, we use time-varying \(\hat{\phi}_t\) estimated using the generalized additive models for location shape and scale (GAMLSS) in mgcv (Wood 2017).

HSSMs with Unknown W and V

Algorithm: Forward Filtering Backward Sampling (FFBS) in a Gibbs Sampler.

(Frühwirth-Schnatter 1994; Campagnoli, Petrone, and Petris 2009).

Initialize \(y_{t, g}, V_g, \mathbf W_g\) and \(\boldsymbol{x}_{t,g}\) for \(g=1, t = 1, \ldots, T\), then, run the following MCMC steps, for \(g=2, \ldots, G\), until convergence (for simplicity we omit subscripts \(g\) below):

  1. FFBS for state vector \(\boldsymbol{x}_t\) for \(t = 1, \ldots, T\):

    1. Forward Filtering: run Kalman Filter and draw \(\boldsymbol{x}_{T}\) from \(p(\boldsymbol{x}_{T}|y_{1:T}, V, \mathbf{W})\).
    2. Backward Sampling: For \(t = T-1, \ldots, 0\), draw \(\boldsymbol{x}_{t}\) from \(\pi(\boldsymbol{x}_{t}|\boldsymbol{x}_{t+1}, y_{1:t}, V, \mathbf{W})\).
  2. Metropolis step to draw \({y}_{t}\) from the full conditional distribution \(p\left(y_{t} |\boldsymbol{x}_{t}, V_{}, z_{t} \right)\), i.e.,
    \[ Tw(z_t;\mu_t=e^{y_{t}} ,\hat{\phi}_t,\hat{p}_t) \mathcal{N}( {y}_{t};\mathbf{F}{\boldsymbol{x}}_{t}, V,\mathbf W)\]

  3. Gibbs sampling to draw \(V\) from the full conditional distribution \[ \mathcal{IG}\left(\frac{T}{2}+\alpha, \ \frac{1}{2}\sum_{t=1}^T\left[y_{t}- \mathbf{F}\boldsymbol{x}_t\right]^{2}+\beta\right),\] given the prior \(\mathcal{IG}(\alpha, \beta).\)

  4. Gibbs sampling to draw \(\mathbf{W}\) from the full conditional distribution \[\mathcal{IW}_m\left(\nu +T, \ \sum_{t=1}^T (\boldsymbol{x}_t - \mathbf{G}\boldsymbol{x}_{t-1})(\boldsymbol{x}_t - \mathbf{G}\boldsymbol{x}_{t-1})^{\prime}+\mathbf{S}\right),\] given the prior \(\mathcal{IW}_m(\nu, \mathbf{S}).\)

Implementation and Evaluation

Data Description

Time series of daily precipitation at Pietermaritzburg (PMB) in South Africa.

Source: World Meteorological Organization’s Global Telecommunication System (GTS) gauge data.

Scale up to weekly data in Figure 1

Figure 1: Weekly Precipitation at PMB

Posterior Predictive Precipitation

Figure 2: Posterior predictive medians with 95% credible interval (CI) for the weekly precipitation at PMB. Symbols: black triangles represent the observed precipitation, blue dots the medians of posterior predictive distribution, and the light blue area represent the 95% CI of the (empirical) posterior predictive distribution.

Posterior Predictive Probability of Occurrence (Precipitation)

Figure 3: Posterior predictive medians with 95% credible interval (CI) for the precipitation occurence at PMB

Future Works

Explore the association with the ENSO indices:

  • sites in South Africa’s corn production area in Figure 4

  • indices selection and information combination, e.g., regression components in SSM

    Figure 4: Corn production area in South Africa. Source: https://ipad.fas.usda.gov/rssiws/al/crop_production_maps/safrica/SouthAfrica_Corn.png

References

Campagnoli, Patrizia, Sonia Petrone, and Giovanni Petris. 2009. Dynamic Linear Models with r. Springer New York. https://doi.org/10.1007/b135794.
Dunn, Peter K., and Gordon K. Smyth. 2005. “Series Evaluation of Tweedie Exponential Dispersion Model Densities.” Statistics and Computing 15 (4): 267–80. https://doi.org/10.1007/s11222-005-4070-y.
———. 2018. Generalized Linear Models with Examples in r. Springer New York. https://doi.org/10.1007/978-1-4419-0118-7.
Frühwirth-Schnatter, Sylvia. 1994. “DATA AUGMENTATION AND DYNAMIC LINEAR MODELS.” Journal of Time Series Analysis 15 (2): 183–202. https://doi.org/10.1111/j.1467-9892.1994.tb00184.x.
Wilson, Granville Tunnicliffe. 2016. “Time Series Analysis: Forecasting and Control, 5th Edition, by George E. P. Box, Gwilym M. Jenkins, Gregory C. Reinsel and Greta M. Ljung, 2015. Published by John Wiley and Sons Inc., Hoboken, New Jersey, Pp. 712. ISBN: 978-1-118-67502-1.” Journal of Time Series Analysis 37 (5): 709–11. https://doi.org/10.1111/jtsa.12194.
Wood, Simon N. 2017. Generalized Additive Models. Chapman; Hall/CRC. https://doi.org/10.1201/9781315370279.
Zhang, Yanwei. 2012. “Likelihood-Based and Bayesian Methods for Tweedie Compound Poisson Linear Mixed Models.” Statistics and Computing 23 (6): 743–57. https://doi.org/10.1007/s11222-012-9343-7.