Department of Statistics and Applied Probability, University of California, Santa Barbara
Highly skewed data with large proportion of zeros
Simultaneous inferences on the amount and the probability of occurrence, for example:
Precipitation data in semi-arid area in climatology
Claims for a covered risk in actuarial science
Distribution: Why Tweedie?
Skewed semi-continuous distribution with a spike of point mass at zero for certain parameter values
Compound Poisson-Gamma within Tweedie class
Non-Stationary Process: Why state space model (SSM)?
SSM vs. ARIMA (Campagnoli, Petrone, and Petris 2009; Wilson 2016)
non-stationary processes with missing data
interpretable state vector components
dynamic coefficient estimates
The probability density function of Tweedie exponential dispersion model (EDM) family has the general form (Dunn and Smyth 2018)
\[ f(y; \theta ,\phi) = a(y, \phi, p) \exp{\left[\frac{1}{\phi} \{y\theta - b(\theta)\} \right]} \qquad(1)\]
where
\(\theta\) is the canonical parameter,
\(\phi >0\) is the dispersion parameter,
\(p\) is the index parameter (or shape parameter) for any real-valued \(p \in (-\infty , 0] \cup [1, \infty )\),
\(a(y,\phi,p)\) is a normalizing function.
The Tweedie distribution (or Tweedie EDM) is a special class of EDMs with the variance function of the form \[V(\mu) = \mu ^ p.\]
If \(Y\) has an Tweedie distribution, \(Y \sim Tw(\mu, \phi,p)\), the density of \(Y\) has form in Equation 1 with
\[ \theta=\left\{\begin{array}{ll} {\displaystyle\frac{\mu^{1-p}}{1-p}} \text{, for } p \neq 1 \\ \ \\ \log \mu \text {, for } p=1 \end{array}\right. \hspace{0.7in} \quad b(\theta)= \left\{\begin{array}{l} {\displaystyle \frac{\mu^{2-p}}{2-p}} \text {, for } p \neq 2 \\ \ \\ {\displaystyle \log \mu } \text {, for } p=2 \end{array}\right. \]
For \(Y \sim Tw(\mu,\phi,p)\) where \(1<p<2\), \(Y\) is a random Poisson sum of i.i.d. Gamma random variables, i.e.
\[ \begin{align} & Y = \sum_{i=1}^{C} X_i, \text{ where}\quad C \sim Pois(\lambda), \quad X_i \stackrel{iid}{\sim} Ga(\alpha,\gamma),\nonumber\\ &\quad \quad C \perp\!\!\!\perp X_i, \quad \text{and } C =0 \implies Y=0, \nonumber \end{align} \]
The linear Gaussian state space model (SSM) also called the dynamic linear model (DLM) has
\[ \begin{aligned} \text { Observation Eqn}&: \mathbf{Y}_t = \mathbf{F}_t\boldsymbol{x}_t+\mathbf{v}_t, \quad \mathbf{v}_t\sim\mathcal{N}_n(\textbf{0},\mathbf{V}_t) \nonumber \\ \text { Evolution Eqn}&: \boldsymbol{x}_t =\mathbf{G}_t\boldsymbol{x}_{t-1} +\mathbf{w}_t, \quad \mathbf{w}_t\sim\mathcal{N}_m(\mathbf{0},\mathbf{W}_t), \\ \text {Prior}&: \boldsymbol{x}_0 \sim \mathcal{N}_m (\textbf{m}_0, \textbf{C}_0). \end{aligned} \]
Where
\(\mathbf{Y}_t\) is \(n\)-variate time series,
\(\boldsymbol{x}_t\) is the state vector of length \(m\).
\(\mathbf{F}_t\) is a known \(n \times m\) observation matrix. \(\textbf{G}_t\) is a known \(m \times m\) evolution matrix.
\(\{\textbf{v}_t: t = 1, \ldots, T\}\) and \(\{\textbf{w}_t: t = 1, \ldots, T\}\) are the sequences of observation errors and evolution errors (also called disturbances).
\(\boldsymbol{x}_t\) is Markov: depends on \(\boldsymbol{x}_{t-1}\), but not on earlier history.
\(\boldsymbol{x}_t\) is independent of the observation errors.
\(\textbf{Y}_t\)’s are conditionally independent given \(\boldsymbol{x}_t\)’s.
In a local linear SSM with two seasonal components:
\[\boldsymbol{x}_t = ({\mu_x}_t,\beta_t,S_{1t},S^{*}_{1t},S_{2t},S^{*}_{2t})^{\prime},\]
\({\mu_x}_t, \beta_t\) represent local level and local growth rate.
Monthly data: \(s=12\).
The observation matrix \(\mathbf{F}_{1\times m} = [1\ 0\ 1\ 0\ 1\ 0].\)
\(\mathbf{G}\) is a \(m\times m\) block diagonal matrix, with \(j^{th}\) block
\[ \mathbf{G}_j = \begin{bmatrix} 1 & 1\\ 0 & 1 \end{bmatrix} \qquad\qquad\qquad \text{for } j=1 \text{,} \]
\[ \mathbf{G}_j = \begin{bmatrix} \ \cos(\omega_j) & \sin(\omega_j)\\ -\sin(\omega_j) & \cos(\omega_j) \end{bmatrix} \qquad\qquad \text{for } j\in \{2,3\} \text{.} \]
Let \(z_{t}\) = precipitation over a period of time, such as a month or a week, \(t\in\{1,\dots,T\}\), at a specific monitoring location.
The Tweedie HSSM for univariate time series we propose has four levels,
\[\begin{align} \text { Observation }& : \quad z_{ t}\sim Tw(\mu_t = e^{y_t}, p, \phi_t) \\ \text { Latent observation }&: \quad {y}_{t} =\mathbf{F} \boldsymbol{x}_{t}+v_{t}, \quad \ \ \ \ \ \ {v}_{t} \stackrel{indep.}{\sim} \mathcal{N}\left(0, V \right)\\ \text { Evolution}&:\quad \boldsymbol{x}_{t}=\mathbf{G} \boldsymbol{x}_{t-1}+ \boldsymbol{w}_{t}, \quad \boldsymbol{w}_{t} \stackrel{indep.}{\sim} \mathcal{N}_{m}\left(\mathbf{0}, \mathbf{W}\right) \\ \text { Prior Distributions}&: \quad V \sim \mathcal{IG}(\alpha, \beta) , \mathbf{W} \sim\mathcal{IW}_m(\nu, \mathbf{S}) \text{ and } \\&\quad \quad \boldsymbol{x_0}\sim \mathcal{N}_m(\mathbf{m}_0, \mathbf{C}_0), \end{align}\]
\[ \begin{align} & p(\boldsymbol{x}_{0:T} ,{y}_{1:T},{V}, \mathbf{W}|{z}_{1:T}, \hat{\phi},\hat{p}) \\ &\propto p(\boldsymbol{x}_{0:T} ,{y}_{1:T},{V}, \mathbf{W}, {z}_{1:T}|\hat{\phi},\hat{p})\\ &\propto f({z}_{1:T}|{y}_{1:T}, \hat{\phi},\hat{p}) \pi({y}_{1:T}|\boldsymbol{x}_{0:T}, {V}) \pi(\boldsymbol{x}_{0:T}| \mathbf{W} ) p( {V}) p(\mathbf{W}) \nonumber \\ & \propto \underbrace{f({z}_{1:T}|{y}_{1:T}, \hat{\phi},\hat{p})}_\text{Tweedie} \underbrace{\pi( {y}_{1:T}|\boldsymbol{x}_{1:T}, {V})}_\text{Observation} \underbrace{\left( \prod_{t=1}^T \pi (\boldsymbol{x}_t|\boldsymbol{x}_{t-1},\mathbf{W})\right)}_\text{Evolution} \underbrace{ p(\boldsymbol{x}_0) p( {V})p(\mathbf{W})}_\text{Priors}\\ & \propto \prod_{z_{ t} \geq 0} Tw(z_t | \mu_t =e^{y_{t}},\hat{\phi},\hat{p}) \prod_{t=1}^T \pi( {y}_{t}|\boldsymbol{x}_{t},{V}) \prod_{t=1}^T \pi (\boldsymbol{x}_t|\boldsymbol{x}_{t-1},\mathbf{W}) p(\boldsymbol{x}_0) p( {V})p(\mathbf{W}) \end{align} \]
where \(\hat{p}\) and \(\hat{\phi}\) are the profile MLEs of \(p\) and \(\phi\). (Dunn and Smyth 2005; Zhang 2012)
mgcv
(Wood 2017).Algorithm: Forward Filtering Backward Sampling (FFBS) in a Gibbs Sampler.
(Frühwirth-Schnatter 1994; Campagnoli, Petrone, and Petris 2009).
Initialize \(y_{t, g}, V_g, \mathbf W_g\) and \(\boldsymbol{x}_{t,g}\) for \(g=1, t = 1, \ldots, T\), then, run the following MCMC steps, for \(g=2, \ldots, G\), until convergence (for simplicity we omit subscripts \(g\) below):
FFBS for state vector \(\boldsymbol{x}_t\) for \(t = 1, \ldots, T\):
Metropolis step to draw \({y}_{t}\) from the full conditional distribution \(p\left(y_{t} |\boldsymbol{x}_{t}, V_{}, z_{t} \right)\), i.e.,
\[ Tw(z_t;\mu_t=e^{y_{t}} ,\hat{\phi}_t,\hat{p}_t) \mathcal{N}( {y}_{t};\mathbf{F}{\boldsymbol{x}}_{t}, V,\mathbf W)\]
Gibbs sampling to draw \(V\) from the full conditional distribution \[ \mathcal{IG}\left(\frac{T}{2}+\alpha, \ \frac{1}{2}\sum_{t=1}^T\left[y_{t}- \mathbf{F}\boldsymbol{x}_t\right]^{2}+\beta\right),\] given the prior \(\mathcal{IG}(\alpha, \beta).\)
Gibbs sampling to draw \(\mathbf{W}\) from the full conditional distribution \[\mathcal{IW}_m\left(\nu +T, \ \sum_{t=1}^T (\boldsymbol{x}_t - \mathbf{G}\boldsymbol{x}_{t-1})(\boldsymbol{x}_t - \mathbf{G}\boldsymbol{x}_{t-1})^{\prime}+\mathbf{S}\right),\] given the prior \(\mathcal{IW}_m(\nu, \mathbf{S}).\)
Time series of daily precipitation at Pietermaritzburg (PMB) in South Africa.
Source: World Meteorological Organization’s Global Telecommunication System (GTS) gauge data.
Scale up to weekly data in Figure 1
Figure 1: Weekly Precipitation at PMB
Figure 2: Posterior predictive medians with 95% credible interval (CI) for the weekly precipitation at PMB. Symbols: black triangles represent the observed precipitation, blue dots the medians of posterior predictive distribution, and the light blue area represent the 95% CI of the (empirical) posterior predictive distribution.
Figure 3: Posterior predictive medians with 95% credible interval (CI) for the precipitation occurence at PMB
Explore the association with the ENSO indices:
sites in South Africa’s corn production area in Figure 4
indices selection and information combination, e.g., regression components in SSM