\documentclass{article}
\usepackage[latin1]{inputenc}
%\usepackage[brazil,brazilian]{babel}
\title{Modelling fish abundance with a negative binomial geostatistical model}
\author{Ernesto Jardim \& Paulo Justiniano Ribeiro Jr}
\date{\today}
\begin{document}
\maketitle
\section{Background}
Consider the generalised linear geostatistical model (GLGM) as defined by Diggle et. al (1998) and
Diggle and Ribeiro (2007). A particular member of this class is the
\textit{negative binomial geostatistical model} (NBGM). Considering a finite set of locations $x$:
\begin{eqnarray*}
[Y(x)|S(x)] &\stackrel{ind}{\sim}& {\rm NB}(\mu(x))\\
g(\mu(x)) & = & \nu(x) = F(x) \beta + S(x) \\
S(x) & \sim & MVN(0, \Sigma)
\end{eqnarray*}
where $S(\cdot)$ is assumed to be an stationary and isotropic Gaussian proccess with
zero mean and covariance
function $C[Y(x_i), Y(x_j)] = \sigma^2 \rho(u_{ij})$ with $u_{ij} = ||x_i - x_j||$ the distance between
a pair of points. The assumed correlation function is $\rho(u_{ij}) = \exp(-u_{ij}/\phi)$.
We group the model parameters denoting by $\beta$ and $\theta = (\sigma^2, \phi)$.
The assumed link function is $g(\cdot) = \log(\cdot)$.
A simulation from this model can be obtained as follows:
\begin{enumerate}
\item define the model parameters $(\beta, \theta)$
\item simulate a realisation of the Gaussian random field parametrised by $\theta$
\item compute the linear predictor $\nu(x) = F(x) \beta + S(x)$
and the inverse link function $\mu(x) = \exp\left\{\nu(x)\right\}$
\item Sample $Y(x)$ by drawing independent samples from the negative binomial distribution.
\end{enumerate}
Notice this is a conditional independence model where the $Y$'s are independent given $S$. The spatial dependence in the marginal distribution of $Y(x)$ is induced by the spatially dependent mean $\mu(x)$.
\section{The NBGM for the fish abundance}
Within the general approch proposed here we need to produce simulations of the $Y(\cdot)$
conditional on the observed data which will be combined with simulations of compositinal data
.to produce inference by simulation of the abundance at ages.
The model defined in the previous session is adapted to model the fish abundance data considering:
\begin{enumerate}
\item data on fish abundance is available and simulations are conditional on the observed data,
\item data consists of observations on diferent years $t = 1, \ldots, T$,
We assume independence between observations from different years,
\item models parameters needs to be estimated from the data,
\item relevant covariates needs to be accounted for,
\item results are produced for a pre-specified covariate levels.
\end{enumerate}
We combine approximate inference methods for the mean parameters with Bayesian inference
on the spatial term as follows.
The first step is to combine data from all years and fit a negative binomial model assuming independence
between all observations and therefore relying on the asymptoptic consistence of the estimates.
This produces estimates $\hat{\beta}$ and ${\rm Var}(\hat{\beta})$.
Next we extract Pearson residuals from the above model spliting them by year. For each year we
perform Bayesian inference on and assumed Gaussian geostatistical model
(Ribeiro and Diggle, 2001; Diggle and Ribeiro, 2007) for such residuals.
This model assumes the same covariates as before,
with the mean parameters adjusting the previous estimates for the individuals years.
This procedure is repeated for each year.
From the above we can denote $\nu_t = F(x) \beta^{GLM} + F(x) \beta^{GEO} + S(x)$
From the Bayesian analysis on the previous step, conditional simulations of $F(x) \beta^{GEO} + S(x)$
can be obtained for each year.
We define a reference set of covariate levels $F_r(x)$ and add
compute $\nu_t(x)$ adding $F_r(x)\beta^{GLM}$ to each simulation.
Finally we obtain simulations of $Y_t(x)$ for each year by drawing independent simulations
from the negative binomial with means $\mu_t(x)$ inducing the spatial correlation.
\end{document}