families.Rmd
Family objects are important building blocks in the design of BAMLSS models. They specify the distribution by collecting functions of the density, respective log-likelihood, first-order derivatives of the log-likelihood w.r.t. predictors (the score function), and (optionally) second-order derivatives of the log-likelihood w.r.t. predictors or their expectation (the Hessian).
The bamlss package can be easily extended by constructing families for specific tasks, i.e. problems for which a likelihood can be formulated. However, the most important distributions are shipped with bamlss. Finally, it is also possible to employ gamlss families.
We illustrate how to build a bamlss family by hand along the Gaussian distribution, with density \[ f(y\,|\,\mu,\sigma) = \frac{1}{\sqrt{2\pi}\sigma} \cdot \exp \left( \frac{-(y-\mu)^2}{2\sigma^2} \right), \] and log-likelihood function \[ \ell(\mu,\sigma\,|\,y) = - \frac{1}{2} \log(2\pi) - \log(\sigma) - \frac{(y-\mu)^2}{2\sigma^2}, \] for an individual observation. The sum of the log-likelihood function over all observations is the target function of the optimization problem.
In the distributional regression framework the parameters are linked to predictors by link functions, \[ \mu = \eta_\mu, \qquad \log(\sigma) = \eta_\sigma. \] For the Gaussian \(\mu\) and \(\sigma\) are linked to \(\eta_\mu\) and \(\eta_\sigma\) by the identity function and the logarithm, respectively.
The score functions in bamlss are the first derivatives of the log-likelihood w.r.t. the predictors: \[ s_\mu = \frac{\partial\ell}{\partial\eta_\mu} = \frac{\partial\ell}{\partial\mu} \cdot \frac{\partial\mu}{\partial\eta_\mu} = \frac{y-\mu}{\sigma^2}, \] and \[ s_\sigma = \frac{\partial\ell}{\partial\eta_\sigma} = \frac{\partial\ell}{\partial\sigma} \cdot \frac{\partial\sigma}{\partial\eta_\sigma} = -1 + \frac{(y-\mu)^2}{\sigma^2}. \]
For the second derivative of the log-likelihood we are able to obtain the negative expectation, \[ \mathsf{E}(-\partial^2\ell / \partial\eta_{\mu}^{2} ) = \sigma^{-2}, \] and \[ \mathsf{E}(-\partial^2\ell / \partial\eta_{\sigma}^{2} ) = 2. \]
Now we have to write a function that returns a family.bamlss
object (S3) which encapsulates functions for density, score and Hessian, and the names of the family, parameter and link functions:
Name of element | Value |
---|---|
family |
character string with the name of the family |
names |
vector of character strings with the names of the parameters |
links |
vector of character strings with the names of the link functions |
d |
a function returning the density with arguments y, par, log = FALSE (see below) |
score |
a list with functions (one for each parameter) returning the first derivatives of the log-likelihood w.r.t. predictors |
hess |
a list with functions (one for each parameter) returning the negative second derivatives of the log-likelihood w.r.t. predictors |
Nearly all functions take as first argument the response y
and as second argument a named list holding the evaluated parameters par
of the distribution on the scale of the predictors \(\eta_{\star}\).
The implementation looks like this:
mygauss <- function(...) {
f <- list(
"family" = "mygauss",
"names" = c("mu", "sigma"),
"links" = c(mu = "identity", sigma = "log"),
"d" = function(y, par, log = FALSE) {
dnorm(y, mean = par$mu, sd = par$sigma, log = log)
},
"score" = list(
mu = function(y, par, ...) {
drop((y - par$mu) / (par$sigma^2))
},
sigma = function(y, par, ...) {
drop(-1 + (y - par$mu)^2 / (par$sigma^2))
}
),
"hess" = list(
mu = function(y, par, ...) {
drop(1 / (par$sigma^2))
},
sigma = function(y, par, ...) {
rep(2, length(y))
}
)
)
class(f) <- "family.bamlss"
return(f)
}
library("bamlss")
## Simulate some data.
d <- GAMart()
## Specify formula.
f <- ~ s(x1) + s(x2) + s(x3) + s(lon, lat)
f <- list(mu = update(f, num ~ .), sigma = f)
## Model.
b <- bamlss(f, data = d, family = mygauss)
Optionally, the family.bamlss
object can be extended by functions for
p(y, par, ...)
,q(p, par)
,r(n, par)
,loglik(y, par)
,mu(par, ...)
,which can help to speed up optimization, or be convenient for predictions and simulations.
There are some families implemented:
Or normal distribution.
Density: \[ f(y\,|\,\mu,\sigma) = \frac{1}{\sqrt{2\pi}\sigma} \cdot \exp \left( \frac{-(y-\mu)^2}{2\sigma^2} \right), \] with \(\sigma > 0\).
Links:
Function: gaussian_bamlss(...)
Alternative parameterizations:
gaussian2_bamlss(...)
models inverse sigma
with a log link.Gaussian_bamlss(...)
one parametric family modelling only mu
.Or skewed-logistic distribution.
Density: \[ f(y\,|\,\mu,\sigma,\alpha)=\frac{\alpha\cdot\exp\left(-\frac{y-\mu}{\sigma}\right)} {\sigma\cdot\left(1+\exp\left(-\frac{y-\mu}{\sigma}\right)\right)^{2\alpha}}, \] with \(\sigma, \alpha > 0\).
Links:
Function: glogis_bamlss(...)
Two-parameter distribution with \(\mu > 0\) for the expectation of the distribution and \(\sigma > 0\) for its shape.
Density: \[ f(y\,|\,\mu,\sigma)=\frac{y^{\sigma-1}\cdot\exp\left(-\frac{\sigma\cdot y}{\mu}\right)} {\left(\frac{\mu}{\sigma}\right)^\sigma\cdot\Gamma(\sigma)}. \]
Links:
Function: gamma_bamlss(...)
Density: \[ f(y\,|\,\xi,\sigma)=\frac{1}{\sigma}\cdot\left(1+\frac{\xi\cdot y}{\sigma} \right)^{-\left(1+1/\xi\right)}, \] with \(\xi, \sigma > 0\).
Links:
Function: gpareto_bamlss(...)
Density: \[ f(y\,|\,\lambda,\alpha)=\frac{\alpha}{\lambda}\cdot\left(\frac{y}{\lambda}\right)^{\alpha-1} \cdot\exp\left(-\left(\frac{y}{\lambda}\right)^\alpha\right), \] with \(\lambda, \alpha > 0\).
Links:
Function: weibull_bamlss(...)
Density: \[ f(y\,|\,\mu,\sigma)= \begin{cases} \frac{1}{\sigma} \phi\left(\frac{y-\mu}{\sigma} \right) & y > 0 \\ \Phi\left(\frac{-\mu}{\sigma} \right) & \mbox{else,} \end{cases} \] where \(\phi\) is the probability density and \(\Phi\) the cumulative distribution function of the standard normal distribution.
Links:
Function: cnorm_bamlss(...)
The beta distribution, with density \[ f_{Beta}(y\,|\,\alpha,\beta)=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\cdot\Gamma(\beta)} \cdot y^{\alpha-1} \cdot (1-y)^{\beta-1} \] is implemented using the Balding-Nichols parameterization (Balding and Nichols 1995).
Density: \[ f(y\,|\,\mu,\sigma)=f_{Beta}\left(y\,\middle|\,\alpha = \mu \cdot \frac{1-\sigma}{\sigma}, \beta = (1-\mu) \cdot \frac{1-\sigma}{\sigma} \right), \] with \(0 < \mu < 1\) and \(0 < \sigma < 1\).
Links:
Function: beta_bamlss(...)
Often called binomial, though the Bernoulli is a special case of the binomial distribution.
Density: \[ f(y\,|\,\pi) = \pi^y \cdot (1-\pi)^{1-y}, \] where \(\pi\) is a probability.
Links:
Function: binomial_bamlss(link = "logit", ...)
Say, \(y\) is a categorial response with \(c+1\) nomial categories, and \(y_{1}^{\star}, \dots, y_{c+1}^{\star}\) are associated dummy variables. \(p_1, \dots, p_{c+1}\) are probabilities for an observation falling in one of the categories. Category \(c+1\) serves as reference, i.e. \(p_{c+1} = 1 - p_1 - \dots - p_{c}\).
Density: \[ f(y\,|\,\pi_1,\dots,\pi_c) = \prod_{s=1}^{c+1} p_{s}^{y_{s}^{\star}}, \quad \text{where} \quad p_r = \frac{\pi_r}{1 + \pi_1 + \dots + \pi_c}, \]
Links:
where \(\eta_r\) specifies the log-odds between category \(r\) and the reference category \(c+1\).
Function: multinomial_bamlss(...)
Density: \[ f(y\,|\,\lambda) = \frac{\lambda^y \cdot \exp(-\lambda)}{y!}, \] with \(\lambda > 0\).
Link:
Function: poisson_bamlss(...)
Density: \[ f(y\, |\, \mu, \theta) = \frac{\Gamma(\theta + y)}{\Gamma({\theta}) \cdot y!} \cdot \frac{\mu^y \cdot \theta^\theta}{(\mu + \theta)^{\theta + y}}, \] with \(\mu,\theta > 0\).
Links:
Function: nbinom_bamlss(...)
Density: \[ f(y\,|\,\mu,\theta)=\frac{f_{\mathrm{NB}}(y\,|\,\mu,\theta)} {1-f_{\mathrm{NB}}(0\,|\,\mu,\theta)}, \] where \(f_{\mathrm{NB}}\) is the density of the negative binomial.
Links:
Function: ztnbinom_bamlss(...)
Say \(y\) is a \(k\) dimensional observation.
Density: \[ f(y\,|\,\mu,\Sigma) = \frac{1}{\sqrt{(2\pi)^{k} |\Sigma|}} \cdot \exp \left(-\frac{1}{2} (y-\mu)^\top \Sigma^{-1} (y-\mu) \right), \] where \(\Sigma\) can be decomposed into \(\Sigma = D \Omega D\) with \[ D= \begin{pmatrix} \sigma_1 & 0 & \cdots & 0 \\ 0 & \sigma_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \sigma_k \end{pmatrix}, \qquad \text{and} \qquad \Omega= \begin{pmatrix} 1 & \rho_{12} & \cdots & \rho_{1k} \\ \rho_{12} & 1 & \cdots & \rho_{2k} \\ \vdots & \vdots & \ddots & \vdots \\ \rho_{1k} & \rho_{2k} & \cdots & 1 \end{pmatrix}. \]
Links:
Function: mvnorm_bamlss(k = 2, ...)
Alternative parameterizations:
mvnormAR1_bamlss(k = 2, ...)
with \[
\Omega=
\begin{pmatrix}
1 & \rho & \rho^2 & \cdots & \rho^{k-1} \\
\rho & 1 & \rho & \cdots & \rho^{k-2} \\
\rho^2 & \rho & 1 & \cdots & \rho^{k-3} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
\rho^{k-1} & \rho^{k-2} & \rho^{k-3} & \cdots & 1
\end{pmatrix},
\] and \(\mathrm{rhogit}(\rho) = \eta_{\rho}\).The bamlss package supports gamlss families via a transfer function that builds the bamlss.family
object from a GAMLSS distribution funciton (Stasinopoulos and Rigby 2007), e.g. NO()
, internally:
## Load the gamlss families.
library("gamlss.dist")
## Estimate model.
b <- bamlss(f, data = d, family = NO)
Balding, David J., and Richard A. Nichols. 1995. “A Method for Quantifying Differentiation Between Populations at Multi-Allelic Loci and Its Implications for Investigating Identity and Paternity.” Genetica 96 (1): 3–12. https://doi.org/10.1007/BF01441146.
Stasinopoulos, D. Mikis, and Robert A. Rigby. 2007. “Generalized Additive Models for Location Scale and Shape (Gamlss) in R.” Journal of Statistical Software 23 (7): 1–46. https://doi.org/10.18637/jss.v023.i07.
Umlauf, Nikolaus, Nadja Klein, Achim Zeileis, and Thorsten Simon. 2024. bamlss: Bayesian Additive Models for Location Scale and Shape (and Beyond). https://CRAN.R-project.org/package=bamlss.