Batchwise backfitting estimation engine for GAMLSS using very large data sets.

## Batchwise backfitting engine.
bbfit(x, y, family, shuffle = TRUE, start = NULL, offset = NULL,
epochs = 1, nbatch = 10, verbose = TRUE, ...)
## Parallel version.
bbfitp(x, y, family, mc.cores = 1, ...)

## Arguments

x |
For function `bfit()` the `x` list, as returned from function
`bamlss.frame` , holding all model matrices and other information that is used for
fitting the model. For the updating functions an object as returned from function
`smooth.construct` or `smoothCon` . |

y |
The model response, as returned from function `bamlss.frame` . |

family |
A bamlss family object, see `family.bamlss` . |

shuffle |
Should observations be shuffled? |

start |
A named numeric vector containing possible starting values, the names are based on
function `parameters` . |

offset |
Can be used to supply model offsets for use in fitting,
returned from function `bamlss.frame` . |

epochs |
For how many epochs should the algorithm run? |

nbatch |
Number of batches. Can also be a number between 0 and 1, i.e., determining
the fraction of observations that should be used for fitting. |

verbose |
Print information during runtime of the algorithm. |

mc.cores |
On how many cores should estimation be started? |

… |
For `bbfitp()` all arguments to be passed to `bbfit()` . |

## Details

The algorithm uses batch-wise estimation of smoothing variances, which are estimated on an
hold-out batch. This way, models for very large data sets can be estimated. Note, the algorithm
only works in combination withe the ff and ffbase package. The data needs to be stored
as comma separated file on disc, see the example.

## Value

For function `bbfit()`

a list containing the following objects:

fitted.valuesA named list of the fitted values of the modeled parameters
of the selected distribution.

parametersThe estimated set regression coefficients and smoothing variances.

shuffleLogical

runtimeThe runtime of the algorithm.

## See also

## Examples

# NOT RUN {
## Simulate data.
set.seed(123)
d <- GAMart(n = 27000, sd = -1)
## Write data to disc.
tf <- tempdir()
write.table(d, file.path(tf, "d.raw"), quote = FALSE, row.names = FALSE, sep = ",")
## Estimation using batch-wise backfitting.
f <- list(
num ~ s(x1,k=40) + s(x2,k=40) + s(x3,k=40) + te(lon,lat,k=10),
sigma ~ s(x1,k=40) + s(x2,k=40) + s(x3,k=40) + te(lon,lat,k=10)
)
b <- bamlss(f, data = file.path(tf, "d.raw"), optimizer = bbfit,
sampler = FALSE, nbatch = 10, epochs = 2, loglik = TRUE)
## Show estimated effects.
plot(b)
# }