biglm.big.matrix, bigglm.big.matrix {bigmemory}R Documentation

Use Thomas Lumley's “biglm” package with a “big.matrix”

Description

This is a wrapper to Thomas Lumley's biglm package, allowing its use with data stored in big.matrix objects.

Usage

biglm.big.matrix(formula, data, chunksize=NULL, ..., fc=NULL, 
  getNextChunkFunc=NULL)
bigglm.big.matrix(formula, data, chunksize=NULL, ..., fc=NULL,
  getNextChunkFunc=NULL)

Arguments

formula a model formula.
data a big.matrix or data.frame object.
chunksize an integer maximum size of chunks of data to process iteratively; if this argument is not given, a suitable default is supplied
fc the names of variables that are factors
getNextChunkFunc a function which generates the next set of indices for the next chunk; if this argument is not given, a suitable default is supplied
... the other parameters which can be specified are those supported by biglm and bigglm

Details

See biglm package for more information; chunksize defaults to
floor(nrow(data)/ncol(data)^2).

Value

an object of class biglm.

Author(s)

Michael J. Kane

References

Algorithm AS274 Applied Statistics (1992) Vol. 41, No.2

Thomas Lumley (2005). biglm: bounded memory linear and generalized linear models. R package version 0.7.

See Also

biglm, big.matrix

Examples

# This example is quite silly, using the iris
# data.  But it shows that our wrapper to Lumley's biglm() function produces
# the same answer as the plain old lm() function.

## Not run: 
x <- matrix(unlist(iris), ncol=5)
colnames(x) <- names(iris)
x <- as.big.matrix(x)
head(x)

silly.biglm <- biglm.big.matrix(Sepal.Length ~ Sepal.Width + Species, data=x, fc="Species")
summary(silly.biglm)

y <- data.frame(x[,])
y$Species <- as.factor(y$Species)
head(y)

silly.lm <- lm(Sepal.Length ~ Sepal.Width + Species, data=y)
summary(silly.lm)
## End(Not run)

[Package bigmemory version 3.10 Index]