% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/future_by.R
\name{future_by}
\alias{future_by}
\title{Apply a Function to a Data Frame Split by Factors via Futures}
\usage{
future_by(
  data,
  INDICES,
  FUN,
  ...,
  simplify = TRUE,
  future.envir = parent.frame()
)
}
\arguments{
\item{data}{An \R object, normally a data frame, possibly a matrix.}

\item{INDICES}{A factor or a list of factors, each of length \code{nrow(data)}.}

\item{FUN}{a function to be applied to (usually data-frame) subsets of \code{data}.}

\item{simplify}{logical: see \link[base:tapply]{base::tapply}.}

\item{future.envir}{An \link{environment} passed as argument \code{envir} to
\code{\link[future:future]{future::future()}} as-is.}

\item{\ldots}{Additional arguments pass to \code{\link[=future_lapply]{future_lapply()}} and
then to \code{FUN()}.}
}
\value{
An object of class "by", giving the results for each subset.
This is always a list if simplify is false, otherwise a list
or array (see \link[base:tapply]{base::tapply}).
See also \code{\link[base:by]{base::by()}} for details.
}
\description{
Apply a Function to a Data Frame Split by Factors via Futures
}
\details{
Internally, \code{data} is grouped by \code{INDICES} into a list of \code{data}
subset elements which is then processed by \code{\link[=future_lapply]{future_lapply()}}.
When the groups differ significantly in size, the processing time
may differ significantly between the groups.
To correct for processing-time imbalances, adjust the amount of chunking
via arguments \code{future.scheduling} and \code{future.chunk.size}.
}
\section{Note on 'stringsAsFactors'}{

The \code{future_by()} is modeled as closely as possible to the
behavior of \code{base::by()}.  Both functions have "default" S3 methods that
calls \code{data <- as.data.frame(data)} internally.  This call may in turn call
an S3 method for \code{as.data.frame()} that coerces strings to factors or not
depending on whether it has a \code{stringsAsFactors} argument and what its
default is.
For example, the S3 method of \code{as.data.frame()} for lists changed its
(effective) default from \code{stringsAsFactors = TRUE} to
\code{stringsAsFactors = TRUE} in R 4.0.0.
}

\examples{
## ---------------------------------------------------------
## by()
## ---------------------------------------------------------
library(datasets) ## warpbreaks
library(stats)    ## lm()

y0 <- by(warpbreaks, warpbreaks[,"tension"],
         function(x) lm(breaks ~ wool, data = x))

plan(multisession)
y1 <- future_by(warpbreaks, warpbreaks[,"tension"],
                function(x) lm(breaks ~ wool, data = x))

plan(sequential)
y2 <- future_by(warpbreaks, warpbreaks[,"tension"],
                function(x) lm(breaks ~ wool, data = x))
}
