Title: | Bayesian Regression Models using 'Stan' |
---|---|
Description: | Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>. |
Authors: | Paul-Christian Bürkner [aut, cre], Jonah Gabry [ctb], Sebastian Weber [ctb], Andrew Johnson [ctb], Martin Modrak [ctb], Hamada S. Badr [ctb], Frank Weber [ctb], Aki Vehtari [ctb], Mattan S. Ben-Shachar [ctb], Hayden Rabel [ctb], Simon C. Mills [ctb], Stephen Wild [ctb], Ven Popov [ctb], Ioannis Kosmidis [ctb] |
Maintainer: | Paul-Christian Bürkner <[email protected]> |
License: | GPL-2 |
Version: | 2.22.8 |
Built: | 2024-12-17 22:28:09 UTC |
Source: | https://github.com/paul-buerkner/brms |
Stan Development Team
The brms package provides an interface to fit Bayesian generalized multivariate (non-)linear multilevel models using Stan, which is a C++ package for obtaining full Bayesian inference (see https://mc-stan.org/). The formula syntax is an extended version of the syntax applied in the lme4 package to provide a familiar and simple interface for performing regression analyses.
The main function of brms is brm
, which uses
formula syntax to specify a wide range of complex Bayesian models
(see brmsformula
for details). Based on the supplied
formulas, data, and additional information, it writes the Stan code
on the fly via stancode
, prepares the data via
standata
and fits the model using
Stan.
Subsequently, a large number of post-processing methods can be applied:
To get an overview on the estimated parameters,
summary
or
conditional_effects
are perfectly suited. Detailed visual analyses can be performed by applying
the pp_check
and stanplot
methods, which both
rely on the bayesplot package.
Model comparisons can be done via loo
and waic
,
which make use of the loo package as well as
via bayes_factor
which relies on the bridgesampling package.
For a full list of methods to apply, type methods(class = "brmsfit")
.
Because brms is based on Stan, a C++ compiler is required. The program Rtools (available on https://cran.r-project.org/bin/windows/Rtools/) comes with a C++ compiler for Windows. On Mac, you should use Xcode. For further instructions on how to get the compilers running, see the prerequisites section at the RStan-Getting-Started page.
When comparing other packages fitting multilevel models to brms, keep in mind that the latter needs to compile models before actually fitting them, which will require between 20 and 40 seconds depending on your machine, operating system and overall model complexity.
Thus, fitting smaller models may be relatively slow as compilation time makes up the majority of the whole running time. For larger / more complex models however, fitting my take several minutes or even hours, so that the compilation time won't make much of a difference for these models.
See vignette("brms_overview")
and vignette("brms_multilevel")
for a general introduction and overview of brms. For a full list of
available vignettes, type vignette(package = "brms")
.
Maintainer: Paul-Christian Bürkner [email protected]
Other contributors:
Jonah Gabry [contributor]
Sebastian Weber [contributor]
Andrew Johnson [contributor]
Martin Modrak [contributor]
Hamada S. Badr [contributor]
Frank Weber [contributor]
Aki Vehtari [contributor]
Mattan S. Ben-Shachar [contributor]
Hayden Rabel [contributor]
Simon C. Mills [contributor]
Stephen Wild [contributor]
Ven Popov [contributor]
Ioannis Kosmidis [contributor]
Paul-Christian Buerkner (2017). brms: An R Package for Bayesian Multilevel
Models Using Stan. Journal of Statistical Software, 80(1), 1-28.
doi:10.18637/jss.v080.i01
Paul-Christian Buerkner (2018). Advanced Bayesian Multilevel Modeling
with the R Package brms. The R Journal. 10(1), 395–411.
doi:10.32614/RJ-2018-017
The Stan Development Team. Stan Modeling Language User's Guide and Reference Manual. https://mc-stan.org/users/documentation/.
Stan Development Team (2020). RStan: the R interface to Stan. R package version 2.21.2. https://mc-stan.org/
brm
,
brmsformula
,
brmsfamily
,
brmsfit
Add model fit criteria to model objects
add_criterion(x, ...) ## S3 method for class 'brmsfit' add_criterion( x, criterion, model_name = NULL, overwrite = FALSE, file = NULL, force_save = FALSE, ... )
add_criterion(x, ...) ## S3 method for class 'brmsfit' add_criterion( x, criterion, model_name = NULL, overwrite = FALSE, file = NULL, force_save = FALSE, ... )
x |
An R object typically of class |
... |
Further arguments passed to the underlying
functions computing the model fit criteria. If you are recomputing
an already stored criterion with other |
criterion |
Names of model fit criteria
to compute. Currently supported are |
model_name |
Optional name of the model. If |
overwrite |
Logical; Indicates if already stored fit
indices should be overwritten. Defaults to |
file |
Either |
force_save |
Logical; only relevant if |
Functions add_loo
and add_waic
are aliases of
add_criterion
with fixed values for the criterion
argument.
An object of the same class as x
, but
with model fit criteria added for later usage.
## Not run: fit <- brm(count ~ Trt, data = epilepsy) # add both LOO and WAIC at once fit <- add_criterion(fit, c("loo", "waic")) print(fit$criteria$loo) print(fit$criteria$waic) ## End(Not run)
## Not run: fit <- brm(count ~ Trt, data = epilepsy) # add both LOO and WAIC at once fit <- add_criterion(fit, c("loo", "waic")) print(fit$criteria$loo) print(fit$criteria$waic) ## End(Not run)
Deprecated aliases of add_criterion
.
add_loo(x, model_name = NULL, ...) add_waic(x, model_name = NULL, ...) add_ic(x, ...) ## S3 method for class 'brmsfit' add_ic(x, ic = "loo", model_name = NULL, ...) add_ic(x, ...) <- value
add_loo(x, model_name = NULL, ...) add_waic(x, model_name = NULL, ...) add_ic(x, ...) ## S3 method for class 'brmsfit' add_ic(x, ic = "loo", model_name = NULL, ...) add_ic(x, ...) <- value
x |
An R object typically of class |
model_name |
Optional name of the model. If |
... |
Further arguments passed to the underlying
functions computing the model fit criteria. If you are recomputing
an already stored criterion with other |
ic , value
|
Names of model fit criteria
to compute. Currently supported are |
An object of the same class as x
, but
with model fit criteria added for later usage.
Previously computed criterion objects will be overwritten.
brmsfit
objectsCompile a stanmodel
and add
it to a brmsfit
object. This enables some advanced functionality
of rstan, most notably log_prob
and friends, to be used with brms models fitted with other Stan backends.
add_rstan_model(x, overwrite = FALSE)
add_rstan_model(x, overwrite = FALSE)
x |
A |
overwrite |
Logical. If |
A (possibly updated) brmsfit
object.
Provide additional information on the response variable
in brms models, such as censoring, truncation, or
known measurement error. Detailed documentation on the use
of each of these functions can be found in the Details section
of brmsformula
(under "Additional response information").
resp_se(x, sigma = FALSE) resp_weights(x, scale = FALSE) resp_trials(x) resp_thres(x, gr = NA) resp_cat(x) resp_dec(x) resp_bhaz(gr = NA, df = 5, ...) resp_cens(x, y2 = NA) resp_trunc(lb = -Inf, ub = Inf) resp_mi(sdy = NA) resp_index(x) resp_rate(denom) resp_subset(x) resp_vreal(...) resp_vint(...)
resp_se(x, sigma = FALSE) resp_weights(x, scale = FALSE) resp_trials(x) resp_thres(x, gr = NA) resp_cat(x) resp_dec(x) resp_bhaz(gr = NA, df = 5, ...) resp_cens(x, y2 = NA) resp_trunc(lb = -Inf, ub = Inf) resp_mi(sdy = NA) resp_index(x) resp_rate(denom) resp_subset(x) resp_vreal(...) resp_vint(...)
x |
A vector; Ideally a single variable defined in the data (see
Details). Allowed values depend on the function: |
sigma |
Logical; Indicates whether the residual standard deviation
parameter |
scale |
Logical; Indicates whether weights should be scaled
so that the average weight equals one. Defaults to |
gr |
A vector of grouping indicators. |
df |
Degrees of freedom of baseline hazard splines for Cox models. |
... |
For |
y2 |
A vector specifying the upper bounds in interval censoring.
Will be ignored for non-interval censored observations. However, it
should NOT be |
lb |
A numeric vector or single numeric value specifying the lower truncation bound. |
ub |
A numeric vector or single numeric value specifying the upper truncation bound. |
sdy |
Optional known measurement error of the response treated as standard deviation. If specified, handles measurement error and (completely) missing values at the same time using the plausible-values-technique. |
denom |
A vector of positive numeric values specifying the denominator values from which the response rates are computed. |
These functions are almost solely useful when
called in formulas passed to the brms package.
Within formulas, the resp_
prefix may be omitted.
More information is given in the 'Details' section
of brmsformula
(under "Additional response information").
It is highly recommended to use a single data variable as input
for x
(instead of a more complicated expression) to make sure all
post-processing functions work as expected.
A list of additional response information to be processed further by brms.
## Not run: ## Random effects meta-analysis nstudies <- 20 true_effects <- rnorm(nstudies, 0.5, 0.2) sei <- runif(nstudies, 0.05, 0.3) outcomes <- rnorm(nstudies, true_effects, sei) data1 <- data.frame(outcomes, sei) fit1 <- brm(outcomes | se(sei, sigma = TRUE) ~ 1, data = data1) summary(fit1) ## Probit regression using the binomial family n <- sample(1:10, 100, TRUE) # number of trials success <- rbinom(100, size = n, prob = 0.4) x <- rnorm(100) data2 <- data.frame(n, success, x) fit2 <- brm(success | trials(n) ~ x, data = data2, family = binomial("probit")) summary(fit2) ## Survival regression modeling the time between the first ## and second recurrence of an infection in kidney patients. fit3 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient), data = kidney, family = lognormal()) summary(fit3) ## Poisson model with truncated counts fit4 <- brm(count | trunc(ub = 104) ~ zBase * Trt, data = epilepsy, family = poisson()) summary(fit4) ## End(Not run)
## Not run: ## Random effects meta-analysis nstudies <- 20 true_effects <- rnorm(nstudies, 0.5, 0.2) sei <- runif(nstudies, 0.05, 0.3) outcomes <- rnorm(nstudies, true_effects, sei) data1 <- data.frame(outcomes, sei) fit1 <- brm(outcomes | se(sei, sigma = TRUE) ~ 1, data = data1) summary(fit1) ## Probit regression using the binomial family n <- sample(1:10, 100, TRUE) # number of trials success <- rbinom(100, size = n, prob = 0.4) x <- rnorm(100) data2 <- data.frame(n, success, x) fit2 <- brm(success | trials(n) ~ x, data = data2, family = binomial("probit")) summary(fit2) ## Survival regression modeling the time between the first ## and second recurrence of an infection in kidney patients. fit3 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient), data = kidney, family = lognormal()) summary(fit3) ## Poisson model with truncated counts fit4 <- brm(count | trunc(ub = 104) ~ zBase * Trt, data = epilepsy, family = poisson()) summary(fit4) ## End(Not run)
Set up an autoregressive (AR) term of order p in brms. The function does not evaluate its arguments – it exists purely to help set up a model with AR terms.
ar(time = NA, gr = NA, p = 1, cov = FALSE)
ar(time = NA, gr = NA, p = 1, cov = FALSE)
time |
An optional time variable specifying the time ordering of the observations. By default, the existing order of the observations in the data is used. |
gr |
An optional grouping variable. If specified, the correlation structure is assumed to apply only to observations within the same grouping level. |
p |
A non-negative integer specifying the autoregressive (AR)
order of the ARMA structure. Default is |
cov |
A flag indicating whether ARMA effects should be estimated by
means of residual covariance matrices. This is currently only possible for
stationary ARMA effects of order 1. If the model family does not have
natural residuals, latent residuals are added automatically. If
|
An object of class 'arma_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
## Not run: data("LakeHuron") LakeHuron <- as.data.frame(LakeHuron) fit <- brm(x ~ ar(p = 2), data = LakeHuron) summary(fit) ## End(Not run)
## Not run: data("LakeHuron") LakeHuron <- as.data.frame(LakeHuron) fit <- brm(x ~ ar(p = 2), data = LakeHuron) summary(fit) ## End(Not run)
Set up an autoregressive moving average (ARMA) term of order (p, q) in brms. The function does not evaluate its arguments – it exists purely to help set up a model with ARMA terms.
arma(time = NA, gr = NA, p = 1, q = 1, cov = FALSE)
arma(time = NA, gr = NA, p = 1, q = 1, cov = FALSE)
time |
An optional time variable specifying the time ordering of the observations. By default, the existing order of the observations in the data is used. |
gr |
An optional grouping variable. If specified, the correlation structure is assumed to apply only to observations within the same grouping level. |
p |
A non-negative integer specifying the autoregressive (AR)
order of the ARMA structure. Default is |
q |
A non-negative integer specifying the moving average (MA)
order of the ARMA structure. Default is |
cov |
A flag indicating whether ARMA effects should be estimated by
means of residual covariance matrices. This is currently only possible for
stationary ARMA effects of order 1. If the model family does not have
natural residuals, latent residuals are added automatically. If
|
An object of class 'arma_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
autocor-terms
, ar
, ma
,
## Not run: data("LakeHuron") LakeHuron <- as.data.frame(LakeHuron) fit <- brm(x ~ arma(p = 2, q = 1), data = LakeHuron) summary(fit) ## End(Not run)
## Not run: data("LakeHuron") LakeHuron <- as.data.frame(LakeHuron) fit <- brm(x ~ arma(p = 2, q = 1), data = LakeHuron) summary(fit) ## End(Not run)
Try to transform an object into a brmsprior
object.
as.brmsprior(x)
as.brmsprior(x)
x |
An object to be transformed. |
A brmsprior
object if the transformation was possible.
Extract posterior draws in conventional formats as data.frames, matrices, or arrays.
## S3 method for class 'brmsfit' as.data.frame( x, row.names = NULL, optional = TRUE, pars = NA, variable = NULL, draw = NULL, subset = NULL, ... ) ## S3 method for class 'brmsfit' as.matrix(x, pars = NA, variable = NULL, draw = NULL, subset = NULL, ...) ## S3 method for class 'brmsfit' as.array(x, pars = NA, variable = NULL, draw = NULL, subset = NULL, ...)
## S3 method for class 'brmsfit' as.data.frame( x, row.names = NULL, optional = TRUE, pars = NA, variable = NULL, draw = NULL, subset = NULL, ... ) ## S3 method for class 'brmsfit' as.matrix(x, pars = NA, variable = NULL, draw = NULL, subset = NULL, ...) ## S3 method for class 'brmsfit' as.array(x, pars = NA, variable = NULL, draw = NULL, subset = NULL, ...)
x |
A |
row.names , optional
|
Unused and only added for consistency with
the |
pars |
Deprecated alias of |
variable |
A character vector providing the variables to extract. By default, all variables are extracted. |
draw |
The draw indices to be select. Subsetting draw indices will lead to an automatic merging of chains. |
subset |
Deprecated alias of |
... |
Further arguments to be passed to the corresponding
|
A data.frame, matrix, or array containing the posterior draws.
The as.mcmc
method is deprecated. We recommend using the more
modern and consistent as_draws_*
extractor
functions of the posterior package instead.
## S3 method for class 'brmsfit' as.mcmc( x, pars = NA, fixed = FALSE, combine_chains = FALSE, inc_warmup = FALSE, ... )
## S3 method for class 'brmsfit' as.mcmc( x, pars = NA, fixed = FALSE, combine_chains = FALSE, inc_warmup = FALSE, ... )
x |
An |
pars |
Names of parameters for which posterior samples should be returned, as given by a character vector or regular expressions. By default, all posterior samples of all parameters are extracted. |
fixed |
Indicates whether parameter names
should be matched exactly ( |
combine_chains |
Indicates whether chains should be combined. |
inc_warmup |
Indicates if the warmup samples should be included.
Default is |
... |
currently unused |
If combine_chains = TRUE
an mcmc
object is returned.
If combine_chains = FALSE
an mcmc.list
object is returned.
Density, distribution function, quantile function and random generation
for the asymmetric Laplace distribution with location mu
,
scale sigma
and asymmetry parameter quantile
.
dasym_laplace(x, mu = 0, sigma = 1, quantile = 0.5, log = FALSE) pasym_laplace( q, mu = 0, sigma = 1, quantile = 0.5, lower.tail = TRUE, log.p = FALSE ) qasym_laplace( p, mu = 0, sigma = 1, quantile = 0.5, lower.tail = TRUE, log.p = FALSE ) rasym_laplace(n, mu = 0, sigma = 1, quantile = 0.5)
dasym_laplace(x, mu = 0, sigma = 1, quantile = 0.5, log = FALSE) pasym_laplace( q, mu = 0, sigma = 1, quantile = 0.5, lower.tail = TRUE, log.p = FALSE ) qasym_laplace( p, mu = 0, sigma = 1, quantile = 0.5, lower.tail = TRUE, log.p = FALSE ) rasym_laplace(n, mu = 0, sigma = 1, quantile = 0.5)
x , q
|
Vector of quantiles. |
mu |
Vector of locations. |
sigma |
Vector of scales. |
quantile |
Asymmetry parameter corresponding to quantiles in quantile regression (hence the name). |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
p |
Vector of probabilities. |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
Specify autocorrelation terms in brms models. Currently supported terms
are arma
, ar
, ma
,
cosy
, unstr
, sar
,
car
, and fcor
. Terms can be directly specified
within the formula, or passed to the autocor
argument of
brmsformula
in the form of a one-sided formula. For deprecated
ways of specifying autocorrelation terms, see cor_brms
.
The autocor term functions are almost solely useful when called in formulas passed to the brms package. They do not evaluate its arguments – but exist purely to help set up a model with autocorrelation terms.
brmsformula
, acformula
,
arma
, ar
, ma
,
cosy
, unstr
, sar
,
car
, fcor
# specify autocor terms within the formula y ~ x + arma(p = 1, q = 1) + car(M) # specify autocor terms in the 'autocor' argument bf(y ~ x, autocor = ~ arma(p = 1, q = 1) + car(M)) # specify autocor terms via 'acformula' bf(y ~ x) + acformula(~ arma(p = 1, q = 1) + car(M))
# specify autocor terms within the formula y ~ x + arma(p = 1, q = 1) + car(M) # specify autocor terms in the 'autocor' argument bf(y ~ x, autocor = ~ arma(p = 1, q = 1) + car(M)) # specify autocor terms via 'acformula' bf(y ~ x) + acformula(~ arma(p = 1, q = 1) + car(M))
(Deprecated) Extract Autocorrelation Objects
## S3 method for class 'brmsfit' autocor(object, resp = NULL, ...) autocor(object, ...)
## S3 method for class 'brmsfit' autocor(object, resp = NULL, ...) autocor(object, ...)
object |
An object of class |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
... |
Currently unused. |
A cor_brms
object or a list of such objects for multivariate
models. Not supported for models fitted with brms 2.11.1 or higher.
Compute Bayes factors from marginal likelihoods.
## S3 method for class 'brmsfit' bayes_factor(x1, x2, log = FALSE, ...)
## S3 method for class 'brmsfit' bayes_factor(x1, x2, log = FALSE, ...)
x1 |
A |
x2 |
Another |
log |
Report Bayes factors on the log-scale? |
... |
Additional arguments passed to
|
Computing the marginal likelihood requires samples
of all variables defined in Stan's parameters
block
to be saved. Otherwise bayes_factor
cannot be computed.
Thus, please set save_all_pars = TRUE
in the call to brm
,
if you are planning to apply bayes_factor
to your models.
The computation of Bayes factors based on bridge sampling requires
a lot more posterior samples than usual. A good conservative
rule of thumb is perhaps 10-fold more samples (read: the default of 4000
samples may not be enough in many cases). If not enough posterior
samples are provided, the bridge sampling algorithm tends to be unstable,
leading to considerably different results each time it is run.
We thus recommend running bayes_factor
multiple times to check the stability of the results.
More details are provided under
bridgesampling::bayes_factor
.
## Not run: # model with the treatment effect fit1 <- brm( count ~ zAge + zBase + Trt, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_all_pars = TRUE ) summary(fit1) # model without the treatment effect fit2 <- brm( count ~ zAge + zBase, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_all_pars = TRUE ) summary(fit2) # compute the bayes factor bayes_factor(fit1, fit2) ## End(Not run)
## Not run: # model with the treatment effect fit1 <- brm( count ~ zAge + zBase + Trt, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_all_pars = TRUE ) summary(fit1) # model without the treatment effect fit2 <- brm( count ~ zAge + zBase, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_all_pars = TRUE ) summary(fit2) # compute the bayes factor bayes_factor(fit1, fit2) ## End(Not run)
Compute a Bayesian version of R-squared for regression models
## S3 method for class 'brmsfit' bayes_R2( object, resp = NULL, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
## S3 method for class 'brmsfit' bayes_R2( object, resp = NULL, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
object |
An object of class |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
... |
Further arguments passed to
|
For an introduction to the approach, see Gelman et al. (2019) and https://github.com/jgabry/bayes_R2/.
If summary = TRUE
, an M x C matrix is returned
(M = number of response variables and c = length(probs) + 2
)
containing summary statistics of the Bayesian R-squared values.
If summary = FALSE
, the posterior draws of the Bayesian
R-squared values are returned in an S x M matrix (S is the number of draws).
Andrew Gelman, Ben Goodrich, Jonah Gabry & Aki Vehtari. (2019).
R-squared for Bayesian regression models, The American Statistician,
73(3):307-309. 10.1080/00031305.2018.1549100
(Preprint available at
https://stat.columbia.edu/~gelman/research/published/bayes_R2_v3.pdf)
## Not run: fit <- brm(mpg ~ wt + cyl, data = mtcars) summary(fit) bayes_R2(fit) # compute R2 with new data nd <- data.frame(mpg = c(10, 20, 30), wt = c(4, 3, 2), cyl = c(8, 6, 4)) bayes_R2(fit, newdata = nd) ## End(Not run)
## Not run: fit <- brm(mpg ~ wt + cyl, data = mtcars) summary(fit) bayes_R2(fit) # compute R2 with new data nd <- data.frame(mpg = c(10, 20, 30), wt = c(4, 3, 2), cyl = c(8, 6, 4)) bayes_R2(fit, newdata = nd) ## End(Not run)
Cumulative density & mass functions, and random number generation for the Beta-binomial distribution using the following re-parameterisation of the Stan Beta-binomial definition:
mu = alpha * beta
mean probability of trial success.
phi = (1 - mu) * beta
precision or over-dispersion, component.
dbeta_binomial(x, size, mu, phi, log = FALSE) pbeta_binomial(q, size, mu, phi, lower.tail = TRUE, log.p = FALSE) rbeta_binomial(n, size, mu, phi)
dbeta_binomial(x, size, mu, phi, log = FALSE) pbeta_binomial(q, size, mu, phi, lower.tail = TRUE, log.p = FALSE) rbeta_binomial(n, size, mu, phi)
x , q
|
Vector of quantiles. |
size |
Vector of number of trials (zero or more). |
mu |
Vector of means. |
phi |
Vector of precisions. |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
n |
Number of draws to sample from the distribution. |
Computes log marginal likelihood via bridge sampling,
which can be used in the computation of bayes factors
and posterior model probabilities.
The brmsfit
method is just a thin wrapper around
the corresponding method for stanfit
objects.
## S3 method for class 'brmsfit' bridge_sampler(samples, recompile = FALSE, ...)
## S3 method for class 'brmsfit' bridge_sampler(samples, recompile = FALSE, ...)
samples |
A |
recompile |
Logical, indicating whether the Stan model should be recompiled. This may be necessary if you are running bridge sampling on another machine than the one used to fit the model. No recompilation is done by default. |
... |
Additional arguments passed to
|
Computing the marginal likelihood requires samples of all variables
defined in Stan's parameters
block to be saved. Otherwise
bridge_sampler
cannot be computed. Thus, please set save_pars
= save_pars(all = TRUE)
in the call to brm
, if you are planning to
apply bridge_sampler
to your models.
The computation of marginal likelihoods based on bridge sampling requires
a lot more posterior draws than usual. A good conservative
rule of thump is perhaps 10-fold more draws (read: the default of 4000
draws may not be enough in many cases). If not enough posterior
draws are provided, the bridge sampling algorithm tends to be
unstable leading to considerably different results each time it is run.
We thus recommend running bridge_sampler
multiple times to check the stability of the results.
More details are provided under
bridgesampling::bridge_sampler
.
## Not run: # model with the treatment effect fit1 <- brm( count ~ zAge + zBase + Trt, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_pars = save_pars(all = TRUE) ) summary(fit1) bridge_sampler(fit1) # model without the treatment effect fit2 <- brm( count ~ zAge + zBase, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_pars = save_pars(all = TRUE) ) summary(fit2) bridge_sampler(fit2) ## End(Not run)
## Not run: # model with the treatment effect fit1 <- brm( count ~ zAge + zBase + Trt, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_pars = save_pars(all = TRUE) ) summary(fit1) bridge_sampler(fit1) # model without the treatment effect fit2 <- brm( count ~ zAge + zBase, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_pars = save_pars(all = TRUE) ) summary(fit2) bridge_sampler(fit2) ## End(Not run)
Fit Bayesian generalized (non-)linear multivariate multilevel models using Stan for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit – among others – linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, extended-support beta regression, and even self-defined mixture models all in a multilevel context. Further modeling options include non-linear and smooth terms, auto-correlation structures, censored data, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distributions can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. In addition, model fit can easily be assessed and compared with posterior predictive checks and leave-one-out cross-validation.
brm( formula, data, family = gaussian(), prior = NULL, autocor = NULL, data2 = NULL, cov_ranef = NULL, sample_prior = "no", sparse = NULL, knots = NULL, drop_unused_levels = TRUE, stanvars = NULL, stan_funs = NULL, fit = NA, save_pars = getOption("brms.save_pars", NULL), save_ranef = NULL, save_mevars = NULL, save_all_pars = NULL, init = NULL, inits = NULL, chains = 4, iter = getOption("brms.iter", 2000), warmup = floor(iter/2), thin = 1, cores = getOption("mc.cores", 1), threads = getOption("brms.threads", NULL), opencl = getOption("brms.opencl", NULL), normalize = getOption("brms.normalize", TRUE), control = NULL, algorithm = getOption("brms.algorithm", "sampling"), backend = getOption("brms.backend", "rstan"), future = getOption("future", FALSE), silent = 1, seed = NA, save_model = NULL, stan_model_args = list(), file = NULL, file_compress = TRUE, file_refit = getOption("brms.file_refit", "never"), empty = FALSE, rename = TRUE, ... )
brm( formula, data, family = gaussian(), prior = NULL, autocor = NULL, data2 = NULL, cov_ranef = NULL, sample_prior = "no", sparse = NULL, knots = NULL, drop_unused_levels = TRUE, stanvars = NULL, stan_funs = NULL, fit = NA, save_pars = getOption("brms.save_pars", NULL), save_ranef = NULL, save_mevars = NULL, save_all_pars = NULL, init = NULL, inits = NULL, chains = 4, iter = getOption("brms.iter", 2000), warmup = floor(iter/2), thin = 1, cores = getOption("mc.cores", 1), threads = getOption("brms.threads", NULL), opencl = getOption("brms.opencl", NULL), normalize = getOption("brms.normalize", TRUE), control = NULL, algorithm = getOption("brms.algorithm", "sampling"), backend = getOption("brms.backend", "rstan"), future = getOption("future", FALSE), silent = 1, seed = NA, save_model = NULL, stan_model_args = list(), file = NULL, file_compress = TRUE, file_refit = getOption("brms.file_refit", "never"), empty = FALSE, rename = TRUE, ... )
formula |
An object of class |
data |
An object of class |
family |
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a |
prior |
One or more |
autocor |
(Deprecated) An optional |
data2 |
A named |
cov_ranef |
(Deprecated) A list of matrices that are proportional to the
(within) covariance structure of the group-level effects. The names of the
matrices should correspond to columns in |
sample_prior |
Indicate if draws from priors should be drawn
additionally to the posterior draws. Options are |
sparse |
(Deprecated) Logical; indicates whether the population-level
design matrices should be treated as sparse (defaults to |
knots |
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See
|
drop_unused_levels |
Should unused factors levels in the data be
dropped? Defaults to |
stanvars |
An optional |
stan_funs |
(Deprecated) An optional character string containing
self-defined Stan functions, which will be included in the functions
block of the generated Stan code. It is now recommended to use the
|
fit |
An instance of S3 class |
save_pars |
An object generated by |
save_ranef |
(Deprecated) A flag to indicate if group-level effects for
each level of the grouping factor(s) should be saved (default is
|
save_mevars |
(Deprecated) A flag to indicate if draws of latent
noise-free variables obtained by using |
save_all_pars |
(Deprecated) A flag to indicate if draws from all
variables defined in Stan's |
init |
Initial values for the sampler. If |
inits |
(Deprecated) Alias of |
chains |
Number of Markov chains (defaults to 4). |
iter |
Number of total iterations per chain (including warmup; defaults
to 2000). Can be set globally for the current R session via the
|
warmup |
A positive integer specifying number of warmup (aka burnin)
iterations. This also specifies the number of iterations used for stepsize
adaptation, so warmup draws should not be used for inference. The number
of warmup should not be larger than |
thin |
Thinning rate. Must be a positive integer. Set |
cores |
Number of cores to use when executing the chains in parallel,
which defaults to 1 but we recommend setting the |
threads |
Number of threads to use in within-chain parallelization. For
more control over the threading process, |
opencl |
The platform and device IDs of the OpenCL device to use for
fitting using GPU support. If you don't know the IDs of your OpenCL device,
|
normalize |
Logical. Indicates whether normalization constants should
be included in the Stan code (defaults to |
control |
A named |
algorithm |
Character string naming the estimation approach to use.
Options are |
backend |
Character string naming the package to use as the backend for
fitting the Stan model. Options are |
future |
Logical; If |
silent |
Verbosity level between |
seed |
The seed for random number generation to make results
reproducible. If |
save_model |
Either |
stan_model_args |
A |
file |
Either |
file_compress |
Logical or a character string, specifying one of the
compression algorithms supported by |
file_refit |
Modifies when the fit stored via the |
empty |
Logical. If |
rename |
For internal use only. |
... |
Further arguments passed to Stan.
For |
Fit a generalized (non-)linear multivariate multilevel model via
full Bayesian inference using Stan. A general overview is provided in the
vignettes vignette("brms_overview")
and
vignette("brms_multilevel")
. For a full list of available vignettes
see vignette(package = "brms")
.
Formula syntax of brms models
Details of the formula syntax applied in brms can be found in
brmsformula
.
Families and link functions
Details of families supported by brms can be found in
brmsfamily
.
Prior distributions
Priors should be specified using the
set_prior
function. Its documentation
contains detailed information on how to correctly specify priors. To find
out on which parameters or parameter classes priors can be defined, use
default_prior
. Default priors are chosen to be
non or very weakly informative so that their influence on the results will
be negligible and you usually don't have to worry about them. However,
after getting more familiar with Bayesian statistics, I recommend you to
start thinking about reasonable informative priors for your model
parameters: Nearly always, there is at least some prior information
available that can be used to improve your inference.
Adjusting the sampling behavior of Stan
In addition to choosing the number of iterations, warmup draws, and
chains, users can control the behavior of the NUTS sampler, by using the
control
argument. The most important reason to use control
is
to decrease (or eliminate at best) the number of divergent transitions that
cause a bias in the obtained posterior draws. Whenever you see the
warning "There were x divergent transitions after warmup." you should
really think about increasing adapt_delta
. To do this, write
control = list(adapt_delta = <x>)
, where <x>
should usually
be value between 0.8
(current default) and 1
. Increasing
adapt_delta
will slow down the sampler but will decrease the number
of divergent transitions threatening the validity of your posterior
draws.
Another problem arises when the depth of the tree being evaluated in each
iteration is exceeded. This is less common than having divergent
transitions, but may also bias the posterior draws. When it happens,
Stan will throw out a warning suggesting to increase
max_treedepth
, which can be accomplished by writing control =
list(max_treedepth = <x>)
with a positive integer <x>
that should
usually be larger than the current default of 10
. For more details
on the control
argument see stan
.
An object of class brmsfit
, which contains the posterior
draws along with many other useful information about the model. Use
methods(class = "brmsfit")
for an overview on available methods.
Paul-Christian Buerkner [email protected]
Paul-Christian Buerkner (2017). brms: An R Package for Bayesian Multilevel
Models Using Stan. Journal of Statistical Software, 80(1), 1-28.
doi:10.18637/jss.v080.i01
Paul-Christian Buerkner (2018). Advanced Bayesian Multilevel Modeling
with the R Package brms. The R Journal. 10(1), 395–411.
doi:10.32614/RJ-2018-017
brms
, brmsformula
,
brmsfamily
, brmsfit
## Not run: # Poisson regression for the number of seizures in epileptic patients fit1 <- brm( count ~ zBase * Trt + (1|patient), data = epilepsy, family = poisson(), prior = prior(normal(0, 10), class = b) + prior(cauchy(0, 2), class = sd) ) # generate a summary of the results summary(fit1) # plot the MCMC chains as well as the posterior distributions plot(fit1) # predict responses based on the fitted model head(predict(fit1)) # plot conditional effects for each predictor plot(conditional_effects(fit1), ask = FALSE) # investigate model fit loo(fit1) pp_check(fit1) # Ordinal regression modeling patient's rating of inhaler instructions # category specific effects are estimated for variable 'treat' fit2 <- brm(rating ~ period + carry + cs(treat), data = inhaler, family = sratio("logit"), prior = set_prior("normal(0,5)"), chains = 2) summary(fit2) plot(fit2, ask = FALSE) WAIC(fit2) # Survival regression modeling the time between the first # and second recurrence of an infection in kidney patients. fit3 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient), data = kidney, family = lognormal()) summary(fit3) plot(fit3, ask = FALSE) plot(conditional_effects(fit3), ask = FALSE) # Probit regression using the binomial family ntrials <- sample(1:10, 100, TRUE) success <- rbinom(100, size = ntrials, prob = 0.4) x <- rnorm(100) data4 <- data.frame(ntrials, success, x) fit4 <- brm(success | trials(ntrials) ~ x, data = data4, family = binomial("probit")) summary(fit4) # Non-linear Gaussian model fit5 <- brm( bf(cum ~ ult * (1 - exp(-(dev/theta)^omega)), ult ~ 1 + (1|AY), omega ~ 1, theta ~ 1, nl = TRUE), data = loss, family = gaussian(), prior = c( prior(normal(5000, 1000), nlpar = "ult"), prior(normal(1, 2), nlpar = "omega"), prior(normal(45, 10), nlpar = "theta") ), control = list(adapt_delta = 0.9) ) summary(fit5) conditional_effects(fit5) # Normal model with heterogeneous variances data_het <- data.frame( y = c(rnorm(50), rnorm(50, 1, 2)), x = factor(rep(c("a", "b"), each = 50)) ) fit6 <- brm(bf(y ~ x, sigma ~ 0 + x), data = data_het) summary(fit6) plot(fit6) conditional_effects(fit6) # extract estimated residual SDs of both groups sigmas <- exp(as.data.frame(fit6, variable = "^b_sigma_", regex = TRUE)) ggplot(stack(sigmas), aes(values)) + geom_density(aes(fill = ind)) # Quantile regression predicting the 25%-quantile fit7 <- brm(bf(y ~ x, quantile = 0.25), data = data_het, family = asym_laplace()) summary(fit7) conditional_effects(fit7) # use the future package for more flexible parallelization library(future) plan(multisession, workers = 4) fit7 <- update(fit7, future = TRUE) # fit a model manually via rstan scode <- stancode(count ~ Trt, data = epilepsy) sdata <- standata(count ~ Trt, data = epilepsy) stanfit <- rstan::stan(model_code = scode, data = sdata) # feed the Stan model back into brms fit8 <- brm(count ~ Trt, data = epilepsy, empty = TRUE) fit8$fit <- stanfit fit8 <- rename_pars(fit8) summary(fit8) ## End(Not run)
## Not run: # Poisson regression for the number of seizures in epileptic patients fit1 <- brm( count ~ zBase * Trt + (1|patient), data = epilepsy, family = poisson(), prior = prior(normal(0, 10), class = b) + prior(cauchy(0, 2), class = sd) ) # generate a summary of the results summary(fit1) # plot the MCMC chains as well as the posterior distributions plot(fit1) # predict responses based on the fitted model head(predict(fit1)) # plot conditional effects for each predictor plot(conditional_effects(fit1), ask = FALSE) # investigate model fit loo(fit1) pp_check(fit1) # Ordinal regression modeling patient's rating of inhaler instructions # category specific effects are estimated for variable 'treat' fit2 <- brm(rating ~ period + carry + cs(treat), data = inhaler, family = sratio("logit"), prior = set_prior("normal(0,5)"), chains = 2) summary(fit2) plot(fit2, ask = FALSE) WAIC(fit2) # Survival regression modeling the time between the first # and second recurrence of an infection in kidney patients. fit3 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient), data = kidney, family = lognormal()) summary(fit3) plot(fit3, ask = FALSE) plot(conditional_effects(fit3), ask = FALSE) # Probit regression using the binomial family ntrials <- sample(1:10, 100, TRUE) success <- rbinom(100, size = ntrials, prob = 0.4) x <- rnorm(100) data4 <- data.frame(ntrials, success, x) fit4 <- brm(success | trials(ntrials) ~ x, data = data4, family = binomial("probit")) summary(fit4) # Non-linear Gaussian model fit5 <- brm( bf(cum ~ ult * (1 - exp(-(dev/theta)^omega)), ult ~ 1 + (1|AY), omega ~ 1, theta ~ 1, nl = TRUE), data = loss, family = gaussian(), prior = c( prior(normal(5000, 1000), nlpar = "ult"), prior(normal(1, 2), nlpar = "omega"), prior(normal(45, 10), nlpar = "theta") ), control = list(adapt_delta = 0.9) ) summary(fit5) conditional_effects(fit5) # Normal model with heterogeneous variances data_het <- data.frame( y = c(rnorm(50), rnorm(50, 1, 2)), x = factor(rep(c("a", "b"), each = 50)) ) fit6 <- brm(bf(y ~ x, sigma ~ 0 + x), data = data_het) summary(fit6) plot(fit6) conditional_effects(fit6) # extract estimated residual SDs of both groups sigmas <- exp(as.data.frame(fit6, variable = "^b_sigma_", regex = TRUE)) ggplot(stack(sigmas), aes(values)) + geom_density(aes(fill = ind)) # Quantile regression predicting the 25%-quantile fit7 <- brm(bf(y ~ x, quantile = 0.25), data = data_het, family = asym_laplace()) summary(fit7) conditional_effects(fit7) # use the future package for more flexible parallelization library(future) plan(multisession, workers = 4) fit7 <- update(fit7, future = TRUE) # fit a model manually via rstan scode <- stancode(count ~ Trt, data = epilepsy) sdata <- standata(count ~ Trt, data = epilepsy) stanfit <- rstan::stan(model_code = scode, data = sdata) # feed the Stan model back into brms fit8 <- brm(count ~ Trt, data = epilepsy, empty = TRUE) fit8$fit <- stanfit fit8 <- rename_pars(fit8) summary(fit8) ## End(Not run)
Run the same brms model on multiple datasets and then combine the results into one fitted model object. This is useful in particular for multiple missing value imputation, where the same model is fitted on multiple imputed data sets. Models can be run in parallel using the future package.
brm_multiple( formula, data, family = gaussian(), prior = NULL, data2 = NULL, autocor = NULL, cov_ranef = NULL, sample_prior = c("no", "yes", "only"), sparse = NULL, knots = NULL, stanvars = NULL, stan_funs = NULL, silent = 1, recompile = FALSE, combine = TRUE, fit = NA, algorithm = getOption("brms.algorithm", "sampling"), seed = NA, file = NULL, file_compress = TRUE, file_refit = getOption("brms.file_refit", "never"), ... )
brm_multiple( formula, data, family = gaussian(), prior = NULL, data2 = NULL, autocor = NULL, cov_ranef = NULL, sample_prior = c("no", "yes", "only"), sparse = NULL, knots = NULL, stanvars = NULL, stan_funs = NULL, silent = 1, recompile = FALSE, combine = TRUE, fit = NA, algorithm = getOption("brms.algorithm", "sampling"), seed = NA, file = NULL, file_compress = TRUE, file_refit = getOption("brms.file_refit", "never"), ... )
formula |
An object of class |
data |
A list of data.frames each of which will be used to fit a
separate model. Alternatively, a |
family |
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a |
prior |
One or more |
data2 |
A list of named lists each of which will be used to fit a
separate model. Each of the named lists contains objects representing data
which cannot be passed via argument |
autocor |
(Deprecated) An optional |
cov_ranef |
(Deprecated) A list of matrices that are proportional to the
(within) covariance structure of the group-level effects. The names of the
matrices should correspond to columns in |
sample_prior |
Indicate if draws from priors should be drawn
additionally to the posterior draws. Options are |
sparse |
(Deprecated) Logical; indicates whether the population-level
design matrices should be treated as sparse (defaults to |
knots |
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See
|
stanvars |
An optional |
stan_funs |
(Deprecated) An optional character string containing
self-defined Stan functions, which will be included in the functions
block of the generated Stan code. It is now recommended to use the
|
silent |
Verbosity level between |
recompile |
Logical, indicating whether the Stan model should be
recompiled for every imputed data set. Defaults to |
combine |
Logical; Indicates if the fitted models should be combined
into a single fitted model object via |
fit |
An instance of S3 class |
algorithm |
Character string naming the estimation approach to use.
Options are |
seed |
The seed for random number generation to make results
reproducible. If |
file |
Either |
file_compress |
Logical or a character string, specifying one of the
compression algorithms supported by |
file_refit |
Modifies when the fit stored via the |
... |
Further arguments passed to |
The combined model may issue false positive convergence warnings, as the MCMC chains corresponding to different datasets may not necessarily overlap, even if each of the original models did converge. To find out whether each of the original models converged, subset the draws belonging to the individual models and then run convergence diagnostics. See Examples below for details.
If combine = TRUE
a brmsfit_multiple
object, which
inherits from class brmsfit
and behaves essentially the same. If
combine = FALSE
a list of brmsfit
objects.
## Not run: library(mice) m <- 5 imp <- mice(nhanes2, m = m) # fit the model using mice and lm fit_imp1 <- with(lm(bmi ~ age + hyp + chl), data = imp) summary(pool(fit_imp1)) # fit the model using brms fit_imp2 <- brm_multiple(bmi ~ age + hyp + chl, data = imp, chains = 1) summary(fit_imp2) plot(fit_imp2, variable = "^b_", regex = TRUE) # investigate convergence of the original models library(posterior) draws <- as_draws_array(fit_imp2) # every dataset has just one chain here draws_per_dat <- lapply(1:m, \(i) subset_draws(draws, chain = i)) lapply(draws_per_dat, summarise_draws, default_convergence_measures()) # use the future package for parallelization library(future) plan(multisession, workers = 4) fit_imp3 <- brm_multiple(bmi ~ age + hyp + chl, data = imp, chains = 1) summary(fit_imp3) ## End(Not run)
## Not run: library(mice) m <- 5 imp <- mice(nhanes2, m = m) # fit the model using mice and lm fit_imp1 <- with(lm(bmi ~ age + hyp + chl), data = imp) summary(pool(fit_imp1)) # fit the model using brms fit_imp2 <- brm_multiple(bmi ~ age + hyp + chl, data = imp, chains = 1) summary(fit_imp2) plot(fit_imp2, variable = "^b_", regex = TRUE) # investigate convergence of the original models library(posterior) draws <- as_draws_array(fit_imp2) # every dataset has just one chain here draws_per_dat <- lapply(1:m, \(i) subset_draws(draws, chain = i)) lapply(draws_per_dat, summarise_draws, default_convergence_measures()) # use the future package for parallelization library(future) plan(multisession, workers = 4) fit_imp3 <- brm_multiple(bmi ~ age + hyp + chl, data = imp, chains = 1) summary(fit_imp3) ## End(Not run)
Family objects provide a convenient way to specify the details of the models
used by many model fitting functions. The family functions presented here are
for use with brms only and will **not** work with other model
fitting functions such as glm
or glmer
.
However, the standard family functions as described in
family
will work with brms.
You can also specify custom families for use in brms with
the custom_family
function.
brmsfamily( family, link = NULL, link_sigma = "log", link_shape = "log", link_nu = "logm1", link_phi = "log", link_kappa = "log", link_beta = "log", link_zi = "logit", link_hu = "logit", link_zoi = "logit", link_coi = "logit", link_disc = "log", link_bs = "log", link_ndt = "log", link_bias = "logit", link_xi = "log1p", link_alpha = "identity", link_quantile = "logit", threshold = "flexible", refcat = NULL ) student(link = "identity", link_sigma = "log", link_nu = "logm1") bernoulli(link = "logit") beta_binomial(link = "logit", link_phi = "log") negbinomial(link = "log", link_shape = "log") geometric(link = "log") lognormal(link = "identity", link_sigma = "log") shifted_lognormal(link = "identity", link_sigma = "log", link_ndt = "log") skew_normal(link = "identity", link_sigma = "log", link_alpha = "identity") exponential(link = "log") weibull(link = "log", link_shape = "log") frechet(link = "log", link_nu = "logm1") gen_extreme_value(link = "identity", link_sigma = "log", link_xi = "log1p") exgaussian(link = "identity", link_sigma = "log", link_beta = "log") wiener( link = "identity", link_bs = "log", link_ndt = "log", link_bias = "logit" ) Beta(link = "logit", link_phi = "log") xbeta(link = "logit", link_phi = "log", link_kappa = "log") dirichlet(link = "logit", link_phi = "log", refcat = NULL) logistic_normal(link = "identity", link_sigma = "log", refcat = NULL) von_mises(link = "tan_half", link_kappa = "log") asym_laplace(link = "identity", link_sigma = "log", link_quantile = "logit") cox(link = "log") hurdle_poisson(link = "log", link_hu = "logit") hurdle_negbinomial(link = "log", link_shape = "log", link_hu = "logit") hurdle_gamma(link = "log", link_shape = "log", link_hu = "logit") hurdle_lognormal(link = "identity", link_sigma = "log", link_hu = "logit") hurdle_cumulative( link = "logit", link_hu = "logit", link_disc = "log", threshold = "flexible" ) zero_inflated_beta(link = "logit", link_phi = "log", link_zi = "logit") zero_one_inflated_beta( link = "logit", link_phi = "log", link_zoi = "logit", link_coi = "logit" ) zero_inflated_poisson(link = "log", link_zi = "logit") zero_inflated_negbinomial(link = "log", link_shape = "log", link_zi = "logit") zero_inflated_binomial(link = "logit", link_zi = "logit") zero_inflated_beta_binomial( link = "logit", link_phi = "log", link_zi = "logit" ) categorical(link = "logit", refcat = NULL) multinomial(link = "logit", refcat = NULL) cumulative(link = "logit", link_disc = "log", threshold = "flexible") sratio(link = "logit", link_disc = "log", threshold = "flexible") cratio(link = "logit", link_disc = "log", threshold = "flexible") acat(link = "logit", link_disc = "log", threshold = "flexible")
brmsfamily( family, link = NULL, link_sigma = "log", link_shape = "log", link_nu = "logm1", link_phi = "log", link_kappa = "log", link_beta = "log", link_zi = "logit", link_hu = "logit", link_zoi = "logit", link_coi = "logit", link_disc = "log", link_bs = "log", link_ndt = "log", link_bias = "logit", link_xi = "log1p", link_alpha = "identity", link_quantile = "logit", threshold = "flexible", refcat = NULL ) student(link = "identity", link_sigma = "log", link_nu = "logm1") bernoulli(link = "logit") beta_binomial(link = "logit", link_phi = "log") negbinomial(link = "log", link_shape = "log") geometric(link = "log") lognormal(link = "identity", link_sigma = "log") shifted_lognormal(link = "identity", link_sigma = "log", link_ndt = "log") skew_normal(link = "identity", link_sigma = "log", link_alpha = "identity") exponential(link = "log") weibull(link = "log", link_shape = "log") frechet(link = "log", link_nu = "logm1") gen_extreme_value(link = "identity", link_sigma = "log", link_xi = "log1p") exgaussian(link = "identity", link_sigma = "log", link_beta = "log") wiener( link = "identity", link_bs = "log", link_ndt = "log", link_bias = "logit" ) Beta(link = "logit", link_phi = "log") xbeta(link = "logit", link_phi = "log", link_kappa = "log") dirichlet(link = "logit", link_phi = "log", refcat = NULL) logistic_normal(link = "identity", link_sigma = "log", refcat = NULL) von_mises(link = "tan_half", link_kappa = "log") asym_laplace(link = "identity", link_sigma = "log", link_quantile = "logit") cox(link = "log") hurdle_poisson(link = "log", link_hu = "logit") hurdle_negbinomial(link = "log", link_shape = "log", link_hu = "logit") hurdle_gamma(link = "log", link_shape = "log", link_hu = "logit") hurdle_lognormal(link = "identity", link_sigma = "log", link_hu = "logit") hurdle_cumulative( link = "logit", link_hu = "logit", link_disc = "log", threshold = "flexible" ) zero_inflated_beta(link = "logit", link_phi = "log", link_zi = "logit") zero_one_inflated_beta( link = "logit", link_phi = "log", link_zoi = "logit", link_coi = "logit" ) zero_inflated_poisson(link = "log", link_zi = "logit") zero_inflated_negbinomial(link = "log", link_shape = "log", link_zi = "logit") zero_inflated_binomial(link = "logit", link_zi = "logit") zero_inflated_beta_binomial( link = "logit", link_phi = "log", link_zi = "logit" ) categorical(link = "logit", refcat = NULL) multinomial(link = "logit", refcat = NULL) cumulative(link = "logit", link_disc = "log", threshold = "flexible") sratio(link = "logit", link_disc = "log", threshold = "flexible") cratio(link = "logit", link_disc = "log", threshold = "flexible") acat(link = "logit", link_disc = "log", threshold = "flexible")
family |
A character string naming the distribution family of the response
variable to be used in the model. Currently, the following families are
supported: |
link |
A specification for the model link function. This can be a name/expression or character string. See the 'Details' section for more information on link functions supported by each family. |
link_sigma |
Link of auxiliary parameter |
link_shape |
Link of auxiliary parameter |
link_nu |
Link of auxiliary parameter |
link_phi |
Link of auxiliary parameter |
link_kappa |
Link of auxiliary parameter |
link_beta |
Link of auxiliary parameter |
link_zi |
Link of auxiliary parameter |
link_hu |
Link of auxiliary parameter |
link_zoi |
Link of auxiliary parameter |
link_coi |
Link of auxiliary parameter |
link_disc |
Link of auxiliary parameter |
link_bs |
Link of auxiliary parameter |
link_ndt |
Link of auxiliary parameter |
link_bias |
Link of auxiliary parameter |
link_xi |
Link of auxiliary parameter |
link_alpha |
Link of auxiliary parameter |
link_quantile |
Link of auxiliary parameter |
threshold |
A character string indicating the type
of thresholds (i.e. intercepts) used in an ordinal model.
|
refcat |
Optional name of the reference response category used in
|
Below, we list common use cases for the different families. This list is not ment to be exhaustive.
Family gaussian
can be used for linear regression.
Family student
can be used for robust linear regression
that is less influenced by outliers.
Family skew_normal
can handle skewed responses in linear
regression.
Families poisson
, negbinomial
, and geometric
can be used for regression of unbounded count data.
Families bernoulli
, binomial
, and beta_binomial
can be used for binary regression (i.e., most commonly logistic
regression).
Families categorical
and multinomial
can be used for
multi-logistic regression when there are more than two possible outcomes.
Families cumulative
, cratio
('continuation ratio'),
sratio
('stopping ratio'), and acat
('adjacent category')
leads to ordinal regression.
Families Gamma
, weibull
, exponential
,
lognormal
, frechet
, inverse.gaussian
, and cox
(Cox proportional hazards model) can be used (among others) for
time-to-event regression also known as survival regression.
Families weibull
, frechet
, and gen_extreme_value
('generalized extreme value') allow for modeling extremes.
Families beta
, dirichlet
, and logistic_normal
can be used to model responses representing rates or probabilities.
Family xbeta
extends the beta
family to
support [0, 1]
responses with exact 0
s and / or
1
s, when each response takes values 0
, 1
,
and (0, 1)
according to a single process. If there is
merit in assuming that 0 and 1 values arise from different
processes than (0, 1)
values, then the
zero_inflated_beta
, zero_one_inflated_beta
families
provide more flexibility. For details see Kosmidis & Zeileis
(2024).
Family asym_laplace
allows for quantile regression when fixing
the auxiliary quantile
parameter to the quantile of interest.
Family exgaussian
('exponentially modified Gaussian') and
shifted_lognormal
are especially suited to model reaction times.
Family wiener
provides an implementation of the Wiener
diffusion model. For this family, the main formula predicts the drift
parameter 'delta' and all other parameters are modeled as auxiliary parameters
(see brmsformula
for details).
Families hurdle_poisson
, hurdle_negbinomial
,
hurdle_gamma
, hurdle_lognormal
, zero_inflated_poisson
,
zero_inflated_negbinomial
, zero_inflated_binomial
,
zero_inflated_beta_binomial
, zero_inflated_beta
,
zero_one_inflated_beta
, and hurdle_cumulative
allow to estimate
zero-inflated and hurdle models. These models can be very helpful when there
are many zeros in the data (or ones in case of one-inflated models)
that cannot be explained by the primary distribution of the response.
Below, we list all possible links for each family. The first link mentioned for each family is the default.
Families gaussian
, student
,
skew_normal
, exgaussian
, asym_laplace
, and
gen_extreme_value
support the links (as names)
identity
, log
, inverse
, and
softplus
.
Families poisson
, negbinomial
, geometric
,
zero_inflated_poisson
, zero_inflated_negbinomial
,
hurdle_poisson
, and hurdle_negbinomial
support
log
, identity
, sqrt
, and softplus
.
Families binomial
, bernoulli
,
beta_binomial
, zero_inflated_binomial
,
zero_inflated_beta_binomial
, Beta
,
zero_inflated_beta
, zero_one_inflated_beta
, and
xbeta
support logit
, probit
,
probit_approx
, cloglog
, cauchit
,
identity
, and log
.
Families cumulative
, cratio
, sratio
,
acat
, and hurdle_cumulative
support logit
,
probit
, probit_approx
, cloglog
, and cauchit
.
Families categorical
, multinomial
, and dirichlet
support logit
.
Families Gamma
, weibull
, exponential
,
frechet
, and hurdle_gamma
support
log
, identity
, inverse
, and softplus
.
Families lognormal
and hurdle_lognormal
support identity
and inverse
.
Family logistic_normal
supports identity
.
Family inverse.gaussian
supports 1/mu^2
,
inverse
, identity
, log
, and softplus
.
Family von_mises
supports tan_half
and
identity
.
Family cox
supports log
, identity
,
and softplus
for the proportional hazards parameter.
Family wiener
supports identity
, log
,
and softplus
for the main parameter which represents the
drift rate.
Please note that when calling the Gamma
family
function of the stats package, the default link will be
inverse
instead of log
although the latter is the default in
brms. Also, when using the family functions gaussian
,
binomial
, poisson
, and Gamma
of the stats
package (see family
), special link functions
such as softplus
or cauchit
won't work. In this case, you
have to use brmsfamily
to specify the family with corresponding link
function.
Kosmidis I, Zeileis A (2024). Extended-Support Beta Regression for [0, 1] Responses. arXiv Preprint. doi:10.48550/arXiv.2409.07233
# create a family object (fam1 <- student("log")) # alternatively use the brmsfamily function (fam2 <- brmsfamily("student", "log")) # both leads to the same object identical(fam1, fam2)
# create a family object (fam1 <- student("log")) # alternatively use the brmsfamily function (fam2 <- brmsfamily("student", "log")) # both leads to the same object identical(fam1, fam2)
brmsfit
of models fitted with the brms packageModels fitted with the brms
package are
represented as a brmsfit
object, which contains the posterior
draws (samples), model formula, Stan code, relevant data, and other information.
See methods(class = "brmsfit")
for an overview of available methods.
formula
A brmsformula
object.
data
A data.frame
containing all variables used in the model.
data2
A list
of data objects which cannot be passed
via data
.
prior
A brmsprior
object containing
information on the priors used in the model.
stanvars
A stanvars
object.
model
The model code in Stan language.
exclude
The names of the parameters for which draws are not saved.
algorithm
The name of the algorithm used to fit the model.
backend
The name of the backend used to fit the model.
threads
An object of class 'brmsthreads' created by
threading
.
opencl
An object of class 'brmsopencl' created by opencl
.
stan_args
Named list of additional control arguments that were passed to the Stan backend directly.
fit
An object of class stanfit
among others containing the posterior draws.
basis
An object that contains a small subset of the Stan data created at fitting time, which is needed to process new data correctly.
criteria
An empty list
for adding model fit criteria
after estimation of the model.
file
Optional name of a file in which the model object was stored in or loaded from.
version
The versions of brms and rstan with which the model was fitted.
family
(Deprecated) A brmsfamily
object.
autocor
(Deprecated) An cor_brms
object containing
the autocorrelation structure if specified.
ranef
(Deprecated) A data.frame
containing the group-level structure.
cov_ranef
(Deprecated) A list
of customized group-level
covariance matrices.
stan_funs
(Deprecated) A character string of length one or NULL
.
data.name
(Deprecated) The name of data
as specified by the user.
brms
,
brm
,
brmsformula
,
brmsfamily
Set up a model formula for use in the brms package allowing to define (potentially non-linear) additive multilevel models for all parameters of the assumed response distribution.
brmsformula( formula, ..., flist = NULL, family = NULL, autocor = NULL, nl = NULL, loop = NULL, center = NULL, cmc = NULL, sparse = NULL, decomp = NULL, unused = NULL )
brmsformula( formula, ..., flist = NULL, family = NULL, autocor = NULL, nl = NULL, loop = NULL, center = NULL, cmc = NULL, sparse = NULL, decomp = NULL, unused = NULL )
formula |
An object of class |
... |
Additional |
flist |
Optional list of formulas, which are treated in the
same way as formulas passed via the |
family |
Same argument as in |
autocor |
An optional |
nl |
Logical; Indicates whether |
loop |
Logical; Only used in non-linear models.
Indicates if the computation of the non-linear formula should be
done inside ( |
center |
Logical; Indicates if the population-level design
matrix should be centered, which usually increases sampling efficiency.
See the 'Details' section for more information.
Defaults to |
cmc |
Logical; Indicates whether automatic cell-mean coding
should be enabled when removing the intercept by adding |
sparse |
Logical; indicates whether the population-level design matrices
should be treated as sparse (defaults to |
decomp |
Optional name of the decomposition used for the
population-level design matrix. Defaults to |
unused |
An optional |
General formula structure
The formula
argument accepts formulas of the following syntax:
response | aterms ~ pterms + (gterms | group)
The pterms
part contains effects that are assumed to be the same
across observations. We call them 'population-level' or 'overall' effects,
or (adopting frequentist vocabulary) 'fixed' effects. The optional
gterms
part may contain effects that are assumed to vary across
grouping variables specified in group
. We call them 'group-level' or
'varying' effects, or (adopting frequentist vocabulary) 'random' effects,
although the latter name is misleading in a Bayesian context. For more
details type vignette("brms_overview")
and
vignette("brms_multilevel")
.
Group-level terms
Multiple grouping factors each with multiple group-level effects are
possible. (Of course we can also run models without any group-level
effects.) Instead of |
you may use ||
in grouping terms to
prevent correlations from being modeled. Equivalently, the cor
argument of the gr
function can be used for this purpose,
for example, (1 + x || g)
is equivalent to
(1 + x | gr(g, cor = FALSE))
.
It is also possible to model different group-level terms of the same
grouping factor as correlated (even across different formulas, e.g., in
non-linear models) by using |<ID>|
instead of |
. All
group-level terms sharing the same ID will be modeled as correlated. If,
for instance, one specifies the terms (1+x|i|g)
and (1+z|i|g)
somewhere in the formulas passed to brmsformula
, correlations
between the corresponding group-level effects will be estimated. In the
above example, i
is not a variable in the data but just a symbol to
indicate correlations between multiple group-level terms. Equivalently, the
id
argument of the gr
function can be used as well,
for example, (1 + x | gr(g, id = "i"))
.
If levels of the grouping factor belong to different sub-populations,
it may be reasonable to assume a different covariance matrix for each
of the sub-populations. For instance, the variation within the
treatment group and within the control group in a randomized control
trial might differ. Suppose that y
is the outcome, and
x
is the factor indicating the treatment and control group.
Then, we could estimate different hyper-parameters of the varying
effects (in this case a varying intercept) for treatment and control
group via y ~ x + (1 | gr(subject, by = x))
.
You can specify multi-membership terms using the mm
function. For instance, a multi-membership term with two members
could be (1 | mm(g1, g2))
, where g1
and g2
specify the first and second member, respectively. Moreover,
if a covariate x
varies across the levels of the grouping-factors
g1
and g2
, we can save the respective covariate values
in the variables x1
and x2
and then model the varying
effect as (1 + mmc(x1, x2) | mm(g1, g2))
.
Special predictor terms
Flexible non-linear smooth terms can modeled using the s
and t2
functions in the pterms
part
of the model formula. This allows to fit generalized additive mixed
models (GAMMs) with brms. The implementation is similar to that
used in the gamm4 package. For more details on this model class
see gam
and gamm
.
Gaussian process terms can be fitted using the gp
function in the pterms
part of the model formula. Similar to
smooth terms, Gaussian processes can be used to model complex non-linear
relationships, for instance temporal or spatial autocorrelation.
However, they are computationally demanding and are thus not recommended
for very large datasets or approximations need to be used.
The pterms
and gterms
parts may contain four non-standard
effect types namely monotonic, measurement error, missing value, and
category specific effects, which can be specified using terms of the
form mo(predictor)
, me(predictor, sd_predictor)
,
mi(predictor)
, and cs(<predictors>)
, respectively.
Category specific effects can only be estimated in
ordinal models and are explained in more detail in the package's
main vignette (type vignette("brms_overview")
).
The other three effect types are explained in the following.
A monotonic predictor must either be integer valued or an ordered factor,
which is the first difference to an ordinary continuous predictor.
More importantly, predictor categories (or integers) are not assumed to be
equidistant with respect to their effect on the response variable.
Instead, the distance between adjacent predictor categories (or integers)
is estimated from the data and may vary across categories.
This is realized by parameterizing as follows:
One parameter takes care of the direction and size of the effect similar
to an ordinary regression parameter, while an additional parameter vector
estimates the normalized distances between consecutive predictor categories.
A main application of monotonic effects are ordinal predictors that
can this way be modeled without (falsely) treating them as continuous
or as unordered categorical predictors. For more details and examples
see vignette("brms_monotonic")
.
Quite often, predictors are measured and as such naturally contain
measurement error. Although most researchers are well aware of this problem,
measurement error in predictors is ignored in most
regression analyses, possibly because only few packages allow
for modeling it. Notably, measurement error can be handled in
structural equation models, but many more general regression models
(such as those featured by brms) cannot be transferred
to the SEM framework. In brms, effects of noise-free predictors
can be modeled using the me
(for 'measurement error') function.
If, say, y
is the response variable and
x
is a measured predictor with known measurement error
sdx
, we can simply include it on the right-hand side of the
model formula via y ~ me(x, sdx)
.
This can easily be extended to more general formulas.
If x2
is another measured predictor with corresponding error
sdx2
and z
is a predictor without error
(e.g., an experimental setting), we can model all main effects
and interactions of the three predictors in the well known manner:
y ~ me(x, sdx) * me(x2, sdx2) * z
.
The me
function is soft deprecated in favor of the more flexible
and consistent mi
function (see below).
When a variable contains missing values, the corresponding rows will
be excluded from the data by default (row-wise exclusion). However,
quite often we want to keep these rows and instead estimate the missing values.
There are two approaches for this: (a) Impute missing values before
the model fitting for instance via multiple imputation (see
brm_multiple
for a way to handle multiple imputed datasets).
(b) Impute missing values on the fly during model fitting. The latter
approach is explained in the following. Using a variable with missing
values as predictors requires two things, First, we need to specify that
the predictor contains missings that should to be imputed.
If, say, y
is the primary response, x
is a
predictor with missings and z
is a predictor without missings,
we go for y ~ mi(x) + z
. Second, we need to model x
as an additional response with corresponding predictors and the
addition term mi()
. In our example, we could write
x | mi() ~ z
. Measurement error may be included via
the sdy
argument, say, x | mi(sdy = se) ~ z
.
See mi
for examples with real data.
Autocorrelation terms
Autocorrelation terms can be directly specified inside the pterms
part as well. Details can be found in autocor-terms
.
Additional response information
Another special of the brms formula syntax is the optional
aterms
part, which may contain multiple terms of the form
fun(<variable>)
separated by +
each providing special
information on the response variable. fun
can be replaced with
either se
, weights
, subset
, cens
, trunc
,
trials
, cat
, dec
, rate
, vreal
, or
vint
. Their meanings are explained below
(see also addition-terms
).
For families gaussian
, student
and skew_normal
, it is
possible to specify standard errors of the observations, thus allowing
to perform meta-analysis. Suppose that the variable yi
contains
the effect sizes from the studies and sei
the corresponding
standard errors. Then, fixed and random effects meta-analyses can
be conducted using the formulas yi | se(sei) ~ 1
and
yi | se(sei) ~ 1 + (1|study)
, respectively, where
study
is a variable uniquely identifying every study.
If desired, meta-regression can be performed via
yi | se(sei) ~ 1 + mod1 + mod2 + (1|study)
or yi | se(sei) ~ 1 + mod1 + mod2 + (1 + mod1 + mod2|study)
,
where mod1
and mod2
represent moderator variables.
By default, the standard errors replace the parameter sigma
.
To model sigma
in addition to the known standard errors,
set argument sigma
in function se
to TRUE
,
for instance, yi | se(sei, sigma = TRUE) ~ 1
.
For all families, weighted regression may be performed using
weights
in the aterms
part. Internally, this is
implemented by multiplying the log-posterior values of each
observation by their corresponding weights.
Suppose that variable wei
contains the weights
and that yi
is the response variable.
Then, formula yi | weights(wei) ~ predictors
implements a weighted regression.
For multivariate models, subset
may be used in the aterms
part, to use different subsets of the data in different univariate
models. For instance, if sub
is a logical variable and
y
is the response of one of the univariate models, we may
write y | subset(sub) ~ predictors
so that y
is
predicted only for those observations for which sub
evaluates
to TRUE
.
For log-linear models such as poisson models, rate
may be used
in the aterms
part to specify the denominator of a response that
is expressed as a rate. The numerator is given by the actual response
variable and has a distribution according to the family as usual. Using
rate(denom)
is equivalent to adding offset(log(denom))
to
the linear predictor of the main parameter but the former is arguably
more convenient and explicit.
With the exception of categorical and ordinal families,
left, right, and interval censoring can be modeled through
y | cens(censored) ~ predictors
. The censoring variable
(named censored
in this example) should contain the values
'left'
, 'none'
, 'right'
, and 'interval'
(or equivalently -1
, 0
, 1
, and 2
) to indicate that
the corresponding observation is left censored, not censored, right censored,
or interval censored. For interval censored data, a second variable
(let's call it y2
) has to be passed to cens
. In this case,
the formula has the structure y | cens(censored, y2) ~ predictors
.
While the lower bounds are given in y
, the upper bounds are given
in y2
for interval censored data. Intervals are assumed to be open
on the left and closed on the right: (y, y2]
.
With the exception of categorical and ordinal families,
the response distribution can be truncated using the trunc
function in the addition part. If the response variable is truncated
between, say, 0 and 100, we can specify this via
yi | trunc(lb = 0, ub = 100) ~ predictors
.
Instead of numbers, variables in the data set can also be passed allowing
for varying truncation points across observations. Defining only one of
the two arguments in trunc
leads to one-sided truncation.
For all continuous families, missing values in the responses can be imputed
within Stan by using the addition term mi
. This is mostly
useful in combination with mi
predictor terms as explained
above under 'Special predictor terms'.
For families binomial
and zero_inflated_binomial
,
addition should contain a variable indicating the number of trials
underlying each observation. In lme4
syntax, we may write for instance
cbind(success, n - success)
, which is equivalent
to success | trials(n)
in brms syntax. If the number of trials
is constant across all observations, say 10
,
we may also write success | trials(10)
.
Please note that the cbind()
syntax will not work
in brms in the expected way because this syntax is reserved
for other purposes.
For all ordinal families, aterms
may contain a term
thres(number)
to specify the number thresholds (e.g,
thres(6)
), which should be equal to the total number of response
categories - 1. If not given, the number of thresholds is calculated from
the data. If different threshold vectors should be used for different
subsets of the data, the gr
argument can be used to provide the
grouping variable (e.g, thres(6, gr = item)
, if item
is the
grouping variable). In this case, the number of thresholds can also be a
variable in the data with different values per group.
A deprecated quasi alias of thres()
is cat()
with which the
total number of response categories (i.e., number of thresholds + 1) can be
specified.
In Wiener diffusion models (family wiener
) the addition term
dec
is mandatory to specify the (vector of) binary decisions
corresponding to the reaction times. Non-zero values will be treated
as a response on the upper boundary of the diffusion process and zeros
will be treated as a response on the lower boundary. Alternatively,
the variable passed to dec
might also be a character vector
consisting of 'lower'
and 'upper'
.
All families support the index
addition term to uniquely identify
each observation of the corresponding response variable. Currently,
index
is primarily useful in combination with the subset
addition and mi
terms.
For custom families, it is possible to pass an arbitrary number of real and
integer vectors via the addition terms vreal
and vint
,
respectively. An example is provided in
vignette('brms_customfamilies')
. To pass multiple vectors of the
same data type, provide them separated by commas inside a single
vreal
or vint
statement.
Multiple addition terms of different types may be specified at the same
time using the +
operator. For example, the formula
formula = yi | se(sei) + cens(censored) ~ 1
implies a censored
meta-analytic model.
The addition argument disp
(short for dispersion)
has been removed in version 2.0. You may instead use the
distributional regression approach by specifying
sigma ~ 1 + offset(log(xdisp))
or
shape ~ 1 + offset(log(xdisp))
, where xdisp
is
the variable being previously passed to disp
.
Parameterization of the population-level intercept
By default, the population-level intercept (if incorporated) is estimated
separately and not as part of population-level parameter vector b
As
a result, priors on the intercept also have to be specified separately.
Furthermore, to increase sampling efficiency, the population-level design
matrix X
is centered around its column means X_means
if the
intercept is incorporated. This leads to a temporary bias in the intercept
equal to <X_means, b>
, where <,>
is the scalar product. The
bias is corrected after fitting the model, but be aware that you are
effectively defining a prior on the intercept of the centered design matrix
not on the real intercept. You can turn off this special handling of the
intercept by setting argument center
to FALSE
. For more
details on setting priors on population-level intercepts, see
set_prior
.
This behavior can be avoided by using the reserved
(and internally generated) variable Intercept
.
Instead of y ~ x
, you may write
y ~ 0 + Intercept + x
. This way, priors can be
defined on the real intercept, directly. In addition,
the intercept is just treated as an ordinary population-level effect
and thus priors defined on b
will also apply to it.
Note that this parameterization may be less efficient
than the default parameterization discussed above.
Formula syntax for non-linear models
In brms, it is possible to specify non-linear models
of arbitrary complexity.
The non-linear model can just be specified within the formula
argument. Suppose, that we want to predict the response y
through the predictor x
, where x
is linked to y
through y = alpha - beta * lambda^x
, with parameters
alpha
, beta
, and lambda
. This is certainly a
non-linear model being defined via
formula = y ~ alpha - beta * lambda^x
(addition arguments
can be added in the same way as for ordinary formulas).
To tell brms that this is a non-linear model,
we set argument nl
to TRUE
.
Now we have to specify a model for each of the non-linear parameters.
Let's say we just want to estimate those three parameters
with no further covariates or random effects. Then we can pass
alpha + beta + lambda ~ 1
or equivalently
(and more flexible) alpha ~ 1, beta ~ 1, lambda ~ 1
to the ...
argument.
This can, of course, be extended. If we have another predictor z
and
observations nested within the grouping factor g
, we may write for
instance alpha ~ 1, beta ~ 1 + z + (1|g), lambda ~ 1
.
The formula syntax described above applies here as well.
In this example, we are using z
and g
only for the
prediction of beta
, but we might also use them for the other
non-linear parameters (provided that the resulting model is still
scientifically reasonable).
By default, non-linear covariates are treated as real vectors in Stan. However, if the data of the covariates is of type 'integer' in R (which can be enforced by the 'as.integer' function), the Stan type will be changed to an integer array. That way, covariates can also be used for indexing purposes in Stan.
Non-linear models may not be uniquely identified and / or show bad convergence.
For this reason it is mandatory to specify priors on the non-linear parameters.
For instructions on how to do that, see set_prior
.
For some examples of non-linear models, see vignette("brms_nonlinear")
.
Formula syntax for predicting distributional parameters
It is also possible to predict parameters of the response distribution such
as the residual standard deviation sigma
in gaussian models or the
hurdle probability hu
in hurdle models. The syntax closely resembles
that of a non-linear parameter, for instance sigma ~ x + s(z) +
(1+x|g)
. For some examples of distributional models, see
vignette("brms_distreg")
.
Parameter mu
exists for every family and can be used as an
alternative to specifying terms in formula
. If both mu
and
formula
are given, the right-hand side of formula
is ignored.
Accordingly, specifying terms on the right-hand side of both formula
and mu
at the same time is deprecated. In future versions,
formula
might be updated by mu
.
The following are
distributional parameters of specific families (all other parameters are
treated as non-linear parameters): sigma
(residual standard
deviation or scale of the gaussian
, student
,
skew_normal
, lognormal
exgaussian
, and
asym_laplace
families); shape
(shape parameter of the
Gamma
, weibull
, negbinomial
, and related zero-inflated
/ hurdle families); nu
(degrees of freedom parameter of the
student
and frechet
families); phi
(precision
parameter of the beta
, zero_inflated_beta
, and xbeta
families);
kappa
(precision parameter of the von_mises
family);
beta
(mean parameter of the exponential component of the
exgaussian
family); quantile
(quantile parameter of the
asym_laplace
family); zi
(zero-inflation probability);
hu
(hurdle probability); zoi
(zero-one-inflation
probability); coi
(conditional one-inflation probability);
disc
(discrimination) for ordinal models; bs
, ndt
, and
bias
(boundary separation, non-decision time, and initial bias of
the wiener
diffusion model). By default, distributional parameters
are modeled on the log scale if they can be positive only or on the logit
scale if the can only be within the unit interval.
Alternatively, one may fix distributional parameters to certain values.
However, this is mainly useful when models become too
complicated and otherwise have convergence issues.
We thus suggest to be generally careful when making use of this option.
The quantile
parameter of the asym_laplace
distribution
is a good example where it is useful. By fixing quantile
,
one can perform quantile regression for the specified quantile.
For instance, quantile = 0.25
allows predicting the 25%-quantile.
Furthermore, the bias
parameter in drift-diffusion models,
is assumed to be 0.5
(i.e. no bias) in many applications.
To achieve this, simply write bias = 0.5
.
Other possible applications are the Cauchy distribution as a
special case of the Student-t distribution with
nu = 1
, or the geometric distribution as a special case of
the negative binomial distribution with shape = 1
.
Furthermore, the parameter disc
('discrimination') in ordinal
models is fixed to 1
by default and not estimated,
but may be modeled as any other distributional parameter if desired
(see examples). For reasons of identification, 'disc'
can only be positive, which is achieved by applying the log-link.
In categorical models, distributional parameters do not have
fixed names. Instead, they are named after the response categories
(excluding the first one, which serves as the reference category),
with the prefix 'mu'
. If, for instance, categories are named
cat1
, cat2
, and cat3
, the distributional parameters
will be named mucat2
and mucat3
.
Some distributional parameters currently supported by brmsformula
have to be positive (a negative standard deviation or precision parameter
does not make any sense) or are bounded between 0 and 1 (for zero-inflated /
hurdle probabilities, quantiles, or the initial bias parameter of
drift-diffusion models).
However, linear predictors can be positive or negative, and thus the log link
(for positive parameters) or logit link (for probability parameters) are used
by default to ensure that distributional parameters are within their valid intervals.
This implies that, by default, effects for such distributional parameters are
estimated on the log / logit scale and one has to apply the inverse link
function to get to the effects on the original scale.
Alternatively, it is possible to use the identity link to predict parameters
on their original scale, directly. However, this is much more likely to lead
to problems in the model fitting, if the parameter actually has a restricted range.
See also brmsfamily
for an overview of valid link functions.
Formula syntax for mixture models
The specification of mixture models closely resembles that
of non-mixture models. If not specified otherwise (see below),
all mean parameters of the mixture components are predicted
using the right-hand side of formula
. All types of predictor
terms allowed in non-mixture models are allowed in mixture models
as well.
Distributional parameters of mixture distributions have the same
name as those of the corresponding ordinary distributions, but with
a number at the end to indicate the mixture component. For instance, if
you use family mixture(gaussian, gaussian)
, the distributional
parameters are sigma1
and sigma2
.
Distributional parameters of the same class can be fixed to the same value.
For the above example, we could write sigma2 = "sigma1"
to make
sure that both components have the same residual standard deviation,
which is in turn estimated from the data.
In addition, there are two types of special distributional parameters.
The first are named mu<ID>
, that allow for modeling different
predictors for the mean parameters of different mixture components.
For instance, if you want to predict the mean of the first component
using predictor x
and the mean of the second component using
predictor z
, you can write mu1 ~ x
as well as mu2 ~ z
.
The second are named theta<ID>
, which constitute the mixing
proportions. If the mixing proportions are fixed to certain values,
they are internally normalized to form a probability vector.
If one seeks to predict the mixing proportions, all but
one of the them has to be predicted, while the remaining one is used
as the reference category to identify the model. The so-called 'softmax'
transformation is applied on the linear predictor terms to form a
probability vector.
For more information on mixture models, see
the documentation of mixture
.
Formula syntax for multivariate models
Multivariate models may be specified using mvbind
notation
or with help of the mvbf
function.
Suppose that y1
and y2
are response variables
and x
is a predictor. Then mvbind(y1, y2) ~ x
specifies a multivariate model.
The effects of all terms specified at the RHS of the formula
are assumed to vary across response variables.
For instance, two parameters will be estimated for x
,
one for the effect on y1
and another for the effect on y2
.
This is also true for group-level effects. When writing, for instance,
mvbind(y1, y2) ~ x + (1+x|g)
, group-level effects will be
estimated separately for each response. To model these effects
as correlated across responses, use the ID syntax (see above).
For the present example, this would look as follows:
mvbind(y1, y2) ~ x + (1+x|2|g)
. Of course, you could also use
any value other than 2
as ID.
It is also possible to specify different formulas for different responses.
If, for instance, y1
should be predicted by x
and y2
should be predicted by z
, we could write mvbf(y1 ~ x, y2 ~ z)
.
Alternatively, multiple brmsformula
objects can be added to
specify a joint multivariate model (see 'Examples').
An object of class brmsformula
, which
is essentially a list
containing all model
formulas as well as some additional information.
mvbrmsformula
, brmsformula-helpers
# multilevel model with smoothing terms brmsformula(y ~ x1*x2 + s(z) + (1+x1|1) + (1|g2)) # additionally predict 'sigma' brmsformula(y ~ x1*x2 + s(z) + (1+x1|1) + (1|g2), sigma ~ x1 + (1|g2)) # use the shorter alias 'bf' (formula1 <- brmsformula(y ~ x + (x|g))) (formula2 <- bf(y ~ x + (x|g))) # will be TRUE identical(formula1, formula2) # incorporate censoring bf(y | cens(censor_variable) ~ predictors) # define a simple non-linear model bf(y ~ a1 - a2^x, a1 + a2 ~ 1, nl = TRUE) # predict a1 and a2 differently bf(y ~ a1 - a2^x, a1 ~ 1, a2 ~ x + (x|g), nl = TRUE) # correlated group-level effects across parameters bf(y ~ a1 - a2^x, a1 ~ 1 + (1 |2| g), a2 ~ x + (x |2| g), nl = TRUE) # alternative but equivalent way to specify the above model bf(y ~ a1 - a2^x, a1 ~ 1 + (1 | gr(g, id = 2)), a2 ~ x + (x | gr(g, id = 2)), nl = TRUE) # define a multivariate model bf(mvbind(y1, y2) ~ x * z + (1|g)) # define a zero-inflated model # also predicting the zero-inflation part bf(y ~ x * z + (1+x|ID1|g), zi ~ x + (1|ID1|g)) # specify a predictor as monotonic bf(y ~ mo(x) + more_predictors) # for ordinal models only # specify a predictor as category specific bf(y ~ cs(x) + more_predictors) # add a category specific group-level intercept bf(y ~ cs(x) + (cs(1)|g)) # specify parameter 'disc' bf(y ~ person + item, disc ~ item) # specify variables containing measurement error bf(y ~ me(x, sdx)) # specify predictors on all parameters of the wiener diffusion model # the main formula models the drift rate 'delta' bf(rt | dec(decision) ~ x, bs ~ x, ndt ~ x, bias ~ x) # fix the bias parameter to 0.5 bf(rt | dec(decision) ~ x, bias = 0.5) # specify different predictors for different mixture components mix <- mixture(gaussian, gaussian) bf(y ~ 1, mu1 ~ x, mu2 ~ z, family = mix) # fix both residual standard deviations to the same value bf(y ~ x, sigma2 = "sigma1", family = mix) # use the '+' operator to specify models bf(y ~ 1) + nlf(sigma ~ a * exp(b * x), a ~ x) + lf(b ~ z + (1|g), dpar = "sigma") + gaussian() # specify a multivariate model using the '+' operator bf(y1 ~ x + (1|g)) + gaussian() + cor_ar(~1|g) + bf(y2 ~ z) + poisson() # specify correlated residuals of a gaussian and a poisson model form1 <- bf(y1 ~ 1 + x + (1|c|obs), sigma = 1) + gaussian() form2 <- bf(y2 ~ 1 + x + (1|c|obs)) + poisson() # model missing values in predictors bf(bmi ~ age * mi(chl)) + bf(chl | mi() ~ age) + set_rescor(FALSE) # model sigma as a function of the mean bf(y ~ eta, nl = TRUE) + lf(eta ~ 1 + x) + nlf(sigma ~ tau * sqrt(eta)) + lf(tau ~ 1)
# multilevel model with smoothing terms brmsformula(y ~ x1*x2 + s(z) + (1+x1|1) + (1|g2)) # additionally predict 'sigma' brmsformula(y ~ x1*x2 + s(z) + (1+x1|1) + (1|g2), sigma ~ x1 + (1|g2)) # use the shorter alias 'bf' (formula1 <- brmsformula(y ~ x + (x|g))) (formula2 <- bf(y ~ x + (x|g))) # will be TRUE identical(formula1, formula2) # incorporate censoring bf(y | cens(censor_variable) ~ predictors) # define a simple non-linear model bf(y ~ a1 - a2^x, a1 + a2 ~ 1, nl = TRUE) # predict a1 and a2 differently bf(y ~ a1 - a2^x, a1 ~ 1, a2 ~ x + (x|g), nl = TRUE) # correlated group-level effects across parameters bf(y ~ a1 - a2^x, a1 ~ 1 + (1 |2| g), a2 ~ x + (x |2| g), nl = TRUE) # alternative but equivalent way to specify the above model bf(y ~ a1 - a2^x, a1 ~ 1 + (1 | gr(g, id = 2)), a2 ~ x + (x | gr(g, id = 2)), nl = TRUE) # define a multivariate model bf(mvbind(y1, y2) ~ x * z + (1|g)) # define a zero-inflated model # also predicting the zero-inflation part bf(y ~ x * z + (1+x|ID1|g), zi ~ x + (1|ID1|g)) # specify a predictor as monotonic bf(y ~ mo(x) + more_predictors) # for ordinal models only # specify a predictor as category specific bf(y ~ cs(x) + more_predictors) # add a category specific group-level intercept bf(y ~ cs(x) + (cs(1)|g)) # specify parameter 'disc' bf(y ~ person + item, disc ~ item) # specify variables containing measurement error bf(y ~ me(x, sdx)) # specify predictors on all parameters of the wiener diffusion model # the main formula models the drift rate 'delta' bf(rt | dec(decision) ~ x, bs ~ x, ndt ~ x, bias ~ x) # fix the bias parameter to 0.5 bf(rt | dec(decision) ~ x, bias = 0.5) # specify different predictors for different mixture components mix <- mixture(gaussian, gaussian) bf(y ~ 1, mu1 ~ x, mu2 ~ z, family = mix) # fix both residual standard deviations to the same value bf(y ~ x, sigma2 = "sigma1", family = mix) # use the '+' operator to specify models bf(y ~ 1) + nlf(sigma ~ a * exp(b * x), a ~ x) + lf(b ~ z + (1|g), dpar = "sigma") + gaussian() # specify a multivariate model using the '+' operator bf(y1 ~ x + (1|g)) + gaussian() + cor_ar(~1|g) + bf(y2 ~ z) + poisson() # specify correlated residuals of a gaussian and a poisson model form1 <- bf(y1 ~ 1 + x + (1|c|obs), sigma = 1) + gaussian() form2 <- bf(y2 ~ 1 + x + (1|c|obs)) + poisson() # model missing values in predictors bf(bmi ~ age * mi(chl)) + bf(chl | mi() ~ age) + set_rescor(FALSE) # model sigma as a function of the mean bf(y ~ eta, nl = TRUE) + lf(eta ~ 1 + x) + nlf(sigma ~ tau * sqrt(eta)) + lf(tau ~ 1)
Helper functions to specify linear and non-linear
formulas for use with brmsformula
.
nlf(formula, ..., flist = NULL, dpar = NULL, resp = NULL, loop = NULL) lf( ..., flist = NULL, dpar = NULL, resp = NULL, center = NULL, cmc = NULL, sparse = NULL, decomp = NULL ) acformula(autocor, resp = NULL) set_nl(nl = TRUE, dpar = NULL, resp = NULL) set_rescor(rescor = TRUE) set_mecor(mecor = TRUE)
nlf(formula, ..., flist = NULL, dpar = NULL, resp = NULL, loop = NULL) lf( ..., flist = NULL, dpar = NULL, resp = NULL, center = NULL, cmc = NULL, sparse = NULL, decomp = NULL ) acformula(autocor, resp = NULL) set_nl(nl = TRUE, dpar = NULL, resp = NULL) set_rescor(rescor = TRUE) set_mecor(mecor = TRUE)
formula |
Non-linear formula for a distributional parameter.
The name of the distributional parameter can either be specified
on the left-hand side of |
... |
Additional |
flist |
Optional list of formulas, which are treated in the
same way as formulas passed via the |
dpar |
Optional character string specifying the distributional
parameter to which the formulas passed via |
resp |
Optional character string specifying the response
variable to which the formulas passed via |
loop |
Logical; Only used in non-linear models.
Indicates if the computation of the non-linear formula should be
done inside ( |
center |
Logical; Indicates if the population-level design
matrix should be centered, which usually increases sampling efficiency.
See the 'Details' section for more information.
Defaults to |
cmc |
Logical; Indicates whether automatic cell-mean coding
should be enabled when removing the intercept by adding |
sparse |
Logical; indicates whether the population-level design matrices
should be treated as sparse (defaults to |
decomp |
Optional name of the decomposition used for the
population-level design matrix. Defaults to |
autocor |
A one sided formula containing autocorrelation
terms. All none autocorrelation terms in |
nl |
Logical; Indicates whether |
rescor |
Logical; Indicates if residual correlation between
the response variables should be modeled. Currently this is only
possible in multivariate |
mecor |
Logical; Indicates if correlations between latent variables
defined by |
For lf
and nlf
a list
that can be
passed to brmsformula
or added
to an existing brmsformula
or mvbrmsformula
object.
For set_nl
and set_rescor
a logical value that can be
added to an existing brmsformula
or mvbrmsformula
object.
# add more formulas to the model bf(y ~ 1) + nlf(sigma ~ a * exp(b * x)) + lf(a ~ x, b ~ z + (1|g)) + gaussian() # specify 'nl' later on bf(y ~ a * inv_logit(x * b)) + lf(a + b ~ z) + set_nl(TRUE) # specify a multivariate model bf(y1 ~ x + (1|g)) + bf(y2 ~ z) + set_rescor(TRUE) # add autocorrelation terms bf(y ~ x) + acformula(~ arma(p = 1, q = 1) + car(W))
# add more formulas to the model bf(y ~ 1) + nlf(sigma ~ a * exp(b * x)) + lf(a ~ x, b ~ z + (1|g)) + gaussian() # specify 'nl' later on bf(y ~ a * inv_logit(x * b)) + lf(a + b ~ z) + set_nl(TRUE) # specify a multivariate model bf(y1 ~ x + (1|g)) + bf(y2 ~ z) + set_rescor(TRUE) # add autocorrelation terms bf(y ~ x) + acformula(~ arma(p = 1, q = 1) + car(W))
brmshypothesis
ObjectsA brmshypothesis
object contains posterior draws
as well as summary statistics of non-linear hypotheses as
returned by hypothesis
.
## S3 method for class 'brmshypothesis' print(x, digits = 2, chars = 20, ...) ## S3 method for class 'brmshypothesis' plot( x, nvariables = 5, N = NULL, ignore_prior = FALSE, chars = 40, colors = NULL, theme = NULL, ask = TRUE, plot = TRUE, ... )
## S3 method for class 'brmshypothesis' print(x, digits = 2, chars = 20, ...) ## S3 method for class 'brmshypothesis' plot( x, nvariables = 5, N = NULL, ignore_prior = FALSE, chars = 40, colors = NULL, theme = NULL, ask = TRUE, plot = TRUE, ... )
x |
An object of class |
digits |
Minimal number of significant digits,
see |
chars |
Maximum number of characters of each hypothesis
to print or plot. If |
... |
Currently ignored. |
nvariables |
The number of variables (parameters) plotted per page. |
N |
Deprecated alias of |
ignore_prior |
A flag indicating if prior distributions should also be plotted. Only used if priors were specified on the relevant parameters. |
colors |
Two values specifying the colors of the posterior
and prior density respectively. If |
theme |
A |
ask |
Logical; indicates if the user is prompted
before a new page is plotted.
Only used if |
plot |
Logical; indicates if plots should be
plotted directly in the active graphic device.
Defaults to |
The two most important elements of a brmshypothesis
object are
hypothesis
, which is a data.frame containing the summary estimates
of the hypotheses, and samples
, which is a data.frame containing
the corresponding posterior draws.
Parse formulas objects for use in brms.
brmsterms(formula, ...) ## Default S3 method: brmsterms(formula, ...) ## S3 method for class 'brmsformula' brmsterms(formula, check_response = TRUE, resp_rhs_all = TRUE, ...) ## S3 method for class 'mvbrmsformula' brmsterms(formula, ...)
brmsterms(formula, ...) ## Default S3 method: brmsterms(formula, ...) ## S3 method for class 'brmsformula' brmsterms(formula, check_response = TRUE, resp_rhs_all = TRUE, ...) ## S3 method for class 'mvbrmsformula' brmsterms(formula, ...)
formula |
An object of class |
... |
Further arguments passed to or from other methods. |
check_response |
Logical; Indicates whether the left-hand side
of |
resp_rhs_all |
Logical; Indicates whether to also include response
variables on the right-hand side of formula |
This is the main formula parsing function of brms. It should usually not be called directly, but is exported to allow package developers making use of the formula syntax implemented in brms. As long as no other packages depend on this functions, it may be changed without deprecation warnings, when new features make this necessary.
An object of class brmsterms
or mvbrmsterms
(for multivariate models), which is a list
containing all
required information initially stored in formula
in an easier to use format, basically a list of formulas
(not an abstract syntax tree).
brm
,
brmsformula
,
mvbrmsformula
Set up an spatial conditional autoregressive (CAR) term in brms. The function does not evaluate its arguments – it exists purely to help set up a model with CAR terms.
car(M, gr = NA, type = "escar")
car(M, gr = NA, type = "escar")
M |
Adjacency matrix of locations. All non-zero entries are treated as
if the two locations are adjacent. If |
gr |
An optional grouping factor mapping observations to spatial locations. If not specified, each observation is treated as a separate location. It is recommended to always specify a grouping factor to allow for handling of new data in post-processing methods. |
type |
Type of the CAR structure. Currently implemented are
|
The escar
and esicar
types are
implemented based on the case study of Max Joseph
(https://github.com/mbjoseph/CARstan). The icar
and
bym2
type is implemented based on the case study of Mitzi Morris
(https://mc-stan.org/users/documentation/case-studies/icar_stan.html).
An object of class 'car_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
## Not run: # generate some spatial data east <- north <- 1:10 Grid <- expand.grid(east, north) K <- nrow(Grid) # set up distance and neighbourhood matrices distance <- as.matrix(dist(Grid)) W <- array(0, c(K, K)) W[distance == 1] <- 1 rownames(W) <- 1:nrow(W) # generate the covariates and response data x1 <- rnorm(K) x2 <- rnorm(K) theta <- rnorm(K, sd = 0.05) phi <- rmulti_normal( 1, mu = rep(0, K), Sigma = 0.4 * exp(-0.1 * distance) ) eta <- x1 + x2 + phi prob <- exp(eta) / (1 + exp(eta)) size <- rep(50, K) y <- rbinom(n = K, size = size, prob = prob) g <- 1:length(y) dat <- data.frame(y, size, x1, x2, g) # fit a CAR model fit <- brm(y | trials(size) ~ x1 + x2 + car(W, gr = g), data = dat, data2 = list(W = W), family = binomial()) summary(fit) ## End(Not run)
## Not run: # generate some spatial data east <- north <- 1:10 Grid <- expand.grid(east, north) K <- nrow(Grid) # set up distance and neighbourhood matrices distance <- as.matrix(dist(Grid)) W <- array(0, c(K, K)) W[distance == 1] <- 1 rownames(W) <- 1:nrow(W) # generate the covariates and response data x1 <- rnorm(K) x2 <- rnorm(K) theta <- rnorm(K, sd = 0.05) phi <- rmulti_normal( 1, mu = rep(0, K), Sigma = 0.4 * exp(-0.1 * distance) ) eta <- x1 + x2 + phi prob <- exp(eta) / (1 + exp(eta)) size <- rep(50, K) y <- rbinom(n = K, size = size, prob = prob) g <- 1:length(y) dat <- data.frame(y, size, x1, x2, g) # fit a CAR model fit <- brm(y | trials(size) ~ x1 + x2 + car(W, gr = g), data = dat, data2 = list(W = W), family = binomial()) summary(fit) ## End(Not run)
Extract model coefficients, which are the sum of population-level effects and corresponding group-level effects
## S3 method for class 'brmsfit' coef(object, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ...)
## S3 method for class 'brmsfit' coef(object, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ...)
object |
An object of class |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
... |
Further arguments passed to |
A list of 3D arrays (one per grouping factor).
If summary
is TRUE
,
the 1st dimension contains the factor levels,
the 2nd dimension contains the summary statistics
(see posterior_summary
), and
the 3rd dimension contains the group-level effects.
If summary
is FALSE
, the 1st dimension contains
the posterior draws, the 2nd dimension contains the factor levels,
and the 3rd dimension contains the group-level effects.
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1+Trt|visit), data = epilepsy, family = gaussian(), chains = 2) ## extract population and group-level coefficients separately fixef(fit) ranef(fit) ## extract combined coefficients coef(fit) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1+Trt|visit), data = epilepsy, family = gaussian(), chains = 2) ## extract population and group-level coefficients separately fixef(fit) ranef(fit) ## extract combined coefficients coef(fit) ## End(Not run)
Combine multiple brmsfit
objects, which fitted the same model.
This is usefully for instance when having manually run models in parallel.
combine_models(..., mlist = NULL, check_data = TRUE)
combine_models(..., mlist = NULL, check_data = TRUE)
... |
One or more |
mlist |
Optional list of one or more |
check_data |
Logical; indicates if the data should be checked
for being the same across models (defaults to |
This function just takes the first model and replaces
its stanfit
object (slot fit
) by the combined
stanfit
objects of all models.
A brmsfit
object.
Compare information criteria of different models fitted
with waic
or loo
.
Deprecated and will be removed in the future. Please use
loo_compare
instead.
compare_ic(..., x = NULL, ic = c("loo", "waic", "kfold"))
compare_ic(..., x = NULL, ic = c("loo", "waic", "kfold"))
... |
At least two objects returned by
|
x |
A |
ic |
The name of the information criterion to be extracted
from |
See loo_compare
for the recommended way
of comparing models with the loo package.
An object of class iclist
.
loo
,
loo_compare
add_criterion
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) waic1 <- waic(fit1) # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) waic2 <- waic(fit2) # compare both models compare_ic(waic1, waic2) ## End(Not run)
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) waic1 <- waic(fit1) # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) waic2 <- waic(fit2) # compare both models compare_ic(waic1, waic2) ## End(Not run)
Display conditional effects of one or more numeric and/or categorical predictors including two-way interaction effects.
## S3 method for class 'brmsfit' conditional_effects( x, effects = NULL, conditions = NULL, int_conditions = NULL, re_formula = NA, prob = 0.95, robust = TRUE, method = "posterior_epred", spaghetti = FALSE, surface = FALSE, categorical = FALSE, ordinal = FALSE, transform = NULL, resolution = 100, select_points = 0, too_far = 0, probs = NULL, ... ) conditional_effects(x, ...) ## S3 method for class 'brms_conditional_effects' plot( x, ncol = NULL, points = getOption("brms.plot_points", FALSE), rug = getOption("brms.plot_rug", FALSE), mean = TRUE, jitter_width = 0, stype = c("contour", "raster"), line_args = list(), cat_args = list(), errorbar_args = list(), surface_args = list(), spaghetti_args = list(), point_args = list(), rug_args = list(), facet_args = list(), theme = NULL, ask = TRUE, plot = TRUE, ... )
## S3 method for class 'brmsfit' conditional_effects( x, effects = NULL, conditions = NULL, int_conditions = NULL, re_formula = NA, prob = 0.95, robust = TRUE, method = "posterior_epred", spaghetti = FALSE, surface = FALSE, categorical = FALSE, ordinal = FALSE, transform = NULL, resolution = 100, select_points = 0, too_far = 0, probs = NULL, ... ) conditional_effects(x, ...) ## S3 method for class 'brms_conditional_effects' plot( x, ncol = NULL, points = getOption("brms.plot_points", FALSE), rug = getOption("brms.plot_rug", FALSE), mean = TRUE, jitter_width = 0, stype = c("contour", "raster"), line_args = list(), cat_args = list(), errorbar_args = list(), surface_args = list(), spaghetti_args = list(), point_args = list(), rug_args = list(), facet_args = list(), theme = NULL, ask = TRUE, plot = TRUE, ... )
x |
An object of class |
effects |
An optional character vector naming effects (main effects or
interactions) for which to compute conditional plots. Interactions are
specified by a |
conditions |
An optional |
int_conditions |
An optional named |
re_formula |
A formula containing group-level effects to be considered
in the conditional predictions. If |
prob |
A value between 0 and 1 indicating the desired probability to be covered by the uncertainty intervals. The default is 0.95. |
robust |
If |
method |
Method used to obtain predictions. Can be set to
|
spaghetti |
Logical. Indicates if predictions should
be visualized via spaghetti plots. Only applied for numeric
predictors. If |
surface |
Logical. Indicates if interactions or
two-dimensional smooths should be visualized as a surface.
Defaults to |
categorical |
Logical. Indicates if effects of categorical
or ordinal models should be shown in terms of probabilities
of response categories. Defaults to |
ordinal |
(Deprecated) Please use argument |
transform |
A function or a character string naming
a function to be applied on the predicted responses
before summary statistics are computed. Only allowed
if |
resolution |
Number of support points used to generate
the plots. Higher resolution leads to smoother plots.
Defaults to |
select_points |
Positive number.
Only relevant if |
too_far |
Positive number.
For surface plots only: Grid points that are too
far away from the actual data points can be excluded from the plot.
|
probs |
(Deprecated) The quantiles to be used in the computation of
uncertainty intervals. Please use argument |
... |
Further arguments such as |
ncol |
Number of plots to display per column for each effect.
If |
points |
Logical. Indicates if the original data points should be added
via |
rug |
Logical. Indicates if a rug representation of predictor values
should be added via |
mean |
Logical. Only relevant for spaghetti plots.
If |
jitter_width |
Only used if |
stype |
Indicates how surface plots should be displayed.
Either |
line_args |
Only used in plots of continuous predictors:
A named list of arguments passed to
|
cat_args |
Only used in plots of categorical predictors:
A named list of arguments passed to
|
errorbar_args |
Only used in plots of categorical predictors:
A named list of arguments passed to
|
surface_args |
Only used in surface plots:
A named list of arguments passed to
|
spaghetti_args |
Only used in spaghetti plots:
A named list of arguments passed to
|
point_args |
Only used if |
rug_args |
Only used if |
facet_args |
Only used if if multiple conditions are provided:
A named list of arguments passed to
|
theme |
A |
ask |
Logical; indicates if the user is prompted
before a new page is plotted.
Only used if |
plot |
Logical; indicates if plots should be
plotted directly in the active graphic device.
Defaults to |
When creating conditional_effects
for a particular predictor
(or interaction of two predictors), one has to choose the values of all
other predictors to condition on. By default, the mean is used for
continuous variables and the reference category is used for factors, but
you may change these values via argument conditions
. This also has
an implication for the points
argument: In the created plots, only
those points will be shown that correspond to the factor levels actually
used in the conditioning, in order not to create the false impression of
bad model fit, where it is just due to conditioning on certain factor
levels.
To fully change colors of the created plots, one has to amend both
scale_colour
and scale_fill
. See
scale_colour_grey
or
scale_colour_gradient
for
more details.
An object of class 'brms_conditional_effects'
which is a
named list with one data.frame per effect containing all information
required to generate conditional effects plots. Among others, these
data.frames contain some special variables, namely estimate__
(predicted values of the response), se__
(standard error of the
predicted response), lower__
and upper__
(lower and upper
bounds of the uncertainty interval of the response), as well as
cond__
(used in faceting when conditions
contains multiple
rows).
The corresponding plot
method returns a named
list of ggplot
objects, which can be further
customized using the ggplot2 package.
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1 | patient), data = epilepsy, family = poisson()) ## plot all conditional effects plot(conditional_effects(fit), ask = FALSE) ## change colours to grey scale library(ggplot2) ce <- conditional_effects(fit, "zBase:Trt") plot(ce, plot = FALSE)[[1]] + scale_color_grey() + scale_fill_grey() ## only plot the conditional interaction effect of 'zBase:Trt' ## for different values for 'zAge' conditions <- data.frame(zAge = c(-1, 0, 1)) plot(conditional_effects(fit, effects = "zBase:Trt", conditions = conditions)) ## also incorporate group-level effects variance over patients ## also add data points and a rug representation of predictor values plot(conditional_effects(fit, effects = "zBase:Trt", conditions = conditions, re_formula = NULL), points = TRUE, rug = TRUE) ## change handling of two-way interactions int_conditions <- list( zBase = setNames(c(-2, 1, 0), c("b", "c", "a")) ) conditional_effects(fit, effects = "Trt:zBase", int_conditions = int_conditions) conditional_effects(fit, effects = "Trt:zBase", int_conditions = list(zBase = quantile)) ## fit a model to illustrate how to plot 3-way interactions fit3way <- brm(count ~ zAge * zBase * Trt, data = epilepsy) conditions <- make_conditions(fit3way, "zAge") conditional_effects(fit3way, "zBase:Trt", conditions = conditions) ## only include points close to the specified values of zAge ce <- conditional_effects( fit3way, "zBase:Trt", conditions = conditions, select_points = 0.1 ) plot(ce, points = TRUE) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1 | patient), data = epilepsy, family = poisson()) ## plot all conditional effects plot(conditional_effects(fit), ask = FALSE) ## change colours to grey scale library(ggplot2) ce <- conditional_effects(fit, "zBase:Trt") plot(ce, plot = FALSE)[[1]] + scale_color_grey() + scale_fill_grey() ## only plot the conditional interaction effect of 'zBase:Trt' ## for different values for 'zAge' conditions <- data.frame(zAge = c(-1, 0, 1)) plot(conditional_effects(fit, effects = "zBase:Trt", conditions = conditions)) ## also incorporate group-level effects variance over patients ## also add data points and a rug representation of predictor values plot(conditional_effects(fit, effects = "zBase:Trt", conditions = conditions, re_formula = NULL), points = TRUE, rug = TRUE) ## change handling of two-way interactions int_conditions <- list( zBase = setNames(c(-2, 1, 0), c("b", "c", "a")) ) conditional_effects(fit, effects = "Trt:zBase", int_conditions = int_conditions) conditional_effects(fit, effects = "Trt:zBase", int_conditions = list(zBase = quantile)) ## fit a model to illustrate how to plot 3-way interactions fit3way <- brm(count ~ zAge * zBase * Trt, data = epilepsy) conditions <- make_conditions(fit3way, "zAge") conditional_effects(fit3way, "zBase:Trt", conditions = conditions) ## only include points close to the specified values of zAge ce <- conditional_effects( fit3way, "zBase:Trt", conditions = conditions, select_points = 0.1 ) plot(ce, points = TRUE) ## End(Not run)
Display smooth s
and t2
terms of models
fitted with brms.
## S3 method for class 'brmsfit' conditional_smooths( x, smooths = NULL, int_conditions = NULL, prob = 0.95, spaghetti = FALSE, surface = TRUE, resolution = 100, too_far = 0, ndraws = NULL, draw_ids = NULL, nsamples = NULL, subset = NULL, probs = NULL, ... ) conditional_smooths(x, ...)
## S3 method for class 'brmsfit' conditional_smooths( x, smooths = NULL, int_conditions = NULL, prob = 0.95, spaghetti = FALSE, surface = TRUE, resolution = 100, too_far = 0, ndraws = NULL, draw_ids = NULL, nsamples = NULL, subset = NULL, probs = NULL, ... ) conditional_smooths(x, ...)
x |
An object of class |
smooths |
Optional character vector of smooth terms
to display. If |
int_conditions |
An optional named |
prob |
A value between 0 and 1 indicating the desired probability to be covered by the uncertainty intervals. The default is 0.95. |
spaghetti |
Logical. Indicates if predictions should
be visualized via spaghetti plots. Only applied for numeric
predictors. If |
surface |
Logical. Indicates if interactions or
two-dimensional smooths should be visualized as a surface.
Defaults to |
resolution |
Number of support points used to generate
the plots. Higher resolution leads to smoother plots.
Defaults to |
too_far |
Positive number.
For surface plots only: Grid points that are too
far away from the actual data points can be excluded from the plot.
|
ndraws |
Positive integer indicating how many
posterior draws should be used.
If |
draw_ids |
An integer vector specifying
the posterior draws to be used.
If |
nsamples |
Deprecated alias of |
subset |
Deprecated alias of |
probs |
(Deprecated) The quantiles to be used in the computation of
uncertainty intervals. Please use argument |
... |
Currently ignored. |
Two-dimensional smooth terms will be visualized using either contour or raster plots.
For the brmsfit
method,
an object of class brms_conditional_effects
. See
conditional_effects
for
more details and documentation of the related plotting function.
## Not run: set.seed(0) dat <- mgcv::gamSim(1, n = 200, scale = 2) fit <- brm(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) # show all smooth terms plot(conditional_smooths(fit), rug = TRUE, ask = FALSE) # show only the smooth term s(x2) plot(conditional_smooths(fit, smooths = "s(x2)"), ask = FALSE) # fit and plot a two-dimensional smooth term fit2 <- brm(y ~ t2(x0, x2), data = dat) ms <- conditional_smooths(fit2) plot(ms, stype = "contour") plot(ms, stype = "raster") ## End(Not run)
## Not run: set.seed(0) dat <- mgcv::gamSim(1, n = 200, scale = 2) fit <- brm(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) # show all smooth terms plot(conditional_smooths(fit), rug = TRUE, ask = FALSE) # show only the smooth term s(x2) plot(conditional_smooths(fit, smooths = "s(x2)"), ask = FALSE) # fit and plot a two-dimensional smooth term fit2 <- brm(y ~ t2(x0, x2), data = dat) ms <- conditional_smooths(fit2) plot(ms, stype = "contour") plot(ms, stype = "raster") ## End(Not run)
Function used to set up constant priors in brms. The function does not evaluate its arguments – it exists purely to help set up the model.
constant(const, broadcast = TRUE)
constant(const, broadcast = TRUE)
const |
Numeric value, vector, matrix of values to which the parameters should be fixed to. Can also be a valid Stan variable in the model. |
broadcast |
Should |
A named list with elements const
and broadcast
.
stancode(count ~ Base + Age, data = epilepsy, prior = prior(constant(1), class = "b")) # will fail parsing because brms will try to broadcast a vector into a vector stancode(count ~ Base + Age, data = epilepsy, prior = prior(constant(alpha), class = "b"), stanvars = stanvar(c(1, 0), name = "alpha")) stancode(count ~ Base + Age, data = epilepsy, prior = prior(constant(alpha, broadcast = FALSE), class = "b"), stanvars = stanvar(c(1, 0), name = "alpha"))
stancode(count ~ Base + Age, data = epilepsy, prior = prior(constant(1), class = "b")) # will fail parsing because brms will try to broadcast a vector into a vector stancode(count ~ Base + Age, data = epilepsy, prior = prior(constant(alpha), class = "b"), stanvars = stanvar(c(1, 0), name = "alpha")) stancode(count ~ Base + Age, data = epilepsy, prior = prior(constant(alpha, broadcast = FALSE), class = "b"), stanvars = stanvar(c(1, 0), name = "alpha"))
Extract control parameters of the NUTS sampler such as
adapt_delta
or max_treedepth
.
control_params(x, ...) ## S3 method for class 'brmsfit' control_params(x, pars = NULL, ...)
control_params(x, ...) ## S3 method for class 'brmsfit' control_params(x, pars = NULL, ...)
x |
An R object |
... |
Currently ignored. |
pars |
Optional names of the control parameters to be returned.
If |
A named list
with control parameter values.
This function is deprecated. Please see ar
for the new syntax.
This function is a constructor for the cor_arma
class,
allowing for autoregression terms only.
cor_ar(formula = ~1, p = 1, cov = FALSE)
cor_ar(formula = ~1, p = 1, cov = FALSE)
formula |
A one sided formula of the form |
p |
A non-negative integer specifying the autoregressive (AR) order of the ARMA structure. Default is 1. |
cov |
A flag indicating whether ARMA effects should be estimated by
means of residual covariance matrices. This is currently only possible for
stationary ARMA effects of order 1. If the model family does not have
natural residuals, latent residuals are added automatically. If
|
AR refers to autoregressive effects of residuals, which is what is typically understood as autoregressive effects. However, one may also model autoregressive effects of the response variable, which is called ARR in brms.
An object of class cor_arma
containing solely autoregression terms.
cor_ar(~visit|patient, p = 2)
cor_ar(~visit|patient, p = 2)
This function is deprecated. Please see arma
for the new syntax.
This functions is a constructor for the cor_arma
class, representing
an autoregression-moving average correlation structure of order (p, q).
cor_arma(formula = ~1, p = 0, q = 0, r = 0, cov = FALSE)
cor_arma(formula = ~1, p = 0, q = 0, r = 0, cov = FALSE)
formula |
A one sided formula of the form |
p |
A non-negative integer specifying the autoregressive (AR) order of the ARMA structure. Default is 0. |
q |
A non-negative integer specifying the moving average (MA) order of the ARMA structure. Default is 0. |
r |
No longer supported. |
cov |
A flag indicating whether ARMA effects should be estimated by
means of residual covariance matrices. This is currently only possible for
stationary ARMA effects of order 1. If the model family does not have
natural residuals, latent residuals are added automatically. If
|
An object of class cor_arma
, representing an
autoregression-moving-average correlation structure.
cor_arma(~ visit | patient, p = 2, q = 2)
cor_arma(~ visit | patient, p = 2, q = 2)
Classes of correlation structures available in the brms package.
cor_brms
is not a correlation structure itself,
but the class common to all correlation structures implemented in brms.
autoregressive-moving average (ARMA) structure, with arbitrary orders for the autoregressive and moving average components
autoregressive (AR) structure of arbitrary order
moving average (MA) structure of arbitrary order
Spatial conditional autoregressive (CAR) structure
Spatial simultaneous autoregressive (SAR) structure
fixed user-defined covariance structure
cor_arma, cor_ar, cor_ma,
cor_car, cor_sar, cor_fixed
These function are deprecated. Please see car
for the new
syntax. These functions are constructors for the cor_car
class
implementing spatial conditional autoregressive structures.
cor_car(W, formula = ~1, type = "escar") cor_icar(W, formula = ~1)
cor_car(W, formula = ~1, type = "escar") cor_icar(W, formula = ~1)
W |
Adjacency matrix of locations.
All non-zero entries are treated as if the two locations
are adjacent. If |
formula |
An optional one-sided formula of the form
|
type |
Type of the CAR structure. Currently implemented
are |
The escar
and esicar
types are
implemented based on the case study of Max Joseph
(https://github.com/mbjoseph/CARstan). The icar
and
bym2
type is implemented based on the case study of Mitzi Morris
(https://mc-stan.org/users/documentation/case-studies/icar_stan.html).
## Not run: # generate some spatial data east <- north <- 1:10 Grid <- expand.grid(east, north) K <- nrow(Grid) # set up distance and neighbourhood matrices distance <- as.matrix(dist(Grid)) W <- array(0, c(K, K)) W[distance == 1] <- 1 # generate the covariates and response data x1 <- rnorm(K) x2 <- rnorm(K) theta <- rnorm(K, sd = 0.05) phi <- rmulti_normal( 1, mu = rep(0, K), Sigma = 0.4 * exp(-0.1 * distance) ) eta <- x1 + x2 + phi prob <- exp(eta) / (1 + exp(eta)) size <- rep(50, K) y <- rbinom(n = K, size = size, prob = prob) dat <- data.frame(y, size, x1, x2) # fit a CAR model fit <- brm(y | trials(size) ~ x1 + x2, data = dat, family = binomial(), autocor = cor_car(W)) summary(fit) ## End(Not run)
## Not run: # generate some spatial data east <- north <- 1:10 Grid <- expand.grid(east, north) K <- nrow(Grid) # set up distance and neighbourhood matrices distance <- as.matrix(dist(Grid)) W <- array(0, c(K, K)) W[distance == 1] <- 1 # generate the covariates and response data x1 <- rnorm(K) x2 <- rnorm(K) theta <- rnorm(K, sd = 0.05) phi <- rmulti_normal( 1, mu = rep(0, K), Sigma = 0.4 * exp(-0.1 * distance) ) eta <- x1 + x2 + phi prob <- exp(eta) / (1 + exp(eta)) size <- rep(50, K) y <- rbinom(n = K, size = size, prob = prob) dat <- data.frame(y, size, x1, x2) # fit a CAR model fit <- brm(y | trials(size) ~ x1 + x2, data = dat, family = binomial(), autocor = cor_car(W)) summary(fit) ## End(Not run)
This function is deprecated. Please see cosy
for the new syntax.
This functions is a constructor for the cor_cosy
class, representing
a compound symmetry structure corresponding to uniform correlation.
cor_cosy(formula = ~1)
cor_cosy(formula = ~1)
formula |
A one sided formula of the form |
An object of class cor_cosy
, representing a compound symmetry
correlation structure.
cor_cosy(~ visit | patient)
cor_cosy(~ visit | patient)
This function is deprecated. Please see fcor
for the new
syntax. Define a fixed covariance matrix of the response variable for
instance to model multivariate effect sizes in meta-analysis.
cor_fixed(V)
cor_fixed(V)
V |
Known covariance matrix of the response variable. If a vector is passed, it will be used as diagonal entries (variances) and covariances will be set to zero. |
An object of class cor_fixed
.
## Not run: dat <- data.frame(y = rnorm(3)) V <- cbind(c(0.5, 0.3, 0.2), c(0.3, 1, 0.1), c(0.2, 0.1, 0.2)) fit <- brm(y~1, data = dat, autocor = cor_fixed(V)) ## End(Not run)
## Not run: dat <- data.frame(y = rnorm(3)) V <- cbind(c(0.5, 0.3, 0.2), c(0.3, 1, 0.1), c(0.2, 0.1, 0.2)) fit <- brm(y~1, data = dat, autocor = cor_fixed(V)) ## End(Not run)
This function is deprecated. Please see ma
for the new syntax.
This function is a constructor for the cor_arma
class,
allowing for moving average terms only.
cor_ma(formula = ~1, q = 1, cov = FALSE)
cor_ma(formula = ~1, q = 1, cov = FALSE)
formula |
A one sided formula of the form |
q |
A non-negative integer specifying the moving average (MA) order of the ARMA structure. Default is 1. |
cov |
A flag indicating whether ARMA effects should be estimated by
means of residual covariance matrices. This is currently only possible for
stationary ARMA effects of order 1. If the model family does not have
natural residuals, latent residuals are added automatically. If
|
An object of class cor_arma
containing solely moving
average terms.
cor_ma(~visit|patient, q = 2)
cor_ma(~visit|patient, q = 2)
Thse functions are deprecated. Please see sar
for the new
syntax. These functions are constructors for the cor_sar
class
implementing spatial simultaneous autoregressive structures.
The lagsar
structure implements SAR of the response values:
The errorsar
structure implements SAR of the residuals:
In the above equations, is the predictor term and
are independent normally or t-distributed residuals.
cor_sar(W, type = c("lag", "error")) cor_lagsar(W) cor_errorsar(W)
cor_sar(W, type = c("lag", "error")) cor_lagsar(W) cor_errorsar(W)
W |
An object specifying the spatial weighting matrix.
Can be either the spatial weight matrix itself or an
object of class |
type |
Type of the SAR structure. Either |
Currently, only families gaussian
and student
support SAR structures.
An object of class cor_sar
to be used in calls to
brm
.
## Not run: data(oldcol, package = "spdep") fit1 <- brm(CRIME ~ INC + HOVAL, data = COL.OLD, autocor = cor_lagsar(COL.nb), chains = 2, cores = 2) summary(fit1) plot(fit1) fit2 <- brm(CRIME ~ INC + HOVAL, data = COL.OLD, autocor = cor_errorsar(COL.nb), chains = 2, cores = 2) summary(fit2) plot(fit2) ## End(Not run)
## Not run: data(oldcol, package = "spdep") fit1 <- brm(CRIME ~ INC + HOVAL, data = COL.OLD, autocor = cor_lagsar(COL.nb), chains = 2, cores = 2) summary(fit1) plot(fit1) fit2 <- brm(CRIME ~ INC + HOVAL, data = COL.OLD, autocor = cor_errorsar(COL.nb), chains = 2, cores = 2) summary(fit2) plot(fit2) ## End(Not run)
Set up a compounds symmetry (COSY) term in brms. The function does not evaluate its arguments – it exists purely to help set up a model with COSY terms.
cosy(time = NA, gr = NA)
cosy(time = NA, gr = NA)
time |
An optional time variable specifying the time ordering of the observations. By default, the existing order of the observations in the data is used. |
gr |
An optional grouping variable. If specified, the correlation structure is assumed to apply only to observations within the same grouping level. |
An object of class 'cosy_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
## Not run: data("lh") lh <- as.data.frame(lh) fit <- brm(x ~ cosy(), data = lh) summary(fit) ## End(Not run)
## Not run: data("lh") lh <- as.data.frame(lh) fit <- brm(x ~ cosy(), data = lh) summary(fit) ## End(Not run)
The create_priorsense_data.brmsfit
method can be used to
create the data structure needed by the priorsense package
for performing power-scaling sensitivity analysis. This method is
called automatically when performing powerscaling via
powerscale
or other related
functions, so you will rarely need to call it manually yourself.
create_priorsense_data.brmsfit(x, ...)
create_priorsense_data.brmsfit(x, ...)
x |
A |
... |
Currently unused. |
A priorsense_data
object to be used in conjunction
with the priorsense package.
## Not run: # fit a model with non-uniform priors fit <- brm(rating ~ treat + period + carry, data = inhaler, family = sratio(), prior = set_prior("normal(0, 0.5)")) summary(fit) # The following code requires the 'priorsense' package to be installed: library(priorsense) # perform power-scaling of the prior powerscale(fit, alpha = 1.5, component = "prior") # perform power-scaling sensitivity checks powerscale_sensitivity(fit) # create power-scaling sensitivity plots (for one variable) powerscale_plot_dens(fit, variable = "b_treat") ## End(Not run)
## Not run: # fit a model with non-uniform priors fit <- brm(rating ~ treat + period + carry, data = inhaler, family = sratio(), prior = set_prior("normal(0, 0.5)")) summary(fit) # The following code requires the 'priorsense' package to be installed: library(priorsense) # perform power-scaling of the prior powerscale(fit, alpha = 1.5, component = "prior") # perform power-scaling sensitivity checks powerscale_sensitivity(fit) # create power-scaling sensitivity plots (for one variable) powerscale_plot_dens(fit, variable = "b_treat") ## End(Not run)
Category Specific Predictors in brms Models
cs(expr)
cs(expr)
expr |
Expression containing predictors, for which category specific effects should be estimated. For evaluation, R formula syntax is applied. |
For detailed documentation see help(brmsformula)
as well as vignette("brms_overview")
.
This function is almost solely useful when called in formulas passed to the brms package.
## Not run: fit <- brm(rating ~ period + carry + cs(treat), data = inhaler, family = sratio("cloglog"), prior = set_prior("normal(0,5)"), chains = 2) summary(fit) plot(fit, ask = FALSE) ## End(Not run)
## Not run: fit <- brm(rating ~ period + carry + cs(treat), data = inhaler, family = sratio("cloglog"), prior = set_prior("normal(0,5)"), chains = 2) summary(fit) plot(fit, ask = FALSE) ## End(Not run)
Define custom families (i.e. response distribution) for use in
brms models. It allows users to benefit from the modeling
flexibility of brms, while applying their self-defined likelihood
functions. All of the post-processing methods for brmsfit
objects can be made compatible with custom families.
See vignette("brms_customfamilies")
for more details.
For a list of built-in families see brmsfamily
.
custom_family( name, dpars = "mu", links = "identity", type = c("real", "int"), lb = NA, ub = NA, vars = NULL, loop = TRUE, specials = NULL, threshold = "flexible", log_lik = NULL, posterior_predict = NULL, posterior_epred = NULL, predict = NULL, fitted = NULL, env = parent.frame() )
custom_family( name, dpars = "mu", links = "identity", type = c("real", "int"), lb = NA, ub = NA, vars = NULL, loop = TRUE, specials = NULL, threshold = "flexible", log_lik = NULL, posterior_predict = NULL, posterior_epred = NULL, predict = NULL, fitted = NULL, env = parent.frame() )
name |
Name of the custom family. |
dpars |
Names of the distributional parameters of
the family. One parameter must be named |
links |
Names of the link functions of the distributional parameters. |
type |
Indicates if the response distribution is
continuous ( |
lb |
Vector of lower bounds of the distributional
parameters. Defaults to |
ub |
Vector of upper bounds of the distributional
parameters. Defaults to |
vars |
Names of variables that are part of the likelihood function
without being distributional parameters. That is, |
loop |
Logical; Should the likelihood be evaluated via a loop
( |
specials |
A character vector of special options to enable for this custom family. Currently for internal use only. |
threshold |
Optional threshold type for custom ordinal families. Ignored for non-ordinal families. |
log_lik |
Optional function to compute log-likelihood values of
the model in R. This is only relevant if one wants to ensure
compatibility with method |
posterior_predict |
Optional function to compute posterior prediction of
the model in R. This is only relevant if one wants to ensure compatibility
with method |
posterior_epred |
Optional function to compute expected values of the
posterior predictive distribution of the model in R. This is only relevant
if one wants to ensure compatibility with method
|
predict |
Deprecated alias of 'posterior_predict'. |
fitted |
Deprecated alias of 'posterior_epred'. |
env |
An |
The corresponding probability density or mass Stan
functions need to have the same name as the custom family.
That is if a family is called myfamily
, then the
Stan functions should be called myfamily_lpdf
or
myfamily_lpmf
depending on whether it defines a
continuous or discrete distribution.
An object of class customfamily
inheriting
from class brmsfamily
.
brmsfamily
, brmsformula
,
stanvar
## Not run: ## demonstrate how to fit a beta-binomial model ## generate some fake data phi <- 0.7 n <- 300 z <- rnorm(n, sd = 0.2) ntrials <- sample(1:10, n, replace = TRUE) eta <- 1 + z mu <- exp(eta) / (1 + exp(eta)) a <- mu * phi b <- (1 - mu) * phi p <- rbeta(n, a, b) y <- rbinom(n, ntrials, p) dat <- data.frame(y, z, ntrials) # define a custom family beta_binomial2 <- custom_family( "beta_binomial2", dpars = c("mu", "phi"), links = c("logit", "log"), lb = c(NA, 0), type = "int", vars = "vint1[n]" ) # define the corresponding Stan density function stan_density <- " real beta_binomial2_lpmf(int y, real mu, real phi, int N) { return beta_binomial_lpmf(y | N, mu * phi, (1 - mu) * phi); } " stanvars <- stanvar(scode = stan_density, block = "functions") # fit the model fit <- brm(y | vint(ntrials) ~ z, data = dat, family = beta_binomial2, stanvars = stanvars) summary(fit) # define a *vectorized* custom family (no loop over observations) # notice also that 'vint' no longer has an observation index beta_binomial2_vec <- custom_family( "beta_binomial2", dpars = c("mu", "phi"), links = c("logit", "log"), lb = c(NA, 0), type = "int", vars = "vint1", loop = FALSE ) # define the corresponding Stan density function stan_density_vec <- " real beta_binomial2_lpmf(array[] int y, vector mu, real phi, array[] int N) { return beta_binomial_lpmf(y | N, mu * phi, (1 - mu) * phi); } " stanvars_vec <- stanvar(scode = stan_density_vec, block = "functions") # fit the model fit_vec <- brm(y | vint(ntrials) ~ z, data = dat, family = beta_binomial2_vec, stanvars = stanvars_vec) summary(fit_vec) ## End(Not run)
## Not run: ## demonstrate how to fit a beta-binomial model ## generate some fake data phi <- 0.7 n <- 300 z <- rnorm(n, sd = 0.2) ntrials <- sample(1:10, n, replace = TRUE) eta <- 1 + z mu <- exp(eta) / (1 + exp(eta)) a <- mu * phi b <- (1 - mu) * phi p <- rbeta(n, a, b) y <- rbinom(n, ntrials, p) dat <- data.frame(y, z, ntrials) # define a custom family beta_binomial2 <- custom_family( "beta_binomial2", dpars = c("mu", "phi"), links = c("logit", "log"), lb = c(NA, 0), type = "int", vars = "vint1[n]" ) # define the corresponding Stan density function stan_density <- " real beta_binomial2_lpmf(int y, real mu, real phi, int N) { return beta_binomial_lpmf(y | N, mu * phi, (1 - mu) * phi); } " stanvars <- stanvar(scode = stan_density, block = "functions") # fit the model fit <- brm(y | vint(ntrials) ~ z, data = dat, family = beta_binomial2, stanvars = stanvars) summary(fit) # define a *vectorized* custom family (no loop over observations) # notice also that 'vint' no longer has an observation index beta_binomial2_vec <- custom_family( "beta_binomial2", dpars = c("mu", "phi"), links = c("logit", "log"), lb = c(NA, 0), type = "int", vars = "vint1", loop = FALSE ) # define the corresponding Stan density function stan_density_vec <- " real beta_binomial2_lpmf(array[] int y, vector mu, real phi, array[] int N) { return beta_binomial_lpmf(y | N, mu * phi, (1 - mu) * phi); } " stanvars_vec <- stanvar(scode = stan_density_vec, block = "functions") # fit the model fit_vec <- brm(y | vint(ntrials) ~ z, data = dat, family = beta_binomial2_vec, stanvars = stanvars_vec) summary(fit_vec) ## End(Not run)
default_prior
is a generic function that can be used to
get default priors for Bayesian models. Its original use is
within the brms package, but new methods for use
with objects from other packages can be registered to the same generic.
default_prior(object, ...) get_prior(formula, ...)
default_prior(object, ...) get_prior(formula, ...)
object |
An object whose class will determine which method will be used. A symbolic description of the model to be fitted. |
... |
Further arguments passed to the specific method. |
formula |
Synonym of |
See default_prior.default
for the default method applied for
brms models. You can view the available methods by typing
methods(default_prior)
.
Usually, a brmsprior
object. See
default_prior.default
for more details.
set_prior
, default_prior.default
## get all parameters and parameters classes to define priors on (prior <- default_prior(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson()))
## get all parameters and parameters classes to define priors on (prior <- default_prior(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson()))
Get information on all parameters (and parameter classes) for which priors may be specified including default priors.
## Default S3 method: default_prior( object, data, family = gaussian(), autocor = NULL, data2 = NULL, knots = NULL, drop_unused_levels = TRUE, sparse = NULL, ... )
## Default S3 method: default_prior( object, data, family = gaussian(), autocor = NULL, data2 = NULL, knots = NULL, drop_unused_levels = TRUE, sparse = NULL, ... )
object |
An object of class |
data |
An object of class |
family |
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a |
autocor |
(Deprecated) An optional |
data2 |
A named |
knots |
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See
|
drop_unused_levels |
Should unused factors levels in the data be
dropped? Defaults to |
sparse |
(Deprecated) Logical; indicates whether the population-level
design matrices should be treated as sparse (defaults to |
... |
Other arguments for internal usage only. |
A brmsprior
object. That is, a data.frame with specific
columns including prior
, class
, coef
, and group
and several rows, each providing information on a parameter (or parameter
class) on which priors can be specified. The prior column is empty except
for internal default priors.
# get all parameters and parameters classes to define priors on (prior <- default_prior(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson())) # define a prior on all population-level effects a once prior$prior[1] <- "normal(0,10)" # define a specific prior on the population-level effect of Trt prior$prior[5] <- "student_t(10, 0, 5)" # verify that the priors indeed found their way into Stan's model code stancode(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson(), prior = prior)
# get all parameters and parameters classes to define priors on (prior <- default_prior(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson())) # define a prior on all population-level effects a once prior$prior[1] <- "normal(0,10)" # define a specific prior on the population-level effect of Trt prior$prior[5] <- "student_t(10, 0, 5)" # verify that the priors indeed found their way into Stan's model code stancode(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson(), prior = prior)
Compute the ratio of two densities at given points based on draws of the corresponding distributions.
density_ratio(x, y = NULL, point = 0, n = 4096, ...)
density_ratio(x, y = NULL, point = 0, n = 4096, ...)
x |
Vector of draws from the first distribution, usually the posterior distribution of the quantity of interest. |
y |
Optional vector of draws from the second distribution, usually the
prior distribution of the quantity of interest. If |
point |
Numeric values at which to evaluate and compare the densities.
Defaults to |
n |
Single numeric value. Influences the accuracy of the density
estimation. See |
... |
Further arguments passed to |
In order to achieve sufficient accuracy in the density estimation, more draws than usual are required. That is you may need an effective sample size of 10,000 or more to reliably estimate the densities.
A vector of length equal to length(point)
. If y
is
provided, the density ratio of x
against y
is returned. Else,
only the density of x
is returned.
x <- rnorm(10000) y <- rnorm(10000, mean = 1) density_ratio(x, y, point = c(0, 1))
x <- rnorm(10000) y <- rnorm(10000, mean = 1) density_ratio(x, y, point = c(0, 1))
Extract quantities that can be used to diagnose sampling behavior of the algorithms applied by Stan at the back-end of brms.
## S3 method for class 'brmsfit' log_posterior(object, ...) ## S3 method for class 'brmsfit' nuts_params(object, pars = NULL, ...) ## S3 method for class 'brmsfit' rhat(x, pars = NULL, ...) ## S3 method for class 'brmsfit' neff_ratio(object, pars = NULL, ...)
## S3 method for class 'brmsfit' log_posterior(object, ...) ## S3 method for class 'brmsfit' nuts_params(object, pars = NULL, ...) ## S3 method for class 'brmsfit' rhat(x, pars = NULL, ...) ## S3 method for class 'brmsfit' neff_ratio(object, pars = NULL, ...)
object , x
|
A |
... |
Arguments passed to individual methods. |
pars |
An optional character vector of parameter names.
For |
For more details see
bayesplot-extractors
.
The exact form of the output depends on the method.
## Not run: fit <- brm(time ~ age * sex, data = kidney) lp <- log_posterior(fit) head(lp) np <- nuts_params(fit) str(np) # extract the number of divergence transitions sum(subset(np, Parameter == "divergent__")$Value) head(rhat(fit)) head(neff_ratio(fit)) ## End(Not run)
## Not run: fit <- brm(time ~ age * sex, data = kidney) lp <- log_posterior(fit) head(lp) np <- nuts_params(fit) str(np) # extract the number of divergence transitions sum(subset(np, Parameter == "divergent__")$Value) head(rhat(fit)) head(neff_ratio(fit)) ## End(Not run)
Density function and random number generation for the dirichlet
distribution with shape parameter vector alpha
.
ddirichlet(x, alpha, log = FALSE) rdirichlet(n, alpha)
ddirichlet(x, alpha, log = FALSE) rdirichlet(n, alpha)
x |
Matrix of quantiles. Each row corresponds to one probability vector. |
alpha |
Matrix of positive shape parameters. Each row corresponds to one probability vector. |
log |
Logical; If |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details on the
parameterization.
brmsfit
to draws
objectsTransform a brmsfit
object to a format supported by the
posterior package.
## S3 method for class 'brmsfit' as_draws(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_matrix(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_array(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_df(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_list(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_rvars(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...)
## S3 method for class 'brmsfit' as_draws(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_matrix(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_array(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_df(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_list(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...) ## S3 method for class 'brmsfit' as_draws_rvars(x, variable = NULL, regex = FALSE, inc_warmup = FALSE, ...)
x |
A |
variable |
A character vector providing the variables to extract. By default, all variables are extracted. |
regex |
Logical; Should variable should be treated as a (vector of)
regular expressions? Any variable in |
inc_warmup |
Should warmup draws be included? Defaults to |
... |
Arguments passed to individual methods (if applicable). |
To subset iterations, chains, or draws, use the
subset_draws
method after
transforming the brmsfit
to a draws
object.
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson()) # extract posterior draws in an array format (draws_fit <- as_draws_array(fit)) posterior::summarize_draws(draws_fit) # extract only certain variables as_draws_array(fit, variable = "r_patient") as_draws_array(fit, variable = "^b_", regex = TRUE) # extract posterior draws in a random variables format as_draws_rvars(fit) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson()) # extract posterior draws in an array format (draws_fit <- as_draws_array(fit)) posterior::summarize_draws(draws_fit) # extract only certain variables as_draws_array(fit, variable = "r_patient") as_draws_array(fit, variable = "^b_", regex = TRUE) # extract posterior draws in a random variables format as_draws_rvars(fit) ## End(Not run)
brmsfit
objectsIndex brmsfit
objects
## S3 method for class 'brmsfit' variables(x, ...) ## S3 method for class 'brmsfit' nvariables(x, ...) ## S3 method for class 'brmsfit' niterations(x) ## S3 method for class 'brmsfit' nchains(x) ## S3 method for class 'brmsfit' ndraws(x)
## S3 method for class 'brmsfit' variables(x, ...) ## S3 method for class 'brmsfit' nvariables(x, ...) ## S3 method for class 'brmsfit' niterations(x) ## S3 method for class 'brmsfit' nchains(x) ## S3 method for class 'brmsfit' ndraws(x)
x |
A |
... |
Arguments passed to individual methods (if applicable). |
Functions required for compatibility of brms with emmeans.
Users are not required to call these functions themselves. Instead,
they will be called automatically by the emmeans
function
of the emmeans package.
recover_data.brmsfit( object, data, resp = NULL, dpar = NULL, nlpar = NULL, re_formula = NA, epred = FALSE, ... ) emm_basis.brmsfit( object, trms, xlev, grid, vcov., resp = NULL, dpar = NULL, nlpar = NULL, re_formula = NA, epred = FALSE, ... )
recover_data.brmsfit( object, data, resp = NULL, dpar = NULL, nlpar = NULL, re_formula = NA, epred = FALSE, ... ) emm_basis.brmsfit( object, trms, xlev, grid, vcov., resp = NULL, dpar = NULL, nlpar = NULL, re_formula = NA, epred = FALSE, ... )
object |
An object of class |
data , trms , xlev , grid , vcov.
|
Arguments required by emmeans. |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
dpar |
Optional name of a predicted distributional parameter. If specified, expected predictions of this parameters are returned. |
nlpar |
Optional name of a predicted non-linear parameter. If specified, expected predictions of this parameters are returned. |
re_formula |
Optional formula containing group-level effects to be
considered in the prediction. If |
epred |
Logical. If |
... |
Additional arguments passed to emmeans. |
In order to ensure compatibility of most brms models with
emmeans, predictions are not generated 'manually' via a design matrix
and coefficient vector, but rather via posterior_linpred.brmsfit
.
This appears to generally work well, but note that it produces an '.@linfct'
slot that contains the computed predictions as columns instead of the
coefficients.
## Not run: fit1 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient), data = kidney, family = lognormal()) summary(fit1) # summarize via 'emmeans' library(emmeans) rg <- ref_grid(fit1) em <- emmeans(rg, "disease") summary(em, point.est = mean) # obtain estimates for the posterior predictive distribution's mean epred <- emmeans(fit1, "disease", epred = TRUE) summary(epred, point.est = mean) # model with transformed response variable fit2 <- brm(log(mpg) ~ factor(cyl), data = mtcars) summary(fit2) # results will be on the log scale by default emmeans(fit2, ~ cyl) # log transform is detected and can be adjusted automatically emmeans(fit2, ~ cyl, epred = TRUE, type = "response") ## End(Not run)
## Not run: fit1 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient), data = kidney, family = lognormal()) summary(fit1) # summarize via 'emmeans' library(emmeans) rg <- ref_grid(fit1) em <- emmeans(rg, "disease") summary(em, point.est = mean) # obtain estimates for the posterior predictive distribution's mean epred <- emmeans(fit1, "disease", epred = TRUE) summary(epred, point.est = mean) # model with transformed response variable fit2 <- brm(log(mpg) ~ factor(cyl), data = mtcars) summary(fit2) # results will be on the log scale by default emmeans(fit2, ~ cyl) # log transform is detected and can be adjusted automatically emmeans(fit2, ~ cyl, epred = TRUE, type = "response") ## End(Not run)
Breslow and Clayton (1993) analyze data initially provided by Thall and Vail (1990) concerning seizure counts in a randomized trial of anti-convulsant therapy in epilepsy. Covariates are treatment, 8-week baseline seizure counts, and age of the patients in years.
epilepsy
epilepsy
A data frame of 236 observations containing information on the following 9 variables.
The age of the patients in years
The seizure count at 8-weeks baseline
Either 0
or 1
indicating
if the patient received anti-convulsant therapy
The patient number
The session number from 1
(first visit)
to 4
(last visit)
The seizure count between two visits
The observation number, that is a unique identifier for each observation
Standardized Age
Standardized Base
Thall, P. F., & Vail, S. C. (1990).
Some covariance models for longitudinal count data with overdispersion.
Biometrics, 46(2), 657-671.
Breslow, N. E., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88(421), 9-25.
## Not run: ## poisson regression without random effects. fit1 <- brm(count ~ zAge + zBase * Trt, data = epilepsy, family = poisson()) summary(fit1) plot(fit1) ## poisson regression with varying intercepts of patients ## as well as normal priors for overall effects parameters. fit2 <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), prior = set_prior("normal(0,5)")) summary(fit2) plot(fit2) ## End(Not run)
## Not run: ## poisson regression without random effects. fit1 <- brm(count ~ zAge + zBase * Trt, data = epilepsy, family = poisson()) summary(fit1) plot(fit1) ## poisson regression with varying intercepts of patients ## as well as normal priors for overall effects parameters. fit2 <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), prior = set_prior("normal(0,5)")) summary(fit2) plot(fit2) ## End(Not run)
Density, distribution function, and random generation
for the exponentially modified Gaussian distribution with
mean mu
and standard deviation sigma
of the gaussian
component, as well as scale beta
of the exponential
component.
dexgaussian(x, mu, sigma, beta, log = FALSE) pexgaussian(q, mu, sigma, beta, lower.tail = TRUE, log.p = FALSE) rexgaussian(n, mu, sigma, beta)
dexgaussian(x, mu, sigma, beta, log = FALSE) pexgaussian(q, mu, sigma, beta, lower.tail = TRUE, log.p = FALSE) rexgaussian(n, mu, sigma, beta)
x , q
|
Vector of quantiles. |
mu |
Vector of means of the combined distribution. |
sigma |
Vector of standard deviations of the gaussian component. |
beta |
Vector of scales of the exponential component. |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
Export user-defined Stan function and
optionally vectorize them. For more details see
expose_stan_functions
.
## S3 method for class 'brmsfit' expose_functions(x, vectorize = FALSE, env = globalenv(), ...) expose_functions(x, ...)
## S3 method for class 'brmsfit' expose_functions(x, vectorize = FALSE, env = globalenv(), ...) expose_functions(x, ...)
x |
An object of class |
vectorize |
Logical; Indicates if the exposed functions
should be vectorized via |
env |
Environment where the functions should be made available. Defaults to the global environment. |
... |
Further arguments passed to
|
Computes exp(x) + 1
.
expp1(x)
expp1(x)
x |
A numeric or complex vector. |
Extract Model Family Objects
## S3 method for class 'brmsfit' family(object, resp = NULL, ...)
## S3 method for class 'brmsfit' family(object, resp = NULL, ...)
object |
An object of class |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
... |
Currently unused. |
A brmsfamily
object
or a list of such objects for multivariate models.
Set up a fixed residual correlation (FCOR) term in brms. The function does not evaluate its arguments – it exists purely to help set up a model with FCOR terms.
fcor(M)
fcor(M)
M |
Known correlation/covariance matrix of the response variable.
If a vector is passed, it will be used as diagonal entries
(variances) and correlations/covariances will be set to zero.
The actual covariance matrix used in the likelihood is obtained
by multiplying |
An object of class 'fcor_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
## Not run: dat <- data.frame(y = rnorm(3)) V <- cbind(c(0.5, 0.3, 0.2), c(0.3, 1, 0.1), c(0.2, 0.1, 0.2)) fit <- brm(y ~ 1 + fcor(V), data = dat, data2 = list(V = V)) ## End(Not run)
## Not run: dat <- data.frame(y = rnorm(3)) V <- cbind(c(0.5, 0.3, 0.2), c(0.3, 1, 0.1), c(0.2, 0.1, 0.2)) fit <- brm(y ~ 1 + fcor(V), data = dat, data2 = list(V = V)) ## End(Not run)
This method is an alias of posterior_epred.brmsfit
with additional arguments for obtaining summaries of the computed draws.
## S3 method for class 'brmsfit' fitted( object, newdata = NULL, re_formula = NULL, scale = c("response", "linear"), resp = NULL, dpar = NULL, nlpar = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
## S3 method for class 'brmsfit' fitted( object, newdata = NULL, re_formula = NULL, scale = c("response", "linear"), resp = NULL, dpar = NULL, nlpar = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
object |
An object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
scale |
Either |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
dpar |
Optional name of a predicted distributional parameter. If specified, expected predictions of this parameters are returned. |
nlpar |
Optional name of a predicted non-linear parameter. If specified, expected predictions of this parameters are returned. |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
sort |
Logical. Only relevant for time series models.
Indicating whether to return predicted values in the original
order ( |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
... |
Further arguments passed to |
An array
of predicted mean response values.
If summary = FALSE
the output resembles those of
posterior_epred.brmsfit
.
If summary = TRUE
the output depends on the family: For categorical
and ordinal families, the output is an N x E x C array, where N is the
number of observations, E is the number of summary statistics, and C is the
number of categories. For all other families, the output is an N x E
matrix. The number of summary statistics E is equal to 2 +
length(probs)
: The Estimate
column contains point estimates (either
mean or median depending on argument robust
), while the
Est.Error
column contains uncertainty estimates (either standard
deviation or median absolute deviation depending on argument
robust
). The remaining columns starting with Q
contain
quantile estimates as specified via argument probs
.
In multivariate models, an additional dimension is added to the output which indexes along the different response variables.
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) ## compute expected predictions fitted_values <- fitted(fit) head(fitted_values) ## plot expected predictions against actual response dat <- as.data.frame(cbind(Y = standata(fit)$Y, fitted_values)) ggplot(dat) + geom_point(aes(x = Estimate, y = Y)) ## End(Not run)
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) ## compute expected predictions fitted_values <- fitted(fit) head(fitted_values) ## plot expected predictions against actual response dat <- as.data.frame(cbind(Y = standata(fit)$Y, fitted_values)) ggplot(dat) + geom_point(aes(x = Estimate, y = Y)) ## End(Not run)
Extract the population-level ('fixed') effects
from a brmsfit
object.
## S3 method for class 'brmsfit' fixef( object, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), pars = NULL, ... )
## S3 method for class 'brmsfit' fixef( object, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), pars = NULL, ... )
object |
An object of class |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
pars |
Optional names of coefficients to extract. By default, all coefficients are extracted. |
... |
Currently ignored. |
If summary
is TRUE
, a matrix returned
by posterior_summary
for the population-level effects.
If summary
is FALSE
, a matrix with one row per
posterior draw and one column per population-level effect.
## Not run: fit <- brm(time | cens(censored) ~ age + sex + disease, data = kidney, family = "exponential") fixef(fit) # extract only some coefficients fixef(fit, pars = c("age", "sex")) ## End(Not run)
## Not run: fit <- brm(time | cens(censored) ~ age + sex + disease, data = kidney, family = "exponential") fixef(fit) # extract only some coefficients fixef(fit, pars = c("age", "sex")) ## End(Not run)
Density, distribution function, quantile function and random generation
for the Frechet distribution with location loc
, scale scale
,
and shape shape
.
dfrechet(x, loc = 0, scale = 1, shape = 1, log = FALSE) pfrechet(q, loc = 0, scale = 1, shape = 1, lower.tail = TRUE, log.p = FALSE) qfrechet(p, loc = 0, scale = 1, shape = 1, lower.tail = TRUE, log.p = FALSE) rfrechet(n, loc = 0, scale = 1, shape = 1)
dfrechet(x, loc = 0, scale = 1, shape = 1, log = FALSE) pfrechet(q, loc = 0, scale = 1, shape = 1, lower.tail = TRUE, log.p = FALSE) qfrechet(p, loc = 0, scale = 1, shape = 1, lower.tail = TRUE, log.p = FALSE) rfrechet(n, loc = 0, scale = 1, shape = 1)
x , q
|
Vector of quantiles. |
loc |
Vector of locations. |
scale |
Vector of scales. |
shape |
Vector of shapes. |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
p |
Vector of probabilities. |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
Density, distribution function, and random generation
for the generalized extreme value distribution with
location mu
, scale sigma
and shape xi
.
dgen_extreme_value(x, mu = 0, sigma = 1, xi = 0, log = FALSE) pgen_extreme_value( q, mu = 0, sigma = 1, xi = 0, lower.tail = TRUE, log.p = FALSE ) qgen_extreme_value( p, mu = 0, sigma = 1, xi = 0, lower.tail = TRUE, log.p = FALSE ) rgen_extreme_value(n, mu = 0, sigma = 1, xi = 0)
dgen_extreme_value(x, mu = 0, sigma = 1, xi = 0, log = FALSE) pgen_extreme_value( q, mu = 0, sigma = 1, xi = 0, lower.tail = TRUE, log.p = FALSE ) qgen_extreme_value( p, mu = 0, sigma = 1, xi = 0, lower.tail = TRUE, log.p = FALSE ) rgen_extreme_value(n, mu = 0, sigma = 1, xi = 0)
x , q
|
Vector of quantiles. |
mu |
Vector of locations. |
sigma |
Vector of scales. |
xi |
Vector of shapes. |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
p |
Vector of probabilities. |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
Get draws of a distributional parameter from a brmsprep
or
mvbrmsprep
object. This function is primarily useful when developing
custom families or packages depending on brms.
This function lets callers easily handle both the case when the
distributional parameter is predicted directly, via a (non-)linear
predictor or fixed to a constant. See the vignette
vignette("brms_customfamilies")
for an example use case.
get_dpar(prep, dpar, i = NULL, inv_link = NULL)
get_dpar(prep, dpar, i = NULL, inv_link = NULL)
prep |
A 'brmsprep' or 'mvbrmsprep' object created by
|
dpar |
Name of the distributional parameter. |
i |
The observation numbers for which predictions shall be extracted.
If |
inv_link |
Should the inverse link function be applied?
If |
If the parameter is predicted and i
is NULL
or
length(i) > 1
, an S x N
matrix. If the parameter it not
predicted or length(i) == 1
, a vector of length S
. Here
S
is the number of draws and N
is the number of
observations or length of i
if specified.
## Not run: posterior_predict_my_dist <- function(i, prep, ...) { mu <- brms::get_dpar(prep, "mu", i = i) mypar <- brms::get_dpar(prep, "mypar", i = i) my_rng(mu, mypar) } ## End(Not run)
## Not run: posterior_predict_my_dist <- function(i, prep, ...) { mu <- brms::get_dpar(prep, "mu", i = i) mypar <- brms::get_dpar(prep, "mypar", i = i) my_rng(mu, mypar) } ## End(Not run)
The get_refmodel.brmsfit
method can be used to create the reference
model structure which is needed by the projpred package for performing
a projection predictive variable selection. This method is called
automatically when performing variable selection via
varsel
or
cv_varsel
, so you will rarely need to call
it manually yourself.
get_refmodel.brmsfit( object, newdata = NULL, resp = NULL, cvfun = NULL, dis = NULL, latent = FALSE, brms_seed = NULL, ... )
get_refmodel.brmsfit( object, newdata = NULL, resp = NULL, cvfun = NULL, dis = NULL, latent = FALSE, brms_seed = NULL, ... )
object |
An object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
cvfun |
Optional cross-validation function
(see |
dis |
Passed to argument |
latent |
See argument |
brms_seed |
A seed used to infer seeds for |
... |
Further arguments passed to
|
The extract_model_data
function used internally by
get_refmodel.brmsfit
ignores arguments wrhs
and orhs
(a warning is thrown if these are non-NULL
). For example, arguments
weightsnew
and offsetnew
of
proj_linpred
,
proj_predict
, and
predict.refmodel
are passed to
wrhs
and orhs
, respectively.
A refmodel
object to be used in conjunction with the
projpred package.
## Not run: # fit a simple model fit <- brm(count ~ zAge + zBase * Trt, data = epilepsy, family = poisson()) summary(fit) # The following code requires the 'projpred' package to be installed: library(projpred) # perform variable selection without cross-validation vs <- varsel(fit) summary(vs) plot(vs) # perform variable selection with cross-validation cv_vs <- cv_varsel(fit) summary(cv_vs) plot(cv_vs) ## End(Not run)
## Not run: # fit a simple model fit <- brm(count ~ zAge + zBase * Trt, data = epilepsy, family = poisson()) summary(fit) # The following code requires the 'projpred' package to be installed: library(projpred) # perform variable selection without cross-validation vs <- varsel(fit) summary(vs) plot(vs) # perform variable selection with cross-validation cv_vs <- cv_varsel(fit) summary(cv_vs) plot(cv_vs) ## End(Not run)
Set up a Gaussian process (GP) term in brms. The function does not evaluate its arguments – it exists purely to help set up a model with GP terms.
gp( ..., by = NA, k = NA, cov = "exp_quad", iso = TRUE, gr = TRUE, cmc = TRUE, scale = TRUE, c = 5/4 )
gp( ..., by = NA, k = NA, cov = "exp_quad", iso = TRUE, gr = TRUE, cmc = TRUE, scale = TRUE, c = 5/4 )
... |
One or more predictors for the GP. |
by |
A numeric or factor variable of the same length as each predictor. In the numeric vector case, the elements multiply the values returned by the GP. In the factor variable case, a separate GP is fitted for each factor level. |
k |
Optional number of basis functions for computing Hilbert-space
approximate GPs. If |
cov |
Name of the covariance kernel. Currently supported are
|
iso |
A flag to indicate whether an isotropic ( |
gr |
Logical; Indicates if auto-grouping should be used (defaults
to |
cmc |
Logical; Only relevant if |
scale |
Logical; If |
c |
Numeric value only used in approximate GPs. Defines the
multiplicative constant of the predictors' range over which
predictions should be computed. A good default could be |
A GP is a stochastic process, which
describes the relation between one or more predictors
and a response
, where
is the number of predictors. A GP is the
generalization of the multivariate normal distribution
to an infinite number of dimensions. Thus, it can be
interpreted as a prior over functions. The values of
at any finite set of locations are jointly multivariate
normal, with a covariance matrix defined by the covariance
kernel
, where
is the vector of parameters
of the GP:
The smoothness and general behavior of the function
depends only on the choice of covariance kernel.
For a more detailed introduction to Gaussian processes,
see https://en.wikipedia.org/wiki/Gaussian_process.
For mathematical details on the supported kernels, please see the Stan manual: https://mc-stan.org/docs/functions-reference/matrix_operations.html under "Gaussian Process Covariance Functions".
An object of class 'gp_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
## Not run: # simulate data using the mgcv package dat <- mgcv::gamSim(1, n = 30, scale = 2) # fit a simple GP model fit1 <- brm(y ~ gp(x2), dat, chains = 2) summary(fit1) me1 <- conditional_effects(fit1, ndraws = 200, spaghetti = TRUE) plot(me1, ask = FALSE, points = TRUE) # fit a more complicated GP model and use an approximate GP for x2 fit2 <- brm(y ~ gp(x0) + x1 + gp(x2, k = 10) + x3, dat, chains = 2) summary(fit2) me2 <- conditional_effects(fit2, ndraws = 200, spaghetti = TRUE) plot(me2, ask = FALSE, points = TRUE) # fit a multivariate GP model with Matern 3/2 kernel fit3 <- brm(y ~ gp(x1, x2, cov = "matern32"), dat, chains = 2) summary(fit3) me3 <- conditional_effects(fit3, ndraws = 200, spaghetti = TRUE) plot(me3, ask = FALSE, points = TRUE) # compare model fit loo(fit1, fit2, fit3) # simulate data with a factor covariate dat2 <- mgcv::gamSim(4, n = 90, scale = 2) # fit separate gaussian processes for different levels of 'fac' fit4 <- brm(y ~ gp(x2, by = fac), dat2, chains = 2) summary(fit4) plot(conditional_effects(fit4), points = TRUE) ## End(Not run)
## Not run: # simulate data using the mgcv package dat <- mgcv::gamSim(1, n = 30, scale = 2) # fit a simple GP model fit1 <- brm(y ~ gp(x2), dat, chains = 2) summary(fit1) me1 <- conditional_effects(fit1, ndraws = 200, spaghetti = TRUE) plot(me1, ask = FALSE, points = TRUE) # fit a more complicated GP model and use an approximate GP for x2 fit2 <- brm(y ~ gp(x0) + x1 + gp(x2, k = 10) + x3, dat, chains = 2) summary(fit2) me2 <- conditional_effects(fit2, ndraws = 200, spaghetti = TRUE) plot(me2, ask = FALSE, points = TRUE) # fit a multivariate GP model with Matern 3/2 kernel fit3 <- brm(y ~ gp(x1, x2, cov = "matern32"), dat, chains = 2) summary(fit3) me3 <- conditional_effects(fit3, ndraws = 200, spaghetti = TRUE) plot(me3, ask = FALSE, points = TRUE) # compare model fit loo(fit1, fit2, fit3) # simulate data with a factor covariate dat2 <- mgcv::gamSim(4, n = 90, scale = 2) # fit separate gaussian processes for different levels of 'fac' fit4 <- brm(y ~ gp(x2, by = fac), dat2, chains = 2) summary(fit4) plot(conditional_effects(fit4), points = TRUE) ## End(Not run)
Function used to set up a basic grouping term in brms.
The function does not evaluate its arguments –
it exists purely to help set up a model with grouping terms.
gr
is called implicitly inside the package
and there is usually no need to call it directly.
gr(..., by = NULL, cor = TRUE, id = NA, cov = NULL, dist = "gaussian")
gr(..., by = NULL, cor = TRUE, id = NA, cov = NULL, dist = "gaussian")
... |
One or more terms containing grouping factors. |
by |
An optional factor variable, specifying sub-populations of the
groups. For each level of the |
cor |
Logical. If |
id |
Optional character string. All group-level terms across the model
with the same |
cov |
An optional matrix which is proportional to the within-group
covariance matrix of the group-level effects. All levels of the grouping
factor should appear as rownames of the corresponding matrix. This argument
can be used, among others, to model pedigrees and phylogenetic effects. See
|
dist |
Name of the distribution of the group-level effects.
Currently |
## Not run: # model using basic lme4-style formula fit1 <- brm(count ~ Trt + (1|patient), data = epilepsy) summary(fit1) # equivalent model using 'gr' which is called anyway internally fit2 <- brm(count ~ Trt + (1|gr(patient)), data = epilepsy) summary(fit2) # include Trt as a by variable fit3 <- brm(count ~ Trt + (1|gr(patient, by = Trt)), data = epilepsy) summary(fit3) ## End(Not run)
## Not run: # model using basic lme4-style formula fit1 <- brm(count ~ Trt + (1|patient), data = epilepsy) summary(fit1) # equivalent model using 'gr' which is called anyway internally fit2 <- brm(count ~ Trt + (1|gr(patient)), data = epilepsy) summary(fit2) # include Trt as a by variable fit3 <- brm(count ~ Trt + (1|gr(patient, by = Trt)), data = epilepsy) summary(fit3) ## End(Not run)
Function used to set up regularized horseshoe priors and related hierarchical shrinkage priors in brms. The function does not evaluate its arguments – it exists purely to help set up the model.
horseshoe( df = 1, scale_global = 1, df_global = 1, scale_slab = 2, df_slab = 4, par_ratio = NULL, autoscale = TRUE, main = FALSE )
horseshoe( df = 1, scale_global = 1, df_global = 1, scale_slab = 2, df_slab = 4, par_ratio = NULL, autoscale = TRUE, main = FALSE )
df |
Degrees of freedom of student-t prior of the
local shrinkage parameters. Defaults to |
scale_global |
Scale of the student-t prior of the global shrinkage
parameter. Defaults to |
df_global |
Degrees of freedom of student-t prior of the
global shrinkage parameter. Defaults to |
scale_slab |
Scale of the Student-t slab. Defaults to |
df_slab |
Degrees of freedom of the student-t slab.
Defaults to |
par_ratio |
Ratio of the expected number of non-zero coefficients
to the expected number of zero coefficients. If specified,
|
autoscale |
Logical; indicating whether the horseshoe
prior should be scaled using the residual standard deviation
|
main |
Logical (defaults to |
The horseshoe prior is a special shrinkage prior initially proposed by
Carvalho et al. (2009).
It is symmetric around zero with fat tails and an infinitely large spike
at zero. This makes it ideal for sparse models that have
many regression coefficients, although only a minority of them is non-zero.
The horseshoe prior can be applied on all population-level effects at once
(excluding the intercept) by using set_prior("horseshoe(1)")
.
The 1
implies that the student-t prior of the local shrinkage
parameters has 1 degrees of freedom. This may, however, lead to an
increased number of divergent transition in Stan.
Accordingly, increasing the degrees of freedom to slightly higher values
(e.g., 3
) may often be a better option, although the prior
no longer resembles a horseshoe in this case.
Further, the scale of the global shrinkage parameter plays an important role
in amount of shrinkage applied. It defaults to 1
,
but this may result in too few shrinkage (Piironen & Vehtari, 2016).
It is thus possible to change the scale using argument scale_global
of the horseshoe prior, for instance horseshoe(1, scale_global = 0.5)
.
In linear models, scale_global
will internally be multiplied by the
residual standard deviation parameter sigma
. See Piironen and
Vehtari (2016) for recommendations how to properly set the global scale.
The degrees of freedom of the global shrinkage prior may also be
adjusted via argument df_global
.
Piironen and Vehtari (2017) recommend to specifying the ratio of the
expected number of non-zero coefficients to the expected number of zero
coefficients par_ratio
rather than scale_global
directly.
As proposed by Piironen and Vehtari (2017), an additional regularization
is applied that only affects non-zero coefficients. The amount of
regularization can be controlled via scale_slab
and df_slab
.
To make sure that shrinkage can equally affect all coefficients,
predictors should be one the same scale.
Generally, models with horseshoe priors a more likely than other models
to have divergent transitions so that increasing adapt_delta
from 0.8
to values closer to 1
will often be necessary.
See the documentation of brm
for instructions
on how to increase adapt_delta
.
The prior does not account for scale differences of the terms it is applied on. Accordingly, please make sure that all these terms have a comparable scale to ensure that shrinkage is applied properly.
Currently, the following classes support the horseshoe prior: b
(overall regression coefficients), sds
(SDs of smoothing splines),
sdgp
(SDs of Gaussian processes), ar
(autoregressive
coefficients), ma
(moving average coefficients), sderr
(SD of
latent residuals), sdcar
(SD of spatial CAR structures), sd
(SD of varying coefficients).
A character string obtained by match.call()
with
additional arguments.
Carvalho, C. M., Polson, N. G., & Scott, J. G. (2009). Handling sparsity via the horseshoe. Artificial Intelligence and Statistics. http://proceedings.mlr.press/v5/carvalho09a
Piironen J. & Vehtari A. (2017). On the Hyperprior Choice for the Global Shrinkage Parameter in the Horseshoe Prior. Artificial Intelligence and Statistics. https://arxiv.org/pdf/1610.05559v1
Piironen, J., and Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors. Electronic Journal of Statistics. https://arxiv.org/abs/1707.01694
set_prior(horseshoe(df = 3, par_ratio = 0.1)) # specify the horseshoe prior across multiple parameter classes set_prior(horseshoe(df = 3, par_ratio = 0.1, main = TRUE), class = "b") + set_prior(horseshoe(), class = "sd")
set_prior(horseshoe(df = 3, par_ratio = 0.1)) # specify the horseshoe prior across multiple parameter classes set_prior(horseshoe(df = 3, par_ratio = 0.1, main = TRUE), class = "b") + set_prior(horseshoe(), class = "sd")
Density and distribution functions for hurdle distributions.
dhurdle_poisson(x, lambda, hu, log = FALSE) phurdle_poisson(q, lambda, hu, lower.tail = TRUE, log.p = FALSE) dhurdle_negbinomial(x, mu, shape, hu, log = FALSE) phurdle_negbinomial(q, mu, shape, hu, lower.tail = TRUE, log.p = FALSE) dhurdle_gamma(x, shape, scale, hu, log = FALSE) phurdle_gamma(q, shape, scale, hu, lower.tail = TRUE, log.p = FALSE) dhurdle_lognormal(x, mu, sigma, hu, log = FALSE) phurdle_lognormal(q, mu, sigma, hu, lower.tail = TRUE, log.p = FALSE)
dhurdle_poisson(x, lambda, hu, log = FALSE) phurdle_poisson(q, lambda, hu, lower.tail = TRUE, log.p = FALSE) dhurdle_negbinomial(x, mu, shape, hu, log = FALSE) phurdle_negbinomial(q, mu, shape, hu, lower.tail = TRUE, log.p = FALSE) dhurdle_gamma(x, shape, scale, hu, log = FALSE) phurdle_gamma(q, shape, scale, hu, lower.tail = TRUE, log.p = FALSE) dhurdle_lognormal(x, mu, sigma, hu, log = FALSE) phurdle_lognormal(q, mu, sigma, hu, lower.tail = TRUE, log.p = FALSE)
x |
Vector of quantiles. |
hu |
hurdle probability |
log |
Logical; If |
q |
Vector of quantiles. |
lower.tail |
Logical; If |
log.p |
Logical; If |
mu , lambda
|
location parameter |
shape |
shape parameter |
sigma , scale
|
scale parameter |
The density of a hurdle distribution can be specified as follows.
If set
. Else set
where
and
are the density and distribution
function of the non-hurdle part, respectively.
Perform non-linear hypothesis testing for all model parameters.
## S3 method for class 'brmsfit' hypothesis( x, hypothesis, class = "b", group = "", scope = c("standard", "ranef", "coef"), alpha = 0.05, robust = FALSE, seed = NULL, ... ) hypothesis(x, ...) ## Default S3 method: hypothesis(x, hypothesis, alpha = 0.05, robust = FALSE, ...)
## S3 method for class 'brmsfit' hypothesis( x, hypothesis, class = "b", group = "", scope = c("standard", "ranef", "coef"), alpha = 0.05, robust = FALSE, seed = NULL, ... ) hypothesis(x, ...) ## Default S3 method: hypothesis(x, hypothesis, alpha = 0.05, robust = FALSE, ...)
x |
An |
hypothesis |
A character vector specifying one or more non-linear hypothesis concerning parameters of the model. |
class |
A string specifying the class of parameters being tested.
Default is "b" for population-level effects.
Other typical options are "sd" or "cor".
If |
group |
Name of a grouping factor to evaluate only group-level effects parameters related to this grouping factor. |
scope |
Indicates where to look for the variables specified in
|
alpha |
The alpha-level of the tests (default is 0.05; see 'Details' for more information). |
robust |
If |
seed |
A single numeric value passed to |
... |
Currently ignored. |
Among others, hypothesis
computes an evidence ratio
(Evid.Ratio
) for each hypothesis. For a one-sided hypothesis, this
is just the posterior probability (Post.Prob
) under the hypothesis
against its alternative. That is, when the hypothesis is of the form
a > b
, the evidence ratio is the ratio of the posterior probability
of a > b
and the posterior probability of a < b
. In this
example, values greater than one indicate that the evidence in favor of
a > b
is larger than evidence in favor of a < b
. For an
two-sided (point) hypothesis, the evidence ratio is a Bayes factor between
the hypothesis and its alternative computed via the Savage-Dickey density
ratio method. That is the posterior density at the point of interest
divided by the prior density at that point. Values greater than one
indicate that evidence in favor of the point hypothesis has increased after
seeing the data. In order to calculate this Bayes factor, all parameters
related to the hypothesis must have proper priors and argument
sample_prior
of function brm
must be set to "yes"
.
Otherwise Evid.Ratio
(and Post.Prob
) will be NA
.
Please note that, for technical reasons, we cannot sample from priors of
certain parameters classes. Most notably, these include overall intercept
parameters (prior class "Intercept"
) as well as group-level
coefficients. When interpreting Bayes factors, make sure that your priors
are reasonable and carefully chosen, as the result will depend heavily on
the priors. In particular, avoid using default priors.
The Evid.Ratio
may sometimes be 0
or Inf
implying very
small or large evidence, respectively, in favor of the tested hypothesis.
For one-sided hypotheses pairs, this basically means that all posterior
draws are on the same side of the value dividing the two hypotheses. In
that sense, instead of 0
or Inf,
you may rather read it as
Evid.Ratio
smaller 1 / S
or greater S
, respectively,
where S
denotes the number of posterior draws used in the
computations.
The argument alpha
specifies the size of the credible interval
(i.e., Bayesian confidence interval). For instance, if we tested a
two-sided hypothesis and set alpha = 0.05
(5%) an, the credible
interval will contain 1 - alpha = 0.95
(95%) of the posterior
values. Hence, alpha * 100
% of the posterior values will
lie outside of the credible interval. Although this allows testing of
hypotheses in a similar manner as in the frequentist null-hypothesis
testing framework, we strongly argue against using arbitrary cutoffs (e.g.,
p < .05
) to determine the 'existence' of an effect.
A brmshypothesis
object.
Paul-Christian Buerkner [email protected]
## Not run: ## define priors prior <- c(set_prior("normal(0,2)", class = "b"), set_prior("student_t(10,0,1)", class = "sigma"), set_prior("student_t(10,0,1)", class = "sd")) ## fit a linear mixed effects models fit <- brm(time ~ age + sex + disease + (1 + age|patient), data = kidney, family = lognormal(), prior = prior, sample_prior = "yes", control = list(adapt_delta = 0.95)) ## perform two-sided hypothesis testing (hyp1 <- hypothesis(fit, "sexfemale = age + diseasePKD")) plot(hyp1) hypothesis(fit, "exp(age) - 3 = 0", alpha = 0.01) ## perform one-sided hypothesis testing hypothesis(fit, "diseasePKD + diseaseGN - 3 < 0") hypothesis(fit, "age < Intercept", class = "sd", group = "patient") ## test the amount of random intercept variance on all variance h <- paste("sd_patient__Intercept^2 / (sd_patient__Intercept^2 +", "sd_patient__age^2 + sigma^2) = 0") (hyp2 <- hypothesis(fit, h, class = NULL)) plot(hyp2) ## test more than one hypothesis at once h <- c("diseaseGN = diseaseAN", "2 * diseaseGN - diseasePKD = 0") (hyp3 <- hypothesis(fit, h)) plot(hyp3, ignore_prior = TRUE) ## compute hypotheses for all levels of a grouping factor hypothesis(fit, "age = 0", scope = "coef", group = "patient") ## use the default method dat <- as.data.frame(fit) str(dat) hypothesis(dat, "b_age > 0") ## End(Not run)
## Not run: ## define priors prior <- c(set_prior("normal(0,2)", class = "b"), set_prior("student_t(10,0,1)", class = "sigma"), set_prior("student_t(10,0,1)", class = "sd")) ## fit a linear mixed effects models fit <- brm(time ~ age + sex + disease + (1 + age|patient), data = kidney, family = lognormal(), prior = prior, sample_prior = "yes", control = list(adapt_delta = 0.95)) ## perform two-sided hypothesis testing (hyp1 <- hypothesis(fit, "sexfemale = age + diseasePKD")) plot(hyp1) hypothesis(fit, "exp(age) - 3 = 0", alpha = 0.01) ## perform one-sided hypothesis testing hypothesis(fit, "diseasePKD + diseaseGN - 3 < 0") hypothesis(fit, "age < Intercept", class = "sd", group = "patient") ## test the amount of random intercept variance on all variance h <- paste("sd_patient__Intercept^2 / (sd_patient__Intercept^2 +", "sd_patient__age^2 + sigma^2) = 0") (hyp2 <- hypothesis(fit, h, class = NULL)) plot(hyp2) ## test more than one hypothesis at once h <- c("diseaseGN = diseaseAN", "2 * diseaseGN - diseasePKD = 0") (hyp3 <- hypothesis(fit, h)) plot(hyp3, ignore_prior = TRUE) ## compute hypotheses for all levels of a grouping factor hypothesis(fit, "age = 0", scope = "coef", group = "patient") ## use the default method dat <- as.data.frame(fit) str(dat) hypothesis(dat, "b_age > 0") ## End(Not run)
Ezzet and Whitehead (1991) analyze data from a two-treatment, two-period crossover trial to compare 2 inhalation devices for delivering the drug salbutamol in 286 asthma patients. Patients were asked to rate the clarity of leaflet instructions accompanying each device, using a 4-point ordinal scale.
inhaler
inhaler
A data frame of 572 observations containing information on the following 5 variables.
The subject number
The rating of the inhaler instructions on a scale ranging from 1 to 4
A contrast to indicate which of the two inhaler devices was used
A contrast to indicate the time of administration
A contrast to indicate possible carry over effects
Ezzet, F., & Whitehead, J. (1991). A random effects model for ordinal responses from a crossover trial. Statistics in Medicine, 10(6), 901-907.
## Not run: ## ordinal regression with family "sratio" fit1 <- brm(rating ~ treat + period + carry, data = inhaler, family = sratio(), prior = set_prior("normal(0,5)")) summary(fit1) plot(fit1) ## ordinal regression with family "cumulative" ## and random intercept over subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = cumulative(), prior = set_prior("normal(0,5)")) summary(fit2) plot(fit2) ## End(Not run)
## Not run: ## ordinal regression with family "sratio" fit1 <- brm(rating ~ treat + period + carry, data = inhaler, family = sratio(), prior = set_prior("normal(0,5)")) summary(fit1) plot(fit1) ## ordinal regression with family "cumulative" ## and random intercept over subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = cumulative(), prior = set_prior("normal(0,5)")) summary(fit2) plot(fit2) ## End(Not run)
Computes inv_logit(x) * (ub - lb) + lb
inv_logit_scaled(x, lb = 0, ub = 1)
inv_logit_scaled(x, lb = 0, ub = 1)
x |
A numeric or complex vector. |
lb |
Lower bound defaulting to |
ub |
Upper bound defaulting to |
A numeric or complex vector between lb
and ub
.
Density, distribution function, and random generation
for the inverse Gaussian distribution with location mu
,
and shape shape
.
dinv_gaussian(x, mu = 1, shape = 1, log = FALSE) pinv_gaussian(q, mu = 1, shape = 1, lower.tail = TRUE, log.p = FALSE) rinv_gaussian(n, mu = 1, shape = 1)
dinv_gaussian(x, mu = 1, shape = 1, log = FALSE) pinv_gaussian(q, mu = 1, shape = 1, lower.tail = TRUE, log.p = FALSE) rinv_gaussian(n, mu = 1, shape = 1)
x , q
|
Vector of quantiles. |
mu |
Vector of locations. |
shape |
Vector of shapes. |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
brmsfit
objectChecks if argument is a brmsfit
object
is.brmsfit(x)
is.brmsfit(x)
x |
An R object |
brmsfit_multiple
objectChecks if argument is a brmsfit_multiple
object
is.brmsfit_multiple(x)
is.brmsfit_multiple(x)
x |
An R object |
brmsformula
objectChecks if argument is a brmsformula
object
is.brmsformula(x)
is.brmsformula(x)
x |
An R object |
brmsprior
objectChecks if argument is a brmsprior
object
is.brmsprior(x)
is.brmsprior(x)
x |
An R object |
brmsterms
objectChecks if argument is a brmsterms
object
is.brmsterms(x)
is.brmsterms(x)
x |
An R object |
Check if argument is one of the correlation structures used in brms.
is.cor_brms(x) is.cor_arma(x) is.cor_cosy(x) is.cor_sar(x) is.cor_car(x) is.cor_fixed(x)
is.cor_brms(x) is.cor_arma(x) is.cor_cosy(x) is.cor_sar(x) is.cor_car(x) is.cor_fixed(x)
x |
An R object. |
mvbrmsformula
objectChecks if argument is a mvbrmsformula
object
is.mvbrmsformula(x)
is.mvbrmsformula(x)
x |
An R object |
mvbrmsterms
objectChecks if argument is a mvbrmsterms
object
is.mvbrmsterms(x)
is.mvbrmsterms(x)
x |
An R object |
Compute and evaluate predictions after performing K-fold
cross-validation via kfold
.
kfold_predict(x, method = "posterior_predict", resp = NULL, ...)
kfold_predict(x, method = "posterior_predict", resp = NULL, ...)
x |
Object of class |
method |
Method used to obtain predictions. Can be set to
|
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
... |
Further arguments passed to |
A list
with two slots named 'y'
and 'yrep'
.
Slot y
contains the vector of observed responses.
Slot yrep
contains the matrix of predicted responses,
with rows being posterior draws and columns being observations.
## Not run: fit <- brm(count ~ zBase * Trt + (1|patient), data = epilepsy, family = poisson()) # perform k-fold cross validation (kf <- kfold(fit, save_fits = TRUE, chains = 1)) # define a loss function rmse <- function(y, yrep) { yrep_mean <- colMeans(yrep) sqrt(mean((yrep_mean - y)^2)) } # predict responses and evaluate the loss kfp <- kfold_predict(kf) rmse(y = kfp$y, yrep = kfp$yrep) ## End(Not run)
## Not run: fit <- brm(count ~ zBase * Trt + (1|patient), data = epilepsy, family = poisson()) # perform k-fold cross validation (kf <- kfold(fit, save_fits = TRUE, chains = 1)) # define a loss function rmse <- function(y, yrep) { yrep_mean <- colMeans(yrep) sqrt(mean((yrep_mean - y)^2)) } # predict responses and evaluate the loss kfp <- kfold_predict(kf) rmse(y = kfp$y, yrep = kfp$yrep) ## End(Not run)
Perform exact K-fold cross-validation by refitting the model
times each leaving out one-
th of the original data.
Folds can be run in parallel using the future package.
## S3 method for class 'brmsfit' kfold( x, ..., K = 10, Ksub = NULL, folds = NULL, group = NULL, joint = FALSE, compare = TRUE, resp = NULL, model_names = NULL, save_fits = FALSE, recompile = NULL, future_args = list() )
## S3 method for class 'brmsfit' kfold( x, ..., K = 10, Ksub = NULL, folds = NULL, group = NULL, joint = FALSE, compare = TRUE, resp = NULL, model_names = NULL, save_fits = FALSE, recompile = NULL, future_args = list() )
x |
A |
... |
Further arguments passed to |
K |
The number of subsets of equal (if possible) size
into which the data will be partitioned for performing
|
Ksub |
Optional number of subsets (of those subsets defined by |
folds |
Determines how the subsets are being constructed.
Possible values are |
group |
Optional name of a grouping variable or factor in the model.
What exactly is done with this variable depends on argument |
joint |
Indicates which observations' log likelihoods shall be
considered jointly in the ELPD computation. If |
compare |
A flag indicating if the information criteria
of the models should be compared to each other
via |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
model_names |
If |
save_fits |
If |
recompile |
Logical, indicating whether the Stan model should be
recompiled. This may be necessary if you are running |
future_args |
A list of further arguments passed to
|
The kfold
function performs exact -fold
cross-validation. First the data are partitioned into
folds
(i.e. subsets) of equal (or as close to equal as possible) size by default.
Then the model is refit
times, each time leaving out one of the
K
subsets. If is equal to the total number of observations
in the data then
-fold cross-validation is equivalent to exact
leave-one-out cross-validation (to which
loo
is an efficient
approximation). The compare_ic
function is also compatible with
the objects returned by kfold
.
The subsets can be constructed in multiple different ways:
If both folds
and group
are NULL
, the subsets
are randomly chosen so that they have equal (or as close to equal as
possible) size.
If folds
is NULL
but group
is specified, the
data is split up into subsets, each time omitting all observations of one
of the factor levels, while ignoring argument K
.
If folds = "stratified"
the subsets are stratified after
group
using loo::kfold_split_stratified
.
If folds = "grouped"
the subsets are split by
group
using loo::kfold_split_grouped
.
If folds = "loo"
exact leave-one-out cross-validation
will be performed and K
will be ignored. Further, if group
is specified, all observations corresponding to the factor level of the
currently predicted single value are omitted. Thus, in this case, the
predicted values are only a subset of the omitted ones.
If folds
is a numeric vector, it must contain one element per
observation in the data. Each element of the vector is an integer in
1:K
indicating to which of the K
folds the corresponding
observation belongs. There are some convenience functions available in
the loo package that create integer vectors to use for this purpose
(see the Examples section below and also the
kfold-helpers page).
When running kfold
on a brmsfit
created with the
cmdstanr backend in a different R session, several recompilations
will be triggered because by default, cmdstanr writes the model
executable to a temporary directory. To avoid that, set option
"cmdstanr_write_stan_file_dir"
to a nontemporary path of your choice
before creating the original brmsfit
(see section 'Examples' below).
kfold
returns an object that has a similar structure as the
objects returned by the loo
and waic
methods and
can be used with the same post-processing functions.
## Not run: fit1 <- brm(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson()) # throws warning about some pareto k estimates being too high (loo1 <- loo(fit1)) # perform 10-fold cross validation (kfold1 <- kfold(fit1, chains = 1)) # use joint likelihoods per fold for ELPD evaluation kfold(fit1, chains = 1, joint = "fold") # use the future package for parallelization of models # that is to fit models belonging to different folds in parallel library(future) plan(multisession, workers = 4) kfold(fit1, chains = 1) plan(sequential) ## to avoid recompilations when running kfold() on a 'cmdstanr'-backend fit ## in a fresh R session, set option 'cmdstanr_write_stan_file_dir' before ## creating the initial 'brmsfit' ## CAUTION: the following code creates some files in the current working ## directory: two 'model_<hash>.stan' files, one 'model_<hash>(.exe)' ## executable, and one 'fit_cmdstanr_<some_number>.rds' file set.seed(7) fname <- paste0("fit_cmdstanr_", sample.int(.Machine$integer.max, 1)) options(cmdstanr_write_stan_file_dir = getwd()) fit_cmdstanr <- brm(rate ~ conc + state, data = Puromycin, backend = "cmdstanr", file = fname) # now restart the R session and run the following (after attaching 'brms') set.seed(7) fname <- paste0("fit_cmdstanr_", sample.int(.Machine$integer.max, 1)) fit_cmdstanr <- brm(rate ~ conc + state, data = Puromycin, backend = "cmdstanr", file = fname) kfold_cmdstanr <- kfold(fit_cmdstanr, K = 2) ## End(Not run)
## Not run: fit1 <- brm(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson()) # throws warning about some pareto k estimates being too high (loo1 <- loo(fit1)) # perform 10-fold cross validation (kfold1 <- kfold(fit1, chains = 1)) # use joint likelihoods per fold for ELPD evaluation kfold(fit1, chains = 1, joint = "fold") # use the future package for parallelization of models # that is to fit models belonging to different folds in parallel library(future) plan(multisession, workers = 4) kfold(fit1, chains = 1) plan(sequential) ## to avoid recompilations when running kfold() on a 'cmdstanr'-backend fit ## in a fresh R session, set option 'cmdstanr_write_stan_file_dir' before ## creating the initial 'brmsfit' ## CAUTION: the following code creates some files in the current working ## directory: two 'model_<hash>.stan' files, one 'model_<hash>(.exe)' ## executable, and one 'fit_cmdstanr_<some_number>.rds' file set.seed(7) fname <- paste0("fit_cmdstanr_", sample.int(.Machine$integer.max, 1)) options(cmdstanr_write_stan_file_dir = getwd()) fit_cmdstanr <- brm(rate ~ conc + state, data = Puromycin, backend = "cmdstanr", file = fname) # now restart the R session and run the following (after attaching 'brms') set.seed(7) fname <- paste0("fit_cmdstanr_", sample.int(.Machine$integer.max, 1)) fit_cmdstanr <- brm(rate ~ conc + state, data = Puromycin, backend = "cmdstanr", file = fname) kfold_cmdstanr <- kfold(fit_cmdstanr, K = 2) ## End(Not run)
This dataset, originally discussed in McGilchrist and Aisbett (1991), describes the first and second (possibly right censored) recurrence time of infection in kidney patients using portable dialysis equipment. In addition, information on the risk variables age, sex and disease type is provided.
kidney
kidney
A data frame of 76 observations containing information on the following 7 variables.
The time to first or second recurrence of the infection, or the time of censoring
A factor of levels 1
or 2
indicating if the infection recurred for the first
or second time for this patient
Either 0
or 1
, where 0
indicates
no censoring of recurrence time and 1
indicates right censoring
The patient number
The age of the patient
The sex of the patient
A factor of levels other, GN, AN
,
and PKD
specifying the type of disease
McGilchrist, C. A., & Aisbett, C. W. (1991). Regression with frailty in survival analysis. Biometrics, 47(2), 461-466.
## Not run: ## performing surivival analysis using the "weibull" family fit1 <- brm(time | cens(censored) ~ age + sex + disease, data = kidney, family = weibull, init = "0") summary(fit1) plot(fit1) ## adding random intercepts over patients fit2 <- brm(time | cens(censored) ~ age + sex + disease + (1|patient), data = kidney, family = weibull(), init = "0", prior = set_prior("cauchy(0,2)", class = "sd")) summary(fit2) plot(fit2) ## End(Not run)
## Not run: ## performing surivival analysis using the "weibull" family fit1 <- brm(time | cens(censored) ~ age + sex + disease, data = kidney, family = weibull, init = "0") summary(fit1) plot(fit1) ## adding random intercepts over patients fit2 <- brm(time | cens(censored) ~ age + sex + disease + (1|patient), data = kidney, family = weibull(), init = "0", prior = set_prior("cauchy(0,2)", class = "sd")) summary(fit2) plot(fit2) ## End(Not run)
This functionality is no longer supported as of brms version 2.19.2. Please
use the horseshoe
or R2D2
shrinkage priors instead.
lasso(df = 1, scale = 1)
lasso(df = 1, scale = 1)
df |
Degrees of freedom of the chi-square prior of the inverse tuning
parameter. Defaults to |
scale |
Scale of the lasso prior. Defaults to |
An error indicating that the lasso prior is no longer supported.
Park, T., & Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association, 103(482), 681-686.
Provide an interface to shinystan for models fitted with brms
launch_shinystan.brmsfit(object, rstudio = getOption("shinystan.rstudio"), ...)
launch_shinystan.brmsfit(object, rstudio = getOption("shinystan.rstudio"), ...)
object |
A fitted model object typically of class |
rstudio |
Only relevant for RStudio users.
The default ( |
... |
Optional arguments to pass to |
An S4 shinystan object
## Not run: fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "gaussian") launch_shinystan(fit) ## End(Not run)
## Not run: fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "gaussian") launch_shinystan(fit) ## End(Not run)
Compute the Pointwise Log-Likelihood
## S3 method for class 'brmsfit' log_lik( object, newdata = NULL, re_formula = NULL, resp = NULL, ndraws = NULL, draw_ids = NULL, pointwise = FALSE, combine = TRUE, add_point_estimate = FALSE, cores = NULL, ... )
## S3 method for class 'brmsfit' log_lik( object, newdata = NULL, re_formula = NULL, resp = NULL, ndraws = NULL, draw_ids = NULL, pointwise = FALSE, combine = TRUE, add_point_estimate = FALSE, cores = NULL, ... )
object |
A fitted model object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
pointwise |
A flag indicating whether to compute the full
log-likelihood matrix at once (the default), or just return
the likelihood function along with all data and draws
required to compute the log-likelihood separately for each
observation. The latter option is rarely useful when
calling |
combine |
Only relevant in multivariate models. Indicates if the log-likelihoods of the submodels should be combined per observation (i.e. added together; the default) or if the log-likelihoods should be returned separately. |
add_point_estimate |
For internal use only. Ensures compatibility
with the |
cores |
Number of cores (defaults to |
... |
Further arguments passed to |
NA
values within factors in newdata
,
are interpreted as if all dummy variables of this factor are
zero. This allows, for instance, to make predictions of the grand mean
when using sum coding.
In multilevel models, it is possible to
allow new levels of grouping factors to be used in the predictions.
This can be controlled via argument allow_new_levels
.
New levels can be sampled in multiple ways, which can be controlled
via argument sample_new_levels
. Both of these arguments are
documented in prepare_predictions
along with several
other useful arguments to control specific aspects of the predictions.
Usually, an S x N matrix containing the pointwise log-likelihood
draws, where S is the number of draws and N is the number
of observations in the data. For multivariate models and if
combine
is FALSE
, an S x N x R array is returned,
where R is the number of response variables.
If pointwise = TRUE
, the output is a function
with a draws
attribute containing all relevant
data and posterior draws.
Density function and random generation for the (multivariate) logistic normal
distribution with latent mean vector mu
and covariance matrix Sigma
.
dlogistic_normal(x, mu, Sigma, refcat = 1, log = FALSE, check = FALSE) rlogistic_normal(n, mu, Sigma, refcat = 1, check = FALSE)
dlogistic_normal(x, mu, Sigma, refcat = 1, log = FALSE, check = FALSE) rlogistic_normal(n, mu, Sigma, refcat = 1, check = FALSE)
x |
Vector or matrix of quantiles. If |
mu |
Mean vector with length equal to the number of dimensions. |
Sigma |
Covariance matrix. |
refcat |
A single integer indicating the reference category.
Defaults to |
log |
Logical; If |
check |
Logical; Indicates whether several input checks
should be performed. Defaults to |
n |
Number of draws to sample from the distribution. |
Computes logit((x - lb) / (ub - lb))
logit_scaled(x, lb = 0, ub = 1)
logit_scaled(x, lb = 0, ub = 1)
x |
A numeric or complex vector. |
lb |
Lower bound defaulting to |
ub |
Upper bound defaulting to |
A numeric or complex vector.
Computes log(x - 1)
.
logm1(x, base = exp(1))
logm1(x, base = exp(1))
x |
A numeric or complex vector. |
base |
A positive or complex number: the base with respect to which
logarithms are computed. Defaults to e = |
For more details see loo_compare
.
## S3 method for class 'brmsfit' loo_compare(x, ..., criterion = c("loo", "waic", "kfold"), model_names = NULL)
## S3 method for class 'brmsfit' loo_compare(x, ..., criterion = c("loo", "waic", "kfold"), model_names = NULL)
x |
A |
... |
More |
criterion |
The name of the criterion to be extracted
from |
model_names |
If |
All brmsfit
objects should contain precomputed
criterion objects. See add_criterion
for more help.
An object of class "compare.loo
".
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) fit1 <- add_criterion(fit1, "waic") # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) fit2 <- add_criterion(fit2, "waic") # compare both models loo_compare(fit1, fit2, criterion = "waic") ## End(Not run)
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) fit1 <- add_criterion(fit1, "waic") # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) fit2 <- add_criterion(fit2, "waic") # compare both models loo_compare(fit1, fit2, criterion = "waic") ## End(Not run)
Compute model weights for brmsfit
objects via stacking
or pseudo-BMA weighting. For more details, see
loo::loo_model_weights
.
## S3 method for class 'brmsfit' loo_model_weights(x, ..., model_names = NULL)
## S3 method for class 'brmsfit' loo_model_weights(x, ..., model_names = NULL)
x |
A |
... |
More |
model_names |
If |
A named vector of model weights.
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler, family = "gaussian") # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "gaussian") loo_model_weights(fit1, fit2) ## End(Not run)
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler, family = "gaussian") # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "gaussian") loo_model_weights(fit1, fit2) ## End(Not run)
Moment matching for efficient approximate leave-one-out cross-validation
(LOO-CV). See loo_moment_match
for more details.
## S3 method for class 'brmsfit' loo_moment_match( x, loo = NULL, k_threshold = 0.7, newdata = NULL, resp = NULL, check = TRUE, recompile = FALSE, ... ) ## S3 method for class 'loo' loo_moment_match(x, fit, ...)
## S3 method for class 'brmsfit' loo_moment_match( x, loo = NULL, k_threshold = 0.7, newdata = NULL, resp = NULL, check = TRUE, recompile = FALSE, ... ) ## S3 method for class 'loo' loo_moment_match(x, fit, ...)
x |
An R object of class |
loo |
An R object of class |
k_threshold |
The Pareto |
newdata |
An optional data.frame for which to evaluate predictions. If
|
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
check |
Logical; If |
recompile |
Logical, indicating whether the Stan model should be recompiled. This may be necessary if you are running moment matching on another machine than the one used to fit the model. No recompilation is done by default. |
... |
Further arguments passed to the underlying methods.
Additional arguments initially passed to |
fit |
An R object of class |
The moment matching algorithm requires draws of all variables
defined in Stan's parameters
block to be saved. Otherwise
loo_moment_match
cannot be computed. Thus, please set
save_pars = save_pars(all = TRUE)
in the call to brm
,
if you are planning to apply loo_moment_match
to your models.
An updated object of class loo
.
Paananen, T., Piironen, J., Buerkner, P.-C., Vehtari, A. (2021). Implicitly Adaptive Importance Sampling. Statistics and Computing.
## Not run: fit1 <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), save_pars = save_pars(all = TRUE)) # throws warning about some pareto k estimates being too high (loo1 <- loo(fit1)) # no more warnings after moment matching (mmloo1 <- loo_moment_match(fit1, loo = loo1)) ## End(Not run)
## Not run: fit1 <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), save_pars = save_pars(all = TRUE)) # throws warning about some pareto k estimates being too high (loo1 <- loo(fit1)) # no more warnings after moment matching (mmloo1 <- loo_moment_match(fit1, loo = loo1)) ## End(Not run)
These functions are wrappers around the E_loo
function of the loo package.
## S3 method for class 'brmsfit' loo_predict( object, type = c("mean", "var", "quantile"), probs = 0.5, psis_object = NULL, resp = NULL, ... ) ## S3 method for class 'brmsfit' loo_epred( object, type = c("mean", "var", "quantile"), probs = 0.5, psis_object = NULL, resp = NULL, ... ) loo_epred(object, ...) ## S3 method for class 'brmsfit' loo_linpred( object, type = c("mean", "var", "quantile"), probs = 0.5, psis_object = NULL, resp = NULL, ... ) ## S3 method for class 'brmsfit' loo_predictive_interval(object, prob = 0.9, psis_object = NULL, ...)
## S3 method for class 'brmsfit' loo_predict( object, type = c("mean", "var", "quantile"), probs = 0.5, psis_object = NULL, resp = NULL, ... ) ## S3 method for class 'brmsfit' loo_epred( object, type = c("mean", "var", "quantile"), probs = 0.5, psis_object = NULL, resp = NULL, ... ) loo_epred(object, ...) ## S3 method for class 'brmsfit' loo_linpred( object, type = c("mean", "var", "quantile"), probs = 0.5, psis_object = NULL, resp = NULL, ... ) ## S3 method for class 'brmsfit' loo_predictive_interval(object, prob = 0.9, psis_object = NULL, ...)
object |
An object of class |
type |
The statistic to be computed on the results.
Can by either |
probs |
A vector of quantiles to compute.
Only used if |
psis_object |
An optional object returned by |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
... |
Optional arguments passed to the underlying methods that is
|
prob |
For |
loo_predict
, loo_epred
, loo_linpred
, and
loo_predictive_interval
all return a matrix with one row per
observation and one column per summary statistic as specified by
arguments type
and probs
. In multivariate or categorical models
a third dimension is added to represent the response variables or categories,
respectively.
loo_predictive_interval(..., prob = p)
is equivalent to
loo_predict(..., type = "quantile", probs = c(a, 1-a))
with
a = (1 - p)/2
.
## Not run: ## data from help("lm") ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) d <- data.frame( weight = c(ctl, trt), group = gl(2, 10, 20, labels = c("Ctl", "Trt")) ) fit <- brm(weight ~ group, data = d) loo_predictive_interval(fit, prob = 0.8) ## optionally log-weights can be pre-computed and reused psis <- loo::psis(-log_lik(fit), cores = 2) loo_predictive_interval(fit, prob = 0.8, psis_object = psis) loo_predict(fit, type = "var", psis_object = psis) loo_epred(fit, type = "var", psis_object = psis) ## End(Not run)
## Not run: ## data from help("lm") ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) d <- data.frame( weight = c(ctl, trt), group = gl(2, 10, 20, labels = c("Ctl", "Trt")) ) fit <- brm(weight ~ group, data = d) loo_predictive_interval(fit, prob = 0.8) ## optionally log-weights can be pre-computed and reused psis <- loo::psis(-log_lik(fit), cores = 2) loo_predictive_interval(fit, prob = 0.8, psis_object = psis) loo_predict(fit, type = "var", psis_object = psis) loo_epred(fit, type = "var", psis_object = psis) ## End(Not run)
Compute a LOO-adjusted R-squared for regression models
## S3 method for class 'brmsfit' loo_R2( object, resp = NULL, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), seed = NULL, ... )
## S3 method for class 'brmsfit' loo_R2( object, resp = NULL, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), seed = NULL, ... )
object |
An object of class |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
seed |
Optional integer used to initialize the random number generator. |
... |
Further arguments passed to
|
If summary = TRUE
, an M x C matrix is returned
(M = number of response variables and c = length(probs) + 2
)
containing Bayesian bootstrap based summary statistics of the
LOO-adjusted R-squared values. If summary = FALSE
, the
Bayesian bootstrap draws of the LOO-adjusted R-squared values
are returned in an S x M matrix (S is the number of draws).
@details LOO-R2 uses LOO residuals and is defined as
,
with
where .
Bayesian bootstrap is used to draw from the approximated uncertainty
distribution as described by Vehtari and Lampinen (2002).
Vehtari and Lampinen (2002). Bayesian model assessment and comparison using cross-validation predictive densities. Neural Computation, 14(10):2439-2468.
## Not run: fit <- brm(mpg ~ wt + cyl, data = mtcars) summary(fit) loo_R2(fit) # compute R2 with new data nd <- data.frame(mpg = c(10, 20, 30), wt = c(4, 3, 2), cyl = c(8, 6, 4)) loo_R2(fit, newdata = nd) ## End(Not run)
## Not run: fit <- brm(mpg ~ wt + cyl, data = mtcars) summary(fit) loo_R2(fit) # compute R2 with new data nd <- data.frame(mpg = c(10, 20, 30), wt = c(4, 3, 2), cyl = c(8, 6, 4)) loo_R2(fit, newdata = nd) ## End(Not run)
Efficient approximate leave-one-out cross-validation (LOO) using subsampling
## S3 method for class 'brmsfit' loo_subsample(x, ..., compare = TRUE, resp = NULL, model_names = NULL)
## S3 method for class 'brmsfit' loo_subsample(x, ..., compare = TRUE, resp = NULL, model_names = NULL)
x |
A |
... |
More |
compare |
A flag indicating if the information criteria
of the models should be compared to each other
via |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
model_names |
If |
More details can be found on
loo_subsample
.
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) (loo1 <- loo_subsample(fit1)) # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) (loo2 <- loo_subsample(fit2)) # compare both models loo_compare(loo1, loo2) ## End(Not run)
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) (loo1 <- loo_subsample(fit1)) # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) (loo2 <- loo_subsample(fit2)) # compare both models loo_compare(loo1, loo2) ## End(Not run)
Perform approximate leave-one-out cross-validation based
on the posterior likelihood using the loo package.
For more details see loo
.
## S3 method for class 'brmsfit' loo( x, ..., compare = TRUE, resp = NULL, pointwise = FALSE, moment_match = FALSE, reloo = FALSE, k_threshold = 0.7, save_psis = FALSE, moment_match_args = list(), reloo_args = list(), model_names = NULL )
## S3 method for class 'brmsfit' loo( x, ..., compare = TRUE, resp = NULL, pointwise = FALSE, moment_match = FALSE, reloo = FALSE, k_threshold = 0.7, save_psis = FALSE, moment_match_args = list(), reloo_args = list(), model_names = NULL )
x |
A |
... |
More |
compare |
A flag indicating if the information criteria
of the models should be compared to each other
via |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
pointwise |
A flag indicating whether to compute the full
log-likelihood matrix at once or separately for each observation.
The latter approach is usually considerably slower but
requires much less working memory. Accordingly, if one runs
into memory issues, |
moment_match |
Logical; Indicate whether |
reloo |
Logical; Indicate whether |
k_threshold |
The Pareto |
save_psis |
Should the |
moment_match_args |
Optional named |
reloo_args |
Optional named |
model_names |
If |
See loo_compare
for details on model comparisons.
For brmsfit
objects, LOO
is an alias of loo
.
Use method add_criterion
to store
information criteria in the fitted model object for later usage.
If just one object is provided, an object of class loo
.
If multiple objects are provided, an object of class loolist
.
Vehtari, A., Gelman, A., & Gabry J. (2016). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. In Statistics and Computing, doi:10.1007/s11222-016-9696-4. arXiv preprint arXiv:1507.04544.
Gelman, A., Hwang, J., & Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models. Statistics and Computing, 24, 997-1016.
Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. The Journal of Machine Learning Research, 11, 3571-3594.
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) (loo1 <- loo(fit1)) # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) (loo2 <- loo(fit2)) # compare both models loo_compare(loo1, loo2) ## End(Not run)
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) (loo1 <- loo(fit1)) # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) (loo2 <- loo(fit2)) # compare both models loo_compare(loo1, loo2) ## End(Not run)
This dataset, discussed in Gesmann & Morris (2020), contains cumulative insurance loss payments over the course of ten years.
loss
loss
A data frame of 55 observations containing information on the following 4 variables.
Origin year of the insurance (1991 to 2000)
Deviation from the origin year in months
Cumulative loss payments
Achieved premiums for the given origin year
Gesmann M. & Morris J. (2020). Hierarchical Compartmental Reserving Models. CAS Research Papers.
## Not run: # non-linear model to predict cumulative loss payments fit_loss <- brm( bf(cum ~ ult * (1 - exp(-(dev/theta)^omega)), ult ~ 1 + (1|AY), omega ~ 1, theta ~ 1, nl = TRUE), data = loss, family = gaussian(), prior = c( prior(normal(5000, 1000), nlpar = "ult"), prior(normal(1, 2), nlpar = "omega"), prior(normal(45, 10), nlpar = "theta") ), control = list(adapt_delta = 0.9) ) # basic summaries summary(fit_loss) conditional_effects(fit_loss) # plot predictions per origin year conditions <- data.frame(AY = unique(loss$AY)) rownames(conditions) <- unique(loss$AY) me_loss <- conditional_effects( fit_loss, conditions = conditions, re_formula = NULL, method = "predict" ) plot(me_loss, ncol = 5, points = TRUE) ## End(Not run)
## Not run: # non-linear model to predict cumulative loss payments fit_loss <- brm( bf(cum ~ ult * (1 - exp(-(dev/theta)^omega)), ult ~ 1 + (1|AY), omega ~ 1, theta ~ 1, nl = TRUE), data = loss, family = gaussian(), prior = c( prior(normal(5000, 1000), nlpar = "ult"), prior(normal(1, 2), nlpar = "omega"), prior(normal(45, 10), nlpar = "theta") ), control = list(adapt_delta = 0.9) ) # basic summaries summary(fit_loss) conditional_effects(fit_loss) # plot predictions per origin year conditions <- data.frame(AY = unique(loss$AY)) rownames(conditions) <- unique(loss$AY) me_loss <- conditional_effects( fit_loss, conditions = conditions, re_formula = NULL, method = "predict" ) plot(me_loss, ncol = 5, points = TRUE) ## End(Not run)
Set up a moving average (MA) term of order q in brms. The function does not evaluate its arguments – it exists purely to help set up a model with MA terms.
ma(time = NA, gr = NA, q = 1, cov = FALSE)
ma(time = NA, gr = NA, q = 1, cov = FALSE)
time |
An optional time variable specifying the time ordering of the observations. By default, the existing order of the observations in the data is used. |
gr |
An optional grouping variable. If specified, the correlation structure is assumed to apply only to observations within the same grouping level. |
q |
A non-negative integer specifying the moving average (MA)
order of the ARMA structure. Default is |
cov |
A flag indicating whether ARMA effects should be estimated by
means of residual covariance matrices. This is currently only possible for
stationary ARMA effects of order 1. If the model family does not have
natural residuals, latent residuals are added automatically. If
|
An object of class 'arma_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
## Not run: data("LakeHuron") LakeHuron <- as.data.frame(LakeHuron) fit <- brm(x ~ ma(p = 2), data = LakeHuron) summary(fit) ## End(Not run)
## Not run: data("LakeHuron") LakeHuron <- as.data.frame(LakeHuron) fit <- brm(x ~ ma(p = 2), data = LakeHuron) summary(fit) ## End(Not run)
This is a helper function to prepare fully crossed conditions primarily
for use with the conditions
argument of conditional_effects
.
Automatically creates labels for each row in the cond__
column.
make_conditions(x, vars, ...)
make_conditions(x, vars, ...)
x |
An R object from which to extract the variables that should be part of the conditions. |
vars |
Names of the variables that should be part of the conditions. |
... |
Arguments passed to |
For factor like variables, all levels are used as conditions.
For numeric variables, mean + (-1:1) * SD
are used as conditions.
A data.frame
where each row indicates a condition.
conditional_effects
, rows2labels
df <- data.frame(x = c("a", "b"), y = rnorm(10)) make_conditions(df, vars = c("x", "y"))
df <- data.frame(x = c("a", "b"), y = rnorm(10)) make_conditions(df, vars = c("x", "y"))
Convenient way to call MCMC plotting functions implemented in the bayesplot package.
## S3 method for class 'brmsfit' mcmc_plot( object, pars = NA, type = "intervals", variable = NULL, regex = FALSE, fixed = FALSE, ... ) mcmc_plot(object, ...)
## S3 method for class 'brmsfit' mcmc_plot( object, pars = NA, type = "intervals", variable = NULL, regex = FALSE, fixed = FALSE, ... ) mcmc_plot(object, ...)
object |
An R object typically of class |
pars |
Deprecated alias of |
type |
The type of the plot.
Supported types are (as names) |
variable |
Names of the variables (parameters) to plot, as given by a
character vector or a regular expression (if |
regex |
Logical; Indicates whether |
fixed |
(Deprecated) Indicates whether parameter names
should be matched exactly ( |
... |
Additional arguments passed to the plotting functions.
See |
Also consider using the shinystan package available via
method launch_shinystan
in brms for flexible
and interactive visual analysis.
A ggplot
object
that can be further customized using the ggplot2 package.
## Not run: model <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = "poisson") # plot posterior intervals mcmc_plot(model) # only show population-level effects in the plots mcmc_plot(model, variable = "^b_", regex = TRUE) # show histograms of the posterior distributions mcmc_plot(model, type = "hist") # plot some diagnostics of the sampler mcmc_plot(model, type = "neff") mcmc_plot(model, type = "rhat") # plot some diagnostics specific to the NUTS sampler mcmc_plot(model, type = "nuts_acceptance") mcmc_plot(model, type = "nuts_divergence") ## End(Not run)
## Not run: model <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = "poisson") # plot posterior intervals mcmc_plot(model) # only show population-level effects in the plots mcmc_plot(model, variable = "^b_", regex = TRUE) # show histograms of the posterior distributions mcmc_plot(model, type = "hist") # plot some diagnostics of the sampler mcmc_plot(model, type = "neff") mcmc_plot(model, type = "rhat") # plot some diagnostics specific to the NUTS sampler mcmc_plot(model, type = "nuts_acceptance") mcmc_plot(model, type = "nuts_divergence") ## End(Not run)
(Soft deprecated) Specify predictors with measurement error. The function does not evaluate its arguments – it exists purely to help set up a model.
me(x, sdx, gr = NULL)
me(x, sdx, gr = NULL)
x |
The variable measured with error. |
sdx |
Known measurement error of |
gr |
Optional grouping factor to specify which
values of |
For detailed documentation see help(brmsformula)
.
me
terms are soft deprecated in favor of the more
general and consistent mi
terms.
By default, latent noise-free variables are assumed
to be correlated. To change that, add set_mecor(FALSE)
to your model formula object (see examples).
brmsformula
, brmsformula-helpers
## Not run: # sample some data N <- 100 dat <- data.frame( y = rnorm(N), x1 = rnorm(N), x2 = rnorm(N), sdx = abs(rnorm(N, 1)) ) # fit a simple error-in-variables model fit1 <- brm(y ~ me(x1, sdx) + me(x2, sdx), data = dat, save_pars = save_pars(latent = TRUE)) summary(fit1) # turn off modeling of correlations bform <- bf(y ~ me(x1, sdx) + me(x2, sdx)) + set_mecor(FALSE) fit2 <- brm(bform, data = dat, save_pars = save_pars(latent = TRUE)) summary(fit2) ## End(Not run)
## Not run: # sample some data N <- 100 dat <- data.frame( y = rnorm(N), x1 = rnorm(N), x2 = rnorm(N), sdx = abs(rnorm(N, 1)) ) # fit a simple error-in-variables model fit1 <- brm(y ~ me(x1, sdx) + me(x2, sdx), data = dat, save_pars = save_pars(latent = TRUE)) summary(fit1) # turn off modeling of correlations bform <- bf(y ~ me(x1, sdx) + me(x2, sdx)) + set_mecor(FALSE) fit2 <- brm(bform, data = dat, save_pars = save_pars(latent = TRUE)) summary(fit2) ## End(Not run)
Specify predictor term with missing values in brms. The function does
not evaluate its arguments – it exists purely to help set up a model.
For documentation on how to specify missing values in response variables,
see resp_mi
.
mi(x, idx = NA)
mi(x, idx = NA)
x |
The variable containing missing values. |
idx |
An optional variable containing indices of observations in 'x'
that are to be used in the model. This is mostly relevant in partially
subsetted models (via |
For detailed documentation see help(brmsformula)
.
## Not run: data("nhanes", package = "mice") N <- nrow(nhanes) # simple model with missing data bform1 <- bf(bmi | mi() ~ age * mi(chl)) + bf(chl | mi() ~ age) + set_rescor(FALSE) fit1 <- brm(bform1, data = nhanes) summary(fit1) plot(conditional_effects(fit1, resp = "bmi"), ask = FALSE) loo(fit1, newdata = na.omit(fit1$data)) # simulate some measurement noise nhanes$se <- rexp(N, 2) # measurement noise can be handled within 'mi' terms # with or without the presence of missing values bform2 <- bf(bmi | mi() ~ age * mi(chl)) + bf(chl | mi(se) ~ age) + set_rescor(FALSE) fit2 <- brm(bform2, data = nhanes) summary(fit2) plot(conditional_effects(fit2, resp = "bmi"), ask = FALSE) # 'mi' terms can also be used when some responses are subsetted nhanes$sub <- TRUE nhanes$sub[1:2] <- FALSE nhanes$id <- 1:N nhanes$idx <- sample(3:N, N, TRUE) # this requires the addition term 'index' being specified # in the subsetted part of the model bform3 <- bf(bmi | mi() ~ age * mi(chl, idx)) + bf(chl | mi(se) + subset(sub) + index(id) ~ age) + set_rescor(FALSE) fit3 <- brm(bform3, data = nhanes) summary(fit3) plot(conditional_effects(fit3, resp = "bmi"), ask = FALSE) ## End(Not run)
## Not run: data("nhanes", package = "mice") N <- nrow(nhanes) # simple model with missing data bform1 <- bf(bmi | mi() ~ age * mi(chl)) + bf(chl | mi() ~ age) + set_rescor(FALSE) fit1 <- brm(bform1, data = nhanes) summary(fit1) plot(conditional_effects(fit1, resp = "bmi"), ask = FALSE) loo(fit1, newdata = na.omit(fit1$data)) # simulate some measurement noise nhanes$se <- rexp(N, 2) # measurement noise can be handled within 'mi' terms # with or without the presence of missing values bform2 <- bf(bmi | mi() ~ age * mi(chl)) + bf(chl | mi(se) ~ age) + set_rescor(FALSE) fit2 <- brm(bform2, data = nhanes) summary(fit2) plot(conditional_effects(fit2, resp = "bmi"), ask = FALSE) # 'mi' terms can also be used when some responses are subsetted nhanes$sub <- TRUE nhanes$sub[1:2] <- FALSE nhanes$id <- 1:N nhanes$idx <- sample(3:N, N, TRUE) # this requires the addition term 'index' being specified # in the subsetted part of the model bform3 <- bf(bmi | mi() ~ age * mi(chl, idx)) + bf(chl | mi(se) + subset(sub) + index(id) ~ age) + set_rescor(FALSE) fit3 <- brm(bform3, data = nhanes) summary(fit3) plot(conditional_effects(fit3, resp = "bmi"), ask = FALSE) ## End(Not run)
Set up a finite mixture family for use in brms.
mixture(..., flist = NULL, nmix = 1, order = NULL)
mixture(..., flist = NULL, nmix = 1, order = NULL)
... |
One or more objects providing a description of the
response distributions to be combined in the mixture model.
These can be family functions, calls to family functions or
character strings naming the families. For details of supported
families see |
flist |
Optional list of objects, which are treated in the
same way as objects passed via the |
nmix |
Optional numeric vector specifying the number of times
each family is repeated. If specified, it must have the same length
as the number of families passed via |
order |
Ordering constraint to identify mixture components.
If |
Most families supported by brms can be used to form mixtures. The response variable has to be valid for all components of the mixture family. Currently, the number of mixture components has to be specified by the user. It is not yet possible to estimate the number of mixture components from the data.
Ordering intercepts in mixtures of ordinal families is not possible as each family has itself a set of vector of intercepts (i.e. ordinal thresholds). Instead, brms will fix the vector of intercepts across components in ordinal mixtures, if desired, so that users can try to identify the mixture model via selective inclusion of predictors.
For most mixture models, you may want to specify priors on the
population-level intercepts via set_prior
to improve
convergence. In addition, it is sometimes necessary to set init = 0
in the call to brm
to allow chains to initialize properly.
For more details on the specification of mixture
models, see brmsformula
.
An object of class mixfamily
.
## Not run: ## simulate some data set.seed(1234) dat <- data.frame( y = c(rnorm(200), rnorm(100, 6)), x = rnorm(300), z = sample(0:1, 300, TRUE) ) ## fit a simple normal mixture model mix <- mixture(gaussian, gaussian) prior <- c( prior(normal(0, 7), Intercept, dpar = mu1), prior(normal(5, 7), Intercept, dpar = mu2) ) fit1 <- brm(bf(y ~ x + z), dat, family = mix, prior = prior, chains = 2) summary(fit1) pp_check(fit1) ## use different predictors for the components fit2 <- brm(bf(y ~ 1, mu1 ~ x, mu2 ~ z), dat, family = mix, prior = prior, chains = 2) summary(fit2) ## fix the mixing proportions fit3 <- brm(bf(y ~ x + z, theta1 = 1, theta2 = 2), dat, family = mix, prior = prior, init = 0, chains = 2) summary(fit3) pp_check(fit3) ## predict the mixing proportions fit4 <- brm(bf(y ~ x + z, theta2 ~ x), dat, family = mix, prior = prior, init = 0, chains = 2) summary(fit4) pp_check(fit4) ## compare model fit loo(fit1, fit2, fit3, fit4) ## End(Not run)
## Not run: ## simulate some data set.seed(1234) dat <- data.frame( y = c(rnorm(200), rnorm(100, 6)), x = rnorm(300), z = sample(0:1, 300, TRUE) ) ## fit a simple normal mixture model mix <- mixture(gaussian, gaussian) prior <- c( prior(normal(0, 7), Intercept, dpar = mu1), prior(normal(5, 7), Intercept, dpar = mu2) ) fit1 <- brm(bf(y ~ x + z), dat, family = mix, prior = prior, chains = 2) summary(fit1) pp_check(fit1) ## use different predictors for the components fit2 <- brm(bf(y ~ 1, mu1 ~ x, mu2 ~ z), dat, family = mix, prior = prior, chains = 2) summary(fit2) ## fix the mixing proportions fit3 <- brm(bf(y ~ x + z, theta1 = 1, theta2 = 2), dat, family = mix, prior = prior, init = 0, chains = 2) summary(fit3) pp_check(fit3) ## predict the mixing proportions fit4 <- brm(bf(y ~ x + z, theta2 ~ x), dat, family = mix, prior = prior, init = 0, chains = 2) summary(fit4) pp_check(fit4) ## compare model fit loo(fit1, fit2, fit3, fit4) ## End(Not run)
Function to set up a multi-membership grouping term in brms. The function does not evaluate its arguments – it exists purely to help set up a model with grouping terms.
mm( ..., weights = NULL, scale = TRUE, by = NULL, cor = TRUE, id = NA, cov = NULL, dist = "gaussian" )
mm( ..., weights = NULL, scale = TRUE, by = NULL, cor = TRUE, id = NA, cov = NULL, dist = "gaussian" )
... |
One or more terms containing grouping factors. |
weights |
A matrix specifying the weights of each member.
It should have as many columns as grouping terms specified in |
scale |
Logical; if |
by |
An optional factor matrix, specifying sub-populations of the
groups. It should have as many columns as grouping terms specified in
|
cor |
Logical. If |
id |
Optional character string. All group-level terms across the model
with the same |
cov |
An optional matrix which is proportional to the within-group
covariance matrix of the group-level effects. All levels of the grouping
factor should appear as rownames of the corresponding matrix. This argument
can be used, among others, to model pedigrees and phylogenetic effects. See
|
dist |
Name of the distribution of the group-level effects.
Currently |
## Not run: # simulate some data dat <- data.frame( y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100), g1 = sample(1:10, 100, TRUE), g2 = sample(1:10, 100, TRUE) ) # multi-membership model with two members per group and equal weights fit1 <- brm(y ~ x1 + (1|mm(g1, g2)), data = dat) summary(fit1) # weight the first member two times for than the second member dat$w1 <- rep(2, 100) dat$w2 <- rep(1, 100) fit2 <- brm(y ~ x1 + (1|mm(g1, g2, weights = cbind(w1, w2))), data = dat) summary(fit2) # multi-membership model with level specific covariate values dat$xc <- (dat$x1 + dat$x2) / 2 fit3 <- brm(y ~ xc + (1 + mmc(x1, x2) | mm(g1, g2)), data = dat) summary(fit3) ## End(Not run)
## Not run: # simulate some data dat <- data.frame( y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100), g1 = sample(1:10, 100, TRUE), g2 = sample(1:10, 100, TRUE) ) # multi-membership model with two members per group and equal weights fit1 <- brm(y ~ x1 + (1|mm(g1, g2)), data = dat) summary(fit1) # weight the first member two times for than the second member dat$w1 <- rep(2, 100) dat$w2 <- rep(1, 100) fit2 <- brm(y ~ x1 + (1|mm(g1, g2, weights = cbind(w1, w2))), data = dat) summary(fit2) # multi-membership model with level specific covariate values dat$xc <- (dat$x1 + dat$x2) / 2 fit3 <- brm(y ~ xc + (1 + mmc(x1, x2) | mm(g1, g2)), data = dat) summary(fit3) ## End(Not run)
Specify covariates that vary over different levels
of multi-membership grouping factors thus requiring
special treatment. This function is almost solely useful,
when called in combination with mm
.
Outside of multi-membership terms it will behave
very much like cbind
.
mmc(...)
mmc(...)
... |
One or more terms containing covariates
corresponding to the grouping levels specified in |
A matrix with covariates as columns.
## Not run: # simulate some data dat <- data.frame( y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100), g1 = sample(1:10, 100, TRUE), g2 = sample(1:10, 100, TRUE) ) # multi-membership model with level specific covariate values dat$xc <- (dat$x1 + dat$x2) / 2 fit <- brm(y ~ xc + (1 + mmc(x1, x2) | mm(g1, g2)), data = dat) summary(fit) ## End(Not run)
## Not run: # simulate some data dat <- data.frame( y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100), g1 = sample(1:10, 100, TRUE), g2 = sample(1:10, 100, TRUE) ) # multi-membership model with level specific covariate values dat$xc <- (dat$x1 + dat$x2) / 2 fit <- brm(y ~ xc + (1 + mmc(x1, x2) | mm(g1, g2)), data = dat) summary(fit) ## End(Not run)
Specify a monotonic predictor term in brms. The function does not evaluate its arguments – it exists purely to help set up a model.
mo(x, id = NA)
mo(x, id = NA)
x |
An integer variable or an ordered factor to be modeled as monotonic. |
id |
Optional character string. All monotonic terms
with the same |
See Bürkner and Charpentier (2020) for the underlying theory. For
detailed documentation of the formula syntax used for monotonic terms,
see help(brmsformula)
as well as vignette("brms_monotonic")
.
Bürkner P. C. & Charpentier E. (2020). Modeling Monotonic Effects of Ordinal Predictors in Regression Models. British Journal of Mathematical and Statistical Psychology. doi:10.1111/bmsp.12195
## Not run: # generate some data income_options <- c("below_20", "20_to_40", "40_to_100", "greater_100") income <- factor(sample(income_options, 100, TRUE), levels = income_options, ordered = TRUE) mean_ls <- c(30, 60, 70, 75) ls <- mean_ls[income] + rnorm(100, sd = 7) dat <- data.frame(income, ls) # fit a simple monotonic model fit1 <- brm(ls ~ mo(income), data = dat) summary(fit1) plot(fit1, N = 6) plot(conditional_effects(fit1), points = TRUE) # model interaction with other variables dat$x <- sample(c("a", "b", "c"), 100, TRUE) fit2 <- brm(ls ~ mo(income)*x, data = dat) summary(fit2) plot(conditional_effects(fit2), points = TRUE) # ensure conditional monotonicity fit3 <- brm(ls ~ mo(income, id = "i")*x, data = dat) summary(fit3) plot(conditional_effects(fit3), points = TRUE) ## End(Not run)
## Not run: # generate some data income_options <- c("below_20", "20_to_40", "40_to_100", "greater_100") income <- factor(sample(income_options, 100, TRUE), levels = income_options, ordered = TRUE) mean_ls <- c(30, 60, 70, 75) ls <- mean_ls[income] + rnorm(100, sd = 7) dat <- data.frame(income, ls) # fit a simple monotonic model fit1 <- brm(ls ~ mo(income), data = dat) summary(fit1) plot(fit1, N = 6) plot(conditional_effects(fit1), points = TRUE) # model interaction with other variables dat$x <- sample(c("a", "b", "c"), 100, TRUE) fit2 <- brm(ls ~ mo(income)*x, data = dat) summary(fit2) plot(conditional_effects(fit2), points = TRUE) # ensure conditional monotonicity fit3 <- brm(ls ~ mo(income, id = "i")*x, data = dat) summary(fit3) plot(conditional_effects(fit3), points = TRUE) ## End(Not run)
Compute model weights in various ways, for instance, via stacking of posterior predictive distributions, Akaike weights, or marginal likelihoods.
## S3 method for class 'brmsfit' model_weights(x, ..., weights = "stacking", model_names = NULL) model_weights(x, ...)
## S3 method for class 'brmsfit' model_weights(x, ..., weights = "stacking", model_names = NULL) model_weights(x, ...)
x |
A |
... |
More |
weights |
Name of the criterion to compute weights from. Should be one
of |
model_names |
If |
A numeric vector of weights for the models.
## Not run: # model with 'treat' as predictor fit1 <- brm(rating ~ treat + period + carry, data = inhaler) summary(fit1) # model without 'treat' as predictor fit2 <- brm(rating ~ period + carry, data = inhaler) summary(fit2) # obtain Akaike weights based on the WAIC model_weights(fit1, fit2, weights = "waic") ## End(Not run)
## Not run: # model with 'treat' as predictor fit1 <- brm(rating ~ treat + period + carry, data = inhaler) summary(fit1) # model without 'treat' as predictor fit2 <- brm(rating ~ period + carry, data = inhaler) summary(fit2) # obtain Akaike weights based on the WAIC model_weights(fit1, fit2, weights = "waic") ## End(Not run)
Density function and random generation for the multivariate normal
distribution with mean vector mu
and covariance matrix Sigma
.
dmulti_normal(x, mu, Sigma, log = FALSE, check = FALSE) rmulti_normal(n, mu, Sigma, check = FALSE)
dmulti_normal(x, mu, Sigma, log = FALSE, check = FALSE) rmulti_normal(n, mu, Sigma, check = FALSE)
x |
Vector or matrix of quantiles. If |
mu |
Mean vector with length equal to the number of dimensions. |
Sigma |
Covariance matrix. |
log |
Logical; If |
check |
Logical; Indicates whether several input checks
should be performed. Defaults to |
n |
Number of draws to sample from the distribution. |
See the Stan user's manual https://mc-stan.org/documentation/ for details on the parameterization
Density function and random generation for the multivariate Student-t
distribution with location vector mu
, covariance matrix Sigma
,
and degrees of freedom df
.
dmulti_student_t(x, df, mu, Sigma, log = FALSE, check = FALSE) rmulti_student_t(n, df, mu, Sigma, check = FALSE)
dmulti_student_t(x, df, mu, Sigma, log = FALSE, check = FALSE) rmulti_student_t(n, df, mu, Sigma, check = FALSE)
x |
Vector or matrix of quantiles. If |
df |
Vector of degrees of freedom. |
mu |
Location vector with length equal to the number of dimensions. |
Sigma |
Covariance matrix. |
log |
Logical; If |
check |
Logical; Indicates whether several input checks
should be performed. Defaults to |
n |
Number of draws to sample from the distribution. |
See the Stan user's manual https://mc-stan.org/documentation/ for details on the parameterization
Can be used to specify a multivariate brms model within a single
formula. Outside of brmsformula
, it just behaves like
cbind
.
mvbind(...)
mvbind(...)
... |
Same as in |
bf(mvbind(y1, y2) ~ x)
bf(mvbind(y1, y2) ~ x)
Set up a multivariate model formula for use in the brms package allowing to define (potentially non-linear) additive multilevel models for all parameters of the assumed response distributions.
mvbrmsformula(..., flist = NULL, rescor = NULL)
mvbrmsformula(..., flist = NULL, rescor = NULL)
... |
Objects of class |
flist |
Optional list of formulas, which are treated in the
same way as formulas passed via the |
rescor |
Logical; Indicates if residual correlation between
the response variables should be modeled. Currently, this is only
possible in multivariate |
See vignette("brms_multivariate")
for a case study.
An object of class mvbrmsformula
, which
is essentially a list
containing all model formulas
as well as some additional information for multivariate models.
brmsformula
, brmsformula-helpers
bf1 <- bf(y1 ~ x + (1|g)) bf2 <- bf(y2 ~ s(z)) mvbf(bf1, bf2)
bf1 <- bf(y1 ~ x + (1|g)) bf2 <- bf(y2 ~ s(z)) mvbf(bf1, bf2)
Extract the number of levels of one or more grouping factors.
## S3 method for class 'brmsfit' ngrps(object, ...) ngrps(object, ...)
## S3 method for class 'brmsfit' ngrps(object, ...) ngrps(object, ...)
object |
An R object. |
... |
Currently ignored. |
A named list containing the number of levels per grouping factor.
Extract the number of posterior samples (draws) stored in a fitted Bayesian
model. Method nsamples
is deprecated. Please use ndraws
instead.
## S3 method for class 'brmsfit' nsamples(object, subset = NULL, incl_warmup = FALSE, ...)
## S3 method for class 'brmsfit' nsamples(object, subset = NULL, incl_warmup = FALSE, ...)
object |
An object of class |
subset |
An optional integer vector defining a subset of samples to be considered. |
incl_warmup |
A flag indicating whether to also count warmup / burn-in samples. |
... |
Currently ignored. |
Use OpenCL for GPU support in Stan via the brms interface. Only some Stan functions can be run on a GPU at this point and so a lot of brms models won't benefit from OpenCL for now.
opencl(ids = NULL)
opencl(ids = NULL)
ids |
(integer vector of length 2) The platform and device IDs of the
OpenCL device to use for fitting. If you don't know the IDs of your OpenCL
device, |
For more details on OpenCL in Stan, check out https://mc-stan.org/docs/2_26/cmdstan-guide/parallelization.html#opencl as well as https://mc-stan.org/docs/2_26/stan-users-guide/opencl.html.
A brmsopencl
object which can be passed to the
opencl
argument of brm
and related functions.
## Not run: # this model just serves as an illustration # OpenCL may not actually speed things up here fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), chains = 2, cores = 2, opencl = opencl(c(0, 0)), backend = "cmdstanr") summary(fit) ## End(Not run)
## Not run: # this model just serves as an illustration # OpenCL may not actually speed things up here fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), chains = 2, cores = 2, opencl = opencl(c(0, 0)), backend = "cmdstanr") summary(fit) ## End(Not run)
brmsfit
objectA pairs
method that is customized for MCMC output.
## S3 method for class 'brmsfit' pairs(x, pars = NA, variable = NULL, regex = FALSE, fixed = FALSE, ...)
## S3 method for class 'brmsfit' pairs(x, pars = NA, variable = NULL, regex = FALSE, fixed = FALSE, ...)
x |
An object of class |
pars |
Deprecated alias of |
variable |
Names of the variables (parameters) to plot, as given by a
character vector or a regular expression (if |
regex |
Logical; Indicates whether |
fixed |
(Deprecated) Indicates whether parameter names
should be matched exactly ( |
... |
Further arguments to be passed to
|
For a detailed description see
mcmc_pairs
.
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1|patient) + (1|visit), data = epilepsy, family = "poisson") pairs(fit, variable = variables(fit)[1:3]) pairs(fit, variable = "^sd_", regex = TRUE) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1|patient) + (1|visit), data = epilepsy, family = "poisson") pairs(fit, variable = variables(fit)[1:3]) pairs(fit, variable = "^sd_", regex = TRUE) ## End(Not run)
Extract all parameter names of a given model.
parnames(x, ...)
parnames(x, ...)
x |
An R object |
... |
Further arguments passed to or from other methods. |
A character vector containing the parameter names of the model.
Trace and Density Plots for MCMC Draws
## S3 method for class 'brmsfit' plot( x, pars = NA, combo = c("hist", "trace"), nvariables = 5, N = NULL, variable = NULL, regex = FALSE, fixed = FALSE, bins = 30, theme = NULL, plot = TRUE, ask = TRUE, newpage = TRUE, ... )
## S3 method for class 'brmsfit' plot( x, pars = NA, combo = c("hist", "trace"), nvariables = 5, N = NULL, variable = NULL, regex = FALSE, fixed = FALSE, bins = 30, theme = NULL, plot = TRUE, ask = TRUE, newpage = TRUE, ... )
x |
An object of class |
pars |
Deprecated alias of |
combo |
A character vector with at least two elements.
Each element of |
nvariables |
The number of variables (parameters) plotted per page. |
N |
Deprecated alias of |
variable |
Names of the variables (parameters) to plot, as given by a
character vector or a regular expression (if |
regex |
Logical; Indicates whether |
fixed |
(Deprecated) Indicates whether parameter names
should be matched exactly ( |
bins |
Number of bins used for posterior histograms (defaults to 30). |
theme |
A |
plot |
Logical; indicates if plots should be
plotted directly in the active graphic device.
Defaults to |
ask |
Logical; indicates if the user is prompted
before a new page is plotted.
Only used if |
newpage |
Logical; indicates if the first set of plots
should be plotted to a new page.
Only used if |
... |
Further arguments passed to
|
An invisible list of
gtable
objects.
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1|patient) + (1|visit), data = epilepsy, family = "poisson") plot(fit) ## plot population-level effects only plot(fit, variable = "^b_", regex = TRUE) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1|patient) + (1|visit), data = epilepsy, family = "poisson") plot(fit) ## plot population-level effects only plot(fit, variable = "^b_", regex = TRUE) ## End(Not run)
Compute posterior model probabilities from marginal likelihoods.
The brmsfit
method is just a thin wrapper around
the corresponding method for bridge
objects.
## S3 method for class 'brmsfit' post_prob(x, ..., prior_prob = NULL, model_names = NULL)
## S3 method for class 'brmsfit' post_prob(x, ..., prior_prob = NULL, model_names = NULL)
x |
A |
... |
More |
prior_prob |
Numeric vector with prior model probabilities.
If omitted, a uniform prior is used (i.e., all models are equally
likely a priori). The default |
model_names |
If |
Computing the marginal likelihood requires samples
of all variables defined in Stan's parameters
block
to be saved. Otherwise post_prob
cannot be computed.
Thus, please set save_all_pars = TRUE
in the call to brm
,
if you are planning to apply post_prob
to your models.
The computation of model probabilities based on bridge sampling requires
a lot more posterior samples than usual. A good conservative
rule of thump is perhaps 10-fold more samples (read: the default of 4000
samples may not be enough in many cases). If not enough posterior
samples are provided, the bridge sampling algorithm tends to be
unstable leading to considerably different results each time it is run.
We thus recommend running post_prob
multiple times to check the stability of the results.
More details are provided under
bridgesampling::post_prob
.
## Not run: # model with the treatment effect fit1 <- brm( count ~ zAge + zBase + Trt, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_all_pars = TRUE ) summary(fit1) # model without the treatent effect fit2 <- brm( count ~ zAge + zBase, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_all_pars = TRUE ) summary(fit2) # compute the posterior model probabilities post_prob(fit1, fit2) # specify prior model probabilities post_prob(fit1, fit2, prior_prob = c(0.8, 0.2)) ## End(Not run)
## Not run: # model with the treatment effect fit1 <- brm( count ~ zAge + zBase + Trt, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_all_pars = TRUE ) summary(fit1) # model without the treatent effect fit2 <- brm( count ~ zAge + zBase, data = epilepsy, family = negbinomial(), prior = prior(normal(0, 1), class = b), save_all_pars = TRUE ) summary(fit2) # compute the posterior model probabilities post_prob(fit1, fit2) # specify prior model probabilities post_prob(fit1, fit2, prior_prob = c(0.8, 0.2)) ## End(Not run)
Extract posterior draws of parameters averaged across models. Weighting can be done in various ways, for instance using Akaike weights based on information criteria or marginal likelihoods.
## S3 method for class 'brmsfit' posterior_average( x, ..., variable = NULL, pars = NULL, weights = "stacking", ndraws = NULL, nsamples = NULL, missing = NULL, model_names = NULL, control = list(), seed = NULL ) posterior_average(x, ...)
## S3 method for class 'brmsfit' posterior_average( x, ..., variable = NULL, pars = NULL, weights = "stacking", ndraws = NULL, nsamples = NULL, missing = NULL, model_names = NULL, control = list(), seed = NULL ) posterior_average(x, ...)
x |
A |
... |
More |
variable |
Names of variables (parameters) for which to average across models. Only those variables can be averaged that appear in every model. Defaults to all overlapping variables. |
pars |
Deprecated alias of |
weights |
Name of the criterion to compute weights from. Should be one
of |
ndraws |
Total number of posterior draws to use. |
nsamples |
Deprecated alias of |
missing |
An optional numeric value or a named list of numeric values
to use if a model does not contain a variable for which posterior draws
should be averaged. Defaults to |
model_names |
If |
control |
Optional |
seed |
A single numeric value passed to |
Weights are computed with the model_weights
method.
A data.frame
of posterior draws.
## Not run: # model with 'treat' as predictor fit1 <- brm(rating ~ treat + period + carry, data = inhaler) summary(fit1) # model without 'treat' as predictor fit2 <- brm(rating ~ period + carry, data = inhaler) summary(fit2) # compute model-averaged posteriors of overlapping parameters posterior_average(fit1, fit2, weights = "waic") ## End(Not run)
## Not run: # model with 'treat' as predictor fit1 <- brm(rating ~ treat + period + carry, data = inhaler) summary(fit1) # model without 'treat' as predictor fit2 <- brm(rating ~ period + carry, data = inhaler) summary(fit2) # compute model-averaged posteriors of overlapping parameters posterior_average(fit1, fit2, weights = "waic") ## End(Not run)
Compute posterior draws of the expected value of the posterior predictive
distribution. Can be performed for the data used to fit the model (posterior
predictive checks) or for new data. By definition, these predictions have
smaller variance than the posterior predictions performed by the
posterior_predict.brmsfit
method. This is because only the
uncertainty in the expected value of the posterior predictive distribution is
incorporated in the draws computed by posterior_epred
while the
residual error is ignored there. However, the estimated means of both methods
averaged across draws should be very similar.
## S3 method for class 'brmsfit' posterior_epred( object, newdata = NULL, re_formula = NULL, re.form = NULL, resp = NULL, dpar = NULL, nlpar = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, ... )
## S3 method for class 'brmsfit' posterior_epred( object, newdata = NULL, re_formula = NULL, re.form = NULL, resp = NULL, dpar = NULL, nlpar = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, ... )
object |
An object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
re.form |
Alias of |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
dpar |
Optional name of a predicted distributional parameter. If specified, expected predictions of this parameters are returned. |
nlpar |
Optional name of a predicted non-linear parameter. If specified, expected predictions of this parameters are returned. |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
sort |
Logical. Only relevant for time series models.
Indicating whether to return predicted values in the original
order ( |
... |
Further arguments passed to |
NA
values within factors in newdata
,
are interpreted as if all dummy variables of this factor are
zero. This allows, for instance, to make predictions of the grand mean
when using sum coding.
In multilevel models, it is possible to
allow new levels of grouping factors to be used in the predictions.
This can be controlled via argument allow_new_levels
.
New levels can be sampled in multiple ways, which can be controlled
via argument sample_new_levels
. Both of these arguments are
documented in prepare_predictions
along with several
other useful arguments to control specific aspects of the predictions.
An array
of draws. For
categorical and ordinal models, the output is an S x N x C array.
Otherwise, the output is an S x N matrix, where S is the number of
posterior draws, N is the number of observations, and C is the number of
categories. In multivariate models, an additional dimension is added to the
output which indexes along the different response variables.
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) ## compute expected predictions ppe <- posterior_epred(fit) str(ppe) ## End(Not run)
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) ## compute expected predictions ppe <- posterior_epred(fit) str(ppe) ## End(Not run)
Compute posterior uncertainty intervals for brmsfit
objects.
## S3 method for class 'brmsfit' posterior_interval(object, pars = NA, variable = NULL, prob = 0.95, ...)
## S3 method for class 'brmsfit' posterior_interval(object, pars = NA, variable = NULL, prob = 0.95, ...)
object |
An object of class |
pars |
Deprecated alias of |
variable |
A character vector providing the variables to extract. By default, all variables are extracted. |
prob |
A value between 0 and 1 indicating the desired probability to be covered by the uncertainty intervals. The default is 0.95. |
... |
More arguments passed to |
A matrix
with lower and upper interval bounds
as columns and as many rows as selected variables.
## Not run: fit <- brm(count ~ zAge + zBase * Trt, data = epilepsy, family = negbinomial()) posterior_interval(fit) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt, data = epilepsy, family = negbinomial()) posterior_interval(fit) ## End(Not run)
Compute posterior draws of the linear predictor, that is draws before applying any link functions or other transformations. Can be performed for the data used to fit the model (posterior predictive checks) or for new data.
## S3 method for class 'brmsfit' posterior_linpred( object, transform = FALSE, newdata = NULL, re_formula = NULL, re.form = NULL, resp = NULL, dpar = NULL, nlpar = NULL, incl_thres = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, ... )
## S3 method for class 'brmsfit' posterior_linpred( object, transform = FALSE, newdata = NULL, re_formula = NULL, re.form = NULL, resp = NULL, dpar = NULL, nlpar = NULL, incl_thres = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, ... )
object |
An object of class |
transform |
Logical; if |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
re.form |
Alias of |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
dpar |
Name of a predicted distributional parameter
for which draws are to be returned. By default, draws
of the main distributional parameter(s) |
nlpar |
Optional name of a predicted non-linear parameter. If specified, expected predictions of this parameters are returned. |
incl_thres |
Logical; only relevant for ordinal models when
|
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
sort |
Logical. Only relevant for time series models.
Indicating whether to return predicted values in the original
order ( |
... |
Further arguments passed to |
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) ## extract linear predictor values pl <- posterior_linpred(fit) str(pl) ## End(Not run)
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) ## extract linear predictor values pl <- posterior_linpred(fit) str(pl) ## End(Not run)
Compute posterior draws of the posterior predictive distribution. Can be
performed for the data used to fit the model (posterior predictive checks) or
for new data. By definition, these draws have higher variance than draws
of the expected value of the posterior predictive distribution computed by
posterior_epred.brmsfit
. This is because the residual error
is incorporated in posterior_predict
. However, the estimated means of
both methods averaged across draws should be very similar.
## S3 method for class 'brmsfit' posterior_predict( object, newdata = NULL, re_formula = NULL, re.form = NULL, transform = NULL, resp = NULL, negative_rt = FALSE, ndraws = NULL, draw_ids = NULL, sort = FALSE, ntrys = 5, cores = NULL, ... )
## S3 method for class 'brmsfit' posterior_predict( object, newdata = NULL, re_formula = NULL, re.form = NULL, transform = NULL, resp = NULL, negative_rt = FALSE, ndraws = NULL, draw_ids = NULL, sort = FALSE, ntrys = 5, cores = NULL, ... )
object |
An object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
re.form |
Alias of |
transform |
(Deprecated) A function or a character string naming a function to be applied on the predicted responses before summary statistics are computed. |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
negative_rt |
Only relevant for Wiener diffusion models.
A flag indicating whether response times of responses
on the lower boundary should be returned as negative values.
This allows to distinguish responses on the upper and
lower boundary. Defaults to |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
sort |
Logical. Only relevant for time series models.
Indicating whether to return predicted values in the original
order ( |
ntrys |
Parameter used in rejection sampling
for truncated discrete models only
(defaults to |
cores |
Number of cores (defaults to |
... |
Further arguments passed to |
NA
values within factors in newdata
,
are interpreted as if all dummy variables of this factor are
zero. This allows, for instance, to make predictions of the grand mean
when using sum coding.
In multilevel models, it is possible to
allow new levels of grouping factors to be used in the predictions.
This can be controlled via argument allow_new_levels
.
New levels can be sampled in multiple ways, which can be controlled
via argument sample_new_levels
. Both of these arguments are
documented in prepare_predictions
along with several
other useful arguments to control specific aspects of the predictions.
For truncated discrete models only: In the absence of any general
algorithm to sample from truncated discrete distributions, rejection
sampling is applied in this special case. This means that values are
sampled until a value lies within the defined truncation boundaries. In
practice, this procedure may be rather slow (especially in R). Thus, we
try to do approximate rejection sampling by sampling each value
ntrys
times and then select a valid value. If all values are
invalid, the closest boundary is used, instead. If there are more than a
few of these pathological cases, a warning will occur suggesting to
increase argument ntrys
.
An array
of draws. In univariate models,
the output is as an S x N matrix, where S is the number of posterior
draws and N is the number of observations. In multivariate models, an
additional dimension is added to the output which indexes along the
different response variables.
## Not run: ## fit a model fit <- brm(time | cens(censored) ~ age + sex + (1 + age || patient), data = kidney, family = "exponential", init = "0") ## predicted responses pp <- posterior_predict(fit) str(pp) ## predicted responses excluding the group-level effect of age pp <- posterior_predict(fit, re_formula = ~ (1 | patient)) str(pp) ## predicted responses of patient 1 for new data newdata <- data.frame( sex = factor(c("male", "female")), age = c(20, 50), patient = c(1, 1) ) pp <- posterior_predict(fit, newdata = newdata) str(pp) ## End(Not run)
## Not run: ## fit a model fit <- brm(time | cens(censored) ~ age + sex + (1 + age || patient), data = kidney, family = "exponential", init = "0") ## predicted responses pp <- posterior_predict(fit) str(pp) ## predicted responses excluding the group-level effect of age pp <- posterior_predict(fit, re_formula = ~ (1 | patient)) str(pp) ## predicted responses of patient 1 for new data newdata <- data.frame( sex = factor(c("male", "female")), age = c(20, 50), patient = c(1, 1) ) pp <- posterior_predict(fit, newdata = newdata) str(pp) ## End(Not run)
Extract posterior samples of specified parameters. The
posterior_samples
method is deprecated. We recommend using the more
modern and consistent as_draws_*
extractor
functions of the posterior package instead.
## S3 method for class 'brmsfit' posterior_samples( x, pars = NA, fixed = FALSE, add_chain = FALSE, subset = NULL, as.matrix = FALSE, as.array = FALSE, ... ) posterior_samples(x, pars = NA, ...)
## S3 method for class 'brmsfit' posterior_samples( x, pars = NA, fixed = FALSE, add_chain = FALSE, subset = NULL, as.matrix = FALSE, as.array = FALSE, ... ) posterior_samples(x, pars = NA, ...)
x |
An |
pars |
Names of parameters for which posterior samples should be returned, as given by a character vector or regular expressions. By default, all posterior samples of all parameters are extracted. |
fixed |
Indicates whether parameter names
should be matched exactly ( |
add_chain |
A flag indicating if the returned |
subset |
A numeric vector indicating the rows
(i.e., posterior samples) to be returned.
If |
as.matrix |
Should the output be a |
as.array |
Should the output be an |
... |
Arguments passed to individual methods (if applicable). |
A data.frame (matrix or array) containing the posterior samples.
## Not run: fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative") # extract posterior samples of population-level effects samples1 <- posterior_samples(fit, pars = "^b") head(samples1) # extract posterior samples of group-level standard deviations samples2 <- posterior_samples(fit, pars = "^sd_") head(samples2) ## End(Not run)
## Not run: fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative") # extract posterior samples of population-level effects samples1 <- posterior_samples(fit, pars = "^b") head(samples1) # extract posterior samples of group-level standard deviations samples2 <- posterior_samples(fit, pars = "^sd_") head(samples2) ## End(Not run)
Compute posterior predictions of smooth s
and t2
terms of
models fitted with brms.
## S3 method for class 'brmsfit' posterior_smooths( object, smooth, newdata = NULL, resp = NULL, dpar = NULL, nlpar = NULL, ndraws = NULL, draw_ids = NULL, ... ) posterior_smooths(object, ...)
## S3 method for class 'brmsfit' posterior_smooths( object, smooth, newdata = NULL, resp = NULL, dpar = NULL, nlpar = NULL, ndraws = NULL, draw_ids = NULL, ... ) posterior_smooths(object, ...)
object |
An object of class |
smooth |
Name of a single smooth term for which predictions should be computed. |
newdata |
An optional |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
dpar |
Optional name of a predicted distributional parameter. If specified, expected predictions of this parameters are returned. |
nlpar |
Optional name of a predicted non-linear parameter. If specified, expected predictions of this parameters are returned. |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
... |
Currently ignored. |
An S x N matrix, where S is the number of posterior draws and N is the number of observations.
## Not run: set.seed(0) dat <- mgcv::gamSim(1, n = 200, scale = 2) fit <- brm(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) summary(fit) newdata <- data.frame(x2 = seq(0, 1, 10)) str(posterior_smooths(fit, smooth = "s(x2)", newdata = newdata)) ## End(Not run)
## Not run: set.seed(0) dat <- mgcv::gamSim(1, n = 200, scale = 2) fit <- brm(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) summary(fit) newdata <- data.frame(x2 = seq(0, 1, 10)) str(posterior_smooths(fit, smooth = "s(x2)", newdata = newdata)) ## End(Not run)
Summarizes posterior draws based on point estimates (mean or median), estimation errors (SD or MAD) and quantiles. This function mainly exists to retain backwards compatibility. It will eventually be replaced by functions of the posterior package (see examples below).
posterior_summary(x, ...) ## Default S3 method: posterior_summary(x, probs = c(0.025, 0.975), robust = FALSE, ...) ## S3 method for class 'brmsfit' posterior_summary( x, pars = NA, variable = NULL, probs = c(0.025, 0.975), robust = FALSE, ... )
posterior_summary(x, ...) ## Default S3 method: posterior_summary(x, probs = c(0.025, 0.975), robust = FALSE, ...) ## S3 method for class 'brmsfit' posterior_summary( x, pars = NA, variable = NULL, probs = c(0.025, 0.975), robust = FALSE, ... )
x |
An R object. |
... |
More arguments passed to or from other methods. |
probs |
The percentiles to be computed by the
|
robust |
If |
pars |
Deprecated alias of |
variable |
A character vector providing the variables to extract. By default, all variables are extracted. |
A matrix where rows indicate variables and columns indicate the summary estimates.
## Not run: fit <- brm(time ~ age * sex, data = kidney) posterior_summary(fit) # recommended workflow using posterior library(posterior) draws <- as_draws_array(fit) summarise_draws(draws, default_summary_measures()) ## End(Not run)
## Not run: fit <- brm(time ~ age * sex, data = kidney) posterior_summary(fit) # recommended workflow using posterior library(posterior) draws <- as_draws_array(fit) summarise_draws(draws, default_summary_measures()) ## End(Not run)
Create a table for unique values of posterior draws. This is usually only useful when summarizing predictions of ordinal models.
posterior_table(x, levels = NULL)
posterior_table(x, levels = NULL)
x |
A matrix of posterior draws where rows indicate draws and columns indicate parameters. |
levels |
Optional values of possible posterior values.
Defaults to all unique values in |
A matrix where rows indicate parameters and columns indicate the unique values of posterior draws.
## Not run: fit <- brm(rating ~ period + carry + treat, data = inhaler, family = cumulative()) pr <- predict(fit, summary = FALSE) posterior_table(pr) ## End(Not run)
## Not run: fit <- brm(rating ~ period + carry + treat, data = inhaler, family = cumulative()) pr <- predict(fit, summary = FALSE) posterior_table(pr) ## End(Not run)
Compute posterior predictive draws averaged across models. Weighting can be done in various ways, for instance using Akaike weights based on information criteria or marginal likelihoods.
## S3 method for class 'brmsfit' pp_average( x, ..., weights = "stacking", method = "posterior_predict", ndraws = NULL, nsamples = NULL, summary = TRUE, probs = c(0.025, 0.975), robust = FALSE, model_names = NULL, control = list(), seed = NULL ) pp_average(x, ...)
## S3 method for class 'brmsfit' pp_average( x, ..., weights = "stacking", method = "posterior_predict", ndraws = NULL, nsamples = NULL, summary = TRUE, probs = c(0.025, 0.975), robust = FALSE, model_names = NULL, control = list(), seed = NULL ) pp_average(x, ...)
x |
A |
... |
More |
weights |
Name of the criterion to compute weights from. Should be one
of |
method |
Method used to obtain predictions to average over. Should be
one of |
ndraws |
Total number of posterior draws to use. |
nsamples |
Deprecated alias of |
summary |
Should summary statistics
(i.e. means, sds, and 95% intervals) be returned
instead of the raw values? Default is |
probs |
The percentiles to be computed by the |
robust |
If |
model_names |
If |
control |
Optional |
seed |
A single numeric value passed to |
Weights are computed with the model_weights
method.
Same as the output of the method specified
in argument method
.
model_weights
, posterior_average
## Not run: # model with 'treat' as predictor fit1 <- brm(rating ~ treat + period + carry, data = inhaler) summary(fit1) # model without 'treat' as predictor fit2 <- brm(rating ~ period + carry, data = inhaler) summary(fit2) # compute model-averaged predicted values (df <- unique(inhaler[, c("treat", "period", "carry")])) pp_average(fit1, fit2, newdata = df) # compute model-averaged fitted values pp_average(fit1, fit2, method = "fitted", newdata = df) ## End(Not run)
## Not run: # model with 'treat' as predictor fit1 <- brm(rating ~ treat + period + carry, data = inhaler) summary(fit1) # model without 'treat' as predictor fit2 <- brm(rating ~ period + carry, data = inhaler) summary(fit2) # compute model-averaged predicted values (df <- unique(inhaler[, c("treat", "period", "carry")])) pp_average(fit1, fit2, newdata = df) # compute model-averaged fitted values pp_average(fit1, fit2, method = "fitted", newdata = df) ## End(Not run)
brmsfit
ObjectsPerform posterior predictive checks with the help of the bayesplot package.
## S3 method for class 'brmsfit' pp_check( object, type, ndraws = NULL, prefix = c("ppc", "ppd"), group = NULL, x = NULL, newdata = NULL, resp = NULL, draw_ids = NULL, nsamples = NULL, subset = NULL, ... )
## S3 method for class 'brmsfit' pp_check( object, type, ndraws = NULL, prefix = c("ppc", "ppd"), group = NULL, x = NULL, newdata = NULL, resp = NULL, draw_ids = NULL, nsamples = NULL, subset = NULL, ... )
object |
An object of class |
type |
Type of the ppc plot as given by a character string.
See |
ndraws |
Positive integer indicating how many
posterior draws should be used.
If |
prefix |
The prefix of the bayesplot function to be applied. Either '"ppc"' (posterior predictive check; the default) or '"ppd"' (posterior predictive distribution), the latter being the same as the former except that the observed data is not shown for '"ppd"'. |
group |
Optional name of a factor variable in the model
by which to stratify the ppc plot. This argument is required for
ppc |
x |
Optional name of a variable in the model.
Only used for ppc types having an |
newdata |
An optional data.frame for which to evaluate predictions. If
|
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
nsamples |
Deprecated alias of |
subset |
Deprecated alias of |
... |
Further arguments passed to |
For a detailed explanation of each of the ppc functions,
see the PPC
documentation of the bayesplot
package.
A ggplot object that can be further customized using the ggplot2 package.
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson()) pp_check(fit) # shows dens_overlay plot by default pp_check(fit, type = "error_hist", ndraws = 11) pp_check(fit, type = "scatter_avg", ndraws = 100) pp_check(fit, type = "stat_2d") pp_check(fit, type = "rootogram") pp_check(fit, type = "loo_pit") ## get an overview of all valid types pp_check(fit, type = "xyz") ## get a plot without the observed data pp_check(fit, prefix = "ppd") ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson()) pp_check(fit) # shows dens_overlay plot by default pp_check(fit, type = "error_hist", ndraws = 11) pp_check(fit, type = "scatter_avg", ndraws = 100) pp_check(fit, type = "stat_2d") pp_check(fit, type = "rootogram") pp_check(fit, type = "loo_pit") ## get an overview of all valid types pp_check(fit, type = "xyz") ## get a plot without the observed data pp_check(fit, prefix = "ppd") ## End(Not run)
Compute the posterior probabilities of mixture component memberships for each observation including uncertainty estimates.
## S3 method for class 'brmsfit' pp_mixture( x, newdata = NULL, re_formula = NULL, resp = NULL, ndraws = NULL, draw_ids = NULL, log = FALSE, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... ) pp_mixture(x, ...)
## S3 method for class 'brmsfit' pp_mixture( x, newdata = NULL, re_formula = NULL, resp = NULL, ndraws = NULL, draw_ids = NULL, log = FALSE, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... ) pp_mixture(x, ...)
x |
An R object usually of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
log |
Logical; Indicates whether to return probabilities on the log-scale. |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
... |
Further arguments passed to |
The returned probabilities can be written as
, that is the posterior probability
that observation n originates from component k.
They are computed using Bayes' Theorem
where is the (posterior) likelihood
of observation n for component k,
is
the (posterior) mixing probability of component k
(i.e. parameter
theta<k>
), and
is a normalizing constant.
If summary = TRUE
, an N x E x K array,
where N is the number of observations, K is the number
of mixture components, and E is equal to length(probs) + 2
.
If summary = FALSE
, an S x N x K array, where
S is the number of posterior draws.
## Not run: ## simulate some data set.seed(1234) dat <- data.frame( y = c(rnorm(100), rnorm(50, 2)), x = rnorm(150) ) ## fit a simple normal mixture model mix <- mixture(gaussian, nmix = 2) prior <- c( prior(normal(0, 5), Intercept, nlpar = mu1), prior(normal(0, 5), Intercept, nlpar = mu2), prior(dirichlet(2, 2), theta) ) fit1 <- brm(bf(y ~ x), dat, family = mix, prior = prior, chains = 2, init = 0) summary(fit1) ## compute the membership probabilities ppm <- pp_mixture(fit1) str(ppm) ## extract point estimates for each observation head(ppm[, 1, ]) ## classify every observation according to ## the most likely component apply(ppm[, 1, ], 1, which.max) ## End(Not run)
## Not run: ## simulate some data set.seed(1234) dat <- data.frame( y = c(rnorm(100), rnorm(50, 2)), x = rnorm(150) ) ## fit a simple normal mixture model mix <- mixture(gaussian, nmix = 2) prior <- c( prior(normal(0, 5), Intercept, nlpar = mu1), prior(normal(0, 5), Intercept, nlpar = mu2), prior(dirichlet(2, 2), theta) ) fit1 <- brm(bf(y ~ x), dat, family = mix, prior = prior, chains = 2, init = 0) summary(fit1) ## compute the membership probabilities ppm <- pp_mixture(fit1) str(ppm) ## extract point estimates for each observation head(ppm[, 1, ]) ## classify every observation according to ## the most likely component apply(ppm[, 1, ], 1, which.max) ## End(Not run)
This method is an alias of posterior_predict.brmsfit
with additional arguments for obtaining summaries of the computed draws.
## S3 method for class 'brmsfit' predict( object, newdata = NULL, re_formula = NULL, transform = NULL, resp = NULL, negative_rt = FALSE, ndraws = NULL, draw_ids = NULL, sort = FALSE, ntrys = 5, cores = NULL, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
## S3 method for class 'brmsfit' predict( object, newdata = NULL, re_formula = NULL, transform = NULL, resp = NULL, negative_rt = FALSE, ndraws = NULL, draw_ids = NULL, sort = FALSE, ntrys = 5, cores = NULL, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
object |
An object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
transform |
(Deprecated) A function or a character string naming a function to be applied on the predicted responses before summary statistics are computed. |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
negative_rt |
Only relevant for Wiener diffusion models.
A flag indicating whether response times of responses
on the lower boundary should be returned as negative values.
This allows to distinguish responses on the upper and
lower boundary. Defaults to |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
sort |
Logical. Only relevant for time series models.
Indicating whether to return predicted values in the original
order ( |
ntrys |
Parameter used in rejection sampling
for truncated discrete models only
(defaults to |
cores |
Number of cores (defaults to |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
... |
Further arguments passed to |
An array
of predicted response values.
If summary = FALSE
the output resembles those of
posterior_predict.brmsfit
.
If summary = TRUE
the output depends on the family: For categorical
and ordinal families, the output is an N x C matrix, where N is the number
of observations, C is the number of categories, and the values are
predicted category probabilities. For all other families, the output is a N
x E matrix where E = 2 + length(probs)
is the number of summary
statistics: The Estimate
column contains point estimates (either
mean or median depending on argument robust
), while the
Est.Error
column contains uncertainty estimates (either standard
deviation or median absolute deviation depending on argument
robust
). The remaining columns starting with Q
contain
quantile estimates as specified via argument probs
.
## Not run: ## fit a model fit <- brm(time | cens(censored) ~ age + sex + (1 + age || patient), data = kidney, family = "exponential", init = "0") ## predicted responses pp <- predict(fit) head(pp) ## predicted responses excluding the group-level effect of age pp <- predict(fit, re_formula = ~ (1 | patient)) head(pp) ## predicted responses of patient 1 for new data newdata <- data.frame( sex = factor(c("male", "female")), age = c(20, 50), patient = c(1, 1) ) predict(fit, newdata = newdata) ## End(Not run)
## Not run: ## fit a model fit <- brm(time | cens(censored) ~ age + sex + (1 + age || patient), data = kidney, family = "exponential", init = "0") ## predicted responses pp <- predict(fit) head(pp) ## predicted responses excluding the group-level effect of age pp <- predict(fit, re_formula = ~ (1 | patient)) head(pp) ## predicted responses of patient 1 for new data newdata <- data.frame( sex = factor(c("male", "female")), age = c(20, 50), patient = c(1, 1) ) predict(fit, newdata = newdata) ## End(Not run)
Compute posterior draws of predictive errors, that is, observed minus predicted responses. Can be performed for the data used to fit the model (posterior predictive checks) or for new data.
## S3 method for class 'brmsfit' predictive_error( object, newdata = NULL, re_formula = NULL, re.form = NULL, method = "posterior_predict", resp = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, ... )
## S3 method for class 'brmsfit' predictive_error( object, newdata = NULL, re_formula = NULL, re.form = NULL, method = "posterior_predict", resp = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, ... )
object |
An object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
re.form |
Alias of |
method |
Method used to obtain predictions. Can be set to
|
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
sort |
Logical. Only relevant for time series models.
Indicating whether to return predicted values in the original
order ( |
... |
Further arguments passed to |
An S x N array
of predictive error draws, where S is the
number of posterior draws and N is the number of observations.
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, cores = 2) ## extract predictive errors pe <- predictive_error(fit) str(pe) ## End(Not run)
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, cores = 2) ## extract predictive errors pe <- predictive_error(fit) str(pe) ## End(Not run)
Compute intervals from the posterior predictive distribution.
## S3 method for class 'brmsfit' predictive_interval(object, prob = 0.9, ...)
## S3 method for class 'brmsfit' predictive_interval(object, prob = 0.9, ...)
object |
An R object of class |
prob |
A number p (0 < p < 1) indicating the desired probability mass to
include in the intervals. Defaults to |
... |
Further arguments passed to |
A matrix with 2 columns for the lower and upper bounds of the intervals, respectively, and as many rows as observations being predicted.
## Not run: fit <- brm(count ~ zBase, data = epilepsy, family = poisson()) predictive_interval(fit) ## End(Not run)
## Not run: fit <- brm(count ~ zBase, data = epilepsy, family = poisson()) predictive_interval(fit) ## End(Not run)
This method helps in preparing brms models for certain post-processing
tasks most notably various forms of predictions. Unless you are a package
developer, you will rarely need to call prepare_predictions
directly.
## S3 method for class 'brmsfit' prepare_predictions( x, newdata = NULL, re_formula = NULL, allow_new_levels = FALSE, sample_new_levels = "uncertainty", incl_autocor = TRUE, oos = NULL, resp = NULL, ndraws = NULL, draw_ids = NULL, nsamples = NULL, subset = NULL, nug = NULL, smooths_only = FALSE, offset = TRUE, newdata2 = NULL, new_objects = NULL, point_estimate = NULL, ndraws_point_estimate = 1, ... ) prepare_predictions(x, ...)
## S3 method for class 'brmsfit' prepare_predictions( x, newdata = NULL, re_formula = NULL, allow_new_levels = FALSE, sample_new_levels = "uncertainty", incl_autocor = TRUE, oos = NULL, resp = NULL, ndraws = NULL, draw_ids = NULL, nsamples = NULL, subset = NULL, nug = NULL, smooths_only = FALSE, offset = TRUE, newdata2 = NULL, new_objects = NULL, point_estimate = NULL, ndraws_point_estimate = 1, ... ) prepare_predictions(x, ...)
x |
An R object typically of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
allow_new_levels |
A flag indicating if new levels of group-level
effects are allowed (defaults to |
sample_new_levels |
Indicates how to sample new levels for grouping
factors specified in |
incl_autocor |
A flag indicating if correlation structures originally
specified via |
oos |
Optional indices of observations for which to compute out-of-sample rather than in-sample predictions. Only required in models that make use of response values to make predictions, that is, currently only ARMA models. |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
nsamples |
Deprecated alias of |
subset |
Deprecated alias of |
nug |
Small positive number for Gaussian process terms only. For
numerical reasons, the covariance matrix of a Gaussian process might not be
positive definite. Adding a very small number to the matrix's diagonal
often solves this problem. If |
smooths_only |
Logical; If |
offset |
Logical; Indicates if offsets should be included in the
predictions. Defaults to |
newdata2 |
A named |
new_objects |
Deprecated alias of |
point_estimate |
Shall the returned object contain only point estimates
of the parameters instead of their posterior draws? Defaults to
|
ndraws_point_estimate |
Only used if |
... |
Further arguments passed to |
An object of class 'brmsprep'
or 'mvbrmsprep'
,
depending on whether a univariate or multivariate model is passed.
brmsfit
objectPrint a summary for a fitted model represented by a brmsfit
object
## S3 method for class 'brmsfit' print(x, digits = 2, ...)
## S3 method for class 'brmsfit' print(x, digits = 2, ...)
x |
An object of class |
digits |
The number of significant digits for printing out the summary; defaults to 2. The effective sample size is always rounded to integers. |
... |
Additional arguments that would be passed
to method |
brmsprior
objectsPrint method for brmsprior
objects
## S3 method for class 'brmsprior' print(x, show_df = NULL, ...)
## S3 method for class 'brmsprior' print(x, show_df = NULL, ...)
x |
An object of class |
show_df |
Logical; Print priors as a single
|
... |
Currently ignored. |
Extract prior draws of specified parameters
## S3 method for class 'brmsfit' prior_draws(x, variable = NULL, pars = NULL, ...) prior_draws(x, ...) prior_samples(x, ...)
## S3 method for class 'brmsfit' prior_draws(x, variable = NULL, pars = NULL, ...) prior_draws(x, ...) prior_samples(x, ...)
x |
An |
variable |
A character vector providing the variables to extract. By default, all variables are extracted. |
pars |
Deprecated alias of |
... |
Arguments passed to individual methods (if applicable). |
To make use of this function, the model must contain draws of
prior distributions. This can be ensured by setting sample_prior =
TRUE
in function brm
. Priors of certain parameters cannot be saved
for technical reasons. For instance, this is the case for the
population-level intercept, which is only computed after fitting the model
by default. If you want to treat the intercept as part of all the other
regression coefficients, so that sampling from its prior becomes possible,
use ... ~ 0 + Intercept + ...
in the formulas.
A data.frame
containing the prior draws.
## Not run: fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative", prior = set_prior("normal(0,2)", class = "b"), sample_prior = TRUE) # extract all prior draws draws1 <- prior_draws(fit) head(draws1) # extract prior draws for the coefficient of 'treat' draws2 <- prior_draws(fit, "b_treat") head(draws2) ## End(Not run)
## Not run: fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative", prior = set_prior("normal(0,2)", class = "b"), sample_prior = TRUE) # extract all prior draws draws1 <- prior_draws(fit) head(draws1) # extract prior draws for the coefficient of 'treat' draws2 <- prior_draws(fit, "b_treat") head(draws2) ## End(Not run)
brms
modelsExtract priors of models fitted with brms.
## S3 method for class 'brmsfit' prior_summary(object, all = TRUE, ...)
## S3 method for class 'brmsfit' prior_summary(object, all = TRUE, ...)
object |
An object of class |
all |
Logical; Show all parameters in the model which may have
priors ( |
... |
Further arguments passed to or from other methods. |
An brmsprior
object.
## Not run: fit <- brm( count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson(), prior = prior(student_t(5,0,10), class = b) + prior(cauchy(0,2), class = sd) ) prior_summary(fit) prior_summary(fit, all = FALSE) print(prior_summary(fit, all = FALSE), show_df = FALSE) ## End(Not run)
## Not run: fit <- brm( count ~ zAge + zBase * Trt + (1|patient) + (1|obs), data = epilepsy, family = poisson(), prior = prior(student_t(5,0,10), class = b) + prior(cauchy(0,2), class = sd) ) prior_summary(fit) prior_summary(fit, all = FALSE) print(prior_summary(fit, all = FALSE), show_df = FALSE) ## End(Not run)
Implementation of Pareto smoothed importance sampling (PSIS), a method for stabilizing importance ratios. The version of PSIS implemented here corresponds to the algorithm presented in Vehtari, Simpson, Gelman, Yao, and Gabry (2024). For PSIS diagnostics see the pareto-k-diagnostic page.
## S3 method for class 'brmsfit' psis(log_ratios, newdata = NULL, resp = NULL, model_name = NULL, ...)
## S3 method for class 'brmsfit' psis(log_ratios, newdata = NULL, resp = NULL, model_name = NULL, ...)
log_ratios |
A fitted model object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
model_name |
Currently ignored. |
... |
The psis()
methods return an object of class "psis"
,
which is a named list with the following components:
log_weights
Vector or matrix of smoothed (and truncated) but unnormalized log
weights. To get normalized weights use the
weights()
method provided for objects of
class "psis"
.
diagnostics
A named list containing two vectors:
pareto_k
: Estimates of the shape parameter of the
generalized Pareto distribution. See the pareto-k-diagnostic
page for details.
n_eff
: PSIS effective sample size estimates.
Objects of class "psis"
also have the following attributes:
norm_const_log
Vector of precomputed values of colLogSumExps(log_weights)
that are
used internally by the weights
method to normalize the log weights.
tail_len
Vector of tail lengths used for fitting the generalized Pareto distribution.
r_eff
If specified, the user's r_eff
argument.
dims
Integer vector of length 2 containing S
(posterior sample size)
and N
(number of observations).
method
Method used for importance sampling, here psis
.
Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4 (journal version, preprint arXiv:1507.04544).
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024). Pareto smoothed importance sampling. Journal of Machine Learning Research, 25(72):1-58. PDF
## Not run: fit <- brm(rating ~ treat + period + carry, data = inhaler) psis(fit) ## End(Not run)
## Not run: fit <- brm(rating ~ treat + period + carry, data = inhaler) psis(fit) ## End(Not run)
Function used to set up R2D2(M2) priors in brms. The function does not evaluate its arguments – it exists purely to help set up the model.
R2D2(mean_R2 = 0.5, prec_R2 = 2, cons_D2 = 0.5, autoscale = TRUE, main = FALSE)
R2D2(mean_R2 = 0.5, prec_R2 = 2, cons_D2 = 0.5, autoscale = TRUE, main = FALSE)
mean_R2 |
Mean of the Beta prior on the coefficient of determination R^2. |
prec_R2 |
Precision of the Beta prior on the coefficient of determination R^2. |
cons_D2 |
Concentration vector of the Dirichlet prior on the variance decomposition parameters. Lower values imply more shrinkage. |
autoscale |
Logical; indicating whether the R2D2
prior should be scaled using the residual standard deviation
|
main |
Logical (defaults to |
The prior does not account for scale differences of the terms it is applied on. Accordingly, please make sure that all these terms have a comparable scale to ensure that shrinkage is applied properly.
Currently, the following classes support the R2D2(M2) prior: b
(overall regression coefficients), sds
(SDs of smoothing splines),
sdgp
(SDs of Gaussian processes), ar
(autoregressive
coefficients), ma
(moving average coefficients), sderr
(SD of
latent residuals), sdcar
(SD of spatial CAR structures), sd
(SD of varying coefficients).
When the prior is only applied to parameter class b
, it is equivalent
to the original R2D2 prior (with Gaussian kernel). When the prior is also
applied to other parameter classes, it is equivalent to the R2D2M2 prior.
Even when the R2D2(M2) prior is applied to multiple parameter classes at once,
the concentration vector (argument cons_D2
) has to be provided
jointly in the the one instance of the prior where main = TRUE
. The
order in which the elements of concentration vector correspond to the
classes' coefficients is the same as the order of the classes provided
above.
Zhang, Y. D., Naughton, B. P., Bondell, H. D., & Reich, B. J. (2020). Bayesian regression using a prior on the model fit: The R2-D2 shrinkage prior. Journal of the American Statistical Association. https://arxiv.org/pdf/1609.00046
Aguilar J. E. & Bürkner P. C. (2022). Intuitive Joint Priors for Bayesian Linear Multilevel Models: The R2D2M2 prior. ArXiv preprint. https://arxiv.org/pdf/2208.07132
set_prior(R2D2(mean_R2 = 0.8, prec_R2 = 10)) # specify the R2D2 prior across multiple parameter classes set_prior(R2D2(mean_R2 = 0.8, prec_R2 = 10, main = TRUE), class = "b") + set_prior(R2D2(), class = "sd")
set_prior(R2D2(mean_R2 = 0.8, prec_R2 = 10)) # specify the R2D2 prior across multiple parameter classes set_prior(R2D2(mean_R2 = 0.8, prec_R2 = 10, main = TRUE), class = "b") + set_prior(R2D2(), class = "sd")
Extract the group-level ('random') effects of each level
from a brmsfit
object.
## S3 method for class 'brmsfit' ranef( object, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), pars = NULL, groups = NULL, ... )
## S3 method for class 'brmsfit' ranef( object, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), pars = NULL, groups = NULL, ... )
object |
An object of class |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
pars |
Optional names of coefficients to extract. By default, all coefficients are extracted. |
groups |
Optional names of grouping variables for which to extract effects. |
... |
Currently ignored. |
A list of 3D arrays (one per grouping factor).
If summary
is TRUE
,
the 1st dimension contains the factor levels,
the 2nd dimension contains the summary statistics
(see posterior_summary
), and
the 3rd dimension contains the group-level effects.
If summary
is FALSE
, the 1st dimension contains
the posterior draws, the 2nd dimension contains the factor levels,
and the 3rd dimension contains the group-level effects.
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1+Trt|visit), data = epilepsy, family = gaussian(), chains = 2) ranef(fit) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1+Trt|visit), data = epilepsy, family = gaussian(), chains = 2) ranef(fit) ## End(Not run)
read_csv_as_stanfit
is used internally to read CmdStan CSV files into a
stanfit
object that is consistent with the structure of the fit slot of a
brmsfit object.
read_csv_as_stanfit( files, variables = NULL, sampler_diagnostics = NULL, model = NULL, exclude = "", algorithm = "sampling" )
read_csv_as_stanfit( files, variables = NULL, sampler_diagnostics = NULL, model = NULL, exclude = "", algorithm = "sampling" )
files |
Character vector of CSV files names where draws are stored. |
variables |
Character vector of variables to extract from the CSV files. |
sampler_diagnostics |
Character vector of sampler diagnostics to extract. |
model |
A compiled cmdstanr model object (optional). Provide this argument if you want to allow updating the model without recompilation. |
exclude |
Character vector of variables to exclude from the stanfit. Only
used when |
algorithm |
The algorithm with which the model was fitted.
See |
A stanfit object consistent with the structure of the fit
slot of a brmsfit object.
## Not run: # fit a model manually via cmdstanr scode <- stancode(count ~ Trt, data = epilepsy) sdata <- standata(count ~ Trt, data = epilepsy) mod <- cmdstanr::cmdstan_model(cmdstanr::write_stan_file(scode)) stanfit <- mod$sample(data = sdata) # feed the Stan model back into brms fit <- brm(count ~ Trt, data = epilepsy, empty = TRUE, backend = 'cmdstanr') fit$fit <- read_csv_as_stanfit(stanfit$output_files(), model = mod) fit <- rename_pars(fit) summary(fit) ## End(Not run)
## Not run: # fit a model manually via cmdstanr scode <- stancode(count ~ Trt, data = epilepsy) sdata <- standata(count ~ Trt, data = epilepsy) mod <- cmdstanr::cmdstan_model(cmdstanr::write_stan_file(scode)) stanfit <- mod$sample(data = sdata) # feed the Stan model back into brms fit <- brm(count ~ Trt, data = epilepsy, empty = TRUE, backend = 'cmdstanr') fit$fit <- read_csv_as_stanfit(stanfit$output_files(), model = mod) fit <- rename_pars(fit) summary(fit) ## End(Not run)
brmsfit
objectsRecompile the Stan model inside a brmsfit
object, if necessary.
This does not change the model, it simply recreates the executable
so that sampling is possible again.
recompile_model(x, recompile = NULL)
recompile_model(x, recompile = NULL)
x |
An object of class |
recompile |
Logical, indicating whether the Stan model should be
recompiled. If |
A (possibly updated) brmsfit
object.
Compute exact cross-validation for problematic observations for which approximate leave-one-out cross-validation may return incorrect results. Models for problematic observations can be run in parallel using the future package.
## S3 method for class 'brmsfit' reloo( x, loo = NULL, k_threshold = 0.7, newdata = NULL, resp = NULL, check = TRUE, recompile = NULL, future_args = list(), ... ) ## S3 method for class 'loo' reloo(x, fit, ...) reloo(x, ...)
## S3 method for class 'brmsfit' reloo( x, loo = NULL, k_threshold = 0.7, newdata = NULL, resp = NULL, check = TRUE, recompile = NULL, future_args = list(), ... ) ## S3 method for class 'loo' reloo(x, fit, ...) reloo(x, ...)
x |
An R object of class |
loo |
An R object of class |
k_threshold |
The threshold at which Pareto |
newdata |
An optional data.frame for which to evaluate predictions. If
|
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
check |
Logical; If |
recompile |
Logical, indicating whether the Stan model should be
recompiled. This may be necessary if you are running |
future_args |
A list of further arguments passed to
|
... |
Further arguments passed to
|
fit |
An R object of class |
Warnings about Pareto estimates indicate observations
for which the approximation to LOO is problematic (this is described in
detail in Vehtari, Gelman, and Gabry (2017) and the
loo package documentation).
If there are
observations with
estimates above
k_threshold
, then reloo
will refit the original model
times, each time leaving out one of the
problematic observations. The pointwise contributions of these observations
to the total ELPD are then computed directly and substituted for the
previous estimates from these
observations that are stored in the
original
loo
object.
An object of the class loo
.
## Not run: fit1 <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson()) # throws warning about some pareto k estimates being too high (loo1 <- loo(fit1)) # no more warnings after reloo (reloo1 <- reloo(fit1, loo = loo1, chains = 1)) ## End(Not run)
## Not run: fit1 <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson()) # throws warning about some pareto k estimates being too high (loo1 <- loo(fit1)) # no more warnings after reloo (reloo1 <- reloo(fit1, loo = loo1, chains = 1)) ## End(Not run)
Rename parameters within the stanfit
object
after model fitting to ensure reasonable parameter names. This function is
usually called automatically by brm
and users will rarely be
required to call it themselves.
rename_pars(x)
rename_pars(x)
x |
A |
Function rename_pars
is a deprecated alias of rename_pars
.
A brmsfit
object with adjusted parameter names.
## Not run: # fit a model manually via rstan scode <- stancode(count ~ Trt, data = epilepsy) sdata <- standata(count ~ Trt, data = epilepsy) stanfit <- rstan::stan(model_code = scode, data = sdata) # feed the Stan model back into brms fit <- brm(count ~ Trt, data = epilepsy, empty = TRUE) fit$fit <- stanfit fit <- rename_pars(fit) summary(fit) ## End(Not run)
## Not run: # fit a model manually via rstan scode <- stancode(count ~ Trt, data = epilepsy) sdata <- standata(count ~ Trt, data = epilepsy) stanfit <- rstan::stan(model_code = scode, data = sdata) # feed the Stan model back into brms fit <- brm(count ~ Trt, data = epilepsy, empty = TRUE) fit$fit <- stanfit fit <- rename_pars(fit) summary(fit) ## End(Not run)
This method is an alias of predictive_error.brmsfit
with additional arguments for obtaining summaries of the computed draws.
## S3 method for class 'brmsfit' residuals( object, newdata = NULL, re_formula = NULL, method = "posterior_predict", type = c("ordinary", "pearson"), resp = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
## S3 method for class 'brmsfit' residuals( object, newdata = NULL, re_formula = NULL, method = "posterior_predict", type = c("ordinary", "pearson"), resp = NULL, ndraws = NULL, draw_ids = NULL, sort = FALSE, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
object |
An object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
method |
Method used to obtain predictions. Can be set to
|
type |
The type of the residuals,
either |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
ndraws |
Positive integer indicating how many posterior draws should
be used. If |
draw_ids |
An integer vector specifying the posterior draws to be used.
If |
sort |
Logical. Only relevant for time series models.
Indicating whether to return predicted values in the original
order ( |
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
... |
Further arguments passed to |
Residuals of type 'ordinary'
are of the form , where
is the observed and
is the predicted response.
Residuals of type
pearson
are of the form , where
is an estimate of the standard deviation of
.
An array
of predictive error/residual draws. If
summary = FALSE
the output resembles those of
predictive_error.brmsfit
. If summary = TRUE
the output
is an N x E matrix, where N is the number of observations and E denotes
the summary statistics computed from the draws.
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, cores = 2) ## extract residuals/predictive errors res <- residuals(fit) head(res) ## End(Not run)
## Not run: ## fit a model fit <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler, cores = 2) ## extract residuals/predictive errors res <- residuals(fit) head(res) ## End(Not run)
restructure
is a generic function used to restructure old R objects to
work with newer versions of the package that generated them. Its original
use is within the brms package, but new methods for use with objects
from other packages can be registered to the same generic.
restructure(x, ...)
restructure(x, ...)
x |
An object to be restructured. The object's class will determine which method to apply |
... |
Additional arguments to pass to the specific methods |
Usually the version of the package that generated the object will be
stored somewhere in the object and this information will be used by the
specific method to determine what transformations to apply. See
restructure.brmsfit
for the default
method applied for brms models. You can view the available methods by
typing: methods(restructure)
An object of the same class as x
compatible with the latest
version of the package that generated it.
brmsfit
ObjectsRestructure old brmsfit
objects to work with
the latest brms version. This function is called
internally when applying post-processing methods.
However, in order to avoid unnecessary run time caused
by the restructuring, I recommend explicitly calling
restructure
once per model after updating brms.
## S3 method for class 'brmsfit' restructure(x, ...)
## S3 method for class 'brmsfit' restructure(x, ...)
x |
An object of class |
... |
Currently ignored. |
If you are restructuring an old spline model (fitted with brms < 2.19.3) to
avoid prediction inconsistencies between machines (see GitHub issue #1465),
please make sure to restructure
your model on the machine on which it
was originally fitted.
A brmsfit
object compatible with the latest version
of brms.
Convert information in rows to labels for each row.
rows2labels(x, digits = 2, sep = " & ", incl_vars = TRUE, ...)
rows2labels(x, digits = 2, sep = " & ", incl_vars = TRUE, ...)
x |
A |
digits |
Minimal number of decimal places shown in the labels of numeric variables. |
sep |
A single character string defining the separator between variables used in the labels. |
incl_vars |
Indicates if variable names should
be part of the labels. Defaults to |
... |
Currently unused. |
A character vector of the same length as the number
of rows of x
.
make_conditions
, conditional_effects
Functions used in definition of smooth terms within a model formulas. The function does not evaluate a (spline) smooth - it exists purely to help set up a model using spline based smooths.
s(...) t2(...)
s(...) t2(...)
... |
The function defined here are just simple wrappers of the respective
functions of the mgcv package. When using them, please cite the
appropriate references obtained via citation("mgcv")
.
brms uses the "random effects" parameterization of smoothing splines
as explained in mgcv::gamm
. A nice tutorial on this
topic can be found in Pedersen et al. (2019). The answers provided in this
Stan discourse post
may also be helpful.
Pedersen, E. J., Miller, D. L., Simpson, G. L., & Ross, N. (2019). Hierarchical generalized additive models in ecology: an introduction with mgcv. PeerJ.
brmsformula
,
mgcv::s
, mgcv::t2
## Not run: # simulate some data dat <- mgcv::gamSim(1, n = 200, scale = 2) # fit univariate smooths for all predictors fit1 <- brm(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, chains = 2) summary(fit1) plot(conditional_smooths(fit1), ask = FALSE) # fit a more complicated smooth model fit2 <- brm(y ~ t2(x0, x1) + s(x2, by = x3), data = dat, chains = 2) summary(fit2) plot(conditional_smooths(fit2), ask = FALSE) ## End(Not run)
## Not run: # simulate some data dat <- mgcv::gamSim(1, n = 200, scale = 2) # fit univariate smooths for all predictors fit1 <- brm(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, chains = 2) summary(fit1) plot(conditional_smooths(fit1), ask = FALSE) # fit a more complicated smooth model fit2 <- brm(y ~ t2(x0, x1) + s(x2, by = x3), data = dat, chains = 2) summary(fit2) plot(conditional_smooths(fit2), ask = FALSE) ## End(Not run)
Set up an spatial simultaneous autoregressive (SAR) term in brms. The function does not evaluate its arguments – it exists purely to help set up a model with SAR terms.
sar(M, type = "lag")
sar(M, type = "lag")
M |
An object specifying the spatial weighting matrix.
Can be either the spatial weight matrix itself or an
object of class |
type |
Type of the SAR structure. Either |
The lagsar
structure implements SAR of the response values:
The errorsar
structure implements SAR of the residuals:
In the above equations, is the predictor term and
are
independent normally or t-distributed residuals. Currently, only families
gaussian
and student
support SAR structures.
An object of class 'sar_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
## Not run: data(oldcol, package = "spdep") fit1 <- brm(CRIME ~ INC + HOVAL + sar(COL.nb, type = "lag"), data = COL.OLD, data2 = list(COL.nb = COL.nb), chains = 2, cores = 2) summary(fit1) plot(fit1) fit2 <- brm(CRIME ~ INC + HOVAL + sar(COL.nb, type = "error"), data = COL.OLD, data2 = list(COL.nb = COL.nb), chains = 2, cores = 2) summary(fit2) plot(fit2) ## End(Not run)
## Not run: data(oldcol, package = "spdep") fit1 <- brm(CRIME ~ INC + HOVAL + sar(COL.nb, type = "lag"), data = COL.OLD, data2 = list(COL.nb = COL.nb), chains = 2, cores = 2) summary(fit1) plot(fit1) fit2 <- brm(CRIME ~ INC + HOVAL + sar(COL.nb, type = "error"), data = COL.OLD, data2 = list(COL.nb = COL.nb), chains = 2, cores = 2) summary(fit2) plot(fit2) ## End(Not run)
Control which (draws of) parameters should be saved in a brms
model. The output of this function is meant for usage in the
save_pars
argument of brm
.
save_pars(group = TRUE, latent = FALSE, all = FALSE, manual = NULL)
save_pars(group = TRUE, latent = FALSE, all = FALSE, manual = NULL)
group |
A flag to indicate if group-level coefficients for
each level of the grouping factors should be saved (default is
|
latent |
A flag to indicate if draws of latent variables obtained by
using |
all |
A flag to indicate if draws of all variables defined in Stan's
|
manual |
A character vector naming Stan variable names which should be saved. These names should match the variable names inside the Stan code before renaming. This feature is meant for power users only and will rarely be useful outside of very special cases. |
A list of class "save_pars"
.
## Not run: # don't store group-level coefficients fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), save_pars = save_pars(group = FALSE)) variables(fit) ## End(Not run)
## Not run: # don't store group-level coefficients fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), save_pars = save_pars(group = FALSE)) variables(fit) ## End(Not run)
Define priors for specific parameters or classes of parameters.
set_prior( prior, class = "b", coef = "", group = "", resp = "", dpar = "", nlpar = "", lb = NA, ub = NA, check = TRUE ) prior(prior, ...) prior_(prior, ...) prior_string(prior, ...) empty_prior()
set_prior( prior, class = "b", coef = "", group = "", resp = "", dpar = "", nlpar = "", lb = NA, ub = NA, check = TRUE ) prior(prior, ...) prior_(prior, ...) prior_string(prior, ...) empty_prior()
prior |
A character string defining a distribution in Stan language |
class |
The parameter class. Defaults to |
coef |
Name of the coefficient within the parameter class. |
group |
Grouping factor for group-level parameters. |
resp |
Name of the response variable. Only used in multivariate models. |
dpar |
Name of a distributional parameter. Only used in distributional models. |
nlpar |
Name of a non-linear parameter. Only used in non-linear models. |
lb |
Lower bound for parameter restriction. Currently only allowed
for classes |
ub |
Upper bound for parameter restriction. Currently only allowed
for classes |
check |
Logical; Indicates whether priors
should be checked for validity (as far as possible).
Defaults to |
... |
Arguments passed to |
set_prior
is used to define prior distributions for parameters
in brms models. The functions prior
, prior_
, and
prior_string
are aliases of set_prior
each allowing
for a different kind of argument specification.
prior
allows specifying arguments as expression without
quotation marks using non-standard evaluation.
prior_
allows specifying arguments as one-sided formulas
or wrapped in quote
.
prior_string
allows specifying arguments as strings just
as set_prior
itself.
Below, we explain its usage and list some common prior distributions for parameters. A complete overview on possible prior distributions is given in the Stan Reference Manual available at https://mc-stan.org/.
To combine multiple priors, use c(...)
or the +
operator
(see 'Examples'). brms does not check if the priors are written
in correct Stan language. Instead, Stan will check their
syntactical correctness when the model is parsed to C++
and
returns an error if they are not.
This, however, does not imply that priors are always meaningful if they are
accepted by Stan. Although brms trys to find common problems
(e.g., setting bounded priors on unbounded parameters), there is no guarantee
that the defined priors are reasonable for the model.
Below, we list the types of parameters in brms models,
for which the user can specify prior distributions.
Below, we provide details for the individual parameter classes that you can
set priors on. Often, it may not be immediately clear, which parameters are
present in the model. To get a full list of parameters and parameter
classes for which priors can be specified (depending on the model) use
function default_prior
.
1. Population-level ('fixed') effects
Every Population-level effect has its own regression parameter
represents the name of the corresponding population-level effect.
Suppose, for instance, that y
is predicted by x1
and x2
(i.e., y ~ x1 + x2
in formula syntax).
Then, x1
and x2
have regression parameters
b_x1
and b_x2
respectively.
The default prior for population-level effects (including monotonic and
category specific effects) is an improper flat prior over the reals.
Other common options are normal priors or student-t priors.
If we want to have a normal prior with mean 0 and
standard deviation 5 for x1
, and a unit student-t prior with 10
degrees of freedom for x2
, we can specify this via
set_prior("normal(0,5)", class = "b", coef = "x1")
and set_prior("student_t(10, 0, 1)", class = "b", coef = "x2")
.
To put the same prior on all population-level effects at once,
we may write as a shortcut set_prior("<prior>", class = "b")
.
This also leads to faster sampling, because priors can be vectorized in this case.
Both ways of defining priors can be combined using for instance
set_prior("normal(0, 2)", class = "b")
and set_prior("normal(0, 10)", class = "b", coef = "x1")
at the same time. This will set a normal(0, 10)
prior on
the effect of x1
and a normal(0, 2)
prior
on all other population-level effects.
However, this will break vectorization and
may slow down the sampling procedure a bit.
In case of the default intercept parameterization
(discussed in the 'Details' section of brmsformula
),
general priors on class "b"
will not affect
the intercept. Instead, the intercept has its own parameter class
named "Intercept"
and priors can thus be
specified via set_prior("<prior>", class = "Intercept")
.
Setting a prior on the intercept will not break vectorization
of the other population-level effects.
Note that technically, this prior is set on an intercept that
results when internally centering all population-level predictors
around zero to improve sampling efficiency. On this centered
intercept, specifying a prior is actually much easier and
intuitive than on the original intercept, since the former
represents the expected response value when all predictors
are at their means. To treat the intercept as an ordinary
population-level effect and avoid the centering parameterization,
use 0 + Intercept
on the right-hand side of the model formula.
In non-linear models, population-level effects are defined separately
for each non-linear parameter. Accordingly, it is necessary to specify
the non-linear parameter in set_prior
so that priors
we can be assigned correctly.
If, for instance, alpha
is the parameter and x
the predictor
for which we want to define the prior, we can write
set_prior("<prior>", coef = "x", nlpar = "alpha")
.
As a shortcut we can use set_prior("<prior>", nlpar = "alpha")
to set the same prior on all population-level effects of alpha
at once.
The same goes for specifying priors for specific distributional
parameters in the context of distributional regression, for example,
set_prior("<prior>", coef = "x", dpar = "sigma")
.
For most other parameter classes (see below), you need to indicate
non-linear and distributional parameters in the same way as shown here.
If desired, population-level effects can be restricted to fall only
within a certain interval using the lb
and ub
arguments
of set_prior
. This is often required when defining priors
that are not defined everywhere on the real line, such as uniform
or gamma priors. When defining a uniform(2,4)
prior,
you should write set_prior("uniform(2,4)", lb = 2, ub = 4)
.
When using a prior that is defined on the positive reals only
(such as a gamma prior) set lb = 0
.
In most situations, it is not useful to restrict population-level
parameters through bounded priors
(non-linear models are an important exception),
but if you really want to this is the way to go.
2. Group-level ('random') effects
Each group-level effect of each grouping factor has a standard deviation named
sd_<group>_<coef>
. Consider, for instance, the formula
y ~ x1 + x2 + (1 + x1 | g)
.
We see that the intercept as well as x1
are group-level effects
nested in the grouping factor g
.
The corresponding standard deviation parameters are named as
sd_g_Intercept
and sd_g_x1
respectively.
These parameters are restricted to be non-negative and, by default,
have a half student-t prior with 3 degrees of freedom and a
scale parameter that depends on the standard deviation of the response
after applying the link function. Minimally, the scale parameter is 2.5.
This prior is used (a) to be only weakly informative in order to influence
results as few as possible, while (b) providing at least some regularization
to considerably improve convergence and sampling efficiency.
To define a prior distribution only for standard deviations
of a specific grouping factor,
use set_prior("<prior>", class = "sd", group = "<group>")
.
To define a prior distribution only for a specific standard deviation
of a specific grouping factor, you may write set_prior("<prior>", class = "sd", group = "<group>", coef = "<coef>")
.
If there is more than one group-level effect per grouping factor,
the correlations between those effects have to be estimated.
The prior lkj_corr_cholesky(eta)
or in short
lkj(eta)
with eta > 0
is essentially the only prior for (Cholesky factors) of correlation matrices.
If eta = 1
(the default) all correlations matrices
are equally likely a priori. If eta > 1
, extreme correlations
become less likely, whereas 0 < eta < 1
results in
higher probabilities for extreme correlations.
Correlation matrix parameters in brms
models are named as
cor_<group>
, (e.g., cor_g
if g
is the grouping factor).
To set the same prior on every correlation matrix,
use for instance set_prior("lkj(2)", class = "cor")
.
Internally, the priors are transformed to be put on the Cholesky factors
of the correlation matrices to improve efficiency and numerical stability.
The corresponding parameter class of the Cholesky factors is L
,
but it is not recommended to specify priors for this parameter class directly.
4. Smoothing Splines
Smoothing splines are implemented in brms using the 'random effects'
formulation as explained in gamm
). Thus, each
spline has its corresponding standard deviations modeling the variability
within this term. In brms, this parameter class is called sds
and priors can be specified via
set_prior("<prior>", class = "sds", coef = "<term label>")
.
The default prior is the same as for standard deviations of group-level effects.
5. Gaussian processes
Gaussian processes as currently implemented in brms have two
parameters, the standard deviation parameter sdgp
, and
characteristic length-scale parameter lscale
(see gp
for more details). The default prior of sdgp
is the same as for
standard deviations of group-level effects. The default prior of
lscale
is an informative inverse-gamma prior specifically tuned to
the covariates of the Gaussian process (for more details see
https://betanalpha.github.io/assets/case_studies/gp_part3/part3.html).
This tuned prior may be overly informative in some cases, so please
consider other priors as well to make sure inference is robust to the prior
specification. If tuning fails, a half-normal prior is used instead.
6. Autocorrelation parameters
The autocorrelation parameters currently implemented are named ar
(autoregression), ma
(moving average), sderr
(standard
deviation of latent residuals in latent ARMA models), cosy
(compound
symmetry correlation), car
(spatial conditional autoregression), as
well as lagsar
and errorsar
(spatial simultaneous
autoregression).
Priors can be defined by set_prior("<prior>", class = "ar")
for
ar
and similar for other autocorrelation parameters. By default,
ar
and ma
are bounded between -1
and 1
;
cosy
, car
, lagsar
, and errorsar
are bounded
between 0
and 1
. The default priors are flat over the
respective definition areas.
7. Parameters of measurement error terms
Latent variables induced via measurement error me
terms
require both mean and standard deviation parameters, whose prior classes
are named "meanme"
and "sdme"
, respectively. If multiple
latent variables are induced this way, their correlation matrix will
be modeled as well and corresponding priors can be specified via the
"corme"
class. All of the above parameters have flat priors over
their respective definition spaces by default.
8. Distance parameters of monotonic effects
As explained in the details section of brm
,
monotonic effects make use of a special parameter vector to
estimate the 'normalized distances' between consecutive predictor
categories. This is realized in Stan using the simplex
parameter type. This class is named "simo"
(short for
simplex monotonic) in brms.
The only valid prior for simplex parameters is the
dirichlet prior, which accepts a vector of length K - 1
(K = number of predictor categories) as input defining the
'concentration' of the distribution. Explaining the dirichlet prior
is beyond the scope of this documentation, but we want to describe
how to define this prior syntactically correct.
If a predictor x
with K
categories is modeled as monotonic,
we can define a prior on its corresponding simplex via prior(dirichlet(<vector>), class = simo, coef = mox1)
.
The 1
in the end of coef
indicates that this is the first
simplex in this term. If interactions between multiple monotonic
variables are modeled, multiple simplexes per term are required.
For <vector>
, we can put in any R
expression
defining a vector of length K - 1
. The default is a uniform
prior (i.e. <vector> = rep(1, K-1)
) over all simplexes
of the respective dimension.
9. Parameters for specific families
Some families need additional parameters to be estimated.
Families gaussian
, student
, skew_normal
,
lognormal
, and gen_extreme_value
need the parameter
sigma
to account for the residual standard deviation.
By default, sigma
has a half student-t prior that scales
in the same way as the group-level standard deviations.
Further, family student
needs the parameter
nu
representing the degrees of freedom of Student-t distribution.
By default, nu
has prior gamma(2, 0.1)
, which is
close to a penalized complexity prior (see Stan prior choice Wiki),
and a fixed lower bound of 1
.
Family negbinomial
needs a shape
parameter that has by
default inv_gamma(0.4, 0.3)
prior which is close to a
penalized complexity prior (see Stan prior choice Wiki).
Families gamma
, weibull
, and inverse.gaussian
,
need a shape
parameter that has a gamma(0.01, 0.01)
prior by default.
For families cumulative
, cratio
, sratio
,
and acat
, and only if threshold = "equidistant"
,
the parameter delta
is used to model the distance between
two adjacent thresholds.
By default, delta
has an improper flat prior over the reals.
The von_mises
family needs the parameter kappa
, representing
the concentration parameter. By default, kappa
has prior
gamma(2, 0.01)
.
Every family specific parameter has its own prior class, so that
set_prior("<prior>", class = "<parameter>")
is the right way to go.
All of these priors are chosen to be weakly informative,
having only minimal influence on the estimations,
while improving convergence and sampling efficiency.
10. Shrinkage priors
To reduce the danger of overfitting in models with many predictor terms fit
on comparably sparse data, brms supports special shrinkage priors, namely
the (regularized) horseshoe
and the R2D2
prior.
These priors can be applied on many parameter classes, either directly on
the coefficient classes (e.g., class b
), if directly setting priors
on them is supported, or on the corresponding standard deviation
hyperparameters (e.g., class sd
) otherwise. Currently, the following
classes support shrinkage priors: b
(overall regression
coefficients), sds
(SDs of smoothing splines), sdgp
(SDs of
Gaussian processes), ar
(autoregressive coefficients), ma
(moving average coefficients), sderr
(SD of latent residuals),
sdcar
(SD of spatial CAR structures), sd
(SD of varying
coefficients).
11. Fixing parameters to constants
Fixing parameters to constants is possible by using the constant
function, for example, constant(1)
to fix a parameter to 1.
Broadcasting to vectors and matrices is done automatically.
An object of class brmsprior
to be used in the prior
argument of brm
.
prior()
: Alias of set_prior
allowing to
specify arguments as expressions without quotation marks.
prior_()
: Alias of set_prior
allowing to specify
arguments as as one-sided formulas or wrapped in quote
.
prior_string()
: Alias of set_prior
allowing to
specify arguments as strings.
empty_prior()
: Create an empty brmsprior
object.
## use alias functions (prior1 <- prior(cauchy(0, 1), class = sd)) (prior2 <- prior_(~cauchy(0, 1), class = ~sd)) (prior3 <- prior_string("cauchy(0, 1)", class = "sd")) identical(prior1, prior2) identical(prior1, prior3) # check which parameters can have priors default_prior(rating ~ treat + period + carry + (1|subject), data = inhaler, family = cumulative()) # define some priors bprior <- c(prior_string("normal(0,10)", class = "b"), prior(normal(1,2), class = b, coef = treat), prior_(~cauchy(0,2), class = ~sd, group = ~subject, coef = ~Intercept)) # verify that the priors indeed found their way into Stan's model code stancode(rating ~ treat + period + carry + (1|subject), data = inhaler, family = cumulative(), prior = bprior) # use the horseshoe prior to model sparsity in regression coefficients stancode(count ~ zAge + zBase * Trt, data = epilepsy, family = poisson(), prior = set_prior("horseshoe(3)")) # fix certain priors to constants bprior <- prior(constant(1), class = "b") + prior(constant(2), class = "b", coef = "zBase") + prior(constant(0.5), class = "sd") stancode(count ~ zAge + zBase + (1 | patient), data = epilepsy, prior = bprior) # pass priors to Stan without checking prior <- prior_string("target += normal_lpdf(b[1] | 0, 1)", check = FALSE) stancode(count ~ Trt, data = epilepsy, prior = prior) # define priors in a vectorized manner # useful in particular for categorical or multivariate models set_prior("normal(0, 2)", dpar = c("muX", "muY", "muZ"))
## use alias functions (prior1 <- prior(cauchy(0, 1), class = sd)) (prior2 <- prior_(~cauchy(0, 1), class = ~sd)) (prior3 <- prior_string("cauchy(0, 1)", class = "sd")) identical(prior1, prior2) identical(prior1, prior3) # check which parameters can have priors default_prior(rating ~ treat + period + carry + (1|subject), data = inhaler, family = cumulative()) # define some priors bprior <- c(prior_string("normal(0,10)", class = "b"), prior(normal(1,2), class = b, coef = treat), prior_(~cauchy(0,2), class = ~sd, group = ~subject, coef = ~Intercept)) # verify that the priors indeed found their way into Stan's model code stancode(rating ~ treat + period + carry + (1|subject), data = inhaler, family = cumulative(), prior = bprior) # use the horseshoe prior to model sparsity in regression coefficients stancode(count ~ zAge + zBase * Trt, data = epilepsy, family = poisson(), prior = set_prior("horseshoe(3)")) # fix certain priors to constants bprior <- prior(constant(1), class = "b") + prior(constant(2), class = "b", coef = "zBase") + prior(constant(0.5), class = "sd") stancode(count ~ zAge + zBase + (1 | patient), data = epilepsy, prior = bprior) # pass priors to Stan without checking prior <- prior_string("target += normal_lpdf(b[1] | 0, 1)", check = FALSE) stancode(count ~ Trt, data = epilepsy, prior = prior) # define priors in a vectorized manner # useful in particular for categorical or multivariate models set_prior("normal(0, 2)", dpar = c("muX", "muY", "muZ"))
Density, distribution function, quantile function and random generation
for the shifted log normal distribution with mean meanlog
,
standard deviation sdlog
, and shift parameter shift
.
dshifted_lnorm(x, meanlog = 0, sdlog = 1, shift = 0, log = FALSE) pshifted_lnorm( q, meanlog = 0, sdlog = 1, shift = 0, lower.tail = TRUE, log.p = FALSE ) qshifted_lnorm( p, meanlog = 0, sdlog = 1, shift = 0, lower.tail = TRUE, log.p = FALSE ) rshifted_lnorm(n, meanlog = 0, sdlog = 1, shift = 0)
dshifted_lnorm(x, meanlog = 0, sdlog = 1, shift = 0, log = FALSE) pshifted_lnorm( q, meanlog = 0, sdlog = 1, shift = 0, lower.tail = TRUE, log.p = FALSE ) qshifted_lnorm( p, meanlog = 0, sdlog = 1, shift = 0, lower.tail = TRUE, log.p = FALSE ) rshifted_lnorm(n, meanlog = 0, sdlog = 1, shift = 0)
x , q
|
Vector of quantiles. |
meanlog |
Vector of means. |
sdlog |
Vector of standard deviations. |
shift |
Vector of shifts. |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
p |
Vector of probabilities. |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
Density, distribution function, and random generation for the
skew-normal distribution with mean mu
,
standard deviation sigma
, and skewness alpha
.
dskew_normal( x, mu = 0, sigma = 1, alpha = 0, xi = NULL, omega = NULL, log = FALSE ) pskew_normal( q, mu = 0, sigma = 1, alpha = 0, xi = NULL, omega = NULL, lower.tail = TRUE, log.p = FALSE ) qskew_normal( p, mu = 0, sigma = 1, alpha = 0, xi = NULL, omega = NULL, lower.tail = TRUE, log.p = FALSE, tol = 1e-08 ) rskew_normal(n, mu = 0, sigma = 1, alpha = 0, xi = NULL, omega = NULL)
dskew_normal( x, mu = 0, sigma = 1, alpha = 0, xi = NULL, omega = NULL, log = FALSE ) pskew_normal( q, mu = 0, sigma = 1, alpha = 0, xi = NULL, omega = NULL, lower.tail = TRUE, log.p = FALSE ) qskew_normal( p, mu = 0, sigma = 1, alpha = 0, xi = NULL, omega = NULL, lower.tail = TRUE, log.p = FALSE, tol = 1e-08 ) rskew_normal(n, mu = 0, sigma = 1, alpha = 0, xi = NULL, omega = NULL)
x , q
|
Vector of quantiles. |
mu |
Vector of mean values. |
sigma |
Vector of standard deviation values. |
alpha |
Vector of skewness values. |
xi |
Optional vector of location values.
If |
omega |
Optional vector of scale values.
If |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
p |
Vector of probabilities. |
tol |
Tolerance of the approximation used in the computation of quantiles. |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
stancode
is a generic function that can be used to
generate Stan code for Bayesian models. Its original use is
within the brms package, but new methods for use
with objects from other packages can be registered to the same generic.
stancode(object, ...) make_stancode(formula, ...)
stancode(object, ...) make_stancode(formula, ...)
object |
An object whose class will determine which method to apply. Usually, it will be some kind of symbolic description of the model form which Stan code should be generated. |
... |
Further arguments passed to the specific method. |
formula |
Synonym of |
See stancode.default
for the default
method applied for brms models.
You can view the available methods by typing: methods(stancode)
The make_stancode
function is an alias of stancode
.
Usually, a character string containing the generated Stan code.
For pretty printing, we recommend the returned object to be of class
c("character", "brmsmodel")
.
stancode.default
, stancode.brmsfit
stancode(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative")
stancode(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative")
brmsfit
objectsExtract Stan code from a fitted brms model.
## S3 method for class 'brmsfit' stancode( object, version = TRUE, regenerate = NULL, threads = NULL, backend = NULL, ... )
## S3 method for class 'brmsfit' stancode( object, version = TRUE, regenerate = NULL, threads = NULL, backend = NULL, ... )
object |
An object of class |
version |
Logical; indicates if the first line containing the brms
version number should be included. Defaults to |
regenerate |
Logical; indicates if the Stan code should be regenerated
with the current brms version. By default, |
threads |
Controls whether the Stan code should be threaded. See
|
backend |
Controls the Stan backend. See |
... |
Further arguments passed to
|
Stan code for further processing.
Generate Stan code for brms models
## Default S3 method: stancode( object, data, family = gaussian(), prior = NULL, autocor = NULL, data2 = NULL, cov_ranef = NULL, sparse = NULL, sample_prior = "no", stanvars = NULL, stan_funs = NULL, knots = NULL, drop_unused_levels = TRUE, threads = getOption("brms.threads", NULL), normalize = getOption("brms.normalize", TRUE), save_model = NULL, ... )
## Default S3 method: stancode( object, data, family = gaussian(), prior = NULL, autocor = NULL, data2 = NULL, cov_ranef = NULL, sparse = NULL, sample_prior = "no", stanvars = NULL, stan_funs = NULL, knots = NULL, drop_unused_levels = TRUE, threads = getOption("brms.threads", NULL), normalize = getOption("brms.normalize", TRUE), save_model = NULL, ... )
object |
An object of class |
data |
An object of class |
family |
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a |
prior |
One or more |
autocor |
(Deprecated) An optional |
data2 |
A named |
cov_ranef |
(Deprecated) A list of matrices that are proportional to the
(within) covariance structure of the group-level effects. The names of the
matrices should correspond to columns in |
sparse |
(Deprecated) Logical; indicates whether the population-level
design matrices should be treated as sparse (defaults to |
sample_prior |
Indicate if draws from priors should be drawn
additionally to the posterior draws. Options are |
stanvars |
An optional |
stan_funs |
(Deprecated) An optional character string containing
self-defined Stan functions, which will be included in the functions
block of the generated Stan code. It is now recommended to use the
|
knots |
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See
|
drop_unused_levels |
Should unused factors levels in the data be
dropped? Defaults to |
threads |
Number of threads to use in within-chain parallelization. For
more control over the threading process, |
normalize |
Logical. Indicates whether normalization constants should
be included in the Stan code (defaults to |
save_model |
Either |
... |
Other arguments for internal usage only. |
A character string containing the fully commented Stan code
to fit a brms model. It is of class c("character", "brmsmodel")
to facilitate pretty printing.
stancode(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative") stancode(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = "poisson")
stancode(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative") stancode(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = "poisson")
standata
is a generic function that can be used to
generate data for Bayesian models to be passed to Stan. Its original use is
within the brms package, but new methods for use
with objects from other packages can be registered to the same generic.
standata(object, ...) make_standata(formula, ...)
standata(object, ...) make_standata(formula, ...)
object |
A formula object whose class will determine which method will be used. A symbolic description of the model to be fitted. |
... |
Further arguments passed to the specific method. |
formula |
Synonym of |
See standata.default
for the default method applied for
brms models. You can view the available methods by typing
methods(standata)
. The make_standata
function is an alias
of standata
.
A named list of objects containing the required data to fit a Bayesian model with Stan.
standata.default
, standata.brmsfit
sdata1 <- standata(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative") str(sdata1)
sdata1 <- standata(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative") str(sdata1)
brmsfit
objectsExtract all data that was used by Stan to fit a brms model.
## S3 method for class 'brmsfit' standata( object, newdata = NULL, re_formula = NULL, newdata2 = NULL, new_objects = NULL, incl_autocor = TRUE, ... )
## S3 method for class 'brmsfit' standata( object, newdata = NULL, re_formula = NULL, newdata2 = NULL, new_objects = NULL, incl_autocor = TRUE, ... )
object |
An object of class |
newdata |
An optional data.frame for which to evaluate predictions. If
|
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
newdata2 |
A named |
new_objects |
Deprecated alias of |
incl_autocor |
A flag indicating if correlation structures originally
specified via |
... |
More arguments passed to
|
A named list containing the data passed to Stan.
Generate data for brms models to be passed to Stan.
## Default S3 method: standata( object, data, family = gaussian(), prior = NULL, autocor = NULL, data2 = NULL, cov_ranef = NULL, sample_prior = "no", stanvars = NULL, threads = getOption("brms.threads", NULL), knots = NULL, drop_unused_levels = TRUE, ... )
## Default S3 method: standata( object, data, family = gaussian(), prior = NULL, autocor = NULL, data2 = NULL, cov_ranef = NULL, sample_prior = "no", stanvars = NULL, threads = getOption("brms.threads", NULL), knots = NULL, drop_unused_levels = TRUE, ... )
object |
An object of class |
data |
An object of class |
family |
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a |
prior |
One or more |
autocor |
(Deprecated) An optional |
data2 |
A named |
cov_ranef |
(Deprecated) A list of matrices that are proportional to the
(within) covariance structure of the group-level effects. The names of the
matrices should correspond to columns in |
sample_prior |
Indicate if draws from priors should be drawn
additionally to the posterior draws. Options are |
stanvars |
An optional |
threads |
Number of threads to use in within-chain parallelization. For
more control over the threading process, |
knots |
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See
|
drop_unused_levels |
Should unused factors levels in the data be
dropped? Defaults to |
... |
Other arguments for internal use. |
A named list of objects containing the required data to fit a brms model with Stan.
sdata1 <- standata(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative") str(sdata1) sdata2 <- standata(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = "poisson") str(sdata2)
sdata1 <- standata(rating ~ treat + period + carry + (1|subject), data = inhaler, family = "cumulative") str(sdata1) sdata2 <- standata(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = "poisson") str(sdata2)
Prepare user-defined variables to be passed to one of Stan's program blocks. This is primarily useful for defining more complex priors, for refitting models without recompilation despite changing priors, or for defining custom Stan functions.
stanvar( x = NULL, name = NULL, scode = NULL, block = "data", position = "start", pll_args = NULL )
stanvar( x = NULL, name = NULL, scode = NULL, block = "data", position = "start", pll_args = NULL )
x |
An R object containing data to be passed to Stan.
Only required if |
name |
Optional character string providing the desired variable
name of the object in |
scode |
Line of Stan code to define the variable
in Stan language. If |
block |
Name of one of Stan's program blocks in
which the variable should be defined. Can be |
position |
Name of the position within the block where the
Stan code should be placed. Currently allowed are |
pll_args |
Optional Stan code to be put into the header
of |
The stanvar
function is not vectorized. Instead, multiple
stanvars
objects can be added together via +
(see Examples).
Special attention is necessary when using stanvars
to inject
code into the 'likelihood'
block while having threading
activated. In this case, your custom Stan code may need adjustments to ensure
correct observation indexing. Please investigate the generated Stan code via
stancode
to see which adjustments are necessary in your case.
An object of class stanvars
.
bprior <- prior(normal(mean_intercept, 10), class = "Intercept") stanvars <- stanvar(5, name = "mean_intercept") stancode(count ~ Trt, epilepsy, prior = bprior, stanvars = stanvars) # define a multi-normal prior with known covariance matrix bprior <- prior(multi_normal(M, V), class = "b") stanvars <- stanvar(rep(0, 2), "M", scode = " vector[K] M;") + stanvar(diag(2), "V", scode = " matrix[K, K] V;") stancode(count ~ Trt + zBase, epilepsy, prior = bprior, stanvars = stanvars) # define a hierachical prior on the regression coefficients bprior <- set_prior("normal(0, tau)", class = "b") + set_prior("target += normal_lpdf(tau | 0, 10)", check = FALSE) stanvars <- stanvar(scode = "real<lower=0> tau;", block = "parameters") stancode(count ~ Trt + zBase, epilepsy, prior = bprior, stanvars = stanvars) # ensure that 'tau' is passed to the likelihood of a threaded model # not necessary for this example but may be necessary in other cases stanvars <- stanvar(scode = "real<lower=0> tau;", block = "parameters", pll_args = "real tau") stancode(count ~ Trt + zBase, epilepsy, stanvars = stanvars, threads = threading(2))
bprior <- prior(normal(mean_intercept, 10), class = "Intercept") stanvars <- stanvar(5, name = "mean_intercept") stancode(count ~ Trt, epilepsy, prior = bprior, stanvars = stanvars) # define a multi-normal prior with known covariance matrix bprior <- prior(multi_normal(M, V), class = "b") stanvars <- stanvar(rep(0, 2), "M", scode = " vector[K] M;") + stanvar(diag(2), "V", scode = " matrix[K, K] V;") stancode(count ~ Trt + zBase, epilepsy, prior = bprior, stanvars = stanvars) # define a hierachical prior on the regression coefficients bprior <- set_prior("normal(0, tau)", class = "b") + set_prior("target += normal_lpdf(tau | 0, 10)", check = FALSE) stanvars <- stanvar(scode = "real<lower=0> tau;", block = "parameters") stancode(count ~ Trt + zBase, epilepsy, prior = bprior, stanvars = stanvars) # ensure that 'tau' is passed to the likelihood of a threaded model # not necessary for this example but may be necessary in other cases stanvars <- stanvar(scode = "real<lower=0> tau;", block = "parameters", pll_args = "real tau") stancode(count ~ Trt + zBase, epilepsy, stanvars = stanvars, threads = threading(2))
Density, distribution function, quantile function and random generation
for the Student-t distribution with location mu
, scale sigma
,
and degrees of freedom df
.
dstudent_t(x, df, mu = 0, sigma = 1, log = FALSE) pstudent_t(q, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE) qstudent_t(p, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE) rstudent_t(n, df, mu = 0, sigma = 1)
dstudent_t(x, df, mu = 0, sigma = 1, log = FALSE) pstudent_t(q, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE) qstudent_t(p, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE) rstudent_t(n, df, mu = 0, sigma = 1)
x |
Vector of quantiles. |
df |
Vector of degrees of freedom. |
mu |
Vector of location values. |
sigma |
Vector of scale values. |
log |
Logical; If |
q |
Vector of quantiles. |
lower.tail |
Logical; If |
log.p |
Logical; If |
p |
Vector of probabilities. |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
brmsfit
objectCreate a summary of a fitted model represented by a brmsfit
object
## S3 method for class 'brmsfit' summary( object, priors = FALSE, prob = 0.95, robust = FALSE, mc_se = FALSE, ... )
## S3 method for class 'brmsfit' summary( object, priors = FALSE, prob = 0.95, robust = FALSE, mc_se = FALSE, ... )
object |
An object of class |
priors |
Logical; Indicating if priors should be included
in the summary. Default is |
prob |
A value between 0 and 1 indicating the desired probability to be covered by the uncertainty intervals. The default is 0.95. |
robust |
If |
mc_se |
Logical; Indicating if the uncertainty in |
... |
Other potential arguments |
The convergence diagnostics Rhat
, Bulk_ESS
, and
Tail_ESS
are described in detail in Vehtari et al. (2020).
Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian Bürkner (2020). Rank-normalization, folding, and localization: An improved R-hat for assessing convergence of MCMC. *Bayesian Analysis*. 1–28. dpi:10.1214/20-BA1221
A black theme for ggplot graphics inspired by a blog post of Jon Lefcheck (https://jonlefcheck.net/2013/03/11/black-theme-for-ggplot2-2/).
theme_black(base_size = 12, base_family = "")
theme_black(base_size = 12, base_family = "")
base_size |
base font size |
base_family |
base font family |
When using theme_black
in plots powered by the
bayesplot package such as pp_check
or stanplot
,
I recommend using the "viridisC"
color scheme (see examples).
A theme
object used in ggplot2 graphics.
## Not run: # change default ggplot theme ggplot2::theme_set(theme_black()) # change default bayesplot color scheme bayesplot::color_scheme_set("viridisC") # fit a simple model fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), chains = 2) summary(fit) # create various plots plot(marginal_effects(fit), ask = FALSE) pp_check(fit) mcmc_plot(fit, type = "hex", variable = c("b_Intercept", "b_Trt1")) ## End(Not run)
## Not run: # change default ggplot theme ggplot2::theme_set(theme_black()) # change default bayesplot color scheme bayesplot::color_scheme_set("viridisC") # fit a simple model fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson(), chains = 2) summary(fit) # create various plots plot(marginal_effects(fit), ask = FALSE) pp_check(fit) mcmc_plot(fit, type = "hex", variable = c("b_Intercept", "b_Trt1")) ## End(Not run)
This theme is imported from the bayesplot package.
See theme_default
for a complete documentation.
base_size |
base font size |
base_family |
base font family |
A theme
object used in ggplot2 graphics.
Use threads for within-chain parallelization in Stan via the brms
interface. Within-chain parallelization is experimental! We recommend its use
only if you are experienced with Stan's reduce_sum
function and have a
slow running model that cannot be sped up by any other means.
threading(threads = NULL, grainsize = NULL, static = FALSE, force = FALSE)
threading(threads = NULL, grainsize = NULL, static = FALSE, force = FALSE)
threads |
Number of threads to use in within-chain parallelization. |
grainsize |
Number of observations evaluated together in one chunk on
one of the CPUs used for threading. If |
static |
Logical. Apply the static (non-adaptive) version of
|
force |
Logical. Defaults to |
The adaptive scheduling procedure used by reduce_sum
will
prevent the results to be exactly reproducible even if you set the random
seed. If you need exact reproducibility, you have to set argument
static = TRUE
which may reduce efficiency a bit.
To ensure that chunks (whose size is defined by grainsize
) require
roughly the same amount of computing time, we recommend storing
observations in random order in the data. At least, please avoid sorting
observations after the response values. This is because the latter often
cause variations in the computing time of the pointwise log-likelihood,
which makes up a big part of the parallelized code.
A brmsthreads
object which can be passed to the
threads
argument of brm
and related functions.
## Not run: # this model just serves as an illustration # threading may not actually speed things up here fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = negbinomial(), chains = 1, threads = threading(2, grainsize = 100), backend = "cmdstanr") summary(fit) ## End(Not run)
## Not run: # this model just serves as an illustration # threading may not actually speed things up here fit <- brm(count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = negbinomial(), chains = 1, threads = threading(2, grainsize = 100), backend = "cmdstanr") summary(fit) ## End(Not run)
Set up an unstructured (UNSTR) correlation term in brms. The function does not evaluate its arguments – it exists purely to help set up a model with UNSTR terms.
unstr(time, gr)
unstr(time, gr)
time |
An optional time variable specifying the time ordering of the observations. By default, the existing order of the observations in the data is used. |
gr |
An optional grouping variable. If specified, the correlation structure is assumed to apply only to observations within the same grouping level. |
An object of class 'unstr_term'
, which is a list
of arguments to be interpreted by the formula
parsing functions of brms.
## Not run: # add an unstructured correlation matrix for visits within the same patient fit <- brm(count ~ Trt + unstr(visit, patient), data = epilepsy) summary(fit) ## End(Not run)
## Not run: # add an unstructured correlation matrix for visits within the same patient fit <- brm(count ~ Trt + unstr(visit, patient), data = epilepsy) summary(fit) ## End(Not run)
Update additions terms used in formulas of brms. See
addition-terms
for details.
update_adterms(formula, adform, action = c("update", "replace"))
update_adterms(formula, adform, action = c("update", "replace"))
formula |
Two-sided formula to be updated. |
adform |
One-sided formula containing addition terms to update
|
action |
Indicates what should happen to the existing addition terms in
|
An object of class formula
.
form <- y | trials(size) ~ x update_adterms(form, ~ trials(10)) update_adterms(form, ~ weights(w)) update_adterms(form, ~ weights(w), action = "replace") update_adterms(y ~ x, ~ trials(10))
form <- y | trials(size) ~ x update_adterms(form, ~ trials(10)) update_adterms(form, ~ weights(w)) update_adterms(form, ~ weights(w), action = "replace") update_adterms(y ~ x, ~ trials(10))
This method allows to update an existing brmsfit
object.
## S3 method for class 'brmsfit' update(object, formula., newdata = NULL, recompile = NULL, ...)
## S3 method for class 'brmsfit' update(object, formula., newdata = NULL, recompile = NULL, ...)
object |
An object of class |
formula. |
Changes to the formula; for details see
|
newdata |
Optional |
recompile |
Logical, indicating whether the Stan model should
be recompiled. If |
... |
Other arguments passed to |
When updating a brmsfit
created with the cmdstanr
backend in a different R session, a recompilation will be triggered
because by default, cmdstanr writes the model executable to a
temporary directory. To avoid that, set option
"cmdstanr_write_stan_file_dir"
to a nontemporary path of your choice
before creating the original brmsfit
(see section 'Examples' below).
## Not run: fit1 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient), data = kidney, family = gaussian("log")) summary(fit1) ## remove effects of 'disease' fit2 <- update(fit1, formula. = ~ . - disease) summary(fit2) ## remove the group specific term of 'patient' and ## change the data (just take a subset in this example) fit3 <- update(fit1, formula. = ~ . - (1|patient), newdata = kidney[1:38, ]) summary(fit3) ## use another family and add population-level priors fit4 <- update(fit1, family = weibull(), init = "0", prior = set_prior("normal(0,5)")) summary(fit4) ## to avoid a recompilation when updating a 'cmdstanr'-backend fit in a fresh ## R session, set option 'cmdstanr_write_stan_file_dir' before creating the ## initial 'brmsfit' ## CAUTION: the following code creates some files in the current working ## directory: two 'model_<hash>.stan' files, one 'model_<hash>(.exe)' ## executable, and one 'fit_cmdstanr_<some_number>.rds' file set.seed(7) fname <- paste0("fit_cmdstanr_", sample.int(.Machine$integer.max, 1)) options(cmdstanr_write_stan_file_dir = getwd()) fit_cmdstanr <- brm(rate ~ conc + state, data = Puromycin, backend = "cmdstanr", file = fname) # now restart the R session and run the following (after attaching 'brms') set.seed(7) fname <- paste0("fit_cmdstanr_", sample.int(.Machine$integer.max, 1)) fit_cmdstanr <- brm(rate ~ conc + state, data = Puromycin, backend = "cmdstanr", file = fname) upd_cmdstanr <- update(fit_cmdstanr, formula. = rate ~ conc) ## End(Not run)
## Not run: fit1 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient), data = kidney, family = gaussian("log")) summary(fit1) ## remove effects of 'disease' fit2 <- update(fit1, formula. = ~ . - disease) summary(fit2) ## remove the group specific term of 'patient' and ## change the data (just take a subset in this example) fit3 <- update(fit1, formula. = ~ . - (1|patient), newdata = kidney[1:38, ]) summary(fit3) ## use another family and add population-level priors fit4 <- update(fit1, family = weibull(), init = "0", prior = set_prior("normal(0,5)")) summary(fit4) ## to avoid a recompilation when updating a 'cmdstanr'-backend fit in a fresh ## R session, set option 'cmdstanr_write_stan_file_dir' before creating the ## initial 'brmsfit' ## CAUTION: the following code creates some files in the current working ## directory: two 'model_<hash>.stan' files, one 'model_<hash>(.exe)' ## executable, and one 'fit_cmdstanr_<some_number>.rds' file set.seed(7) fname <- paste0("fit_cmdstanr_", sample.int(.Machine$integer.max, 1)) options(cmdstanr_write_stan_file_dir = getwd()) fit_cmdstanr <- brm(rate ~ conc + state, data = Puromycin, backend = "cmdstanr", file = fname) # now restart the R session and run the following (after attaching 'brms') set.seed(7) fname <- paste0("fit_cmdstanr_", sample.int(.Machine$integer.max, 1)) fit_cmdstanr <- brm(rate ~ conc + state, data = Puromycin, backend = "cmdstanr", file = fname) upd_cmdstanr <- update(fit_cmdstanr, formula. = rate ~ conc) ## End(Not run)
This method allows to update an existing brmsfit_multiple
object.
## S3 method for class 'brmsfit_multiple' update(object, formula., newdata = NULL, ...)
## S3 method for class 'brmsfit_multiple' update(object, formula., newdata = NULL, ...)
object |
An object of class |
formula. |
Changes to the formula; for details see
|
newdata |
List of |
... |
Other arguments passed to |
## Not run: library(mice) imp <- mice(nhanes2) # initially fit the model fit_imp1 <- brm_multiple(bmi ~ age + hyp + chl, data = imp, chains = 1) summary(fit_imp1) # update the model using fewer predictors fit_imp2 <- update(fit_imp1, formula. = . ~ hyp + chl, newdata = imp) summary(fit_imp2) ## End(Not run)
## Not run: library(mice) imp <- mice(nhanes2) # initially fit the model fit_imp1 <- brm_multiple(bmi ~ age + hyp + chl, data = imp, chains = 1) summary(fit_imp1) # update the model using fewer predictors fit_imp2 <- update(fit_imp1, formula. = . ~ hyp + chl, newdata = imp) summary(fit_imp2) ## End(Not run)
Validate new data passed to post-processing methods of brms. Unless you
are a package developer, you will rarely need to call validate_newdata
directly.
validate_newdata( newdata, object, re_formula = NULL, allow_new_levels = FALSE, newdata2 = NULL, resp = NULL, check_response = TRUE, incl_autocor = TRUE, group_vars = NULL, req_vars = NULL, ... )
validate_newdata( newdata, object, re_formula = NULL, allow_new_levels = FALSE, newdata2 = NULL, resp = NULL, check_response = TRUE, incl_autocor = TRUE, group_vars = NULL, req_vars = NULL, ... )
newdata |
A |
object |
A |
re_formula |
formula containing group-level effects to be considered in
the prediction. If |
allow_new_levels |
A flag indicating if new levels of group-level
effects are allowed (defaults to |
newdata2 |
A named |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
check_response |
Logical; Indicates if response variables should
be checked as well. Defaults to |
incl_autocor |
A flag indicating if correlation structures originally
specified via |
group_vars |
Optional names of grouping variables to be validated. Defaults to all grouping variables in the model. |
req_vars |
Optional names of variables required in |
... |
Currently ignored. |
A validated 'data.frame'
based on newdata
.
Validate priors supplied by the user. Return a complete set of priors for the given model, including default priors.
validate_prior( prior, formula, data, family = gaussian(), sample_prior = "no", data2 = NULL, knots = NULL, drop_unused_levels = TRUE, ... )
validate_prior( prior, formula, data, family = gaussian(), sample_prior = "no", data2 = NULL, knots = NULL, drop_unused_levels = TRUE, ... )
prior |
One or more |
formula |
An object of class |
data |
An object of class |
family |
A description of the response distribution and link function to
be used in the model. This can be a family function, a call to a family
function or a character string naming the family. Every family function has
a |
sample_prior |
Indicate if draws from priors should be drawn
additionally to the posterior draws. Options are |
data2 |
A named |
knots |
Optional list containing user specified knot values to be used
for basis construction of smoothing terms. See
|
drop_unused_levels |
Should unused factors levels in the data be
dropped? Defaults to |
... |
Other arguments for internal usage only. |
An object of class brmsprior
.
prior1 <- prior(normal(0,10), class = b) + prior(cauchy(0,2), class = sd) validate_prior(prior1, count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson())
prior1 <- prior(normal(0,10), class = b) + prior(cauchy(0,2), class = sd) validate_prior(prior1, count ~ zAge + zBase * Trt + (1|patient), data = epilepsy, family = poisson())
This function calculates the estimated standard deviations,
correlations and covariances of the group-level terms
in a multilevel model of class brmsfit
.
For linear models, the residual standard deviations,
correlations and covariances are also returned.
## S3 method for class 'brmsfit' VarCorr( x, sigma = 1, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
## S3 method for class 'brmsfit' VarCorr( x, sigma = 1, summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ... )
x |
An object of class |
sigma |
Ignored (included for compatibility with
|
summary |
Should summary statistics be returned
instead of the raw values? Default is |
robust |
If |
probs |
The percentiles to be computed by the |
... |
Currently ignored. |
A list of lists (one per grouping factor), each with three elements: a matrix containing the standard deviations, an array containing the correlation matrix, and an array containing the covariance matrix with variances on the diagonal.
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1+Trt|visit), data = epilepsy, family = gaussian(), chains = 2) VarCorr(fit) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1+Trt|visit), data = epilepsy, family = gaussian(), chains = 2) VarCorr(fit) ## End(Not run)
Get a point estimate of the covariance or correlation matrix of population-level parameters
## S3 method for class 'brmsfit' vcov(object, correlation = FALSE, pars = NULL, ...)
## S3 method for class 'brmsfit' vcov(object, correlation = FALSE, pars = NULL, ...)
object |
An object of class |
correlation |
Logical; if |
pars |
Optional names of coefficients to extract. By default, all coefficients are extracted. |
... |
Currently ignored. |
Estimates are obtained by calculating the maximum likelihood covariances (correlations) of the posterior draws.
covariance or correlation matrix of population-level parameters
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1+Trt|visit), data = epilepsy, family = gaussian(), chains = 2) vcov(fit) ## End(Not run)
## Not run: fit <- brm(count ~ zAge + zBase * Trt + (1+Trt|visit), data = epilepsy, family = gaussian(), chains = 2) vcov(fit) ## End(Not run)
Density, distribution function, and random generation for the
von Mises distribution with location mu
, and precision kappa
.
dvon_mises(x, mu, kappa, log = FALSE) pvon_mises(q, mu, kappa, lower.tail = TRUE, log.p = FALSE, acc = 1e-20) rvon_mises(n, mu, kappa)
dvon_mises(x, mu, kappa, log = FALSE) pvon_mises(q, mu, kappa, lower.tail = TRUE, log.p = FALSE, acc = 1e-20) rvon_mises(n, mu, kappa)
x , q
|
Vector of quantiles between |
mu |
Vector of location values. |
kappa |
Vector of precision values. |
log |
Logical; If |
lower.tail |
Logical; If |
log.p |
Logical; If |
acc |
Accuracy of numerical approximations. |
n |
Number of draws to sample from the distribution. |
See vignette("brms_families")
for details
on the parameterization.
Compute the widely applicable information criterion (WAIC)
based on the posterior likelihood using the loo package.
For more details see waic
.
## S3 method for class 'brmsfit' waic( x, ..., compare = TRUE, resp = NULL, pointwise = FALSE, model_names = NULL )
## S3 method for class 'brmsfit' waic( x, ..., compare = TRUE, resp = NULL, pointwise = FALSE, model_names = NULL )
x |
A |
... |
More |
compare |
A flag indicating if the information criteria
of the models should be compared to each other
via |
resp |
Optional names of response variables. If specified, predictions are performed only for the specified response variables. |
pointwise |
A flag indicating whether to compute the full
log-likelihood matrix at once or separately for each observation.
The latter approach is usually considerably slower but
requires much less working memory. Accordingly, if one runs
into memory issues, |
model_names |
If |
See loo_compare
for details on model comparisons.
For brmsfit
objects, WAIC
is an alias of waic
.
Use method add_criterion
to store
information criteria in the fitted model object for later usage.
If just one object is provided, an object of class loo
.
If multiple objects are provided, an object of class loolist
.
Vehtari, A., Gelman, A., & Gabry J. (2016). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. In Statistics and Computing, doi:10.1007/s11222-016-9696-4. arXiv preprint arXiv:1507.04544.
Gelman, A., Hwang, J., & Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models. Statistics and Computing, 24, 997-1016.
Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. The Journal of Machine Learning Research, 11, 3571-3594.
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) (waic1 <- waic(fit1)) # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) (waic2 <- waic(fit2)) # compare both models loo_compare(waic1, waic2) ## End(Not run)
## Not run: # model with population-level effects only fit1 <- brm(rating ~ treat + period + carry, data = inhaler) (waic1 <- waic(fit1)) # model with an additional varying intercept for subjects fit2 <- brm(rating ~ treat + period + carry + (1|subject), data = inhaler) (waic2 <- waic(fit2)) # compare both models loo_compare(waic1, waic2) ## End(Not run)
Density function and random generation for the Wiener
diffusion model distribution with boundary separation alpha
,
non-decision time tau
, bias beta
and
drift rate delta
.
dwiener( x, alpha, tau, beta, delta, resp = 1, log = FALSE, backend = getOption("wiener_backend", "Rwiener") ) rwiener( n, alpha, tau, beta, delta, types = c("q", "resp"), backend = getOption("wiener_backend", "Rwiener") )
dwiener( x, alpha, tau, beta, delta, resp = 1, log = FALSE, backend = getOption("wiener_backend", "Rwiener") ) rwiener( n, alpha, tau, beta, delta, types = c("q", "resp"), backend = getOption("wiener_backend", "Rwiener") )
x |
Vector of quantiles. |
alpha |
Boundary separation parameter. |
tau |
Non-decision time parameter. |
beta |
Bias parameter. |
delta |
Drift rate parameter. |
resp |
Response: |
log |
Logical; If |
backend |
Name of the package to use as backend for the computations.
Either |
n |
Number of draws to sample from the distribution. |
types |
Which types of responses to return? By default,
return both the response times |
These are wrappers around functions of the RWiener or rtdists
package (depending on the chosen backend
). See
vignette("brms_families")
for details on the parameterization.
Density and distribution functions for zero-inflated distributions.
dzero_inflated_poisson(x, lambda, zi, log = FALSE) pzero_inflated_poisson(q, lambda, zi, lower.tail = TRUE, log.p = FALSE) dzero_inflated_negbinomial(x, mu, shape, zi, log = FALSE) pzero_inflated_negbinomial(q, mu, shape, zi, lower.tail = TRUE, log.p = FALSE) dzero_inflated_binomial(x, size, prob, zi, log = FALSE) pzero_inflated_binomial(q, size, prob, zi, lower.tail = TRUE, log.p = FALSE) dzero_inflated_beta_binomial(x, size, mu, phi, zi, log = FALSE) pzero_inflated_beta_binomial( q, size, mu, phi, zi, lower.tail = TRUE, log.p = FALSE ) dzero_inflated_beta(x, shape1, shape2, zi, log = FALSE) pzero_inflated_beta(q, shape1, shape2, zi, lower.tail = TRUE, log.p = FALSE)
dzero_inflated_poisson(x, lambda, zi, log = FALSE) pzero_inflated_poisson(q, lambda, zi, lower.tail = TRUE, log.p = FALSE) dzero_inflated_negbinomial(x, mu, shape, zi, log = FALSE) pzero_inflated_negbinomial(q, mu, shape, zi, lower.tail = TRUE, log.p = FALSE) dzero_inflated_binomial(x, size, prob, zi, log = FALSE) pzero_inflated_binomial(q, size, prob, zi, lower.tail = TRUE, log.p = FALSE) dzero_inflated_beta_binomial(x, size, mu, phi, zi, log = FALSE) pzero_inflated_beta_binomial( q, size, mu, phi, zi, lower.tail = TRUE, log.p = FALSE ) dzero_inflated_beta(x, shape1, shape2, zi, log = FALSE) pzero_inflated_beta(q, shape1, shape2, zi, lower.tail = TRUE, log.p = FALSE)
x |
Vector of quantiles. |
zi |
zero-inflation probability |
log |
Logical; If |
q |
Vector of quantiles. |
lower.tail |
Logical; If |
log.p |
Logical; If |
mu , lambda
|
location parameter |
shape , shape1 , shape2
|
shape parameter |
size |
number of trials |
prob |
probability of success on each trial |
phi |
precision parameter |
The density of a zero-inflated distribution can be specified as follows.
If set
.
Else set
,
where
is the density of the non-zero-inflated part.