Compute the acceptable family of linear subsets for the random intercept model
Source:R/source_subsel.R
accept_family_randint.Rd
Given output from a Bayesian random intercept model and a candidate of
subsets, compute the *acceptable family* of subsets that
match or nearly match the predictive accuracy of the "best" subset.
The acceptable family may be computed for any set of covariate values
XX
; if XX = X
are the in-sample points, then
cross-validation is used to assess out-of-sample predictive performance.
Usage
accept_family_randint(
post_y_pred,
post_lpd,
post_sigma_e,
post_sigma_u,
XX,
YY,
indicators,
post_y_pred_sum = NULL,
eps_level = 0.05,
eta_level = 0,
K = 10,
sir_frac = 0.5,
plot = TRUE
)
Arguments
- post_y_pred
S x m x n
matrix of posterior predictive draws at the givenXX
covariate values form
replicates per subject- post_lpd
S
evaluations of the log-likelihood computed at each posterior draw of the parameters- post_sigma_e
(
nsave
) draws from the posterior distribution of the observation error SD- post_sigma_u
(
nsave
) draws from the posterior distribution of the random intercept SD- XX
n x p
matrix of covariates at which to evaluate- YY
m x n
matrix of response variables (optional)- indicators
L x p
matrix of inclusion indicators (booleans) where each row denotes a candidate subset- post_y_pred_sum
(
nsave x n
) matrix of the posterior predictive draws summed over the replicates within each subject (optional)- eps_level
probability required to match the predictive performance of the "best" model (up to
eta_level
)- eta_level
allowable margin ( and the "best" model
- K
number of cross-validation folds (optional)
- sir_frac
fraction of the posterior samples to use for SIR (optional)
- plot
logical; if TRUE, include a plot to summarize the predictive performance across candidate subsets
Value
a list containing the following elements:
all_accept
: indices (i.e., rows ofindicators
) that correspond to the acceptable subsetsbeta_hat_small
linear coefficients for the smallest acceptable modelbeta_hat_min
linear coefficients for the "best" acceptable modelell_small
: index (i.e., row ofindicators
) of the smallest acceptable modelell_min
: index (i.e., row ofindicators
) of the "best" acceptable model
Details
When XX = X
is the observed covariate values,
then post_lpd
and yy
must be provided. These
are used to compute the cross-validated predictive and empirical
squared errors; the predictive version relies on a sampling importance-resampling
procedure.
When XX
corresponds to a new set of covariate values, then set post_lpd = NULL
and yy = NULL
(these are the default values).
Additional details on the predictive and empirical comparisons are
in pp_loss_randint
.