Search for the "best" (according to residual sum of squares)
linear subsets of each size. The algorithm may collect
the n_best
"best" subsets of each size, include or
exclude certain variables automatically, and apply
forward, backward, or exhaustive search.
Usage
branch_and_bound(
yy,
XX,
wts = NULL,
n_best = 15,
to_include = 1,
to_exclude = NULL,
searchtype = "exhaustive"
)
Arguments
- yy
vector of response variables
- XX
matrix of covariates
- wts
vector of observation weights (for weighted least squares)
- n_best
number of "best" subsets for each model size
- to_include
indices of covariates to include in *all* subsets
- to_exclude
indices of covariates to exclude from *all* subsets
- searchtype
use exhaustive search, forward selection, backward selection or sequential replacement to search
Examples
# Simulate data:
dat = simulate_lm(n = 100, p = 10)
# Run branch-and-bound:
indicators = branch_and_bound(yy = dat$y, XX = dat$X)
# Inspect:
head(indicators)
#> X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11
#> force_in TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
#> TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
#> TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
# Dimensions:
dim(indicators)
#> [1] 127 11
# Model sizes:
rowSums(indicators)
#> force_in
#> 1 2 2 2 2 2 2 2
#>
#> 2 2 2 3 3 3 3 3
#>
#> 3 3 3 3 3 3 3 3
#>
#> 3 3 4 4 4 4 4 4
#>
#> 4 4 4 4 4 4 4 4
#>
#> 4 5 5 5 5 5 5 5
#>
#> 5 5 5 5 5 5 5 5
#>
#> 6 6 6 6 6 6 6 6
#>
#> 6 6 6 6 6 6 6 7
#>
#> 7 7 7 7 7 7 7 7
#>
#> 7 7 7 7 7 7 8 8
#>
#> 8 8 8 8 8 8 8 8
#>
#> 8 8 8 8 8 9 9 9
#>
#> 9 9 9 9 9 9 9 9
#>
#> 9 9 9 9 10 10 10 10
#>
#> 10 10 10 10 10 10 11