Generate training data (X, y) and testing data (X_test, y_test) for a transformed linear model. The covariates are correlated Gaussian variables. Half of the true regression coefficients are zero and the other half are one. There are multiple options for the transformation, which define the support of the data (see below).
Arguments
- n
number of observations in the training data
- p
number of covariates
- g_type
type of transformation; must be one of
beta
,step
, orbox-cox
- n_test
number of observations in the testing data
- heterosked
logical; if TRUE, simulate the latent data with heteroskedasticity
- lambda
Box-Cox parameter (only applies for
g_type = 'box-cox'
)
Value
a list with the following elements:
y
: the response variable in the training dataX
: the covariates in the training datay_test
: the response variable in the testing dataX_test
: the covariates in the testing databeta_true
: the true regression coefficientsg_true
: the true transformation, evaluated at y
Details
The transformations vary in complexity and support
for the observed data, and include the following options:
beta
yields marginally Beta(0.1, 0.5) data
supported on [0,1]; step
generates a locally-linear
inverse transformation and produces positive data; and box-cox
refers to the signed Box-Cox family indexed by lambda
,
which generates real-valued data with examples including identity,
square-root, and log transformations.