Skip to contents

Generate data from a (sparse) Gaussian linear model with random intercepts, i.e., for repeated measurements of (longitudinal) data. The covariates are correlated Gaussian variables. The user may control the signal-to-noise, the number of nonzero coefficients, and the intraclass correlation

Usage

simulate_lm_randint(n, p, m, rho = 0.25, p_sig = min(5, p/2), SNR = 1)

Arguments

n

number of subjects

p

number of covariates

m

number of observations per subject

rho

intraclass correlation coefficient

p_sig

number of true nonzero coefficients (signals)

SNR

signal-to-noise ratio

Value

a list with the following elements:

  • Y: the matrix of response variables

  • X: the matrix of covariates

  • beta_true: the true regression coefficients (including an intercept)

  • Ey_true: the true expectation of y (X%*%beta_true)

  • m_scale_true: the true Mahalanobis scale factor, 1/(sigma_e^2/sigma_u^2 + m)

Details

The true regression coefficients include an intercept (-1) and otherwise the p_sig nonzero coefficients are half equal to 1 and half equal to -1.

Examples

# Simulate data:
dat = simulate_lm_randint(n = 100, p = 10, m = 4)
names(dat) # what is returned
#> [1] "Y"            "X"            "beta_true"    "Ey_true"      "m_scale_true"