Simulate a Continuous Gene Expression Matrix and an Accompanying Perturbation Matrix
Source:R/simulate.R
normal_data_sim.Rd
Generate a binary perturbation matrix and a continuous gene expression matrix in a bottom-up fashion according to a hierarchical factor model with normal noise terms.
Arguments
- N
Number of samples to simulate
- P
Number of genes to simulate
- beta_true
A \(M\) by \(K\) numeric matrix that stores the true effect sizes of perturbation-factor associations; when
offset=TRUE
, \(M+1\) rows should be provided instead.- K
Number of factors to simulate
- M
Number of perturbations to simulate
- pi_true
The true density (proportion of nonzero gene loading) of each factor
- G_prob
The Bernoulli probability based on which the binary perturbation matrix
G
will be generated; determines the frequency of each perturbation in the sample population- offset
Default is FALSE. If TRUE,
beta_true
should have \(M+1\) rows, with the last row storing the intercept values \(\beta_0\)
Value
A list object with the following elements:
- Y
a sample by gene matrix with continuous gene expression values;
- G
a binary sample by perturbation matrix;
- Z
a sample by factor matrix;
- F
a binary gene by factor matrix that indicates whether a gene has non-zero loading in the factor;
- U
a gene by factor matrix with normal effect sizes, and
F*U
(element-wise multiplication) gives the loading matrixW
.