Bayesian Guided Sparse Factor Analysis on Perturbed Gene Expression Matrix
Source:R/GSFA.R
fit_gsfa_multivar_2groups.Rd
Performs GSFA on given gene expression matrix and matching perturbation information using Gibbs sampling for samples that come from two groups
Usage
fit_gsfa_multivar_2groups(
Y,
G,
group,
K,
fit0,
use_neg_control = FALSE,
neg_control_index = NULL,
prior_type = c("mixture_normal", "spike_slab"),
init.method = c("svd", "random"),
prior_w_s = 50,
prior_w_r = 0.2,
prior_beta_s = 20,
prior_beta_r = 0.2,
niter = 500,
used_niter = floor(niter/2),
lfsr_niter = used_niter,
verbose = TRUE,
return_samples = TRUE
)
Arguments
- Y
A sample by gene numeric matrix that stores normalized gene expression values;
is.matrix(Y)
should beTRUE
;- G
Either a numeric vector or a sample by perturbation numeric matrix that stores sample-level perturbation information; length or nrow of
G
should be the same asnrow(Y)
;- group
a vector of sample size length, with two types of unique values indicating one of the two groups each sample belongs to;
- K
Number of factors to use in the model; only one of
K
andfit0
is needed;- fit0
A list of class 'gsfa_fit' that is obtained from a previous
fit_gsfa_multivar
run, so that more iterations of Gibbs sampling can continue from the last updates in it; only one ofK
andfit0
is needed;- prior_type
Type of sparse prior used on gene weights, can be "mixture_normal" or "spike_slab", "mixture_normal" sometimes works better in inducing sparsity;
- init.method
Method to initialize the factors, can be one of "svd" (truncated SVD on
Y
) or "random";- prior_w_s, prior_w_r
prior parameters (\(s_{w}\) and \(r_{w}\)) of the gene loading on the factors;
- prior_beta_s, prior_beta_r
prior parameters (\(s_{b}\) and \(r_{b}\)) of the effects of perturbations on the factors;
- niter
Total number of Gibbs sampling iterations;
- used_niter
Number of iterations (counting from the last iteration) from which the posterior means of parameters are to be computed;
- lfsr_niter
Number of iterations (counting from the last iteration) of posterior samples to use for the computation of LFSR;
- return_samples
Boolean indicator of whether all posterior samples throughout Gibbs sampling should be returned;
Value
A list of class 'gsfa_fit' which stores the Gibbs sampling updates and posterior mean estimates, and the prior parameters used during the inference.
Details
Similar to the function fit_gsfa_multivar()
, but associations
between factors and perturbations are estimated for each group of samples separately.
For details about the GSFA model and prior specification,
please see the GSFA paper and Supplementary Notes (Section 1).