Skip to contents

Performs GSFA on given gene expression matrix and matching perturbation information using Gibbs sampling for samples that come from two groups

Usage

fit_gsfa_multivar_2groups(
  Y,
  G,
  group,
  K,
  fit0,
  use_neg_control = FALSE,
  neg_control_index = NULL,
  prior_type = c("mixture_normal", "spike_slab"),
  init.method = c("svd", "random"),
  prior_w_s = 50,
  prior_w_r = 0.2,
  prior_beta_s = 20,
  prior_beta_r = 0.2,
  niter = 500,
  used_niter = floor(niter/2),
  lfsr_niter = used_niter,
  verbose = TRUE,
  return_samples = TRUE
)

Arguments

Y

A sample by gene numeric matrix that stores normalized gene expression values; is.matrix(Y) should be TRUE;

G

Either a numeric vector or a sample by perturbation numeric matrix that stores sample-level perturbation information; length or nrow of G should be the same as nrow(Y);

group

a vector of sample size length, with two types of unique values indicating one of the two groups each sample belongs to;

K

Number of factors to use in the model; only one of K and fit0 is needed;

fit0

A list of class 'gsfa_fit' that is obtained from a previous fit_gsfa_multivar run, so that more iterations of Gibbs sampling can continue from the last updates in it; only one of K and fit0 is needed;

prior_type

Type of sparse prior used on gene weights, can be "mixture_normal" or "spike_slab", "mixture_normal" sometimes works better in inducing sparsity;

init.method

Method to initialize the factors, can be one of "svd" (truncated SVD on Y) or "random";

prior_w_s, prior_w_r

prior parameters (\(s_{w}\) and \(r_{w}\)) of the gene loading on the factors;

prior_beta_s, prior_beta_r

prior parameters (\(s_{b}\) and \(r_{b}\)) of the effects of perturbations on the factors;

niter

Total number of Gibbs sampling iterations;

used_niter

Number of iterations (counting from the last iteration) from which the posterior means of parameters are to be computed;

lfsr_niter

Number of iterations (counting from the last iteration) of posterior samples to use for the computation of LFSR;

return_samples

Boolean indicator of whether all posterior samples throughout Gibbs sampling should be returned;

Value

A list of class 'gsfa_fit' which stores the Gibbs sampling updates and posterior mean estimates, and the prior parameters used during the inference.

Details

Similar to the function fit_gsfa_multivar(), but associations between factors and perturbations are estimated for each group of samples separately. For details about the GSFA model and prior specification, please see the GSFA paper and Supplementary Notes (Section 1).

Examples

if (FALSE) {
fit0 <- fit_gsfa_multivar_2groups(Y, G, group, 10, init.method = "svd", niter = 500, used_niter = 200)
fit1 <- fit_gsfa_multivar_2groups(Y, G, group, fit0 = fit0, niter = 500, used_niter = 200)
}