run cTWAS analysis using summary statistics

ctwas_sumstats(
  z_snp,
  weights,
  region_info,
  LD_map,
  snp_map,
  z_gene,
  thin = 0.1,
  niter_prefit = 3,
  niter = 30,
  L = 5,
  init_group_prior = NULL,
  init_group_prior_var = NULL,
  filter_L = TRUE,
  filter_nonSNP_PIP = FALSE,
  min_nonSNP_PIP = 0.5,
  min_p_single_effect = 0.8,
  maxSNP = Inf,
  use_null_weight = TRUE,
  coverage = 0.95,
  min_abs_corr = 0.1,
  LD_format = c("rds", "rdata", "mtx", "csv", "txt", "custom"),
  LD_loader_fun,
  force_compute_cor = FALSE,
  save_cor = FALSE,
  cor_dir = NULL,
  outputdir = NULL,
  outname = "ctwas",
  ncore = 1,
  ncore_LD = max(ncore - 1, 1),
  logfile = NULL,
  verbose = FALSE,
  ...
)

Arguments

z_snp

A data frame with four columns: "id", "A1", "A2", "z". giving the z scores for snps. "A1" is effect allele. "A2" is the other allele.

weights

a list of pre-processed prediction weights

region_info

a data frame of region definitions.

LD_map

a data frame with filenames of LD matrices and SNP information for the regions.

snp_map

a list of data frames with SNP-to-region map for the reference.

z_gene

A data frame with columns: "id", "z", giving the z-scores for genes.

thin

The proportion of SNPs to be used for estimating parameters and screening regions.

niter_prefit

the number of iterations of the E-M algorithm to perform during the initial parameter estimation step

niter

the number of iterations of the E-M algorithm to perform during the complete parameter estimation step

L

the number of effects for susie during the fine mapping steps

init_group_prior

a vector of initial values of prior inclusion probabilities for SNPs and genes.

init_group_prior_var

a vector of initial values of prior variances for SNPs and gene effects.

filter_L

If TRUE, screening regions with L > 0.

filter_nonSNP_PIP

If TRUE, screening regions with total non-SNP PIP >= min_nonSNP_PIP.

min_nonSNP_PIP

Regions with non-SNP PIP >= min_nonSNP_PIP will be selected to run finemapping using all SNPs.

min_p_single_effect

Regions with probability >= min_p_single_effect of having at most one causal effect will be selected for the final EM step.

maxSNP

Inf or integer. Maximum number of SNPs in a region. Default is Inf, no limit. This can be useful if there are many SNPs in a region and you don't have enough memory to run the program.

use_null_weight

If TRUE, allow for a probability of no effect in susie

coverage

A number between 0 and 1 specifying the “coverage” of the estimated confidence sets.

min_abs_corr

Minimum absolute correlation allowed in a credible set.

LD_format

file format for LD matrix. If "custom", use a user defined LD_loader_fun() function to load LD matrix.

LD_loader_fun

a user defined function to load LD matrix when LD_format = "custom".

force_compute_cor

If TRUE, force computing correlation (R) matrices

save_cor

If TRUE, save correlation (R) matrices to cor_dir

cor_dir

The directory to store correlation (R) matrices

outputdir

The directory to store output. If specified, save outputs to the directory.

outname

The output name.

ncore

The number of cores used to parallelize computing over regions.

ncore_LD

The number of cores used to parallelize computing correlation matrices, in screening regions and fine-mapping steps with LD.

logfile

The log filename. If NULL, print log info on screen.

verbose

If TRUE, print detailed messages

...

Additional arguments of susie_rss.

Value

a list, include z_gene, estimated parameters, region_data, cross-boundary genes, screening region results, and fine-mapping results.