It will extract the required columns from summary statistics, check chromosomes, remove X, Y chromosomes, compute z-scores, convert alleles to upper case, remove indels, and sort by chromosome and position.

clean_sumstats(
  sumstats,
  chr = "chr",
  pos = "pos",
  beta = "beta",
  se = "se",
  a0 = "a0",
  a1 = "a1",
  snp = "snp",
  pval = "pval",
  remove_indels = TRUE
)

Arguments

sumstats

A data frame of GWAS summary statistics. It is required to have the following columns: chr, position, beta, se, a0, a1, SNP ID (rs), p-value.

chr

Name of the chromosome column in summary statistics.

pos

Name of the position column (base pair position).

beta

Name of beta column (if you have Odds Ratio, you will need to transform it to log(Odds Ratio)).

se

Name of the standard error (se) column.

a0

Column name of the reference allele.

a1

Column name of the association/effect allele.

snp

Name of the SNP ID (rsID) column.

pval

Name of the p-value column.

Value

A data frame of cleaned summary statistics, sort by chromosome and position.