It will extract the required columns from summary statistics, check chromosomes, remove X, Y chromosomes, compute z-scores, convert alleles to upper case, remove indels, and sort by chromosome and position.
clean_sumstats(
sumstats,
chr = "chr",
pos = "pos",
beta = "beta",
se = "se",
a0 = "a0",
a1 = "a1",
snp = "snp",
pval = "pval",
remove_indels = TRUE
)
A data frame of GWAS summary statistics. It is required to have the following columns: chr, position, beta, se, a0, a1, SNP ID (rs), p-value.
Name of the chromosome column in summary statistics.
Name of the position column (base pair position).
Name of beta column (if you have Odds Ratio, you will need to transform it to log(Odds Ratio)).
Name of the standard error (se) column.
Column name of the reference allele.
Column name of the association/effect allele.
Name of the SNP ID (rsID) column.
Name of the p-value column.
A data frame of cleaned summary statistics, sort by chromosome and position.