R/ctwas_harmonize_data.R
harmonize_z_ld.Rd
Harmonize z scores from GWAS to match ld reference genotypes. Flip signs when reverse complement matches.
harmonize_z_ld(
z_snp,
ld_snpinfo,
strand_ambig_action = c("drop", "none", "recover"),
ld_pgenfs = NULL,
ld_Rinfo = NULL
)
a data frame, with columns "id", "A1", "A2" and "z". Z scores for every SNP. "A1" is the effect allele.
a data frame, snp info for LD reference, with columns "chrom", "id", "pos", "alt", "ref".
the action to take to harmonize strand ambiguous variants (A/T, G/C) between the z scores and LD reference. "drop" removes the ambiguous variant from the z scores. "none" treats the variant as unambiguous, flipping the z score to match the LD reference and then taking no additional action. "recover" imputes the sign of ambiguous z scores using unambiguous z scores and the LD reference and flips the z scores if there is a mismatch between the imputed sign and the observed sign of the z score. This option is computationally intensive
a character vector of .pgen or .bed files. One file for one chromosome, in the order of 1 to 22. Therefore, the length of this vector needs to be 22. If .pgen files are given, then .pvar and .psam are assumed to present in the same directory. If .bed files are given, then .bim and .fam files are assumed to present in the same directory.
a vector of paths to the variant information for all LD matrices
a data frame, z_snp with the "z" columns flipped to match LD ref.