R/susie_susie_rss.R
susie_rss.Rd
susie_rss
performs sum of single-effect linear
regression with z scores; all posterior calculations are for
z-scores. This function fits the regression model \(z = \sum_l
R*b_l + e\), where e is \(N(0,R)\) and \(\sum_l b_l\) is a
p-vector of effects to be estimated. The required summary data are
the p by p correlation matrix, R
, and the p-vector z
.
susie_rss(
z,
R,
maf = NULL,
maf_thresh = 0,
z_ld_weight = 0,
L = 10,
prior_variance = 50,
residual_variance = NULL,
prior_weights = NULL,
null_weight = NULL,
estimate_residual_variance = FALSE,
estimate_prior_variance = TRUE,
estimate_prior_method = c("optim", "EM", "simple"),
check_null_threshold = 0,
prior_tol = 1e-09,
max_iter = 100,
s_init = NULL,
intercept_value = 0,
coverage = 0.95,
min_abs_corr = 0.5,
tol = 0.001,
verbose = FALSE,
track_fit = FALSE,
check_R = FALSE,
r_tol = 1e-08,
refine = FALSE
)
A p-vector of z scores.
A p by p symmetric, positive semidefinite correlation matrix.
Minor allele frequency; to be used along with
maf_thresh
to filter input summary statistics.
Variants having a minor allele frequency smaller than this threshold are not used.
This feature is not recommended. The weights
assigned to the z scores in the LD matrix. If z_ld_weight >
0
, the LD matrix used in the model is cov2cor((1-w)*R +
w*tcrossprod(z))
, where w = z_ld_weight
.
Number of components (nonzero coefficients) in the susie regression model. If L is larger than the number of covariates, p, L is set to p.
The prior variance. It is either a scalar or a vector of length L.
Variance of the residual. If it is not specified, we set it to 1.
A vector of length p, in which each entry gives the prior probability that SNP j has non-zero effect.
Prior probability of no effect (a number between 0 and 1, and cannot be exactly 1).
The residual variance is
fixed to the value supplied by residual_variance
. We don't
estimate residual variance in susie_rss.
If estimate_prior_variance =
TRUE
, the prior variance is estimated (this is a separate
parameter for each of the L effects). If provided,
prior_variance
is then used as an initial value for
the optimization. When estimate_prior_variance = FALSE
, the
prior variance for each of the L effects is determined by the
value supplied to prior_variance
.
The method used for estimating prior
variance. When estimate_prior_method = "simple"
is used, the
likelihood at the specified prior variance is compared to the
likelihood at a variance of zero, and the setting with the larger
likelihood is retained.
When the prior variance is estimated,
compare the estimate with the null, and set the prior variance to
zero unless the log-likelihood using the estimate is larger by this
threshold amount. For example, if you set
check_null_threshold = 0.1
, this will "nudge" the estimate
towards zero when the difference in log-likelihoods is small. A
note of caution that setting this to a value greater than zero may
lead the IBSS fitting procedure to occasionally decrease the ELBO.
When the prior variance is estimated, compare the
estimated value to prior_tol
at the end of the computation,
and exclude a single effect from PIP computation if the estimated
prior variance is smaller than this tolerance value.
Maximum number of IBSS iterations to perform.
A previous susie fit with which to initialize.
The intercept. (The intercept cannot be
estimated from centered summary data.) This setting will be used by
coef
to assign an intercept value, mainly for consistency
with susie
. Set to NULL
if you want coef
not
to include an intercept term (and so only a p-vector is returned).
A number between 0 and 1 specifying the “coverage” of the estimated confidence sets.
Minimum absolute correlation allowed in a credible set. The default, 0.5, corresponds to a squared correlation of 0.25, which is a commonly used threshold for genotype data in genetic studies.
A small, non-negative number specifying the convergence
tolerance for the IBSS fitting procedure. The fitting procedure
will halt when the difference in the variational lower bound, or
“ELBO” (the objective function to be maximized), is
less than tol
.
If verbose = TRUE
, the algorithm's progress,
and a summary of the optimization settings, are printed to the
console.
If track_fit = TRUE
, trace
is also returned containing detailed information about the
estimates at each iteration of the IBSS fitting procedure.
If check_R = TRUE
, check that R
is
positive semidefinite.
Tolerance level for eigenvalue check of positive semidefinite matrix of R.
If refine = TRUE
, we use a procedure to help
SuSiE get out of local optimum.
A "susie"
object with some or all of the following
elements:
An L by p matrix of posterior inclusion probabilites.
An L by p matrix of posterior means, conditional on inclusion.
An L by p matrix of posterior second moments, conditional on inclusion.
log-Bayes Factor for each single effect.
log-Bayes Factor for each variable and single effect.
Fixed Intercept.
Fixed Residual variance.
Prior variance of the non-zero elements of b, equal to
scaled_prior_variance * var(Y)
.
The value of the variational lower bound, or “ELBO” (objective function to be maximized), achieved at each iteration of the IBSS fitting procedure.
Vector of length n containing the fitted values of the outcome.
Credible sets estimated from model fit; see
susie_get_cs
for details.
A vector of length p giving the (marginal) posterior inclusion probabilities for all p covariates.
Number of IBSS iterations that were performed.
TRUE
or FALSE
indicating whether
the IBSS converged to a solution within the chosen tolerance
level.
An p-vector of t(X)
times fitted values, X
%*% colSums(alpha*mu)
.
set.seed(1)
n = 1000
p = 1000
beta = rep(0,p)
beta[1:4] = 1
X = matrix(rnorm(n*p),nrow = n,ncol = p)
X = scale(X,center = TRUE,scale = TRUE)
y = drop(X %*% beta + rnorm(n))
input_ss <- compute_ss(X,y,standardize = TRUE)
#> Error in compute_ss(X, y, standardize = TRUE): could not find function "compute_ss"
ss <- univariate_regression(X,y)
#> Error in univariate_regression(X, y): could not find function "univariate_regression"
R <- with(input_ss,cov2cor(XtX))
#> Error in with(input_ss, cov2cor(XtX)): object 'input_ss' not found
zhat <- with(ss,betahat/sebetahat)
#> Error in with(ss, betahat/sebetahat): object 'ss' not found
res <- susie_rss(zhat,R,L = 10)
#> Error in susie_rss(zhat, R, L = 10): could not find function "susie_rss"