GaussianSampler
- class knockpy.knockoffs.GaussianSampler(X, mu=None, Sigma=None, invSigma=None, groups=None, sample_tol=1e-05, S=None, method=None, verbose=False, **kwargs)[source]
Bases:
knockpy.knockoffs.KnockoffSampler
Samples MX Gaussian (group) knockoffs.
- Parameters
- Xnp.ndarray
the
(n, p)
-shaped design- munp.ndarray
(p, )
-shaped mean of the features. If None, this defaults to the empirical mean of the features.- Sigmanp.ndarray
(p, p)
-shaped covariance matrix of the features. If None, this is estimated using theutilities.estimate_covariance
function.- groupsnp.ndarray
For group knockoffs, a p-length array of integers from 1 to num_groups such that
groups[j] == i
indicates that variable j is a member of group i. Defaults to None (regular knockoffs).- Snp.ndarray
the
(p, p)
-shaped knockoff S-matrix used to generate knockoffs. This is defined such that Cov(X, tilde(X)) = Sigma - S. When None, will be constructed by knockoff generator. Defaults to None.- methodstr
Specifies how to construct S matrix. This will be ignored if
S
is not None. There are several options:‘mvr’: Minimum Variance-Based Reconstructability knockoffs.
‘mmi’: Minimizes the mutual information between X and the knockoffs.
‘ci’: Conditional independence knockoffs.
‘sdp’: minimize the mean absolute covariance (MAC) between the features
and the knockoffs. - ‘equicorrelated’: Minimizes the MAC under the constraint that the the correlation between each feature and its knockoff is the same.
The default is to use mvr for non-group knockoffs, and to use the group-SDP for grouped knockoffs (the implementation for group mvr knockoffs is currently fairly slow). In both cases we use a block-diagonal approximation if the number if features is greater than 1000.
- objectivestr
How to optimize the S matrix if using the SDP for group knockoffs. There are several options:
‘abs’: minimize sum(abs(Sigma - S))
between groups and the group knockoffs. - ‘pnorm’: minimize Lp-th matrix norm. Equivalent to abs when p = 1. - ‘norm’: minimize different type of matrix norm (see norm_type below).
- sample_tolfloat
Minimum eigenvalue allowed for feature-knockoff covariance matrix. Keep this small but nonzero (1e-5) to prevent numerical errors.
- verbosebool
If True, prints progress over time
- rec_propfloat
The proportion of knockoffs to recycle (see Barber and Candes 2018, https://arxiv.org/abs/1602.03574). If method = ‘mvr’, then S_generation takes this into account and should increase the power of recycled knockoffs. sparsely-correlated, high-dimensional settings.
- kwargsdict
Other kwargs for S-matrix solvers.
Methods
check_PSD_condition
(Sigma, S)Checks that the feature-knockoff cov matrix is PSD.
check_xk_validity
(X, Xk[, testname, alpha])Runs a variety of KS tests on X and Xk to (informally) check that Xk are valid knockoffs for X.
fetch_S
()Fetches knockoff S-matrix.
many_ks_tests
(sample1s, sample2s)Samples1s, Sample2s = list of arrays Gets p values by running ks tests and then does a multiple testing correction.
sample_knockoffs
([check_psd])Samples knockoffs.
Methods Summary
fetch_S
()Fetches knockoff S-matrix.
sample_knockoffs
([check_psd])Samples knockoffs.
Methods Documentation