Variance Decomposition Module

Public interface

class limix.vardec.VarianceDecomposition(Y, standardize=False)[source]

Class for variance decomposition

Parameters:
  • Y (ndarray) – (N, P) phenotype matrix
  • standardize (bool) – if True, impute missing phenotype values by mean value, zero-mean and unit-variance phenotype. The default value is False.

Examples

Basic example of usage.

>>> from numpy.random import RandomState
>>> from limix.vardec import VarianceDecomposition
>>> from numpy import dot, eye, ones
>>> from numpy import set_printoptions
>>> set_printoptions(4)
>>> random = RandomState(1)
>>>
>>> N = 100
>>> P = 3
>>>
>>> # generate data
>>> pheno = random.randn(N, P)
>>> W = random.randn(N, 10)
>>> kinship = dot(W, W.T) / float(10)
>>> kinship+= 1e-4*eye(N)
>>> F1 = ones((N, 1))
>>> A1 = eye(P)
>>> F2 = random.randn(N, 2)
>>> A2 = ones((1, P))
>>>
>>> vc = VarianceDecomposition(pheno)
>>> vc.addFixedEffect(F1, A1)
>>> vc.addFixedEffect(F2, A2)
>>> vc.addRandomEffect(K=kinship)
>>> vc.addRandomEffect(is_noise=True)
>>> conv = vc.optimize()
>>> print(conv)
True
>>>
>>> print(vc.getTraitCovar(0))
[[ 0.0097  0.023  -0.0032]
 [ 0.023   0.0562 -0.0062]
 [-0.0032 -0.0062  0.0096]]
>>> print(vc.getTraitCovar(1))
[[ 1.0029 -0.1212 -0.0052]
 [-0.1212  0.8284 -0.0412]
 [-0.0052 -0.0412  0.8147]]
>>> print(vc.getWeights(0))
[[ 0.039   0.0899  0.1213]]
>>> print(vc.getWeights(1))
[[-0.0176]
 [-0.0095]]
addFixedEffect(F=None, A=None, Ftest=None)[source]

add fixed effect term to the model

Parameters:
  • F – sample design matrix for the fixed effect [N,K]
  • A – trait design matrix for the fixed effect (e.g. sp.ones((1,P)) common effect; sp.eye(P) any effect) [L,P]
  • Ftest – sample design matrix for test samples [Ntest,K]
addRandomEffect(K=None, is_noise=False, normalize=False, Kcross=None, trait_covar_type=’freeform’, rank=1, fixed_trait_covar=None, jitter=0.0001)[source]

Add random effects term.

Parameters:
  • K – Sample Covariance Matrix [N, N]
  • is_noise – Boolean indicator specifying if the matrix is homoscedastic noise (weighted identity covariance) (default False)
  • normalize – Boolean indicator specifying if K has to be normalized such that K.trace()=N.
  • Kcross – NxNtest cross covariance for predictions
  • trait_covar_type – type of covaraince to use. Default ‘freeform’. possible values are ‘freeform’: general semi-definite positive matrix, ‘fixed’: fixed matrix specified in fixed_trait_covar, ‘diag’: diagonal matrix, ‘lowrank’: low rank matrix. The rank of the lowrank part is specified in the variable rank, ‘lowrank_id’: sum of a lowrank matrix and a multiple of the identity. The rank of the lowrank part is specified in the variable rank, ‘lowrank_diag’: sum of a lowrank and a diagonal matrix. The rank of the lowrank part is specified in the variable rank, ‘block’: multiple of a matrix of ones, ‘block_id’: sum of a multiple of a matrix of ones and a multiple of the idenity, ‘block_diag’: sum of a multiple of a matrix of ones and a diagonal matrix,
  • rank – rank of a possible lowrank component (default 1)
  • fixed_trait_covar – PxP matrix for the (predefined) trait-to-trait covariance matrix if fixed type is used
  • jitter – diagonal contribution added to trait-to-trait covariance matrices for regularization
crossValidation(seed=0, n_folds=10, fullVector=True, verbose=None, D=None, **keywords)[source]

Split the dataset in n folds, predict each fold after training the model on all the others

Parameters:
  • seed – seed
  • n_folds – number of folds to train the model on
  • fullVector – Bolean indicator, if true it stops if no convergence is observed for one of the folds, otherwise goes through and returns a pheno matrix with missing values
  • verbose – if true, prints the fold that is being used for predicitons
  • **keywords – params to pass to the function optimize
Returns:

Matrix of phenotype predictions [N,P]

getLML()[source]

Return log marginal likelihood

getLMLgrad()[source]

Return gradient of log-marginal likelihood

getTraitCorrCoef(term_i=None)[source]
Return the estimated trait correlation coefficient matrix for term_i (or the total if term_i is None)
To retrieve the trait covariance matrix use see getTraitCovar
Parameters:term_i – index of the random effect term we want to retrieve the correlation coefficients
Returns:estimated trait correlation coefficient matrix
getTraitCovar(term_i=None)[source]
Return the estimated trait covariance matrix for term_i (or the total if term_i is None)
To retrieve the matrix of correlation coefficient use see getTraitCorrCoef
Parameters:term_i – index of the random effect term we want to retrieve the covariance matrix
Returns:estimated trait covariance
getTraitCovarFun(term_i)[source]

Return the trait limix covariance function for term term_i

Parameters:term_i – index of the ranodm effect to consider
Returns:LIMIX cvoariance function
getTraitCovarStdErrors(term_i)[source]

Returns standard errors on trait covariances from term_i (for the covariance estimate see getTraitCovar)

Parameters:term_i – index of the term we are interested in
getVarianceCompStdErrors(univariance=False)[source]

Return the standard errors on the estimated variance components (for variance component estimates see getVarianceComps)

Parameters:univariance – Boolean indicator, if True variance components are normalized to sum up to 1 for each trait
Returns:standard errors on variance components [P, n_randEffs matrix]
getVarianceComps(univariance=False)[source]

Return the estimated variance components

Parameters:univariance – Boolean indicator, if True variance components are normalized to sum up to 1 for each trait
Returns:variance components of all random effects on all phenotypes [P, n_randEffs matrix]
getWeights(term_i=None)[source]

Return weights for fixed effect term term_i

Parameters:term_i – fixed effect term index
Returns:weights of the spefied fixed effect term. The output will be a KxL matrix of weights will be returned, where K is F.shape[1] and L is A.shape[1] of the correspoding fixed effect term (L will be always 1 for single-trait analysis).
optimize(init_method=’default’, inference=None, n_times=10, perturb=False, pertSize=0.001, verbose=False)[source]

Train the model using the specified initialization strategy

Parameters:
  • init_method – initialization strategy: ‘default’: variance is equally split across the different random effect terms. For mulit-trait models the empirical covariance between traits is used ‘random’: variance component parameters (scales) are sampled from a normal distribution with mean 0 and std 1, None: no initialization is considered. Initial parameters can be specifies by using the single covariance getTraitCovarfun()
  • inference – inference gp method, by default algebrically efficient inference (i.e., gp2kronSum, gp2KronSumLR, gp3KronSumLR) will be used when possible. For models with high a standard inference scheme (gp_base) will be used.
  • n_times – number of restarts to converge
  • perturb – if true, the initial point (if random initializaiton is not being used) is perturbed with gaussian noise for each restart (default, False)
  • perturbSize – std of the gassian noise used to perturb the initial point
  • verbose – print if convergence is achieved and how many restarted were needed
optimize_with_repeates(fast=None, verbose=None, n_times=10, lambd=None, lambd_g=None, lambd_n=None)[source]

Train the model repeadly up to a number specified by the users with random restarts and return a list of all relative minima that have been found. This list is sorted according to least likelihood. Each list term is a dictionary with keys “counter”, “LML”, and “scales”.

After running this function, the vc object will be set at the last iteration. Thus, if you wish to get the vc object of one of the repeats, then set the scales. For example:

vc.setScales(scales=optimize_with_repeates_output[0][“scales”])

Parameters:
  • fast – Boolean. if set to True initalize kronSumGP
  • verbose – Boolean. If set to True, verbose output is produced. (default True)
  • n_times – number of re-starts of the optimization. (default 10)
predictPhenos(use_fixed=None, use_random=None)[source]

predict the conditional mean (BLUP)

Parameters:
  • use_fixed – list of fixed effect indeces to use for predictions
  • use_random – list of random effect indeces to use for predictions
Returns:

predictions (BLUP)

setTestSampleSize(Ntest)[source]

set sample size of the test dataset

Parameters:Ntest – sample size of the test set
setY(Y, standardize=False)[source]

Set phenotype matrix

Parameters:
  • Y – phenotype matrix [N, P]
  • standardize – if True, phenotype is standardized (zero mean, unit variance)