Interaction Set Test (iSet)

Joint genetic models for multiple traits have helped to enhance association analyses. Most existing multi-trait models have been designed to increase power for detecting associations, whereas the analysis of interactions has received considerably less attention. Here, we implement fit_iSet() [CHRS17], a method based on linear mixed models to test for interactions between sets of variants and environmental states or other contexts. Our model generalizes previous interaction tests and in particular provides a test for local differences in the genetic architecture between contexts. We first use simulations to validate iSet before applying the model to the analysis of genotype-environment interactions in an eQTL study. Our model retrieves a larger number of interactions than alternative methods and reveals that up to 20%% of cases show context-specific configurations of causal variants. Finally, we apply iSet to test for sub-group specific genetic effects in human lipid levels in a large human cohort, where we identify a gene-sex interaction for C-reactive protein that is missed by alternative methods.

References

[CHRS17]Casale FP, Horta D, Rakitsch B, Stegle O (2017) Joint genetic analysis using variant sets reveals polygenic gene-context interactions. PLOS Genetics 13(4): e1006693.

Public interface

limix.iset.fit_iSet(Y=None, Xr=None, F=None, Rg=None, Ug=None, Sg=None, Ie=None, n_nulls=10, factr=10000000.0)[source]

Fit interaction set test (iSet).

Parameters:
  • Y (ndarray) – For complete design, the phenotype ndarray Y for N samples and C contexts has shape (N, C). For stratified design, the phenotype ndarray Y has shape (N, 1) (each individual is phenotyped in only one context - Ie specifies in which context each individuals has been phenotyped).
  • Xr (ndarray) – (N, S) genotype values for N samples and S variants (defines the set component)
  • F (ndarray, optional) – (N, K) ndarray of K covariates for N individuals. By default, F is a (N, 1) array of ones.
  • Rg (ndarray, optional) – (N, N) ndarray of LMM-covariance/kinship coefficients. Ug and Sg can be provided instead of Rg. If neither Rg nor Ug and Sg are provided, the null models has iid normal residuals.
  • Ug (ndarray, optional) – (N, N) ndarray of eigenvectors of Rg. Ug and Sg can be provided instead of Rg. If neither Rg nor Ug and Sg are provided, iid normal residuals are considered.
  • Sg (ndarray, optional) – (N, ) ndarray of eigenvalues of Rg. Ug and Sg can be provided instead of Rg. If neither Rg nor Ug and Sg are provided, iid normal residuals are considered.
  • Ie (ndarry, optional) – (N, 1) boolean ndarray indicator for analysis of stratified designs. More specifically Ie specifies in which context each individuals has been phenotyped. Needs to be specified for analysis of stratified designs.
  • n_nulls (ndarray, optional) – number of parametric bootstrap. This parameter determines the minimum P value that can be estimated. The default value is 10.
  • factr (float, optional) – optimization paramenter that determines the accuracy of the solution (see scipy.optimize.fmin_l_bfgs_b for more details).
Returns:

tuple containing:
  • df (:class:`pandas.DataFrame`): contains test statistcs of mtSet, iSet, and iSet-GxC tests and the variance exaplained by persistent, GxC and heterogeneity-GxC effects.
  • df0 (:class:`pandas.DataFrame`): contains null test statistcs of mtSet, iSet, and iSet-GxC tests.

Return type:

(tuple)

Examples

This example shows how to fit iSet when considering complete designs and modelling population structure/relatedness by introducing the top principle components of the genetic relatedness matrix (pc_rrm) as fixed effects.

>>> from numpy.random import RandomState
>>> from limix.iset import fit_iSet
>>> from numpy import ones, concatenate
>>> import scipy as sp
>>>
>>> random = RandomState(1)
>>> sp.random.seed(0)
>>>
>>> N = 100
>>> C = 2
>>> S = 4
>>>
>>> snps = (random.rand(N, S) < 0.2).astype(float)
>>> pheno = random.randn(N, C)
>>> mean = ones((N, 1))
>>> pc_rrm = random.randn(N, 4)
>>> covs = concatenate([mean, pc_rrm], 1)
>>>
>>> df, df0 = fit_iSet(Y=pheno, Xr=snps, F=covs, n_nulls=2)
>>>
>>> print(df.round(3).T)
                           0
Heterogeneity-GxC var  0.000
Persistent Var         0.005
Rescaling-GxC Var      0.005
iSet LLR               0.137
iSet-het LLR          -0.000
mtSet LLR              0.166

This example shows how to fit iSet when considering complete designs and modelling population structure/relatedness using the full genetic relatedness matrix.

>>> from numpy import dot, eye
>>> random = RandomState(1)
>>> sp.random.seed(0)
>>>
>>> W = random.randn(N, 10)
>>> kinship = dot(W, W.T) / float(10)
>>> kinship+= 1e-4 * eye(N)
>>>
>>> df, df0 = fit_iSet(Y=pheno, Xr=snps, Rg=kinship, n_nulls=2)
>>>
>>> print(df.round(3).T)
                           0
Heterogeneity-GxC var  0.000
Persistent Var         0.005
Rescaling-GxC Var      0.005
iSet LLR               1.098
iSet-het LLR           1.014
mtSet LLR              0.154

This example shows how to fit iSet when considering stratified designs and modelling population structure/relatedness by introducing the top principle components of the genetic relatedness matrix (pc_rrm) as fixed effects. iSet does not support models with full genetic relatedness matrix for stratified designs.

>>> random = RandomState(1)
>>> sp.random.seed(0)
>>>
>>> pheno = random.randn(N, 1)
>>> Ie = random.randn(N)<0.
>>>
>>> df, df0 = fit_iSet(Y=pheno, Xr=snps, F=covs, Ie=Ie, n_nulls=2)
>>>
>>> print(df.round(3).T)
                           0
Heterogeneity-GxC var -0.000
Persistent Var         0.064
Rescaling-GxC Var      0.006
iSet LLR               0.648
iSet-het LLR           0.000
mtSet LLR              1.177

For more info and examples see the iSet tutorial.