TWAS measures the genetic association between gene expression and a complex phenotype using only GWAS summary-level data (see: Gusev et al. 2016 Nat Genet). The TWAS central dogma is that associated genes are more likely to be causal mediators of the disease and thus informative of disease biology or as targets for experimental follow-up.
Rat TWAS hub is an interactive browser of results from integrative analyses of GWAS and functional data. The aim is facilitate the investigation of individual TWAS associations; pleiotropic disease/trait associations for a given gene of interest; predicted gene associations for a given disease/trait of interest with detailed per-locus statistics; and pleiotropic relationships between traits based on shared associated genes. See the USAGE tab for detailed examples of each analysis type. Rat TWAS Hub is managed in the labs of Abraham Palmer at UC San Diego and Pejman Mohammadi at Seattle Children’s Research Institute and University of Washington. It was adapted from the (human) TWAS hub developed in the Gusev Lab at the Dana-Farber Cancer Institute and Harvard Medical School. For questions or comments please contact Daniel Munro at dmunro@health.ucsd.edu.
For each trait, a TWAS is carried out using the FUSION software. FUSON post-processing is then used to extract all significant associations (after Bonferroni correction) and grouped into contiguous loci and a step-wise conditional analysis is performed to identify independent associations (see more below). TWAS-reporter is then run on all traits to generate Markdown formatted reports which are human readable or can be flexibly converted to html/pdf/etc. We use a custom Jekyll layout to present these reports as a static web-site with data elements made interactive through javascript. Tables are handled by datatables.js and plots are handled by plotly.js. All code is available on GitHub, originally forked from gusevlab/TWAS_HUB. All data for each trait is available from the links in the TRAITS tab.
Please read this blog post for much more about interpreting TWAS signals and the relationship between TWAS, other methods, and complex disease architectures.
The predictive models for all analyses were generated using the default Pantry (Pan-transcriptome phenotyping) pipeline. Weights were generated from gene expression and multiple gene expression and genotype PCs as covariates.
All analyses include weakly predictive models up to a heritability P-value of 0.01. This means you will sometimes see models with negative cross-validation (adjusted) R2 values because the heritable signal is not predictive after reducing 4/5 folds. These models are included primarily for individual gene look-up where the multiple-testing burden is negligible and weakly significant models may still be informative (alternatively, if you don’t see a model for a gene it’s because there wasn’t a hint of signal in the data). For genomewide scans we recommend interpreting these models with caution.
The conditional analysis is a simple summary based step-wise model selection process that iteratively adds predictors to the model in decreasing order of conditional TWAS significance until no significant associations remain. Across models, conditional results should be interpreted as estimating the number of jointly significant models, but the selected models are not necessarily more likely to be causal than unselected features (either due to high correlation or different levels of noise). Rather, we recommend using a formal fine-mapping procedure (e.g. FOCUS). Additionally, the SNP conditioning analysis (and Manhattan plots) provide an estimate of variance in the locus explained by the predicted model. A small fraction of variance explained is a strong indicator that the predicted model is tagging another causal feature (or there are multiple causal features in the locus). A large fraction of variance explained is consistent with the predicted model explaining all of the genetic effect - necessary but not sufficient for this to be the single causal mediator.
The conditional analysis uses an LD-reference panel and is therefore approximate, so you may see loci that behave unusually (for example, becoming extremely significant after conditioning). These are most likely instances of LD mismatch between reference and GWAS data. In instances where the full conditional analysis is unstable, the “top SNP corr” column still provides a useful estimate of the marginal correlation between the gene model and the top GWAS SNP, the square of which is the estimate of variance explained by that model alone.
All transcriptome-wide significant associations are run through the coloc colocalization model, with posterior probabilities PP3 (distinct causal variant) and PP4 (shared causal variant) reported in the locus view. coloc assumes a single causal variant model while TWAS directly models multiple eQTLs so we tend to use low PP3 as an indicator of colocalization rather than high PP4 (as done in Raj et al. 2017 biorxiv).
The TWAS hub logo is from André Luiz Gollo and the Noun Project.
Mancuso et al. 2018 biorxiv | A method for fine-mapping credible sets of TWAS genes |
Barfield et al. 2018 Gen Epi | A method for distinguishing co-localization in TWAS tests |
Gusev et al. 2018 biorxiv | TWAS of ovarian cancer |
Mancuso et al. 2018 biorxiv | TWAS of prostate cancer |
Wu et al. 2018 Nat Genet | TWAS of breast cancer |
Gusev et al. 2018 Nat Genet | Integration of TWAS with chromatin features |
Mancuso et al. 2017 AJHG | TWAS of 30 traits and methods for cross-trait analyses |
Gusev et al. 2016 Nat Genet | Primary TWAS method paper |