Integrative analysis for identification of high-order genetic interactions

GWAS has been widely used to explore associations between genetic variants and complex diseases. Many complex traits are regulated by biological mechanisms that may involve multiple interacting genes. Not surprisingly, it has been suggested that incorporating gene-gene interactions can not only help to further explain the missing heritability of complex traits, but also can further characterize the genetic architecture of traits (Moore and Williams, 2009).

Projects​​

Systematic review of diagnostic test accuracy

With a gold standard. The systematic review of diagnostic imaging modalities for surveillance of melanoma patients. It involves both cohort and case-control studies. The standard likelihood based inference often encounters computational issues, such as non-convergence and sensitive to initial values, due to the complexity of likelihood and the small number of studies. We developed an inferno procedure based on composite likelihood to avoid the multidimensional integrals (Chen et al.,2014). In addition, due to mixture types of studies, the standard methods for evaluating diagnostic accuracy focus only on sensitivity and specificity and ignore the information on disease prevalence that is contained in cohort studies. Consequently, such methods cannot provide estimates of measures related to disease prevalence, such as positive and negative predictive values (e.g., PPV and NPV), which reflect the clinical utility of a diagnostic test. To address this issue, we have developed procedures to jointly analyze case-control and cohort studies (Chen et al., 2015, 2016), where information information on disease prevalence in cohort studies is jointly modeled with the sensitivity and specificity. As a result, for example, the figure in right panel shows the estimated PPV and NPV for four diagnostic imaging modalities, and these measures are useful for clinicians who want to obtain PPV and NPV for a specific cohort of patients under investigation.

Meta-analysis for clinical trial outcomes

Multivariate network meta-analysis. Different from traditional pairwise meta-analyses where a pair of treatments are compared, network meta- analysis compares multiple treatments by taking into consideration an entire network with all pieces of evidence simultaneously (i.e., direct and indirect evidence). Our motivating example is a systematic review of pharmacological treatments for alcohol dependence, where not only multiple treatments are under comparison, but also multiple alcohol dependence outcomes are reported. Network structures of two alcohol outcomes are displayed in the right panel figure. As in common RCTs, the secondary outcomes are often selectively reported (Chan et al.,2005). We proposed a multilevel joint model to borrow information across multivariate outcomes as well as across treatments through direct and indirect comparisons in the complex network. Our joint modeling strategy allows for borrowing information across outcomes, which can reduce the impact of selective reporting of outcomes (Liu, DeSantis, and Chen, 2016, Bayesian network meta-analysis with correlated out- comes subject to publication bias: application to a systematic reviews of alcohol dependence (under minor revision)). 

​Copyright @ 2016 Yu-Lun Liu | Política de Privacidade

Without a gold standard. In diagnostic test accuracies, the selected reference test may not be a gold standard due to measurement error, high cost or non-existence. Failure to account for the errors in reference test can lead to substantial bias in the evaluation of candidate test accuracy. Two models have been proposed in the literature to account for such an imperfect reference test, namely, a multivariate generalized linear mixed model (Chu et al., 2009) and a hierarchical summary receiver operating characteristic model (Dendukuri et al., 2012). In practice, researchers may have to choose between one of these two models. To provide a useful guideline for modeling with DTA, we have shown that these two models, although with very different formulations, are closely related and are mathematically equivalent in the absence of study-level covariates (Liu et al., 2015). Moreover, we have provided the exact relations between the parameters of these two models and assumptions under which two models can be reduced to equivalent submodels. Our results generalized the relationship between the bivariate generalized linear mixed model (Reitsma et al.,2005) and HSROC model (Rutter et al., 2001) when the reference test is a gold standard, and unified the existing models. 


Gene-gene interactions (YETI). A major challenge for the standard detection methods is the large number of possible combinations, with a requisite need to correct for multiple testing. Assumptions of large marginal effects, to reduce the search space, may be restrictive and miss higher-order interactions with modest marginal effects. We extended the method of Kooperberg and LeBlanc (2008)  to the multiple studies setting, which we term as KL-meta. Additionally, to relax the large marginal effect assumption, we proposed a new procedure, phylogenY-aware Effect-size Tests for Interactions (YETI), for detecting gene-gene interactions through heterogeneity in es- timated low-order (e.g., marginal) effect sizes by leverag- ing population structure, or ancestral differences, among studies in which the same phenotypes were measured (Liu et al., 2016). The major advantages of YETI, compared to KL-meta, are the following. First, unlike KL-meta, the power function of YETI is monotonic in the interaction effect (as shown in the right panel figure). This is because the YETI method relies on the heterogeneity in marginal effects across populations, that is, it requires that the minor allele frequency is different across populations. Second, it is indeed possible biologically that marginal effects are small but the interaction is not. Our proposed method is particularly suitable for integrating information within large research consortia. 

​​ YU-LUN LIU

​Yu-Lun Liu
Email: yulunliu@upenn.edu

My research interest focuses on statistical methods for effective integration of heterogeneous data sources in biomedical studies. My research scope includes the conventional and modern meta-analyses for clinical trial outcomes, systematic review of diagnostic test accuracies, integrative analysis for identification of high-order genetic interactions, and precision medicine.

​I did my graduate work in
Biostatistics atThe University of Texas at Houston (Ph.D. 2016), advised by Drs. Yong Chen and Paul Sheet. I am currently a postdoctoral research fellow at Department of Biostatistics and Epidemiology at the University of Pennsylvania exploring the statistical methods as the presence of outcome reporting bias, working with Dr. Yong Chen. I am also collaborating with Dr. Stephen E. Kimmel, Dr. Gui-Shuang Ying, and Dr. Jason Moore at 
University of Pennsylvania.