Integrative Analysis of Omics Data to Construct Whole-Genome Gene Regulatory Networks

Seminar Series

Friday, April 2, 2021 - 12:00
Zoom Conference
Min Zhang, PhD

Abstract: Constructing gene regulatory networks is crucial to understanding the molecular interactions underlying complex diseases. By integrating transcriptomic and genomic data, we propose a parallel algorithm, i.e., the two-stage penalized least squares method (2SPLS), to infer the causal relationships between all genes in an organism, via a model-based framework. With a huge number of transcriptional variables and even more genotypic variables, 2SPLS limits memory consumption and avoids intensive computation by parallelly fitting one linear model for each gene at each stage. It obtains consistent estimation of a set of well-defined surrogate variables at the first stage, and consistently selects regulatory genes among massive candidates at the second stage. The entire system of gene regulation can be constructed with bounded errors. We also demonstrate the superior performance of the method using Monte Carlo simulation studies and real data analysis.

Speaker: Min Zhang is Professor of Statistics at Purdue University and Associate Director of Data Science at Purdue University Center for Cancer Research. Jointly trained in clinical medicine (MD), neuroscience (PhD), and biological statistics and computational biology (PhD), she has focused her research on developing new statistical methods for high-dimensional data in biomedical research. With her collaborators, Dr. Zhang proposed generalized thresholding estimators for gene expression data,  Bayesian classification methods for quantitative trait loci mapping, semi-supervised machine learning methods for genome-wide association analysis with whole-genome sequencing data, and integrative analysis of multi-omics data for large scale network construction.

Zoom: https://duke.zoom.us/j/94444118346?pwd=Qjlvc1JSWWRJem4vbERMWHRMQ2d6dz09  

Passcode: 273373