Omics Lab

Ongoing Projects

Our research mainly focuses on development of models, algorithms, and computational techniques to extract information from a variety of data sources that relate to Molecular Biology, Systems Biology, and Genetics. The main challenges associated with analyzing this type of data include (i) the complexity of biological systems at multiple levels (from populations to molecules), (ii) the dynamical nature of biological phenomena in spatio-temporal dimensions, (iii) large scale and high-dimensionality of data, along with the combinatorial nature of interactions between different entities, and (iv) incompleteness and noisy nature of data collected from high-throughput experiments. While addressing these challenges, we often encounter sophisticated abstractions and intractable computational problems, which in turn provide us with the opportunity to contribute to computational sciences through development of advanced algorithms and computational techniques. The following projects are among those that are currently undertaken by our group.

Enhancing Genome-Wide Association Studies via Integrative Network Analysis

Genome Wide Association Studies (GWAS) comprehensively compare common genetic variants in affected and control populations to identify variants that are potentially associated with complex diseases. In recent years, GWAS successfully identified susceptible genes for many diseases. However, researchers recognize many limitations of GWAS in characterizing the genetic bases of complex diseases, including reduced statistical power due to small sample size, inadequacy of separate consideration of individual variants in capturing the interplay between multiple factors, modest success in predicting individual risk for disease, and lack of insights into the biological and functional mechanisms that relate identified variants to the disease. We aim to enhance GWAS by using protein-protein interaction (PPI) networks as an integrative framework to interpret the outcome of GWAS within a functional context. PPI networks characterize the physical and functional interactions among functional proteins; thus they are useful in understanding the functional relationships between multiple genetic factors. This project is supported by National Institutes of Health Award R01-LM011247.

A subnetwork of the human protein interaction network associated with type II diabetes

Discovery of Coordinately Dysregulated Subnetworks in Complex Phenotypes

Cellular systems are orchestrated through combinatorial organization of thousands of biomolecules. This complexity is reflected in the diversity of phenotypic effects, which generally present themselves as weak signals in the expression profiles of single molecules. For this reason, researchers increasingly focus on identification of multiple markers that together exhibit differential expression with respect to various phenotypes. In collaboration with the research group of Mark Chance, we focus on human colorectal cancer and develop abstractions and algorithms to define coordinate dysregulation of multiple genes within network context and identify such network patterns with a view to establishing markers for prognosis of cancer and targets for theurapetic intervention. For this purpose, our algorithms integrate genomic, transcriptiomic, proteomic, and interactomic data. This project is supported in part by NSF CAREER Award CCF-0953195.

Characterization of Copy Number Variation in Human Genome

Not long ago, it was discovered that individuals may differ in copy numbers of their genes, meaning that a segment of DNA may have more or less copies than usual in an individual's chromosome. Recent research suggests that these variations are associated with many diseases including Autism and Schizophrenia. Copy number variation (CNV) in somatic cells also underly various cancers. Copy numbers are usually identified using SNP microarrays, however, short-read sequence data is emerging as an important resource for characterizing structural variation in human genome. In collaboration with the research group of Thomas LaFramboise, we develop optimization based algorithms for fast and accurate identification of rare and de novo CNVs from these two data sources, with a view to enabling personalized genomics applications. This project is supported by National Science Foundation Award IIS-0916102. Copy Number Variation