Search
School of Informatics and Computing Menu

Chi Zhang, Ph.D.: Development of bi-clustering algorithm to capture intra-tissue heterogeneity in single cell RNA-Seq data and bi-clustering based prediction for gain or loss of functions led by somatic mutations

Friday, October 21 at 12:30 p.m. in IT 252

Abstract

Large scale single cell RNA-Seq was emerged to capture intra-tissue heterogeneity in cancer, brain under different stage of development and immune cell populations in blood sample. Large number 0 observations and complicated co-variance structure led by diverse biological variations in the observed data raise a demand for novel computational frameworks of single cell RNA-Seq data analysis. In this work, we developed a probabilistic model based bi-clustering approach to capture the heterogeneous characteristics of different cell groups. In the second story, we developed a computational framework to predict the gain or loss of function led by certain mutation patterns by integrative analysis of genomic mutation and transcriptmoics data. It is unknown what mutation patterns may lead to gain or loss of functions (GoLoF), neither what GoLoFs are caused by certain mutations that form a bi-clustering problem.  Our data integration approach has successfully identified GoLoF led by single, concurrent and collective effect of multiple mutations of 20 cancer associated and 30 frequently observed mutations in a pan-cancer level study.

About Chi Zhang

Dr. Chi Zhang has his bachelor degree in Mathematics from Peking University in 2010 and Ph.D. in Bioinformatics with a Statistics minor from the University of Georgia in 2015. In the summer of 2016, Dr. Chi Zhang joined the Center of Computational Biology and Bioinformatics and Department of Medical and Molecular Biology at Indiana University, School of Medicine. His research interests include computational modeling of cancer micro-environment by integrative analysis of cancer omics data and developing novel computational algorithm and pipelines to characterize the complexity of tissue level data.