Demystifying the Drop-Outs in Single Cell RNA-Seq Data

March 25, 2022
12:00 pm to 1:00 pm

Event sponsored by:

Biostatistics and Bioinformatics


Adkins, Judy


Mengjie Chen, PhD, Speaker


Mengjie Chen, PhD, Assistant Professor, Medicine & Human Genetics, University of Chicago
Abstract: Droplet-based single-cell RNA-sequencing (scRNA-seq) methods have changed the landscape of genomics research in complex biological systems by producing single cell resolution data at affordable costs. In the state-of-the-arts protocols, a step called barcoding unique molecular identifiers (UMI) has been introduced to remove amplification bias and further improve data quality. Recent literature suggests that barcoding has led to a different error structure in the count data with much less technical noise. Regardless, many tools do not acknowledge the differences between the read count data and UMI count data, still assuming that both suffer from excessive technical noise. In this presentation, I will make a brief overview of scRNA-seq data analysis pipelines and then present extensive analyses of publicly available UMI data sets that challenge the assumptions of most existing pre-processing tools. Our results suggest that resolving cell-type heterogeneity should be the foremost step of the scRNA-seq analysis pipeline. Normalizing or imputing the data set before resolving the heterogeneity can lead to adversary consequences in downstream analysis. As a result, we provide a new perspective on scRNA-seq data analysis by fully integrating pre-processing and clustering, which was classified as part of the downstream analysis. The proposed procedures have been implemented in software, HIPPO. If time permits, I will also talk about other single cell analysis tools developed in my group, VIPER, an imputation method for SMART-sea data, and dmatch, an alignment tool for multiple scRNA-seq samples batch correction. Bio: Mengjie Chen is an assistant professor of Genetic Medicine in the Department of Medicine and Human Genetics at the University of Chicago. She was an assistant professor in the Department of Biostatistics and Genetics at UNC-Chapel Hill from 2014 to 2016. She obtained her PhD in Computational Biology and Bioinformatics from Yale University in 2014. Dr. Chen was a recipient of the Alfred P. Sloan Research fellowship in Computational and Molecular Evolutionary Biology in 2019. As a computational biologist and statistician by training, Dr. Chen's research bridges statistical methodological advances and biomedical applications. She and her group develop computational methods and open source tools to address challenges posed by high-throughput technologies for data analysis and interpretation. Zoom: Passcode:914091