Selecting gene features for unsupervised analysis of single-cell gene expression data

Jie Sheng, Wei Vivian Li

Research output: Contribution to journalReview articlepeer-review

14 Scopus citations


Single-cell RNA sequencing (scRNA-seq) technologies facilitate the characterization of transcriptomic landscapes in diverse species, tissues, and cell types with unprecedented molecular resolution. In order to evaluate various biological hypotheses using high-dimensional single-cell gene expression data, most computational and statistical methods depend on a gene feature selection step to identify genes with high biological variability and reduce computational complexity. Even though many gene selection methods have been developed for scRNA-seq analysis, there lacks a systematic comparison of the assumptions, statistical models, and selection criteria used by these methods. In this article, we summarize and discuss 17 computational methods for selecting gene features in unsupervised analysis of single-cell gene expression data, with unified notations and statistical frameworks. Our discussion provides a useful summary to help practitioners select appropriate methods based on their assumptions and applicability, and to assist method developers in designing new computational tools for unsupervised learning of scRNA-seq data.

Original languageEnglish (US)
Article numberbbab295
JournalBriefings in bioinformatics
Issue number6
StatePublished - Nov 1 2021

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Molecular Biology


  • feature selection
  • highly variable genes
  • single-cell genomics
  • unsupervised learning


Dive into the research topics of 'Selecting gene features for unsupervised analysis of single-cell gene expression data'. Together they form a unique fingerprint.

Cite this