A mixture model for estimating the local false discovery rate in DNA microarray analysis

J. G. Liao, Yong Lin, Zachariah E. Selvanayagam, Weichung Shih

Research output: Contribution to journalArticle

62 Citations (Scopus)

Abstract

Motivation: Statistical methods based on controlling the false discovery rate (FDR) or positive false discovery rate (pFDR) are now well established in identifying differentially expressed genes in DNA microarray. Several authors have recently raised the important issue that FDR or pFDR may give misleading inference when specific genes are of interest because they average the genes under consideration with genes that show stronger evidence for differential expression. The paper proposes a flexible and robust mixture model for estimating the local FDR which quantifies how plausible each specific gene expresses differentially. Results: We develop a special mixture model tailored to multiple testing by requiring the P-value distribution for the differentially expressed genes to be stochastically smaller than the P-value distribution for the non-differentially expressed genes. A smoothing mechanism is built in. The proposed model gives robust estimation of local FDR for any reasonable underlying P-value distributions. It also provides a single framework for estimating the proportion of differentially expressed genes, pFDR, negative predictive values, sensitivity and specificity. A cervical cancer study shows that the local FDR gives more specific and relevant quantification of the evidence for differential expression that can be substantially different from pFDR.

Original languageEnglish (US)
Pages (from-to)2694-2701
Number of pages8
JournalBioinformatics
Volume20
Issue number16
DOIs
StatePublished - Nov 1 2004

Fingerprint

Microarray Analysis
DNA Microarray
Microarrays
Oligonucleotide Array Sequence Analysis
Mixture Model
DNA
Genes
Gene
False Positive
Value Distribution
Differential Expression
Multiple Testing
False
Robust Estimation
Uterine Cervical Neoplasms
Statistical method
Quantification
Specificity
Smoothing
Statistical methods

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Liao, J. G. ; Lin, Yong ; Selvanayagam, Zachariah E. ; Shih, Weichung. / A mixture model for estimating the local false discovery rate in DNA microarray analysis. In: Bioinformatics. 2004 ; Vol. 20, No. 16. pp. 2694-2701.
@article{1bf078409b4f4157a17e65fdbcdd171c,
title = "A mixture model for estimating the local false discovery rate in DNA microarray analysis",
abstract = "Motivation: Statistical methods based on controlling the false discovery rate (FDR) or positive false discovery rate (pFDR) are now well established in identifying differentially expressed genes in DNA microarray. Several authors have recently raised the important issue that FDR or pFDR may give misleading inference when specific genes are of interest because they average the genes under consideration with genes that show stronger evidence for differential expression. The paper proposes a flexible and robust mixture model for estimating the local FDR which quantifies how plausible each specific gene expresses differentially. Results: We develop a special mixture model tailored to multiple testing by requiring the P-value distribution for the differentially expressed genes to be stochastically smaller than the P-value distribution for the non-differentially expressed genes. A smoothing mechanism is built in. The proposed model gives robust estimation of local FDR for any reasonable underlying P-value distributions. It also provides a single framework for estimating the proportion of differentially expressed genes, pFDR, negative predictive values, sensitivity and specificity. A cervical cancer study shows that the local FDR gives more specific and relevant quantification of the evidence for differential expression that can be substantially different from pFDR.",
author = "Liao, {J. G.} and Yong Lin and Selvanayagam, {Zachariah E.} and Weichung Shih",
year = "2004",
month = "11",
day = "1",
doi = "10.1093/bioinformatics/bth310",
language = "English (US)",
volume = "20",
pages = "2694--2701",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "16",

}

A mixture model for estimating the local false discovery rate in DNA microarray analysis. / Liao, J. G.; Lin, Yong; Selvanayagam, Zachariah E.; Shih, Weichung.

In: Bioinformatics, Vol. 20, No. 16, 01.11.2004, p. 2694-2701.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A mixture model for estimating the local false discovery rate in DNA microarray analysis

AU - Liao, J. G.

AU - Lin, Yong

AU - Selvanayagam, Zachariah E.

AU - Shih, Weichung

PY - 2004/11/1

Y1 - 2004/11/1

N2 - Motivation: Statistical methods based on controlling the false discovery rate (FDR) or positive false discovery rate (pFDR) are now well established in identifying differentially expressed genes in DNA microarray. Several authors have recently raised the important issue that FDR or pFDR may give misleading inference when specific genes are of interest because they average the genes under consideration with genes that show stronger evidence for differential expression. The paper proposes a flexible and robust mixture model for estimating the local FDR which quantifies how plausible each specific gene expresses differentially. Results: We develop a special mixture model tailored to multiple testing by requiring the P-value distribution for the differentially expressed genes to be stochastically smaller than the P-value distribution for the non-differentially expressed genes. A smoothing mechanism is built in. The proposed model gives robust estimation of local FDR for any reasonable underlying P-value distributions. It also provides a single framework for estimating the proportion of differentially expressed genes, pFDR, negative predictive values, sensitivity and specificity. A cervical cancer study shows that the local FDR gives more specific and relevant quantification of the evidence for differential expression that can be substantially different from pFDR.

AB - Motivation: Statistical methods based on controlling the false discovery rate (FDR) or positive false discovery rate (pFDR) are now well established in identifying differentially expressed genes in DNA microarray. Several authors have recently raised the important issue that FDR or pFDR may give misleading inference when specific genes are of interest because they average the genes under consideration with genes that show stronger evidence for differential expression. The paper proposes a flexible and robust mixture model for estimating the local FDR which quantifies how plausible each specific gene expresses differentially. Results: We develop a special mixture model tailored to multiple testing by requiring the P-value distribution for the differentially expressed genes to be stochastically smaller than the P-value distribution for the non-differentially expressed genes. A smoothing mechanism is built in. The proposed model gives robust estimation of local FDR for any reasonable underlying P-value distributions. It also provides a single framework for estimating the proportion of differentially expressed genes, pFDR, negative predictive values, sensitivity and specificity. A cervical cancer study shows that the local FDR gives more specific and relevant quantification of the evidence for differential expression that can be substantially different from pFDR.

UR - http://www.scopus.com/inward/record.url?scp=8844258766&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=8844258766&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bth310

DO - 10.1093/bioinformatics/bth310

M3 - Article

C2 - 15145810

AN - SCOPUS:8844258766

VL - 20

SP - 2694

EP - 2701

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 16

ER -