Using Fisher's Method to Identify Enriched Gene Sets

Volha Tryputsen, Javier Cabrera, An De Bondt, Dhammika Amaratunga

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

In gene expression studies where gene-level p-values have been calculated, Fisher's method for pooling p-values (here referred to as MLP for "mean log p") can be used to identify predefined gene sets that are enriched in the sense that the genes that comprise them have comparatively low p-values. Since gene-level p-values tend not to follow a uniform distribution even in situations that could be regarded as null, a permutation procedure is the most effective way to assess significance. However, this may prove computationally burdensome if a large number of analyses need to be done. In this article, we derive a highly accurate approximation to the permutation p-value that can be used to assess the significance of Fisher's test statistic in a computationally efficient manner. In addition, we show the superiority of this approach compared to methods based on the (regular or weighted) Kolmogorov-Smirnov statistic, which is the basis of the popular GSEA method for gene set enrichment analysis, and Fisher's exact test, which is the basis of several other gene set analysis modalities such as Ingenuity, GoMiner, MAPPFinder, and EASE. We also explore some simple but novel variations of the MLP and find that one of them, MLQ, essentially Fisher's method based on FDR-adjusted p-values or q-values, has comparable performance to MLP for small gene set sizes, but for large gene set sizes, offers noticeable improvement over MLP.

Original languageEnglish (US)
Pages (from-to)154-162
Number of pages9
JournalStatistics in Biopharmaceutical Research
Volume6
Issue number2
DOIs
StatePublished - Apr 2014

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Pharmaceutical Science

Keywords

  • Edgeworth
  • Mean log p
  • Microarray
  • Saddlepoint

Fingerprint

Dive into the research topics of 'Using Fisher's Method to Identify Enriched Gene Sets'. Together they form a unique fingerprint.

Cite this