The Golden Spike Project:

Assessment of Microarray Analysis Methods

A project of the Halfon Lab and our collaborators

home download   technical comments links

DOWNLOAD the original Golden Spike experiment paper (Choe et al. 2005. "Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset." Genome Biology 6:R16) and additional data including scripts, CEL files, and clone-to-probe mappings.

DNA microarrays have become a leading research method in a diverse variety of biological and biomedical disciplines. However, assessing the accuracy of microarray data has been difficult as the “correct” answers typically are not known; that is, due to the vast numbers of genes interrogated in a microarray experiment, only a relatively small fraction of gene expression differences tend to be validated in any given study. This is of particular concern due to the tremendous number of proposed microarray data analysis methods, which have proliferated in tandem with the increased use of the arrays themselves. To illustrate the scale of the problem, consider just a single journal, Bioinformatics: in the six month period ending December 2004, this journal published over 40 papers involving various aspects of microarray analysis, of which at least one-half dealt with basic issues such as normalization or discovery of differentially expressed genes. Unfortunately, despite the large number of proposed algorithms, there have been relatively few studies that assess their relative performance. Thus the microarray user is left in the difficult position of choosing from among a large number of analysis options without the benefit of knowing which methods work the best.

The Golden Spike Project represents an attempt to rectify this situation by making available control microarray datasets in which the relative concentrations of all present genes are known. All data, scripts, technical comments, and links to publications will be available through this site.

Our first dataset (Choe et al. 2005) is a wholly-defined Affymetrix GeneChip experiment that has 1309 individual cRNAs “spiked in” at known relative concentrations between the two (spike-in and control) samples. This large number of defined RNAs enabled us to generate accurate estimates of false negative and false positive rates at each fold change level, beginning at only a 1.2x concentration difference. Such small fold changes can be biologically relevant, yet are frequently overlooked in microarray datasets due to a lack of knowledge as to how reliably such small changes can be detected.  Our dataset uses a defined background sample of 2551 RNA species present at identical concentrations in both sets of microarrays, rather than a biological RNA sample of unknown composition. This background RNA population was sufficiently large for normalization purposes, yet also enabled us to observe the distribution of truly non-specific signal from probe sets which correspond to RNAs not present in the sample.

 We used this dataset to compare several common Affymetrix array analysis algorithms. Our results demonstrated that at several steps of analysis, large differences exist in the effectiveness of the various options that we considered. We found a significant limit to the sensitivity of the microarray experiments to detect small changes: in the best case scenario, we could detect approximately 95% of true DEGs with changes greater than 2-fold, but less than 30% with changes below 1.7 fold before exceeding a 10% false discovery rate. Importantly, we found that accurate detection of DEGs was maximized by combining aspects of different published methods, rather than by any one existing method.

We are continuing to analyze the data from this set of experiments, and are planning additional experiments using Affymetrix, long oligonucleotide, and cDNA arrays.