摘要

One of the biggest problems facing microarray experiments is the difficulty of translating results into other microarray formats or comparing microarray results to other biochemical methods. We believe that this is largely the result of poor gene identification. We re-identified the probesets on the Affymetrix U133 plus 2.0 GeneChip array. This identification was based on the sequence of the probes and the sequence of the human genome. Using the BLAST program, we matched probes with documented and postulated human transcripts. This resulted in the redefinition of approximately 37% of the probes on the U133 plus 2.0 array. This updated identification specifically points out where the identification is complicated by cross-hybridization from splice variants or closely related genes. More than 5000 probesets detect multiple transcripts and therefore the exact protein affected cannot be readily concluded from the performance of one probeset alone. This makes naming difficult and impacts any downstream analysis such as associating gene ontologies, mapping affected pathways or simply validating expression changes. We have now automated the sequence-based identification and can more appropriately annotate any array where the sequence on each spot is known.

  • 出版日期2005