An ontology-based search engine for protein-protein interactions

作者:Park Byungkyu; Han Kyungsook*
来源:BMC Bioinformatics, 2010, 11: S23.
DOI:10.1186/1471-2105-11-S1-S23

摘要

Background: Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database.
Results: We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Godel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Godel numbers representing the query protein and the search conditions.
Conclusion: Representing the biological relations of proteins and their GO annotations by modified Godel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.