摘要

The purpose of this study was to describe breast atypical hyperplasia (BAH)-related gene expression and to systematically analyze the functions, pathways, and networks of BAH-related hub genes. On the basis of natural language processing, gene data for BAH were extracted from the PubMed database using text mining. The enriched Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways were obtained using DAVID (). A protein-protein interaction network was constructed using the STRING database. Hub genes were identified as genes that interact with at least 10 other genes within the BAH-related gene network. In total, 138 BAH-associated genes were identified as significant (P < 0.05), and 133 pathways were identified as significant (P < 0.05, false discovery rate < 0.05). A BAH-related protein network that included 81 interactions was constructed. Twenty genes were determined to interact with at least 10 others (P < 0.05, false discovery rate < 0.05) and were identified as the BAH-related hub genes of this protein-protein interaction network. These 20 genes are TP53, PIK3CA, JUN, MYC, EGFR, CCND1, AKT1, ERBB2, CTNN1B, ESR1, IGF-1, VEGFA, HRAS, CDKN1B, CDKN1A, PCNA, HGF, HIF1A, RB1, and STAT5A. This study may help to disclose the molecular mechanisms of BAH development and provide implications for BAH-targeted therapy or even breast cancer prevention. Nevertheless, connections between certain genes and BAH require further exploration.