Automatic Algorithm to Classify and Locate Research Papers Using Natural Language

作者:Calvillo E A*; Mendoza R*; Munoz J*; Martinez J C*; Vargas M*; Rodriguez L C*
来源:IEEE Latin America Transactions, 2016, 14(3): 1367-1371.
DOI:10.1109/tla.2016.7459622

摘要

The objective of this paper was to provide an automatic engine to classify and locate information using natural language. The proposal integrates a set of two algorithms to extract information from different repositories using their own open APIs and creates a knowledge database using a natural language approach using a Bayesian algorithm to classify and a second algorithm to clean the paper. Putting said techniques together derived in a strong alternative which reach common gaps in classification and location of information including avoid the use of the whole paper to get information and not only the information introduced at the moment of upload the paper in the digital library. The proposal was oriented to classify and locate research papers in order to better describe this contribution, however, findings could be applicable to a vast range of scenarios. An adaptation of the popular methodology Crisp-DM was used to evaluate the performance of the algorithm obtaining good results in classifying, searching, and feeding the knowledge base.

  • 出版日期2016-3