摘要

Named entity recognition (NER) is one of the fundamental problems in many natural language processing applications and the study on NER has great significance. Combining words segmentation and parts of speech analysis, the paper proposes a new NER method based on conditional random fields considering the graininess of candidate entities. The recognition granularity can be divided into two levels: word-based and character-based. We use segmented text to extract characteristics according to the characteristic templates which had been trained in the training phase, and then calculate P(y vertical bar x) to get the best result from the input sequence. The paper valuates the algorithm for different graininess on large-scale corpus experimentally, and the results show that this method has high research value and feasibility.