摘要

Though widely used in detecting new semantic classes and named entities, the pattern-based approaches have suffered limited coverage because patterns often have fixed structures and limited information. To encompass flexible structure and more information, the soft pattern and feature vector are introduced for generalizing the patterns for better performance. Both of them support fuzzy match by similarity scores. Experiments on Chinese names of diseases, weapons and vehicles from the People's Daily corpus show effective improvement in named entity recognition.