摘要

Morphological segmentation is the task of segmenting words into morphemes, the basic semantic units. It is one of the most fundamental tasks in natural language processing, especially for morphologically-rich languages. In this paper, we treat the morphological segmentation as a character sequence to sequence learning problem and propose an attention based neural network model for solving it. In our proposed method, we use a bidirectional long-short term memory as the encoder, which can increase the amount of input information available to the network and capture past and future information effectively. Additionally, an attention mechanism is presented in the decoder to make our morphological segmentation model focus on certain contexts of current character to be tagged. We conduct experiments on several languages such as Turkish, Finnish, and English. Experimental results show that our model can achieve either better or comparable results to existing methods in morphological segmentation.

全文