摘要

Over recent years, there has been a growing interest in the computational treatment of nominalized Noun Phrases due to the rich semantic information they contain. These Noun Phrases can be understood as verbal paraphrases and, just like them, they can also denote argument and thematic-role relations. This paper presents the methodology followed to annotate the argument structure of deverbal nominalizations in the Spanish AnCora-Es corpus. We focus on the automated annotation process that is mostly based on the semantic information specified in a verbal lexicon but also on the syntactic and semantic information annotated in the corpus. The heuristic rules that make use of this information rely on linguistic assumptions that are also evaluated as we evaluate the reliability of the automated process. The automated annotation was manually checked in order to ensure the accuracy of the final resource. We demonstrate its feasibility (77% F-measure) and show that it facilitates corpus annotation, which is always a time-consuming and costly process. The result is the enrichment of the AnCora-Es corpus with the argument structure and thematic roles of deverbal nominalizations. It is the first Spanish corpus with this kind of information that is freely available.

  • 出版日期2012-12