摘要

Automatic coding of short text responses opens new doors in assessment. We implemented and integrated baseline methods of natural language processing and statistical modelling by means of software components that are available under open licenses. The accuracy of automatic text coding is demonstrated by using data collected in the Programme for International Student Assessment (PISA) 2012 in Germany. Free text responses of 10 items with n = 41, 990 responses in total were analyzed. We further examined the effect of different methods, parameter values, and sample sizes on performance of the implemented system. The system reached fair to good up to excellent agreement with human codings (.458 <= kappa <= .959): Especially items that are solved by naming specific semantic concepts appeared properly coded. The system performed equally well with n >= 1, 661 and somewhat poorer but still acceptable down to n = 249. Based on our findings, we discuss potential innovations for assessment that are enabled by automatic coding of short text responses.

  • 出版日期2016-4