摘要

In this study, we explored the potential for machine scoring of short written responses to the Classroom-Video-Analysis (CVA) assessment, which is designed to measure teachers' usable mathematics teaching knowledge. We created naive Bayes classifiers for CVA scales assessing three different topic areas and compared computer-generated scores to those assigned by trained raters. Using cross-validation techniques, average correlations between rater- and computer-generated total scores exceeded .85 for each assessment, providing some evidence for convergent validity of machine scores. These correlations remained moderate to large when we controlled for length of response. Machine scores exhibited internal consistency, which we view as a measure of reliability. Finally, correlations between machine scores and another measure of teacher knowledge were close in size to those observed for human scores, providing further evidence for the validity of machine scores. Findings from this study suggest that machine learning techniques hold promise for automating scoring of the CVA.