Developing a Bottom-up, User-Based Method of Web Register Classification

作者:Egbert Jesse*; Biber Douglas; Davies Mark
来源:Journal of the Association for Information Science and Technology, 2015, 66(9): 1817-1831.
DOI:10.1002/asi.23308

摘要

This paper introduces a project to develop a reliable, cost-effective method for classifying Internet texts into register categories, and apply that approach to the analysis of a large corpus of web documents. To date, the project has proceeded in 2 key phases. First, we developed a bottom-up method for web register classification, asking end users of the web to utilize a decision-tree survey to code relevant situational characteristics of web documents, resulting in a bottom-up identification of register and subregister categories. We present details regarding the development and testing of this method through a series of 10 pilot studies. Then, in the second phase of our project we applied this procedure to a corpus of 53,000 web documents. An analysis of the results demonstrates the effectiveness of these methods for web register classification and provides a preliminary description of the types and distribution of registers on the web.

  • 出版日期2015-9