摘要

Data clustering and analyzing techniques are studied by using hierarchical clustering method. A matrix of words is constructed with a randomly chosen RSS list. By collecting data from this list a matrix is built. In the matrix each row corresponds to a article and each column represents a word. Based on the matrix a hierarchical clustering algorithm is designed. In this algorithm the Pearson correlation coefficient is used to compute the distances among different contents. The dendrogram is used to describe the hierarchical relationship of contents and words. And the 2-D graph also is used to represent the dendrogram in another format.

  • 出版日期2015-10
  • 单位哈尔滨商业大学

全文