An efficient and scalable family of algorithms for combining clusterings

Mimaroglu Selim<sup>*</sup>; Erdil Ertunc

doi:10.1016/j.engappai.2013.08.001

摘要

Clustering is the process of grouping objects that are similar, where similarity between objects is usually measured by a distance metric. The groups formed by a clustering method are referred as clusters. Clustering is a widely used activity with multiple applications ranging from biology to economics. Each clustering technique has some advantages and disadvantages. Some clustering algorithms may even require input parameters which strongly affect the result. In most cases, it is not possible to choose the best distance metric, the best clustering method, and the best input argument values for an input data set. Therefore, multiple clusterings can be obtained by several distance metrics, several clustering methods, and several input argument values. And, multiple clusterings can be combined into a new and better quality final clustering. We propose a family of combining multiple clustering algorithms that are memory efficient, scalable, robust, and intuitive. Our new algorithms offer tremendous speed gain and low memory requirements by working at cluster level, while producing very good quality final clusters. Extensive experimental evaluations on some very challenging artificially generated and real data sets from a diverse set of domains establish the usefulness of our methods.

出版日期2013-11

全文

访问全文

收藏分享被引浏览

更新时间：2017-04-24 15:04

An efficient and scalable family of algorithms for combining clusterings

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友