A weighted multivariate Fuzzy C-Means method in interval-valued scientific production data

Pimentel Bruno Almeida; de Souza Renata M C R<sup>*</sup>

doi:10.1016/j.eswa.2013.11.013

摘要

Clustering is the process of organizing objects into groups whose members are similar in some way. Most of the clustering methods involve numeric data only. However, this representation may not be adequate to model complex information which may be: histogram, distributions, intervals. To deal with these types of data, Symbolic Data Analysis (SDA) was developed. In multivariate data analysis, it is common some variables be more or less relevant than others and less relevant variables can mask the cluster structure. This work proposes a clustering method based on fuzzy approach that produces weighted multivariate memberships for interval-valued data. These memberships can change at each iteration of the algorithm and they are different from one variable to another and from one cluster to another. Furthermore, there is a different relevance weight associated to each variable that may also be different from one cluster to another. The advantage of this method is that it is robust to ambiguous cluster membership assignment since weights represent how important the different variables are to the clusters. Experiments are performed with synthetic data sets to compare the performance of the proposed method against other methods already established by the clustering literature. Also, an application with interval-valued scientific production data is presented in this work. Clustering quality results have shown that the proposed method offers higher accuracy when variables have different variabilities.

出版日期2014-6-1

全文

访问全文

收藏分享被引(24) 浏览

更新时间：2021-04-23 06:44

A weighted multivariate Fuzzy C-Means method in interval-valued scientific production data

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友