mrMoulder: A recommendation-based adaptive parameter tuning approach for big data processing platform

Cai, Lin; Qi, Yong<sup>*</sup>; Wei, Wei; Wu, Jinsong; Li, Jingwei

doi:10.1016/j.future.2018.05.080

摘要

Nowadays the world has entered the big data era. Big data processing platforms, such as Hadoop and Spark, are increasingly adopted by many applications, in which there are numerous parameters that can be tuned to improve processing performance for big data platform operators. However, due to the large number of these parameters and the complex relationship among them, it is very time-consuming to manually tune parameters. Therefore, it is a challenge to automatically configure parameters as quickly as possible to optimize the performance of the current job. Existing auto-tuning methods often take a certain time before job runs to get the optimal configuration, which would increase the job's total processing time and reduce the overall efficiency of cluster. In this paper, we propose an adaptive tuning framework, mrMoulder, to recommend a near-optimal configuration for the new job in a short time. mrMoulder sets a self-extending configuration repository and a collaborative filtering based recommendation engine, to speed up the process of optimizing parameter configuration. We have deployed mrMoulder in a Hadoop cluster, and the experiment results have demonstrated that, for a new big data application, the recommend time of mrMoulder is only 20% to 30% of that for the existing auto-tuning methods, while the recommendation quality remains almost unchanged.

出版日期2019-4
单位西安理工大学; 西安交通大学

全文

访问全文

收藏分享被引(18) 浏览

更新时间：2024-05-10 14:05

mrMoulder: A recommendation-based adaptive parameter tuning approach for big data processing platform

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友