Dynamic core affinity for high-performance file upload on Hadoop Distributed File System

Cho Joong Yeon; Jin Hyun Wook<sup>*</sup>; Lee Min; Schwan Karsten

doi:10.1016/j.parco.2014.07.005

摘要

The MapReduce programming model, in which the data nodes perform both the data storing and the computation, was introduced for big-data processing. Thus, we need to understand the different resource requirements of data storing and computation tasks and schedule these efficiently over multi-core processors. In particular, the provision of high-performance data storing has become more critical because of the continuously increasing volume of data uploaded to distributed file systems and database servers. However, the analysis of the performance characteristics of the processes that store upstream data is very intricate, because both network and disk inputs/outputs (I/O) are heavily involved in their operations. In this paper, we analyze the impact of core affinity on both network and disk I/O performance and propose a novel approach for dynamic core affinity for high-throughput file upload. We consider the dynamic changes in the processor load and the intensiveness of the file upload at run-time, and accordingly decide the core affinity for service threads, with the objective of maximizing the parallelism, data locality, and resource efficiency. We apply the dynamic core affinity to Hadoop Distributed File System (HDFS). Measurement results show that our implementation can improve the file upload throughput of end applications by more than 30% as compared with the default HDFS, and provide better scalability.

出版日期2014-12

全文

访问全文

收藏分享被引(5) 浏览

更新时间：2021-04-22 04:41

Dynamic core affinity for high-performance file upload on Hadoop Distributed File System

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友