A variable-sized stripe level data layout strategy for HDD/SSD hybrid parallel file systems

作者:Liu, Yan; Huang, Xin; Huang, Yizi; Geng, Shaofeng; Peng, Xin; Li, Renfa
来源:Concurrency and Computation: Practice and Experience (CCPE) , 2017, 29(20): e4039.
DOI:10.1002/cpe.4035

摘要

Parallel file systems commonly distribute a file across multiple file servers with a fixed-size stripe, thereby allowing data access through multiple file servers. This default data layout works well in traditional homogeneous storage systems, but when solid state disks (SSDs) are conducted into a storage system, the data layout of hybrid parallel file systems has a chance to obtain better I/O performance. In this study, we propose a variable-sized stripe level data layout strategy for hybrid parallel file systems (SLDP). SLDP divides the file into several regions according to the data access pattern and then finds the optimal configurations for each region among the solid state disk file server nodes and mechanical hard disk drive file server nodes. It uses variable stripe sizes to reorganize the data layout of file systems. Furthermore, it considers SSD space limitation, the main idea is to distribute key regions of the file to hybrid parallel file systems based on the optimal stripe configuration, which can significantly improve the system I/O throughput performance. The remaining parts of a file are then distributed according to the SSD free space threshold, which can leverage the SSD servers as much as possible. To achieve this, SLDP divides a large file into many fine-grained regions and adjusts the data layout method for each region according to the access patter. Experimental results show that the SLDP is feasible and can improve system performance.