摘要

Cloud-based file systems are widely accepted and adopted for personal and business purposes in recent years. Statistics shows that similar to 25% of file operations from a typical user are random writes. Inherited from traditional disk-based file systems, most distributed file systems are also based on objects or chunks of fixed sizes, which work well for sequential writes but poorly for random writes. This paper investigates the design paradigm of variable-sized objects for a distributed file system, where a new file write interface is proposed to provide rich write semantics. A novel distributed file system named VarFS, is presented to incorporate variable object indexing, support the random write interface and remain POSIX compatible. VarFS reduces the amount of unnecessary data being read and the number of objects modified in face of updates and consequently alleviates the total amount of data transferred. VarFS is implemented based on Ceph and the performance measurements show that it can achieve 1-2 orders of magnitude less latency than Ceph on random writes. At the same time, the overhead for initial writes and re-writes is acceptable.