摘要

Graph processing systems have been widely used in enterprises like online social networks to process their daily jobs. With the fast growing of social applications, they have to efficiently handle massive concurrent jobs. However, due to the inherent design for a single job, existing systems incur great inefficiency in terms of memory usage, execution and fault tolerance. Motivated by this issue, in this paper we introduce Seraph, a graph processing system that enables efficient job-level parallelism. Seraph is designed based on a decoupled computation model, which decouples both the runtime job data and the computation logic. Decoupling the runtime data allows multiple concurrent jobs to share graph structure data in memory, which fundamentally increases job-level concurrency and reduces fault tolerance overhead. Decoupling computation logic could extend scheduling space, which benefits both execution performance and memory consumption. Seraph adopts a copy-on-write semantic to isolate the graph mutation of concurrent jobs, and a lazy snapshot protocol to generate consistent graph snapshots for jobs submitted at different time. Based on the decoupled model, it provides unified programming interfaces for both synchronous and asynchronous graph applications. Moreover, Seraph implements a lightweight checkpoint mechanism which can tremendously reduce the fault tolerance overhead. The evaluation results show that Seraph significantly outperforms popular systems (such as Giraph, Spark, GraphX and PowerLyra) in both memory usage and job completion time, when executing concurrent graph jobs.