Deep Learning at Scale and at Ease

作者:Wang, Wei*; Chen, Gang; Chen, Haibo; Tien Tuan Anh Dinh; Gao, Jinyang; Ooi, Beng Chin; Tan, Kian-Lee; Wang, Sheng; Zhang, Meihui
来源:ACM Transactions on Multimedia Computing, Communications, and Applications, 2016, 12(4): 69.
DOI:10.1145/2996464

摘要

Recently, deep learning techniques have enjoyed success in various multimedia applications, such as image classification and multimodal data analysis. Large deep learning models are developed for learning rich representations of complex data. There are two challenges to overcome before deep learning can be widely adopted in multimedia and other applications. One is usability, namely the implementation of different models and training algorithms must be done by nonexperts without much effort, especially when the model is large and complex. The other is scalability, namely the deep learning system must be able to provision for a huge demand of computing resources for training large models with massive datasets. To address these two challenges, in this article we design a distributed deep learning platform called SINGA, which has an intuitive programming model based on the common layer abstraction of deep learning models. Good scalability is achieved through flexible distributed training architecture and specific optimization techniques. SINGA runs on both GPUs and CPUs, and we show that it outperforms many other state-of-the-art deep learning systems. Our experience with developing and training deep learning models for real-life multimedia applications in SINGA shows that the platform is both usable and scalable.