
Current service discovery approaches mainly focus on syntax matchmaking, which contains little semantic information to discover services automatically. This paper proposes a scalable automatic service discovery approach based on probabilistic topic model. Specifically, a novel service description model PTWSDM is proposed. With this model, heterogeneous service descriptions can be represented in a topic vector form on the same homogeneous plane. For the scarcity of word co-occurrence patterns in service functional descriptions, Biterm topic model is introduced to extract latent topics. Finally, a stream algorithm for topic model updating is introduced in order that the proposed approach is scalable and adaptable for large-scale dynamic registry. Experimental results confirm that the proposed approach outperforms the state-of-the-art solutions in terms of precision and normalised discounted cumulative gain values. It also has good time performance and scalability.
