摘要

Environmental sounds, everyday audio events that do not consist of music or speech data and are often more diverse and chaotic in their structure, have proven to be a promising type of carrier signals to carry out covert communication as they occur frequently in the natural environment, e.g., marine communication by mimicking dolphin or sea lion whistles. However, a mass collection of the carrier signals still remains a challenging task. Recently proposed generator models represented by Generator Adversarial Nets (GAN) have provided an effective way to synthesize environmental sounds. In this study, an end-to-end convolutional neural network (CNN) is proposed to directly transform the randomly sampled Gaussian noise into environmental sound that contains the secret message. The proposed network structure is composed of upsampling groups and orthogonal quantization layer, which can simultaneously realize factor analysis and information embedding. The design of the orthogonal quantization layer to complete the message embedding task is inspired by spread spectrum, model-based modulation, and compensative quantization. The underlying idea in this study is to treat the secret message as the constraint information in the generative model with the aim of maximizing the complete data model. The alternating back-propagation algorithm is used to train the overall network. Experimental results show that the proposed scheme can generate realistic environmental sounds that convey secret messages, while guaranteeing a high degree of communication reliability.