摘要

We formulate the alignment problem of multiple and partially overlapping audio sequences in a probabilistic framework. We define and compare five generative models for several time varying features extracted from audio clips that are recorded independently and asynchronously. For each model, we derive the associated scoring function that evaluates the quality of an alignment. The matching is then achieved with a sequential algorithm. The derived score functions are also able to identify the cases where the sequences do not overlap and handle multiple sequences where no sequence is covering the entire timeline. The simulation results on real data suggest that the approach is able to handle difficult, ambiguous scenarios and partial matchings where simple baseline methods such as correlation fail.

  • 出版日期2015-7