摘要

The integration of multiblock high throughput data from multiple sources is one of the major challenges in several disciplines including metabolomics, computational biology, genomics, and clinical psychology. A main challenge in this line of research is to obtain interpretable results 1) that give an insight into the common and distinctive sources of variations associated to the multiple and heterogeneous data blocks and 2) that facilitate the identification of relevant variables. We present a novel variable selection method for performing data integration, providing easily interpretable results, and recovering underlying data structure such as common and distinctive components. The flexibility and applicability of this method are showcased via numerical simulations and an application to metabolomics data.

  • 出版日期2016-11-15