DataSHIELD: resolving a conflict in contemporary bioscience-performing a pooled analysis of individual-level data without sharing the data

作者:Wolfson Michael; Wallace Susan E; Masca Nicholas; Rowe Geoff; Sheehan Nuala A; Ferretti Vincent; LaFlamme Philippe; Tobin Martin D; Macleod John; Little Julian; Fortier Isabel; Knoppers Bartha M; Burton Paul R*
来源:International Journal of Epidemiology, 2010, 39(5): 1372-1382.
DOI:10.1093/ije/dyq111

摘要

Methods Data aggregation through anonymous summary-statistics from harmonized individual-level databases (DataSHIELD), provides a simple approach to analysing pooled data that circumvents this conflict. This is achieved via parallelized analysis and modern distributed computing and, in one key setting, takes advantage of the properties of the updating algorithm for generalized linear models (GLMs). Results The conceptual use of DataSHIELD is illustrated in two different settings. Conclusions As the study of the aetiological architecture of chronic diseases advances to encompass more complex causal pathways-e.g. to include the joint effects of genes, lifestyle and environment-sample size requirements will increase further and the analysis of pooled individual-level data will become ever more important. An aim of this conceptual article is to encourage others to address the challenges and opportunities that DataSHIELD presents, and to explore potential extensions, for example to its use when different data sources hold different data on the same individuals.

  • 出版日期2010-10