摘要

The systematic mapping of protein interactions by bait-prey techniques, including affinity purification-mass spectrometry or the yeast two-hybrid system, contributes a unique and relevant perspective on the comprehensive picture of cellular machines. We describe here a protocol for statistical analysis of node-and-edge graph representations of these data using R and Bioconductor, recognizing that steps may be added or omitted depending on the data set at hand. The fundamental purpose of such analyses is feature estimation, defined here as the estimation of data-type-specific biological features, such as protein complex composition and the physical interaction integrity of known or estimated complexes. In preparation for feature estimation tasks, we outline a progression through three analytic components common to all bait-prey data types: preliminary setup, exploratory analysis and quality assessment. The end result is a collection of descriptive and inferred characteristics of the data, ready for biological interpretation in a computationally tractable form.