摘要

We study the inverse contagion problem (ICP). As opposed to the direct contagion problem, in which the network structure is known and the question is when each node will be contaminated, in the inverse problem the links of the network are unknown but a sequence of contagion histories (the times when each node was contaminated) is observed. We consider two versions of the ICP: The strong problem (SICP), which is the reconstruction of the network and has been studied before, and the weak problem (WICP), which requires "only" the prediction (at each time step) of the nodes that will be contaminated at the next time step (this is often the real life situation in which a contagion is observed and predictions are made in real time). Moreover, our focus is on analyzing the increasing accuracy of the solution, as a function of the number of contagion histories already observed. For simplicity, we discuss the simplest (deterministic and synchronous) contagion dynamics and the simplest solution algorithm, which we have applied to different network types. The main result of this paper is that the complex problem of the convergence of the ICP for a network can be reduced to an individual property of pairs of nodes: the "false link difficulty". By definition, given a pair of unlinked nodes i and j, the difficulty of the false link (i,j) is the, probability that in a random contagion history, the nodes i and j are not contaminated at the same time step (or at consecutive time steps). In other words, the "false link difficulty" of a non-existing network link is the probability that the observations during a random contagion history would not rule out that link. This probability is relatively straightforward to calculate, and in most instances relies only on the relative positions of the two nodes (ij) and not on the entire network structure. We have observed the distribution of false link difficulty for various network types, estimated it theoretically and confronted it (successfully) with the numerical simulations. Based on it, we estimated analytically the convergence of the ICP solution (as a function of the number of contagion histories observed), and found it to be in perfect agreement with simulation results. Finally, the most important insight we obtained is that SICP and WICP are have quite different properties: if one in interested only in the operational aspect of predicting how contagion will spread, the links which are most difficult to decide about are the least influential on contagion dynamics. In other words, the parts of the network which are harder to reconstruct are also least important for predicting the contagion dynamics, up to the point where a (large) constant number of false links in the network (i.e. non-convergence of the network reconstruction procedure) implies a zero rate of the node contagion prediction errors (perfect convergence of the WICP). Thus, the contagion prediction problem (WICP) difficulty is very different from the network reconstruction problem (SICP), in as far as links which are difficult to reconstruct are quite harmless in terms of contagion prediction capability (WICP).

  • 出版日期2017-10-15

全文