摘要

Synchronization is often necessary in parallel computing, but it can create delays whenever the receiving processor is idle, waiting for the information to arrive. This is especially true for barrier, or global, synchronization, in which every processor must synchronize with every other processor. Nonetheless, barriers are the only form of synchronization explicitly supplied in OpenMP, and they occur whenever collective communication operations are used in MPI. Many applications do not actually require global synchronization; local synchronization, in which a processor synchronizes only with those processors from or to which information or resources are needed, is often adequate. However, when tasks take varying amounts of time, the behavior of a system under local synchronization is more difficult to analyze since processors do not start tasks at the same time. We show that when the synchronization dependencies form a directed cycle and the task times are geometrically distributed with p = 0.5, then as the number of processors tends to infinity the processors are working 2-root 2 approximate to 0.59% of the time. Under global synchronization, however, the time to complete each task is unbounded, increasing logarithmically with the number of processors. Similar results apply for p not equal 0.5. We also present some of the combinatorial properties of the synchronization problem with geometrically distributed tasks on an undirected cycle. Nondeterministic synchronization is also examined, where processors decide randomly at the beginning of each task which neighbors(s) to synchronize with. We show that the expected number of task dependencies for random synchronization on an undirected cycle is the same as for deterministic synchronization on a directed cycle. Simulations are included to extend the analytic results. They show that more heavy-tailed distributions can actually create fewer delays than less heavy-tailed ones if the number of processors is small for some random-neighbor synchronization models. The results also show the rate of convergence to the steady state for various task distributions and synchronization graphs.

  • 出版日期2010