摘要

We implement and undertake an empirical study of the cache-oblivious variant of the polygon indecomposability testing algorithm of Gao and Lauder, based on a depth-first search (DFS) traversal of the computation tree. According to Abu Salem, the cache-oblivious variant exhibits improved spatial and temporal locality over the original one, and its spatial locality is optimal. Our implementation revolves around eight different variants of the DFS-based algorithm, tailored to assess the trade-offs between computation and memory performance as originally proposed by Abu Salem. We analyse performance sensitively to manipulations of the several parameters comprising the input size. We describe how to construct suitably random families of input that solicit such variations, and how to handle redundancies in vector computations at no asymptotic increase in the work and cache complexities. We report extensively on our experimental results. In all eight variants, the DFS-based variant achieves excellent performance in terms of L1 and L2 cache misses as well as total run time, when compared to the original variant of Gao and Lauder. We also benchmark the DFS variant against the powerful computer algebra system MAGMA, in the context of bivariate polynomial irreducibility testing using polygons. For sufficiently high degree polynomials, MAGMA either runs out of memory or fails to terminate after about 4 h of execution. In contrast, the DFS-based version processes such input using a couple of seconds. Particularly, we report on absolute irreducibility testing of bivariate polynomials of total degree reaching 19,000 in about 2 s for the DFS variant, using a single processor.

  • 出版日期2010-6

全文