An Analysis of Cache Performance for a Hypercube Multicomputer
Abstract
This paper presents multicomputer cache simulation results derived from address traces collected from an Intel iPSC/2 hypercube multicomputer. The primary emphasis of this study is on examining how increasing the number of processor nodes executing a parallel application affects the overall multicomputer cache performance. The effects on multicomputer direct-mapped cache performance of application-specific data partitioning, data access patterns, communication distribution, and communication frequency are illustrated. The effects of system accesses on total cache performance are explored, as well as the reasons for application-specific differences in cache behavior for system and user accesses. When the parallel applications partition data well among the nodes, user code cache analysis shows that the cache miss ratio tends to decrease for large caches as the dimension of the hypercube increased. However, under certain conditions, when a significant portion of the data is replicated among the nodes, the cache miss ratio can increase with increasing multicomputer size. Comparing user code results with full user and system code analysis reveals the significant effect of system accesses, and this effect increases with multicomputer size. The time distribution of an application’s message-passing operations is found to more strongly affect cache performance than the total amount of time spent in message-passing code. © 1992 IEEE