M.Aater Suleman, Onur Mutlu, et al.
ASPLOS 2009
A key challenge in architecting a CMP with many cores is maintaining cache coherence in an efficient manner. Directory-based protocols avoid the bandwidth overhead of snoop-based protocols, and therefore scale to a large number of cores. Unfortunately, conventional directory structures incur significant area overheads in larger CMPs. The Tagless Coherence Directory (TL) is a scalable coherence solution that uses an implicit, conservative representation of sharing information. Conceptually, TL consists of a grid of small Bloom filters. The grid has one column per core and one row per cache set. TL uses 48% less area, 57% less leakage power, and 44% less dynamic energy than a conventional coherence directory for a 16-core CMP with 1MB private L2 caches. Simulations of commercial and scientific workloads indicate that TL has no statistically significant impact on performance, and incurs only a 2.5% increase in bandwidth utilization. Analytical modelling predicts that TL continues to scale well up to at least 1024 cores. Copyright 2009 ACM.
M.Aater Suleman, Onur Mutlu, et al.
ASPLOS 2009
Swagath Venkataramani, Jungwook Choi, et al.
IEEE Micro
Snehasish Kumar, Arrvindh Shriraman, et al.
PACT 2014
Sae Kyu Lee, Ankur Agrawal, et al.
IEEE JSSC