Malolan Chetlur, Umamaheswari C. Devi, et al.
IBM J. Res. Dev
With an increasing number of cores per chip, it is becoming harder to guarantee optimal performance for parallel shared memory applications due to interference caused by kernel threads, interrupts, bus contention, and temperature management schemes (referred to as jitter). We demonstrate that the performance of parallel programs gets reduced (up to 35.22 percent) in large CMP based systems. In this paper, we characterize the jitter for large multi-core processors, and evaluate the loss in performance. We propose a novel jitter measurement unit that uses a distributed protocol to keep track of the number of wasted cycles. Subsequently, we try to compensate for jitter by using DVFS across a region of timing critical instructions called a frame. Additionally, we propose an OS cache that intelligently manages the OS cache lines to reduce memory interference. By performing detailed cycle accurate simulations, we show that we are able to execute a suite of Splash2 and Parsec benchmarks with a deterministic timing overhead limited to 2 percent for 14 out of 17 benchmarks with modest DVFS factors. We reduce the overall jitter by an average 13.5 percent for Splash2 and 6.4 percent for Parsec. The area overhead of our scheme is limited to 1 percent. © 2013 IEEE.
Malolan Chetlur, Umamaheswari C. Devi, et al.
IBM J. Res. Dev
Parul Gupta, Pradipta Dey, et al.
China Communications
Parul Gupta, Arun Vishwanath, et al.
COMSNETS 2010
Zhenbo Zhu, Parul Gupta, et al.
CF 2011