A parallel, incremental, mostly concurrent garbage collector for servers
Abstract
Multithreaded applications with multigigabyte heaps running on modern servers provide new challenges for garbage collection (GC). The challenges for "server-oriented" GC include: ensuring short pause times on a multigigabyte heap while minimizing throughput penalty, good scaling on multiprocessor hardware, and keeping the number of expensive multicycle fence instructions required by weak ordering to a minimum. We designed and implemented a collector facing these demands building on the mostly concurrent garbage collector proposed by Boehm et al. [1991]. Our collector incorporates new ideas into the original collector. We make it parallel and incremental; we employ concurrent low-priority background GC threads to take advantage of processor idle time; we propose novel algorithmic improvements to the basic mostly concurrent algorithm improving its efficiency and shortening its pause times; and finally, we use advanced techniques, such as a low-overhead work packet mechanism to enable full parallelism among the incremental and concurrent collecting threads and ensure load balancing. We compared the new collector to the mature, well-optimized, parallel, stop-the-world mark-sweep collector already in the IBM JVM. When allowed to run aggressively, using 72% of the CPU utilization during a short concurrent phase, our collector prototype reduces the maximum pause time from 161 ms to 46 ms while only losing 11.5% throughput when running the SPECjbb2000 benchmark on a 600-MB heap on an 8-way PowerPC 1.1-GHz processors. When the collector is limited to a nonintrusive operation using only 29% of the CPU utilization, the maximum pause time obtained is 79 ms and the loss in throughput is 15.4%. © 2005 ACM.