Abstract
This paper provides: 1) a very brief motivation and technological trend data to show why hard and soft errors are expected to be of increasing concern in the future; 2) a summary review of chip-level error tolerance practices today-with a brief reference to IBM's POWER6 and POWER7 designs; 3) open research challenges and current solution approaches of promise, based on published literature; and 4) concluding remarks. © 2011 IEEE.