In this paper we describe techniques that enable the implementation of a fast processor simulator. These techniques have been used to implement a detailed out-of-order processor simulator called Turandot that executes over 350 million instructions per hour.