Failure diagnosis with incomplete information in cable networks
Yun Mao, Hani Jamjoom, et al.
CoNEXT 2006
An analytical technique called thermal diagnostics is presented as a tool for determining the root cause of thermal anomalies arising in electronic equipment. The technique utilizes a dynamically constructed flow network model, real-time inventory, temperature, utilization metrics, and statistical hypothesis testing to select the most likely scenario from among thousands of potential causes of thermal problems. This paper describes the concept of thermal diagnostics and concludes with results from a laboratory evaluation in which we physically trigger thermal anomalies on a running IBM eServer™ BladeCenter® system and record the diagnosis given by the algorithm. In these tests, our algorithm correctly diagnosed the thermal situation and provided meaningful guidance toward clearing the detected problems. ©Copyright 2005 by International Business Machines Corporation.
Yun Mao, Hani Jamjoom, et al.
CoNEXT 2006
Fan Zhang, Junwei Cao, et al.
IEEE TETC
Reena Elangovan, Shubham Jain, et al.
ACM TODAES
Yvonne Anne Pignolet, Stefan Schmid, et al.
Discrete Mathematics and Theoretical Computer Science