IBM predictive analytics reduces server downtime
Abstract
IBM has deployed its Predictive Analytics for Server Incident Reduction (PASIR) solution to more than 360 information technology (IT) environments worldwide since 2013. These environments, covering sectors from banking to travel to e-commerce, are serviced by IBM support groups. Incidents occurring on servers, including problem descriptions and resolutions, are documented in client account-specific ticket management systems. PASIR uses machine learning to classify the incident tickets within an IT environment and identify high-impact incidents that involve server outages by using the respective ticket descriptions and resolutions. It then correlates these high-impact tickets with server properties and utilization measurements to identify problematic server configurations. Finally, for such configurations, PASIR uses statistical multivariate analysis and simulation methods to prescribe improvement and modernization actions. In this paper, we present the results achieved from deploying this solution. We describe the PASIR approach, from ticket classification to the recommendations of remediation actions (e.g., hardware and software upgrades). We demonstrate the model's effectiveness by comparing predictions on the impact of prescriptive actions with actual system improvements. Since 2013, we have applied PASIR to more than 840,000 client servers, resulting in more precise upgrade spending and environmental stability, thus saving our clients an estimated $7 billion.