PriSM: Discovering and prioritizing severe technical issues from product discussion forums
Abstract
Online forums provide a channel for users to report and discuss problems related to products and troubleshooting, for faster resolution. These could garner negative publicity if left unattended by the companies. Manually monitoring these massive amounts of discussions is laborious. This paper makes the first attempt at collecting issues that require immediate action by the product supplier by analyzing the immense information on forums. Features that are specific to forum discussions, in conjunction with linguistic cues help in capturing and better prioritizing issues. Any attempt to collect training data for learning a classifier for this task will require enormous labeling effort. Hence, this paper adopts a co-training approach, which uses minimal manual labeling, coupled with linguistic features extracted using a set-expansion algorithm to discover severe problems. Further, most distinct and recent issues are obtained by incorporating a measure of 'centrality', 'diversity' and temporal aspect of the forum threads. We show that this helps in better prioritizing longstanding issues and identify issues that need to be addressed immediately. © 2012 ACM.