Fast unsupervised location category inference from highly inaccurate mobility data
Abstract
Understanding a mobile user’s behavior, e.g., to infer if she is exercising in a gym or dining in a restaurant, is the key to a variety of applications. However, in many real-world scenarios, precisely determining user visitation is extremely challenging due to the uncertainty present in mobile location updates, where errors can be hundreds of meters or even more. We consider the location uncertainty circle determined by the reported location coordinates as the center and the associated location error as the radius. Such a location uncertainty circle is likely to cover multiple location categories, especially in densely populated areas. Worse still, in many cases, mobile users are anonymous, and we have no access to their personal information or other labeled data, which compels us to develop an unsupervised learning approach to solve this problem. Using a user-time-location category tensor, we capture the user behavior and propose a novel tensor factorization framework to accurately infer the location categories visited by mobile users. This framework leverages several key observations including the negative-unlabeled nature of the data and the intrinsic correlations between users. Also, the proposed algorithm can predict where users are even in the absence of location information. To efficiently solve the proposed framework, we propose a parameter-free and scalable optimization algorithm by effectively exploring the sparse and low-rank structure of the tensor. Our empirical studies show that the proposed algorithm is both effective and scalable: it can solve problems with millions of users and billions of location updates, and also provide superior prediction accuracies on real-world location update and check-in datasets.