A scalable approach for performing proximal search for verbose patent search queries
Abstract
Even though queries received by traditional information retrieval systems are quite short, there are many application scenarios where long natural language queries are more effective. Further, incorporating term position information can help improve results of long queries. However, the techniques for incorporating term position information have been developed for terse queries and hence, can not be directly applied to long queries. Though there exist some methods for performing proximal search for long queries, they are not scalable due to long query response times. We describe an intuitive and simple, yet effective technique that implicitly incorporates term position information for long queries in a scalable manner. Our proposed approach achieves more than 700% faster query response times while maintaining the quality of retrieved results when compared with a state-of-the-art method for performing proximal search for very long queries. © 2012 ACM.