Customizing search results for non-native speakers
Abstract
Blog posts, news articles and other webpages are present on the web in multiple languages. Standard search engines evaluate the relevance of the candidate documents to the given query. However, when considering documents with overlapping content, many of them written in a foreign language other than the user's own native tongue, it is beneficial to promote documents that are easy enough for the user to read. Here, we show how to rank a collection of foreign documents based on both: a) relevance to the query, and b) the comprehension difficulty of the document. We design effective ranking operators that evaluate the difficulty of a foreign document with respect to the user's native language. We show that existing search engines can easily augment their scoring function by incorporating the proposed comprehensibility metrics. Finally, we provide extensive experimental evidence that the comprehensibility-aware ranking model significantly improves the standard relevance-based ranking paradigm. © 2012 ACM.