Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction
@article{Marco2013ClusteringAD, title={Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction}, author={Antonio Di Marco and Roberto Navigli}, journal={Computational Linguistics}, year={2013}, volume={39}, pages={709-754}, url={https://api.semanticscholar.org/CorpusID:1775181} }
Key to the approach is to first acquire the various senses of an ambiguous query and then cluster the search results based on their semantic similarity to the word senses induced, which outperforms both Web clustering and search engines.
Topics
Word Sense Induction (opens in a new tab)B-MST (opens in a new tab)Search Results Clustering (opens in a new tab)HyperLex (opens in a new tab)Web Search Result Clustering (opens in a new tab)MORESQUE (opens in a new tab)Query-senses (opens in a new tab)Induced Senses (opens in a new tab)SRC Systems (opens in a new tab)Web Search (opens in a new tab)
158 Citations
Retrieving web search results using MaxโMax soft clustering for Hindi query
- 2014
Computer Science
This is the first attempt to fuzzy IR for a query in Hindi language, experimental evaluations shows promising results.
Multilingual Word Sense Induction to Improve Web Search Result Clustering
- 2015
Computer Science
Some preliminary ideas to exploit the multilingual Word Sense Induction method to Web search result clustering to improve the WSI results are given.
Neural Embedding Language Models in Semantic Clustering of Web Search Results
- 2016
Computer Science
It is shown that in the task of semantically clustering search results, prediction-based models slightly but stably outperform traditional count-based ones, with the same training corpora.
Graph-Based Concept Clustering for Web Search Results
- 2015
Computer Science
This paper proposes a method to cluster the web search results with high clustering quality using graph-based clustering with concept which extract from the external knowledge source, and compared the clustering results of this method with two well-known search results clustering techniques, Suffix Tree Clustering and Lingo.
PageRank-based Word Sense Induction within Web Search Results Clustering
- 2014
Computer Science
The evaluation results show that PageRank-based sense induction achieves interesting results when compared to state-of-the-art content-based algorithms in the context of Web Search Results Clustering.
A Relative Study on Search Results Clustering Algorithms - K-means, Suffix Tree and LINGO
- 2013
Computer Science
A comparative analysis is done on three common search results of clustering algorithms to study the performance of the web search engine using m ultiple test collections and evaluation measures.
Web Search Results Clustering Using Frequent Termset Mining
- 2015
Computer Science
This work acquires the senses of a query by means of a word sense induction method that identify meanings as trees of closed frequent termsets mining and clusters the search results based on their lexical and semantic intersection with induced senses.
A comparison of graph-based word sense induction clustering algorithms in a pseudoword evaluation framework
- 2018
Computer Science
A self-sufficient pseudoword-based evaluation framework for wsi graph-based clustering algorithms, thereby defining a new evaluation measure (top2) and a secondary clustering process (hyperclustering).
A HYBRID APPROACH FOR WEB SEARCH RESULT CLUSTERING BASED ON GENETIC ALGORITHM WITH K-MEANS
- 2021
Computer Science
An efficient hybrid web search results clustering algorithm referred to as G-K-M is presented, whereby, K-means with a modified genetic algorithm is combined, whereby, the proposed approach demonstrates its significant advantages over traditional clustering.
A Novel Method for Clustering Web Search Results with Wikipedia Disambiguation Pages
- 2015
Computer Science
A novel method to cluster search results of ambiguous query into topics about the query constructed from Wikipedia disambiguation pages (WDP) is proposed and a concept filtering method to filter semantically unrelated concepts in each topic is proposed.
118 References
Inducing Word Senses to Improve Web Search Result Clustering
- 2010
Computer Science
This work first acquires the senses of a query by means of a graph-based clustering algorithm that exploits cycles in the co-occurrence graph of the query, then clusters the search results based on their semantic similarity to the induced word senses.
An Unsupervised Approach to Cluster Web Search Results Based on Word Sense Communities
- 2008
Computer Science
The clustering problem as a word sense discovery problem is reformalized as a unsupervised method and the modularity score of the discovered keyword community structure is used to measure page clustering necessity.
Graph-based Word Clustering using a Web Search Engine
- 2006
Computer Science
An unsupervised algorithm for word clustering based on a word similarity measure by web counts, called Newman clustering, is proposed for efficiently identifying word clusters.
Web Search Clustering and Labeling with Hidden Topics
- 2009
Computer Science
This article introduces a novel framework for clustering Web search results in Vietnamese which is able to cluster and label short snippets effectively in a topic-oriented manner without concerning whole Web pages.
Word Sense Induction & Disambiguation Using Hierarchical Random Graphs
- 2010
Computer Science, Linguistics
The inferred hierarchical structures are applied to the problem of word sense disambiguation, where it is shown that the method performs significantly better than traditional graph-based methods and agglomerative clustering yielding improvements over state-of-the-art WSD systems based on sense induction.
Web document clustering: a feasibility demonstration
- 1998
Computer Science
To satisfy the stringent requirements of the Web domain, an incremental, linear time algorithm called Suffix Tree Clustering (STC) is introduced which creates clusters based on phrases shared between documents, showing that STC is faster than standard clustering methods in this domain.
Clustering Web Search Results with Maximum Spanning Trees
- 2011
Computer Science
This work presents a novel method for clustering Web search results based on Word Sense Induction, which improves classical search result clustering methods in terms of both clustering quality and degree of diversification.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
- 2010
Computer Science
Wikipedia has a much better coverage of search results, the distribution of senses in search results can be estimated using the internal graph structure of the Wikipedia and the relative number of visits received by each sense in Wikipedia, and associating Web pages to Wikipedia senses with simple and efficient algorithms can produce modified rankings that cover 70% more Wikipedia senses than the original search engine rankings.
Word sense disambiguation in queries
- 2005
Computer Science
A new approach to determine the senses of words in queries by using WordNet is presented, which has 100% applicability and 90% accuracy on the most recent robust track of TREC collection of 250 queries and the retrieval effectiveness is 7% better than the best reported result in the literature.
Information retrieval using word senses: root sense tagging approach
- 2004
Computer Science
This paper proposes a new method using word senses in information retrieval: root sense tagging method that assigns coarse-grained word senses defined in WordNet to query terms and document terms by unsupervised way using co-occurrence information constructed automatically.