nach oben

Erschienen in:

2015 | OriginalPaper | Buchkapitel

Web Search Results Clustering Using Frequent Termset Mining

verfasst von : Marek Kozlowski

Erschienen in: Pattern Recognition and Machine Intelligence

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We present a novel method for clustering web search results based on frequent termsets mining. First, we acquire the senses of a query by means of a word sense induction method that identify meanings as trees of closed frequent termsets. Then we cluster the search results based on their lexical and semantic intersection with induced senses. We show that our approach is better or comparable with state-of-the-art classical search result clustering methods in terms of both clustering quality and degree of diversification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Fusion of Static and Dynamic Parameters at Decision Level in Human Gait Recognition

Nächstes Kapitel Effective Imbalanced Classification of Breast Thermogram Features

http://credo.fub.it/ambient/.

http://lcl.uniroma1.it/moresque/.

Sanderson, M.: Ambiguous queries: test collections need more sense. In: Proceedings of SIGIR, pp. 499–506. ACM, New York (2008)

Di Marco, A., Navigli, R.: Clustering web search results with maximum spanning trees. In: Proceedings of the 12th Congress of the Italian Association for Artificial Intelligence, Palermo, pp. 201–212 (2011)

Di Marco, A., Navigli, R.: Clustering and diversifying web search results with graph-based word sense induction. Comput. Linguist. 39(3), 709–754 (2013). MIT PressCrossRef

Navigli, R., Crisafulli, G.: Inducing word senses to improve web search result clustering. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Boston, pp. 116–126 (2010)

Bernardini, A., Carpineto, C., DAmico, M.: Full-subtopic retrieval with keyphrasebased search results clustering. In: Proceedings of 2009 IEEE/WIC/ACM International Conference on Web Intelligence, Milan, pp. 206–213 (2009)

Osinski, S., Weiss, D.: A concept-driven algorithm for clustering search results. IEEE Intell. Syst. 20(3), 48–54 (2005). IEEE PressCrossRef

Osinski, S., Stefanowski, J., Weiss, D.: Lingo: search results clustering algorithm based on singular value decomposition. In: Proceedings of the International IIS: IIPWM 2004 Conference held in Zakopane, Zakopane, pp. 359–368 (2004)

Zamir, O., Etzioni, O.: Web document clustering: a feasibility demonstration. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, pp. 46–54 (1998)

Zamir, O., Etzioni, O.: Grouper: a dynamic clustering interface to web search results. Comput. Netw. 31(11–12), 1361–1374 (1999). ElsevierCrossRef

10.

Carpineto, C., Osinski, S., Romano, G., Weiss, D.: A survey of web clustering engines. ACM Comput. Surv. 41(3), 1–38 (2009). ACM, New YorkCrossRef

11.

Swaminathan, A., Cherian, M., Kirovski, D.: Essential pages. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, Milan, pp. 173–182 (2009)

12.

Van Rijsbergen, C.: Information Retrieval. Butterworths, London (1979)

13.

Cutting, D., Karger, D., Pedersen, J., Tukey, J.: Scatter/gather: a cluster based approach to browsing large document collections. In: Proceedings of SIGIR, Copenhagen, pp. 318–329 (1992)

14.

Maarek, I., Fagin, R., Pelleg, D.: Ephemeral document clustering for web applications. IBM Research Report RJ 10186 (2000)

15.

Zhang, X., Hu, X., Zhou, X.: A comparative evaluation of different link types on enhancing document clustering. In: Proceedings of SIGIR, Singapore, pp. 555–562 (2008)

16.

Pantel, P., Lin, D.: Discovering word senses from text. In: Proceedings of the 8th KDD, Edmonton, pp. 613–619 (2002)

17.

Denkowski, M.: A survey of techniques for unsupervised word sense induction. Technical report, Language and Statistics II Literature Review (2009)

18.

Kozłowski, M., Rybiński, H.: SnS: a novel word sense induction method. In: Kryszkiewicz, M., Cornelis, C., Ciucci, D., Medina-Moreno, J., Motoda, H., Raś, Z.W. (eds.) RSEISP 2014. LNCS, vol. 8537, pp. 258–268. Springer, Heidelberg (2014)

19.

Kozlowski, M.: Word sense discovery using frequent termsets. PhD in Warsaw University of Technology (2014)

20.

Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of SIGIR, Melbourn, pp. 335–336 (1998)

21.

Zaki, M., Hsiao, C.: CHARM: an efficient algorithm for closed itemset mining. In: Proceedings 2002 SIAM International Conference on Data Mining, Arlington, pp. 457–472 (2002)

22.

Navigli, R., Vannella, D.: SemEval-2013 task 11: word sense induction and disambiguation within an end-user applications. In: Proceedings of 7th International Workshop on Semantic Evaluation (SemEval), in the Second Joint Conference on Lexical and Computational Semantics, Atlanta, pp. 193–201 (2013)

23.

Carpineto, C., Romano, G.: AMBIENT dataset. http://credo.fub.it/ambient

Titel: Web Search Results Clustering Using Frequent Termset Mining
verfasst von: Marek Kozlowski
Verlag: Springer International Publishing
Buch: Pattern Recognition and Machine Intelligence
Print ISBN: 978-3-319-19940-5

Electronic ISBN: 978-3-319-19941-2

Copyright-Jahr: 2015
DOI: https://doi.org/10.1007/978-3-319-19941-2_50

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner