skip to main content
10.1145/2348283.2348329acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Multi-aspect query summarization by composite query

Authors Info & Claims
Published:12 August 2012Publication History

ABSTRACT

Conventional search engines usually return a ranked list of web pages in response to a query. Users have to visit several pages to locate the relevant parts. A promising future search scenario should involve: (1) understanding user intents; (2) providing relevant information directly to satisfy searchers' needs, as opposed to relevant pages. In this paper, we present a search paradigm to summarize a query's information from different aspects. Query aspects could be aligned to user intents. The generated summaries for query aspects are expected to be both specific and informative, so that users can easily and quickly find relevant information. Specifically, we use a Composite Query for Summarization" method, where a set of component queries are used for providing additional information for the original query. The system leverages the search engine to proactively gather information by submitting multiple component queries according to the original query and its aspects. In this way, we could get more relevant information for each query aspect and roughly classify information. By comparative mining the search results of different component queries, it is able to identify query (dependent) aspect words, which help to generate more specific and informative summaries. The experimental results on two data sets, Wikipedia and TREC ClueWeb2009, are encouraging. Our method outperforms two baseline methods on generating informative summaries.

References

  1. H. Chen and S. T. Dumais. Bringing order to the web: automatically categorizing search results. In CHI, pages 145--152, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. V. Dang, X. Xue, and W. B. Croft. Inferring query aspects from reformulations using clustering. In Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM '11, pages 2117--2120, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B, 39(1):1--38, 1977.Google ScholarGoogle Scholar
  4. Z. Dou, S. Hu, K. Chen, R. Song, and J.-R. Wen. Multi-dimensional search result diversification. In Proceedings of the 4th ACM WSDM, pages 475--484, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Goldstein, V. Mittal, J. Carbonell, and M. Kantrowitz. Multi-document summarization by sentence extraction. In Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization, pages 40--48, Stroudsburg, USA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. A. Hearst. Clustering versus faceted categories for information exploration. Commun. ACM, 49:59--61, April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Hu and B. Liu. Mining and summarizing customer reviews. In W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, editors, Proceedings of the 10th ACM SIGKDD, Seattle, Washington, USA, August 22--25, 2004, pages 168--177. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Jarvelin and J. Kekalainen. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of the 23rd annual international ACM SIGIR, pages 41--48, New York, NY, USA, 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Kummamuru, R. Lotlikar, S. Roy, K. Singal, and R. Krishnapuram. A hierarchical monothetic document clustering algorithm for summarization and browsing search results. In Proceedings of the 13th international conference on WWW, pages 658--665, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Y.-L. Kuo, T. Hentrich, B. M. . Good, and M. D. Wilkinson. Tag clouds for summarizing web search results. In Proceedings of the 16th ACM WWW, pages 1203--1204, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. J. Lawrie and W. B. Croft. Generating hierarchical summaries for web searches. In Proceedings of the 26th ACM SIGIR, pages 457--458, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C.-Y. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Conference of the NAACL - Volume 1, pages 71--78, Stroudsburg, PA, USA, 2003. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. X. Ling, Q. Mei, C. Zhai, and B. Schatz. Mining multi-faceted overviews of arbitrary topics in a text collection. In Proceeding of the 14th ACM SIGKDD, pages 497--505, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: modeling facets and opinions in weblogs. In Proceedings of the 16th international conference on World Wide Web, pages 171--180, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Nenkova, L. Vanderwende, and K. McKeown. A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In Proceedings of the 29th Annual International ACM SIGIR, Seattle, Washington, USA, pages 573--580. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Robertson and H. Zaragoza. The probabilistic relevance framework: Bm25 and beyond. Found. Trends Inf. Retr., 3:333--389, April 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Shen, D. Wang, and T. Li. Topic aspect analysis for multi-document summarization. In Proceedings of the 19th ACM CIKM, pages 1545--1548, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Song, M. Zhang, T. Sakai, M. Kato, Y. Liu, M. Sugimoto, Q. Wang, and N. Orii. Overview of the ntcir-9 intent task. In NTCIR-9 Proceedings, pages 82--105. Morgan and Claypool, December 2011.Google ScholarGoogle Scholar
  19. A. Tombros and M. Sanderson. Advantages of query biased summaries in information retrieval. In Proceedings of the 21st ACM SIGIR}, pages 2--10, New York, NY, USA, 1998. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Learning query-biased web page summarization. In Proceedings of the 6th ACM CIKM, pages 555--562, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Wang, S. Zhu, T. Li, and Y. Gong. Multi-document summarization using sentence-based topic models. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 297--300, Stroudsburg, PA, USA, 2009. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. X. Wang, D. Chakrabarti, and K. Punera. Mining broad latent query aspects from search sessions. In Proceedings of the 15th ACM SIGKDD, pages 867--876, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. X. Wang and C. Zhai. Learn from web search logs to organize search results. In Proceedings of the 30th annual international ACM SIGIR, pages 87--94, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. White and R. Roth. Exploratory search. beyond the query-response paradigm. In Synthesis Lectures on Information Concepts, Retrieval, and Services Series, Gary Marchionini (ed.), vol. 3. Morgan and Claypool, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Wu, J. Madhavan, and A. Halevy. Identifying aspects for web-search queries. In Journal of Artificial Intelligence Research, pages 677--700, 2011 (40). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W.-t. Yih, J. Goodman, L. Vanderwende, and H. Suzuki. Multi-document summarization by maximizing informative content-words. In Proceedings of the 20th IJCAI, pages 1776--1782, San Francisco, CA, USA, 2007. Morgan Kaufmann Publishers Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to cluster web search results. In Proceedings of the 27th annual international ACM SIGIR, pages 210--217, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-aspect query summarization by composite query

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
      August 2012
      1236 pages
      ISBN:9781450314725
      DOI:10.1145/2348283

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 August 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader