research-article

Effective pseudo-relevance for Microblog retrieval

Authors:
Khaled Albishre

Queensland University of Technology (QUT), Brisbane, Australia and Umm Al-Qura University, Makkah, Saudi Arabia

Queensland University of Technology (QUT), Brisbane, Australia and Umm Al-Qura University, Makkah, Saudi Arabia
View Profile

,
Yuefeng Li

Queensland University of Technology (QUT), Brisbane, Australia

Queensland University of Technology (QUT), Brisbane, Australia
View Profile

,
Yue Xu

Queensland University of Technology (QUT), Brisbane, Australia

Queensland University of Technology (QUT), Brisbane, Australia
View Profile

ACSW '17: Proceedings of the Australasian Computer Science Week MulticonferenceJanuary 2017Article No.: 51Pages 1–6https://doi.org/10.1145/3014812.3014865

Published:31 January 2017Publication History

ACSW '17: Proceedings of the Australasian Computer Science Week Multiconference

Pages 1–6

ABSTRACT

Microblog services such as Twitter have become a part of daily life for many users, with thousands of documents published each second. Microblog documents are often too short, overwhelming in their use of informal language and hard to understand due to a lack of contextual clues. Retrieving relevant documents from microblogs is somewhat challenging because of its nature and the massive scale of the data. However, microblog retrieval models suffer from a vocabulary mismatch problem that leads to insufficient performance. In this paper, we address microblog retrieval limitations by proposing a pseudo-relevance feedback model. Our model considers discriminative expansion to meet user interests. Experimental results on TREC 2011 and 2012 microblog datasets show that our model demonstrates significant improvements over the baseline models.

References

N. Abdul-Jaleel, J. Allan, W. B. Croft, F. Diaz, L. Larkey, X. Li, M. D. Smucker, and C. Wade. Umass at trec 2004: Novelty and hard. 2004.Google ScholarCross Ref
C. C. Aggarwal and C. Zhai. Mining text data. Springer Science & Business Media, 2012. Google ScholarCross Ref
K. Albishre, M. Albathan, and Y. Li. Effective 20 newsgroups dataset cleaning. In 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), volume 3, pages 98--101. IEEE, 2015.Google ScholarCross Ref
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003. Google ScholarDigital Library
C. Carpineto and G. Romano. A survey of automatic query expansion in information retrieval. ACM Computing Surveys (CSUR), 44(1):1, 2012. Google ScholarDigital Library
J. Choi and W. B. Croft. Temporal models for microblogs. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 2491--2494. ACM, 2012. Google ScholarDigital Library
M. Efron. Hashtag retrieval in a microblogging environment. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 787--788. ACM, 2010. Google ScholarDigital Library
M. Efron. Information search and retrieval in microblogs. Journal of the American Society for Information Science and Technology, 62(6):996--1008, 2011. Google ScholarDigital Library
M. Efron and G. Golovchinsky. Estimation methods for ranking recent information. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pages 495--504. ACM, 2011. Google ScholarDigital Library
T. El-Ganainy, W. Magdy, and A. Rafea. Hyperlink-extended pseudo relevance feedback for improved microblog retrieval. In Proceedings of the first international workshop on Social media retrieval and analysis, pages 7--12. ACM, 2014. Google ScholarDigital Library
Y. Gao, Y. Xu, and Y. Li. Topical pattern based document modelling and relevance ranking. In International Conference on Web Information Systems Engineering, pages 186--201. Springer, 2014.Google ScholarCross Ref
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), 20(4):422--446, 2002. Google ScholarDigital Library
V. Lavrenko and W. B. Croft. Relevance based language models. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 120--127. ACM, 2001. Google ScholarDigital Library
Y. Li, A. Algarni, M. Albathan, Y. Shen, and M. A. Bijaksana. Relevance feature discovery for text mining. IEEE Transactions on Knowledge and Data Engineering, 27(6):1656--1669, 2015.Google ScholarDigital Library
Y. Li, A. Algarni, and N. Zhong. Mining positive and negative patterns for relevance feature discovery. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 753--762. ACM, 2010. Google ScholarDigital Library
Y. Li and N. Zhong. Mining ontology for automatically acquiring web user information needs. IEEE transactions on Knowledge and Data Engineering, 18(4):554--568, 2006. Google ScholarDigital Library
Y. Li, X. Zhou, P. Bruza, Y. Xu, and R. Y. Lau. A two-stage decision model for information filtering. Decision Support Systems, 52(3):706--716, 2012. Google ScholarDigital Library
F. Liang, R. Qiang, and J. Yang. Exploiting real-time information retrieval in the microblogosphere. In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, pages 267--276. ACM, 2012. Google ScholarDigital Library
J. Lin, M. Efron, Y. Wang, and G. Sherman. Overview of the trec-2014 microblog track. Technical report, DTIC Document, 2014.Google Scholar
Y. Lv and C. Zhai. A comparative study of methods for estimating query language models with pseudo feedback. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 1895--1898. ACM, 2009. Google ScholarDigital Library
C. D. Manning, P. Raghavan, H. Schütze, et al. Introduction to information retrieval, volume 1. Cambridge university press Cambridge, 2008. Google ScholarCross Ref
T. Miyanishi, K. Seki, and K. Uehara. Improving pseudo-relevance feedback via tweet selection. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, pages 439--448. ACM, 2013. Google ScholarDigital Library
I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. Overview of the trec-2011 microblog track. In Proceeddings of the 20th Text REtrieval Conference (TREC 2011), volume 32, 2011.Google Scholar
L. Pipanmaekaporn and Y. Li. Discovering relevant features for effective query formulation. In Information Retrieval Facility Conference, pages 137--151. Springer, 2012. Google ScholarDigital Library
J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 275--281. ACM, 1998. Google ScholarDigital Library
M. F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.Google ScholarCross Ref
S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, M. Gatford, et al. Okapi at trec-3. NIST SPECIAL PUBLICATION SP, 109:109, 1995.Google Scholar
I. Soboroff, I. Ounis, C. Macdonald, and J. Lin. Overview of the trec-2012 microblog track. In TREC, volume 2012, page 20, 2012.Google Scholar
J. Teevan, D. Ramage, and M. R. Morris. # twittersearch: a comparison of microblog search and web search. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 35--44. ACM, 2011. Google ScholarDigital Library
Z. Wang and M. Zhang. Feedback model for microblog retrieval. In Database Systems for Advanced Applications, pages 529--544. Springer, 2015.Google ScholarCross Ref
P. Willett. The porter stemming algorithm: then and now. Program, 40(3):219--223, 2006.Google ScholarCross Ref
J. Xu and W. B. Croft. Query expansion using local and global document analysis. In Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pages 4--11. ACM, 1996. Google ScholarDigital Library
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 334--342. ACM, 2001. Google ScholarDigital Library
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems (TOIS), 22(2):179--214, 2004. Google ScholarDigital Library
C. Zhai and S. Massung. Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining. Association for Computing Machinery and Morgan; Claypool, New York, NY, USA, 2016. Google ScholarDigital Library

Index Terms

Effective pseudo-relevance for Microblog retrieval
1. Information systems
  1. Information retrieval

Recommendations

Hybrid pseudo-relevance feedback for microblog retrieval

The microblog has become a new global hot spot. Information retrieval IR technologies are necessary for accessing the massive amounts of valuable user-generated contents in the microblog sphere. The challenge in searching relevant microblogs is that ...
Read More
Hyperlink-extended pseudo relevance feedback for improved microblog retrieval
SoMeRA '14: Proceedings of the first international workshop on Social media retrieval and analysis

Microblog retrieval has received much attention in recent years due to the wide spread of social microblogging platforms such as Twitter. Many research studies investigated different approaches for microblog retrieval. Query expansion is one of the ...
Read More
Improving pseudo-relevance feedback via tweet selection
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Query expansion methods using pseudo-relevance feedback have been shown effective for microblog search because they can solve vocabulary mismatch problems often seen in searching short documents such as Twitter messages (tweets), which are limited to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ACSW '17: Proceedings of the Australasian Computer Science Week Multiconference
January 2017
615 pages
ISBN:9781450347686
DOI:10.1145/3014812

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 January 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
language model
microblog search
query expansion
topic model
Qualifiers
- research-article
Conference

Acceptance Rates
ACSW '17 Paper Acceptance Rate78of156submissions,50%Overall Acceptance Rate204of424submissions,48%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 133
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Effective pseudo-relevance for Microblog retrieval

ACSW '17: Proceedings of the Australasian Computer Science Week Multiconference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Hybrid pseudo-relevance feedback for microblog retrieval

Hyperlink-extended pseudo relevance feedback for improved microblog retrieval

Improving pseudo-relevance feedback via tweet selection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Effective pseudo-relevance for Microblog retrieval

ACSW '17: Proceedings of the Australasian Computer Science Week Multiconference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Hybrid pseudo-relevance feedback for microblog retrieval

Hyperlink-extended pseudo relevance feedback for improved microblog retrieval

Improving pseudo-relevance feedback via tweet selection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media