Abstract
Advertising keywords recommendation is an indispensable component for online advertising with the keywords selected from the target Web pages used for contextual advertising or sponsored search. Several ranking-based algorithms have been proposed for recommending advertising keywords. However, for most of them performance is still lacking, especially when dealing with short-text target Web pages, that is, those containing insufficient textual information for ranking. In some cases, short-text Web pages may not even contain enough keywords for selection. A natural alternative is then to recommend relevant keywords not present in the target Web pages. In this article, we propose a novel algorithm for advertising keywords recommendation for short-text Web pages by leveraging the contents of Wikipedia, a user-contributed online encyclopedia. Wikipedia contains numerous entities with related entities on a topic linked to each other. Given a target Web page, we propose to use a content-biased PageRank on the Wikipedia graph to rank the related entities. Furthermore, in order to recommend high-quality advertising keywords, we also add an advertisement-biased factor into our model. With these two biases, advertising keywords that are both relevant to a target Web page and valuable for advertising are recommended. In our experiments, several state-of-the-art approaches for keyword recommendation are compared. The experimental results demonstrate that our proposed approach produces substantial improvement in the precision of the top 20 recommended keywords on short-text Web pages over existing approaches.
- Abhishek, V. and Hosanagar, K. 2007. Keyword generation for search engine advertising using semantic similarity between terms. In Proceedings of the 9th International Conference on Electronic Commerce. 89--94. Google ScholarDigital Library
- Anagnostopoulos, A., Broder, A., Gabrilovich, E., Josifovski, V., and Riedel, L. 2007. Just-in-time contextual advertising. In Proceedings of the CIKM Conference. Google ScholarDigital Library
- Antonellis, I., Garcia-Molina, H., and Chang, C.-C. 2008. Simrank++: Query rewriting through link analysis of the click graph. In Proceedings of the International Conference on Very Large Databases (VLDB). 408--421.Google Scholar
- Baeza-Yated, R. and Ribeiro-Neto, B. 2008. Modem Information Retrieval. Addison-Wesley Longman Publishing Co., Boston, MA. Google ScholarDigital Library
- Becker, H., Broder, A., Gabrilovich, E., Josifovski, V., and Pang, B. 2009. What happens after an ad click? Quantifying the impact of landing pages in web advertising. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. 57--66. Google ScholarDigital Library
- Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., and Vigna, S. 2008. The query-flow graph: Model and applications. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). 609--618. Google ScholarDigital Library
- Boldi, P., Bonchi, F., Castillo, C., Donato, D., and Vigna, S. 2009. Query suggestions using query-flow graphs. In Proceedings of the Workshop on Web Search Click Data (WSCD). 56--63. Google ScholarDigital Library
- Brin, S. and Page, L. 1997. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International World Wide Web Conference. 107--117. Google ScholarDigital Library
- Brin, S., Motwani, R., Page, L., and Winograd., T. 1998. What can you do with a web in your pocket. Bull. IEEE.Google Scholar
- Broder, A., Fontoura, M., Josifovski, V., and Riedel, L. 2007. A semantic approach to contextual advertising. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information. 559--566. Google ScholarDigital Library
- Broder, A. Z., Ciccolo, P., Fontoura, M., Gabrilovich, E., Josifovski, V., and Riedel, L. 2008. Search advertising using web relevance feedback. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). New York, NY, 1013--1022. Google ScholarDigital Library
- Broder, A., Ciccolo, P., Gabrilovich, E., Josifovski, V., Metzler, D., Riedel, L., and Yuan, J. 2009. Online expansion of rare queries for sponsored search. In Proceedings of the 18th International Conference on World Wide Web. Google ScholarDigital Library
- Cao, H., Jiang, D., Pei, J., He, Q., Liao, Z., Chen, E., and Li, H. 2008. Context-aware query suggestion by mining click-through and session data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 875--883. Google ScholarDigital Library
- Carmel, D., Roitman, H., and Zwerdling, N. 2009. Enhancing cluster labeling using wikipidia. In Proceedings of 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 139--146. Google ScholarDigital Library
- Chen, Y., Xue, G., and Yu, Y. 2008. Advertising keyword suggestion based on concept hierarchy. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 251--260. Google ScholarDigital Library
- Choi, Y., Fontoura, M., Gabrilovich, E., Josifovski, V., Mediano, M., and Pang, B. 2010. Using landing pages for sponsored search ad selection. In Proceedings of the 19th International Conference on World Wide Web. 251--260. Google ScholarDigital Library
- Cristo, M., Ribeiro-Neto1, B., Golgher, P. B., and de Moura, E. 2006. Search advertising. In Proceedings of the StudFuzz Conference 197. 259--285.Google ScholarCross Ref
- Fang, Y., Wu, B., Li, Q., Bot, R., and Chen, X. 2005. Domain-specific keyphrase extraction. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. 283--284. Google ScholarDigital Library
- Feng, J., Bhargava, H., and Pennock, D. 2003. Comparison of allocation rules for paid placement advertising in search engines. In Proceedings of the 5th International Conference on Electronic Commerce. 294--299. Google ScholarDigital Library
- Haveliwala, T. 2002. Topic-Sensitive pagerank. In Proceedings of the 14th World Wide Web Conference. 517--526. Google ScholarDigital Library
- Hillard, D., Schroedl, S., Manavoglu, E., Raghavan, H., and Leggetter, C. 2010. Improving ad relevance in sponsored search. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 361--369. Google ScholarDigital Library
- Hu, J., Fang, L., Cao, Y., Zeng, H.-J., Li, H., Yang, Q., and Chen, Z. 2008. Enhancing text clustering by leveraging wikipedia semantics. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 179--186. Google ScholarDigital Library
- Hu, J., Wang, G., Lochovsky, F., Sun, J., and Chen, Z. 2009. Understanding user’s query intent with wikipedia. In Proceedings of the 18th World Wide Web Conference. 471--478. Google ScholarDigital Library
- IAB and PricewaterhouseCoopers. 2011. http://www.iab.net/media/file/IAB_Full_year_2010_0413_Final.pdf.Google Scholar
- Jones, R., Rey, B., Madani, O., and Greiner, W. 2006. Generating query substitutions. In Proceedings of the 15th International Conference on World Wide Web. 387--396. Google ScholarDigital Library
- Jones, S. and Paynter, G. 2001. Human evaluation of kea, an automatic keyphrasing system. In Proceedings of the 1st ACM/IEE-CS Joint Conference on Digital Libraries. 148--156. Google ScholarDigital Library
- Joshi, A. and Motwani, R. 2006. Keyword generation for search engine advertising. In Proceedings of the 6th IEEE International Conference on Data Mining (Workshops). Google ScholarDigital Library
- Kleinberg, J. M. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 604--632. Google ScholarDigital Library
- Litvak, M. and Last, M. 2008. Graph-based keyword extraction for single-document summarization. In Proceedings of the Workshop on Multisource Multilingual Information Extraction and Summarization (Coling). 17--24. Google ScholarDigital Library
- Matsuo, Y. 2003. Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools.Google Scholar
- Medelyan, O., Milne, D., Legg, C., and Witten, I. 2009. Mining meaning from wikipedia. Int. J. Hum.-Comput. Studies. 716--754. Google ScholarDigital Library
- Mitra, M., Singhal, A., and Buckley, C. 1998. Improving automatic query expansion. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 206--214. Google ScholarDigital Library
- Page, L. 1997. Pagerank: Bringing order to the web. In Digital Libraries Working Paper.Google Scholar
- Pandey, S., Punera, K., Fontoura, M., and Josifovski, V. 2010. Estimating advertisability of tail queries for sponsored search. http://arnetminer.org/viewpub.do?pid=2814327. Google ScholarDigital Library
- Radlinski, F., Broder, A., Ciccolo, P., Gabrilovich, E., Josifovski, V., and Riedel, L. 2008. Optimizing relevance and revenue in ad search: A query substitution approach. In Proceedings of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval. 403--410. Google ScholarDigital Library
- Raghavan, H. and Hillard, D. 2009. A relevance model based filter for improving ad quality. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 762--763. Google ScholarDigital Library
- Ravi, S., Broder, A., Gabrilovich, E., Josifovski, V., Pandey, S., and Pang, B. 2010. Automatic generation of bid phrases for online advertising. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 341--350. Google ScholarDigital Library
- Ribeiro-Neto, B., Cristo, M., Golgher, P., and Moura., E. 2005. Impedance coupling in content-targeted advertising. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 496--503. Google ScholarDigital Library
- Schönhofen, P. 2006. Identifying document topics using the wikipedia category network. Web Intell. Agent Syst. 456--462. Google ScholarDigital Library
- Sweney, M. 2009. http://www.guardian.co.uk/media/2009/sep/30/internet-biggest-uk-advertising-sector.Google Scholar
- Turney, P. D. 2000. Learning algorithms for keyphrase extraction. J. Inform. Retrieval, 303--336. Google ScholarDigital Library
- Turney, P. D. 2003. Coherent keyphrase extraction via web mining. In Proceedings of the IJCAI’03 Conference. 434--439. Google ScholarDigital Library
- Wang, C., Zhang, P., Choi, R., and Eredita, M. 2002. Understanding consumers attitude toward advertising. In Proceedings of the 8th Americas Conference on Informatino System. 1143--1148.Google Scholar
- Wang, H., Ling, Y., Fu, L., Xue, G., and Yu, Y. 2009a. Efficient query expansion for advertisement search. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 51--58. Google ScholarDigital Library
- Wang, P. and Domeniconi, C. 2008. Building semantic kernels for text classification using wikipedia. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 713--721. Google ScholarDigital Library
- Wang, P., Hu, J., Zeng, H.-J., and Chen, Z. 2009b. Using wikipedia knowledge to improve text classification. Knowl. Inf. Syst. 19, 265--281. Google ScholarDigital Library
- Wen, J.-R., Nie, J.-Y., and Zhang, H.-J. 2001. Clustering user queries of a search engine. In Proceedings of the 10th International Conference on World Wide Web (WWW). 162--168. Google ScholarDigital Library
- Witten, I., Paynter, G., Frank, E., Gutwin, C., and Nevill-Manning, C. 1999. Kea: Practical automatic keyphrase extraction. In Proceedings of the 4th ACM Conference on Digital Libraries. 254--255. Google ScholarDigital Library
- Yih, W., Goodman, J., and Carvalho, V. 2006. Finding advertising keywords on web pages. In Proceedings of the 15th World Wide Web Conference. 213--222. Google ScholarDigital Library
- Yu, J., Thom, J., and Tam, A. 2007. Ontology evaluation using wikipedia categories for browsing. In Proceedings of the 6th ACM Conference on Information and Knowledge Management. 223--232. Google ScholarDigital Library
Index Terms
- Advertising Keywords Recommendation for Short-Text Web Pages Using Wikipedia
Recommendations
Finding advertising keywords on web pages
WWW '06: Proceedings of the 15th international conference on World Wide WebA large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of revenue supporting the web today. Despite the importance of this area, little ...
Selecting keywords to represent web pages using Wikipedia information
WebMedia '12: Proceedings of the 18th Brazilian symposium on Multimedia and the webIn this paper we present three new methods to extract keywords from web pages using Wikipedia as an external source of information. The information used from Wikipedia includes the titles of articles, co-occurrence of keywords and categories associated ...
Extracting advertising keywords from URL strings
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide WebExtracting advertising keywords from web-pages is important in keyword-based online advertising. Previous works have attempted to extract advertising keywords from the whole content of a web-page. However, in some scenarios, it is necessary to extract ...
Comments