skip to main content
10.1145/1367497.1367554acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Contextual advertising by combining relevance with click feedback

Published:21 April 2008Publication History

ABSTRACT

Contextual advertising supports much of the Web's ecosystem today. User experience and revenue (shared by the site publisher and the ad network) depend on the relevance of the displayed ads to the page content. As with other document retrieval systems, relevance is provided by scoring the match between individual ads (documents) and the content of the page where the ads are shown (query). In this paper we show how this match can be improved significantly by augmenting the ad-page scoring function with extra parameters from a logistic regression model on the words in the pages and ads. A key property of the proposed model is that it can be mapped to standard cosine similarity matching and is suitable for efficient and scalable implementation over inverted indexes. The model parameter values are learnt from logs containing ad impressions and clicks, with shrinkage estimators being used to combat sparsity. To scale our computations to train on an extremely large training corpus consisting of several gigabytes of data, we parallelize our fitting algorithm in a Hadoop framework [10]. Experimental evaluation is provided showing improved click prediction over a holdout set of impression and click events from a large scale real-world ad placement engine. Our best model achieves a 25% lift in precision relative to a traditional information retrieval model which is based on cosine similarity, for recalling 10% of the clicks in our test data.

References

  1. R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Z. Broder, D. Carmel, M. Herscovici, A. Soffer, and J. Zien. Efficient query evaluation using a two-level retrieval process. In CIKM '03: Proc. of the twelfth intl. conf. on Information and knowledge management, pages 426--434, New York, NY, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Z. Broder, M. Fontoura, V. Josifovski, and L. Riedel. A semantic approach to contextual advertising. In SIGIR, pages 559--566, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Chatterjee, D. L. Hoffman, and T. P. Novak. Modeling the clickstream: Implications for web-based advertising efforts. Marketing Science, 22(4):520--541, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Lin, R. C. Weng, and S. S. Keerthi. Trust region newton methods for large-scale logistic regression. In International Conference on machine learning, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. R. Rao. Linear Statistical Inference and its Applications. Wiley-Interscience, 2002.Google ScholarGoogle Scholar
  7. D. C. Liu and J. Nocedal. On the limited memory bfgs method for large scale optimization. Mathmematical Programming, 45:503--528, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Derksen and H. J. Keselman. Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. British Journal of Mathematical and Statistical Psychology, 45:265--282, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  9. Online ad spending to total $19.5 billion in 2007. eMarketer, February 2007. Available from http://www.emarketer.com/Article.aspx?id=1004635.Google ScholarGoogle Scholar
  10. A. Foundation. Apache hadoop project. In lucene.apache.org/hadoop.Google ScholarGoogle Scholar
  11. G. King and L. Zeng. Logistic regression in rare events data. Political Analysis, 9:137--162, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  12. J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. In Sixth Symposium on Operating System Design and Implementation, pages 137--150, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Lacerda, M. Cristo, M. A. G., W. Fan, N. Ziviani, and B. Ribeiro-Neto. Learning to advertise. In SIGIR '06: Proc. of the 29th annual intl. ACM SIGIR conf., pages 549--556, New York, NY, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. McCullagh and J. A. Nelder. Generalized Linear Models. Chapman and Hall, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  15. M. J. Silvapulle. On the existence of maximum likelihood estimates for the binomial response models. Journal of the Royal Statistical Society, Series B, 43:310--313, 1981.Google ScholarGoogle Scholar
  16. P. Komarek and A. W. Moore. Making logistic regression a core data mining tool with tr-irls. In International Conference on Data Mining, pages 685--688, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Regelson and D. Fain. Predicting click-through rate using keyword clusters. In In Proc. of the Second Workshop on Sponsored Search Auctions, 2006.Google ScholarGoogle Scholar
  18. B. Ribeiro-Neto, M. Cristo, P. B. Golgher, and E. S. de Moura. Impedance coupling in content-targeted advertising. In SIGIR '05: Proc. of the 28th annual intl. ACM SIGIR conf., pages 496--503, New York, NY, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Richardson, E. Dominowska, and R. Ragno. Predicting clicks: estimating the click-through rate for new ads. In WWW, pages 521--530, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. D. Pietra, V. D. Pietra, and J. Lafferty. Inducing features of random fields. IEEE PAMI, 19:380--393, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Wang, P. Zhang, R. Choi, and M. D. Eredita. Understanding consumers attitude toward advertising. In Eighth Americas conf. on Information System, pages 1143--1148, 2002.Google ScholarGoogle Scholar
  22. W. Yih, J. Goodman, and V. R. Carvalho. Finding advertising keywords on web pages. In WWW '06: Proc. of the 15th intl. conf. on World Wide Web, pages 213--222, New York, NY, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Contextual advertising by combining relevance with click feedback

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '08: Proceedings of the 17th international conference on World Wide Web
      April 2008
      1326 pages
      ISBN:9781605580852
      DOI:10.1145/1367497

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 April 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader