ABSTRACT
There is a perennial need in the online advertising industry to refresh ad creatives, i.e., images and text used for enticing online users towards a brand. Such refreshes are required to reduce the likelihood of ad fatigue among online users, and to incorporate insights from other successful campaigns in related product categories. Given a brand, to come up with themes for a new ad is a painstaking and time consuming process for creative strategists. Strategists typically draw inspiration from the images and text used for past ad campaigns, as well as world knowledge on the brands. To automatically infer ad themes via such multimodal sources of information in past ad campaigns, we propose a theme (keyphrase) recommender system for ad creative strategists. The theme recommender is based on aggregating results from a visual question answering (VQA) task, which ingests the following: (i) ad images, (ii) text associated with the ads as well as Wikipedia pages on the brands in the ads, and (iii) questions around the ad. We leverage transformer based cross-modality encoders to train visual-linguistic representations for our VQA task. We study two formulations for the VQA task along the lines of classification and ranking; via experiments on a public dataset, we show that cross-modal representations lead to significantly better classification accuracy and ranking precision-recall metrics. Cross-modal representations show better performance compared to separate image and text representations. In addition, the use of multimodal information shows a significant lift over using only textual or visual information.
- 2019. Automatic Understanding of Image and Video Advertisements. http://people.cs.pitt.edu/~kovashka/ads.Google Scholar
- 2019. Banner blindness. https://en.wikipedia.org/wiki/Banner_blindness.Google Scholar
- 2019. Facebook business: Optimize your ad results by refreshing your creative. https://www.facebook.com/business/m/test-ads-on-facebook.Google Scholar
- 2019. Marketing Land: Social media ad fatigue. https://marketingland.com/ad-fatigue-social-media-combat-224234.Google Scholar
- 2019. Match Zoo. https://github.com/NTMC-Community/MatchZoo.Google Scholar
- 2019. Shutterstock: Search millions of royalty free stock images, photos, videos, and music.https://www.shutterstock.com/.Google Scholar
- 2019. Taboola-trends. https://trends.taboola.com/.Google Scholar
- Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh. 2015. VQA: Visual Question Answering. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
- Narayan Bhamidipati, Ravi Kant, Shaunak Mishra, and Mingzhu Zhu. 2017. A Large Scale Prediction Engine for App Install Clicks and Conversions. In CIKM 2017.Google ScholarDigital Library
- Florian Boudin. 2016. pke: an open source python-based keyphrase extraction toolkit. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).Google Scholar
- Corina Florescu and Cornelia Caragea. 2017. PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).Google ScholarCross Ref
- Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-hoc Retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management.Google ScholarDigital Library
- Jiawei Han, Jian Pei, and Micheline Kamber. 2011. Data mining: concepts and techniques. Elsevier.Google ScholarDigital Library
- Zaeem Hussain, Mingda Zhang, Xiaozhong Zhang, Keren Ye, Christopher Thomas, Zuha Agha, Nathan Ong, and Adriana Kovashka. 2017. Automatic Understanding of Image and Video Advertisements. In CVPR.Google Scholar
- Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 422–446.Google ScholarDigital Library
- Gen Li, Nan Duan, Yuejian Fang, Daxin Jiang, and Ming Zhou. 2019. Unicoder-vl: A universal encoder for vision and language by cross-modal pre-training. arXiv preprint arXiv:1908.06066(2019).Google Scholar
- Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557(2019).Google Scholar
- Wei Li, Xuerui Wang, Ruofei Zhang, Ying Cui, Jianchang Mao, and Rong Jin. 2010. Exploitation and Exploration in a Performance Based Contextual Advertising System. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google ScholarDigital Library
- Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. In NeurIPS.Google Scholar
- H. Brendan McMahan, Gary Holt, D. Sculley, Michael Young, Dietmar Ebner, Julian Grady, Lan Nie, Todd Phillips, Eugene Davydov, Daniel Golovin, Sharat Chikkerur, Dan Liu, Martin Wattenberg, Arnar Mar Hrafnkelsson, Tom Boulos, and Jeremy Kubica. [n.d.]. Ad Click Prediction: a View from the Trenches(KDD 2013).Google ScholarDigital Library
- Shaunak Mishra, Manisha Verma, and Jelena Gligorijevic. 2019. Guiding Creative Design in Online Advertising(RecSys).Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In In EMNLP.Google Scholar
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91–99.Google Scholar
- Susanne Schmidt and Martin Eisend. 2015. Advertising Repetition: A Meta-Analysis on Effective Frequency in Advertising. Journal of Advertising 44, 4 (2015), 415–428.Google ScholarCross Ref
- Piyush Sharma, Nan Ding, Sebastian Goodman, and Radu Soricut. 2018. Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning. In Proceedings of ACL.Google ScholarCross Ref
- Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, and Jifeng Dai. 2019. Vl-bert: Pre-training of generic visual-linguistic representations. arXiv preprint arXiv:1908.08530(2019).Google Scholar
- Hao Tan and Mohit Bansal. 2019. LXMERT: Learning Cross-Modality Encoder Representations from Transformers. In EMNLP-IJCNLP.Google Scholar
- Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144(2016).Google Scholar
- Keren Ye and Adriana Kovashka. 2018. ADVISE: Symbolism and External Knowledge for Decoding Advertisements. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XV. 868–886.Google Scholar
- Seounmi Youn and Seunghyun Kim. 2019. Newsfeed native advertising on Facebook: young millennials’ knowledge, pet peeves, reactance and ad avoidance. International Journal of Advertising 38, 5 (2019), 651–683.Google ScholarCross Ref
- Yichao Zhou, Shaunak Mishra, Jelena Gligorijevic, Tarun Bhatia, and Narayan Bhamidipati. 2019. Understanding Consumer Journey using Attention based Recurrent Neural Networks. KDD (2019).Google Scholar
Index Terms
- Recommending Themes for Ad Creative Design via Visual-Linguistic Representations
Recommendations
Guiding creative design in online advertising
RecSys '19: Proceedings of the 13th ACM Conference on Recommender SystemsAd creatives (text and images) for a brand play an influential role in online advertising. To design impactful ads, creative strategists employed by the brands (advertisers) typically go through a time consuming process of market research and ideation. ...
The online target advertising design model: a conceptual model to provide theoretical guidelines, insights, and understanding in online target marketplaces and the development of websites and apps
This study presents generations of digital marketing business strategy, guidelines and insights to online social media companies for how to design online target advertisements in digital market places, application development, and methods for improving ...
The impact of visual appearance on user response in online display advertising
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide WebDisplay advertising has been a significant source of revenue for publishers and ad networks in the online advertising ecosystem. One of the main goals in display advertising is to maximize user response rate for advertising campaigns, such as click ...
Comments