nach oben

International Journal of Data Science and Analytics

Erschienen in:

20.03.2020 | Applications

Classifying sensitive content in online advertisements with deep learning

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In online advertising, an important quality control step is to audit advertising images (“creatives”) before they appear on publishers’ Web pages. This ensures that advertisements only appear on Web pages where the ad is appropriate. If a creative with sensitive content such as gambling and pornography is displayed on the wrong Web page, it can ruin the user’s experience, the publisher’s reputation, and may have legal implications. To protect against this, humans must audit every creative before it is displayed through our ad exchange; this process is costly and time-consuming. To detect sensitive content, we use a pre-trained deep convolutional neural network (Xception Chollet in: The IEEE conference on computer vision and pattern recognition (CVPR), 2017) to process the creative image, and merge its features with the historical distribution of categories associated with the creative’s landing page (the Web page that loads when the ad is clicked, which may also contain sensitive content). This representation is then passed through a series of fully connected layers to predict the sensitive category. The trained model achieves slightly better than human performance (model accuracy 99.92%; human accuracy 99.88%) on a large fraction of creatives (61%), while making 3.5 times fewer mistakes in very sensitive categories. The main challenges we faced were to detect, with high accuracy, creatives from 10 “very sensitive” categories as determined by our Creative Audit team, along with a highly imbalanced data set with 95% of creatives having no sensitive categories. This paper extends the work we described in Austin et al. (in: Proceedings of the 2018 IEEE international conference on data science and advanced analytics (DSAA), DSAA’18, 2018). It demonstrates the successful usage of deep learning in production as a method for detecting sensitive creatives, while respecting the constraints set by business.

Vorheriger Artikel A fast scalable distributed kriging algorithm using Spark framework

Nächster Artikel Identifying Pareto-based solutions for regression subset selection via a feasible solution algorithm

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: a system for large-scale machine learning tensorflow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16), pp. 265–284 (2016)

adperium: Appnexus standards (2017). https://www.adperium.com/misc/AppNexus_standards.pdf

Andrews, M.: File name hashing: creating a hashed directory structure (2017). https://medium.com/eonian-technologies/file-name-hashing-creating-a-hashed-directory-structure-eabb03aa4091

Caruana, R., Lawrence, S., Giles, C.L.: Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: Proceedings of the 13th International Conference on Neural Information Processing Systems, pp 381–387 (2000)

Chen, J., Sun, B., Li, H., Lu, H., Hua, X.S.: Deep ctr prediction in display advertising. In: Proceedings of the 2016 ACM on Multimedia Conference, MM ’16, pp. 811–820. ACM, New York, NY, USA (2016). https://doi.org/10.1145/2964284.2964325. http://doi.acm.org/10.1145/2964284.2964325

Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

Chollet, F., et al.: Keras. (2015) https://keras.io

Clarifai: Nsfw (2016). https://www.clarifai.com/models/nsfw-image-recognition-model-e9576d86d2004ed1a38ba0cf39ecb4b1

Connie, T., Al-Shabi, M., Goh, M.: Smart content recognition from images using a mixture of convolutional neural networks. In: Kim, K.J., Kim, H., Baek, N. (eds.) IT Convergence and Security 2017, pp. 11–18. Springer, Singapore (2018)CrossRef

10.

Corp., I.: Zeromq whitepapers- multithreading magic (2017). http://zeromq.org/whitepapers:multithreading-magic

11.

Facebook: advertising policies (2017). https://www.facebook.com/policies/ads/

12.

Ge, T., Zhao, L., Zhou, G., Chen, K., Liu, S., Yi, H., Hu, Z., Liu, B., Sun, P., Liu, H., Yi, P., Huang, S., Zhang, Z., Zhu, X., Zhang, Y., Gai, K.: Image matters: visually modeling user behaviors using advanced model server. arXiv preprint arXiv:1711.06505v2 (2017)

13.

Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)MATH

14.

Google: Adwords policies (2017). https://support.google.com/adwordspolicy/answer/6008942?hl=en

15.

Group, T.H.: The hdf5 library and file format (2018). https://www.hdfgroup.org/solutions/hdf5/

16.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR arXiv: 1512.03385 (2015)

17.

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning—Volume 37, pp. 448–456 (2015)

18.

Ling, X., Deng, W., Gu, C., Zhou, H., Li, C., Sun, F.: Model ensemble for click prediction in bing search ads. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 689–698. International World Wide Web Conferences Steering Committee (2017)

19.

Mo, K., Liu, B., Xiao, L., Li, Y., Jiang, J.: Image feature learning for cold start problem in display advertising. In: Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pp. 3728–3734. AAAI Press (2015). http://dl.acm.org/citation.cfm?id=2832747.2832769

20.

Moustafa, M.: Applying deep learning to classify pornographic images and videos. arXiv preprint arXiv:1511.08899 (2015)

21.

Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)

22.

Ng, A.: Machine learning yearning (2018). https://www.deeplearning.ai/content/uploads/2018/09/Ng-MLY01-12.pdf

23.

van Rossum, G., Drake, F.L.: The Python Language Reference Manual. Network Theory Ltd., New York (2011)

24.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y MathSciNetCrossRef

25.

Sculley, D., Otey, M.E., Pohl, M., Spitznagel, B., Hainsworth, J., Zhou, Y.: Detecting adversarial advertisements in the wild. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’11, pp. 274–282. ACM, New York, NY, USA (2011). https://doi.org/10.1145/2020408.2020455. http://doi.acm.org/10.1145/2020408.2020455

26.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556v6 (2015)

27.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH

28.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlena, J., Wojna, Z.: Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567v3 (2015)

29.

Wang, R., Fu, B., Fu, G., Wang, M.: Deep & cross network for ad click predictions. arXiv preprint arXiv:1708.05123 (2017)

30.

Wang, S., Cao, L.: Inferring implicit rules by learning explicit and hidden item dependency. IEEE Trans. Syst. Man Cybern. Syst. (2017). https://doi.org/10.1109/TSMC.2017.2768547 CrossRef

31.

Wang, S., Liu, W., Wu, J., Cao, L., Meng, Q., Kennedy, P.J.: Training deep neural networks on imbalanced data sets. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 4368–4374 (2016). https://doi.org/10.1109/IJCNN.2016.7727770

32.

Wehrmann, J., Simes, G.S., Barros, R.C., Cavalcante, V.F.: Adult content detection in videos with convolutional and recurrent neural networks. Neurocomputing 272, 432–438 (2018). https://doi.org/10.1016/j.neucom.2017.07.012 CrossRef

33.

Woodard, R.: A keras multithreaded dataframe generator for millions of image files (2017). https://techblog.appnexus.com/a-keras-multithreaded-dataframe-generator-for-millions-of-image-files-84d3027f6f43

34.

Yahoo: Open nsfw model (2016). https://github.com/yahoo/open_nsfw

35.

Zhou, G., Song, C., Zhu, X., Ma, X., Yan, Y., Dai, X., Zhu, H., Jin, J., Li, H., Gai, K.: Deep interest network for click-through rate prediction. arXiv preprint arXiv:1706.06978 (2017)

36.

Zhou, K., Zhuo, L., Geng, Z., Zhang, J., Li, X.G.: Convolutional neural networks based pornographic image classification. In: 2016 IEEE Second International Conference on Multimedia Big Data (BigMM), pp. 206–209. IEEE (2016)

Titel: Classifying sensitive content in online advertisements with deep learning
Publikationsdatum: 20.03.2020
Erschienen in: International Journal of Data Science and Analytics / Ausgabe 3/2020
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI: https://doi.org/10.1007/s41060-020-00212-6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2020

Combining instance and feature neighbours for extreme multi-label classification

A fast scalable distributed kriging algorithm using Spark framework

Modelling the electrical energy profile of a batch manufacturing pharmaceutical facility

Clustering of mixed-type data considering concept hierarchies: problem specification and algorithm

Identifying Pareto-based solutions for regression subset selection via a feasible solution algorithm

Premium Partner