Top

Journal of Intelligent Information Systems

Published in:

04-06-2018

EVE: explainable vector based embedding technique using Wikipedia

Authors: M. Atif Qureshi, Derek Greene

Published in: Journal of Intelligent Information Systems | Issue 1/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

We present an unsupervised explainable vector embedding technique, called EVE, which is built upon the structure of Wikipedia. The proposed model defines the dimensions of a semantic vector representing a concept using human-readable labels, thereby it is readily interpretable. Specifically, each vector is constructed using the Wikipedia category graph structure together with the Wikipedia article link structure. To test the effectiveness of the proposed model, we consider its usefulness in three fundamental tasks: 1) intruder detection—to evaluate its ability to identify a non-coherent vector from a list of coherent vectors, 2) ability to cluster—to evaluate its tendency to group related vectors together while keeping unrelated vectors in separate clusters, and 3) sorting relevant items first—to evaluate its ability to rank vectors (items) relevant to the query in the top order of the result. For each task, we also propose a strategy to generate a task-specific human-interpretable explanation from the model. These demonstrate the overall effectiveness of the explainable embeddings generated by EVE. Finally, we compare EVE with the Word2Vec, FastText, and GloVe embedding techniques across the three tasks, and report improvements over the state-of-the-art.

previous article Collaborative filtering recommendation based on trust and emotion

next article AWML: adaptive weighted margin learning for knowledge graph embedding

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

http://lsa.colorado.edu/

Besides the basic similarity with PageRank

With an intuition to penalize untrusted pages (or spam)

This can be an exact match or a partial best match using an information retrieval algorithm

This dimension is the most relevant dimension defining the concept which is the article itself.

In case of best match strategy, where more than one article is mapped to a concept i.e., A_concept1,A_concept2,... the score computed is further scaled by the relevance score of each article for the top-k articles, then reduced by the vector addition, and normalized again.

In case of the partial best match it is the relevance score returned by BM25 algorithm.

By simply, 1—normalized similarity score over each dimension

The full set of experimental visualizations is available at http://mlg.ucd.ie/eve/

Adler, P., Falk, C., Friedler, S.A, Rybeck, G., Scheidegger, C., Smith, B., Venkatasubramanian, S. (2016). Auditing black-box models for indirect influence. In 2016 IEEE 16th international conference on data mining (ICDM) (pp. 1–10). IEEE.

Agirre, E., & Soroa, A. (2009). Personalizing pagerank for word sense disambiguation. In Proceedings of the 12th conference of the European chapter of the association for computational linguistics, association for computational linguistics (pp. 33–41).

Arora, S., Li, Y., Liang, Y., Ma, T., Risteski, A. (2016). A latent variable model approach to pmi-based word embeddings. Transactions of the Association for Computational Linguistics, 4, 385–399.CrossRef

Baroni, M., & Lenci, A. (2010). Distributional memory: a general framework for corpus-based semantics. Computational Linguistics, 36 (4), 673–721.CrossRef

Baroni, M., Dinu, G., Kruszewski, G. (2014). Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In ACL (Vol. 1, pp. 238–247).

Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C. (2003). A neural probabilistic language model. JMLR, 3, 1137–1155.MATH

Bhargava, P., Phan, T., Zhou, J., Lee, J. (2015). Who, what, when, and where: multi-dimensional collaborative recommendations using tensor factorization on sparse user-generated data. In Proceedings of the 24th international conference on world wide web (pp. 130–140). ACM.

Bian, J., Gao, B., Liu, T.Y. (2014). Knowledge-powered deep learning for word embedding. In Joint European conference on machine learning and knowledge discovery in databases (pp. 132–148). Springer.

Bojanowski, P., Grave, E., Joulin, A., Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:160704606.

Bordes, A., Weston, J., Collobert, R., Bengio, Y. (2011). Learning structured embeddings of knowledge bases. In Conference on artificial intelligence, EPFL-CONF-192344.

Budanitsky, A., & Hirst, G. (2006). Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics, 32 (1), 13–47.CrossRefMATH

Caliński, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics-Theory and Methods, 3 (1), 1–27.MathSciNetCrossRefMATH

Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the ICML’2008 (pp. 160–167). ACM.

Datta, A., Sen, S., Zick, Y. (2016). Algorithmic transparency via quantitative input influence: theory and experiments with learning systems. In 2016 IEEE symposium on security and privacy (SP) (pp. 598–617). IEEE.

Deerwester, S. (1988). Improving information retrieval with latent semantic indexing. In Proceedings of the 51st annual meeting of the American Society for information science (Vol. 25, pp. 36–40).

Diao, Q., Qiu, M., Wu, C.Y., Smola, A.J., Jiang, J., Wang, C. (2014). Jointly modeling aspects, ratings and sentiments for movie recommendation (jmars). In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 193–202). ACM.

Diaz, F., Mitra, B., Craswell, N. (2016). Query expansion with locally-trained word embeddings. In Association for computational linguistics (pp. 367–377).

Everitt, B., Landau, S., Leese, M. (2001). Cluster analysis. Wiley: Hodder Arnold Publication.MATH

Faruqui, M., Dodge, J., Jauhar, S.K, Dyer, C., Hovy, E., Smith, N.A. (2014). Retrofitting word vectors to semantic lexicons. arXiv preprint arXiv:14114166.

Firth, J. (1957). A synopsis of linguistic theory 1930–1955. In Studies in linguistic analysis (pp. 1–32).

Fu, X., Wang, T., Li, J., Yu, C., Liu, W. (2016). Improving distributed word representation and topic model by word-topic mixture model. In Proceedings of the 8th Asian conference on machine learning (pp. 190–205).

Gabrilovich, E., & Markovitch, S. (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. In Proceedings of the IJCAI’07 (Vol. 7, pp. 1606–1611).

Gallant, S.I., Caid, W.R., Carleton, J., Hecht-Nielsen, R., Qing, K.P., Sudbeck, D. (1992). Hnc’s matchplus system. In ACM SIGIR Forum (Vol. 26, pp. 34–38). ACM.

Ganguly, D., Roy, D., Mitra, M., Jones, G.J. (2015). Word embedding based generalized language model for information retrieval. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval (pp. 795–798). ACM.

Ganitkevitch, J., Van Durme, B., Callison-Burch, C. (2013). Ppdb: the paraphrase database. In HLT-NAACL (pp. 758–764).

Globerson, A., Chechik, G., Pereira, F., Tishby, N. (2007). Euclidean embedding of co-occurrence data. JMLR, 8, 2265–2295.MathSciNetMATH

Goodman, B., & Flaxman, S. (2016). European union regulations on algorithmic decision-making and a “right to explanation”. arXiv preprint arXiv:160608813.

Gyöngyi, Z., Garcia-Molina, H., Pedersen, J. (2004). Combating web spam with trustrank. In Proceedings of the thirtieth international conference on very large data bases. VLDB Endowment (Vol. 30, pp. 576–587).

Harris, Z.S. (1954). Distributional structure. Word, 10 (2–3), 146–162.CrossRef

Harris, Z.S. (1968). Mathematical structures of language. New York: Wiley.MATH

Henelius, A., Puolamäki, K., Boström, H., Asker, L., Papapetrou, P. (2014). A peek into the black box: exploring classifiers by randomization. Data Mining and Knowledge Discovery, 28 (5–6), 1503.MathSciNetCrossRef

Hoffart, J., Seufert, S., Nguyen, D.B., Theobald, M., Weikum, G. (2012). Kore: keyphrase overlap relatedness for entity disambiguation. In Proceedings of the 21st ACM international conference on information and knowledge management (pp. 545–554).

Hunt, J., & Price, C. (1988). Explaining qualitative diagnosis. Engineering Applications of Artificial Intelligence, 1 (3), 161–169.CrossRef

Jarmasz, M. (2012). Roget’s thesaurus as a lexical resource for natural language processing. arXiv preprint arXiv:12040140.

Jiang, Y., Zhang, X., Tang, Y., Nie, R. (2015). Feature-based approaches to semantic similarity assessment of concepts using wikipedia. Info Processing & Management, 51 (3), 215–234.CrossRef

Kuzi, S., Shtok, A., Kurland, O. (2016). Query expansion using word embeddings. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp. 1929–1932). ACM.

Landauer, T.K., Foltz, P.W, Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25 (2–3), 259–284.CrossRef

Levy, O., & Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. In Proceedings of the NIPS’2014 (pp. 2177–2185).

Levy, O., Goldberg, Y., Ramat-Gan, I. (2014). Linguistic regularities in sparse and explicit word representations. In CoNLL (pp. 171–180).

Levy, O., Goldberg, Y., Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3, 211–225.CrossRef

Lipton, Z.C. (2016). The mythos of model interpretability. arXiv preprint arXiv:160603490.

Liu, Y., Liu, Z., Chua, T.S., Sun, M. (2015). Topical word embeddings. In AAAI (pp. 2418–2424).

Lopez-Suarez, A., & Kamel, M. (1994). Dykor: a method for generating the content of explanations in knowledge systems. Knowledge-Based Systems, 7 (3), 177–188.CrossRef

Manning, C.D., Raghavan, P., Schütze, H. (2008). Introduction to information retrieval. New York: Cambridge University Press.CrossRefMATH

Metzler, D., Dumais, S., Meek, C. (2007). Similarity measures for short segments of text. In European conference on information retrieval (pp. 16–27). Springer.

Mihalcea, R., & Tarau, P. (2004). Textrank: bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing.

Mikolov, T., Chen, K., Corrado, G., Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:13013781.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S, Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In Proceedings of the NIPS’2013 (pp. 3111–3119).

Nikfarjam, A., Sarker, A., O’Connor, K., Ginn, R., Gonzalez, G. (2015). Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. Journal of the American Medical Informatics Association, 22, 671–681.

Niu, L., Dai, X., Zhang, J., Chen, J. (2015). Topic2vec: learning distributed representations of topics. In 2015 International conference on asian language processing (IALP) (pp. 193–196). IEEE.

Page, L., Brin, S., Motwani, R., Winograd, T. (1999). The pagerank citation ranking: bringing order to the web. Tech. rep., Stanford InfoLab.

Pennington, J., Socher, R., Manning, C.D. (2014). Glove: global vectors for word representation. In Empirical methods in natural language processing (EMNLP) (pp. 1532–1543).

Qureshi, M.A. (2015). Utilising wikipedia for text mining applications. PhD thesis, National University of Ireland Galway.

Ren, Z., Liang, S., Li, P., Wang, S., de Rijke, M. (2017). Social collaborative viewpoint regression with explainable recommendations. In Proceedings of the tenth ACM international conference on web search and data mining (pp. 485–494). ACM.

Ribeiro, M.T., Singh, S., Guestrin, C. (2016). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144). ACM.

Salton, G., & McGill, M.J. (1986). Introduction to modern information retrieval. New York: McGraw-Hill, Inc.MATH

Sari, Y., & Stevenson, M. (2016). Exploring word embeddings and character n-grams for author clustering. In Working notes. CEUR Workshop Proceedings, CLEF.

Schütze, H. (1992). Word space. In Proceedings of the NIPS’1992 (pp. 895–902).

Sherkat, E., & Milios, E.E. (2017). Vector embedding of wikipedia concepts and entities. In International conference on applications of natural language to information systems (pp. 418–428). Springer.

Socher, R., Chen, D., Manning, C.D, Ng, A. (2013). Reasoning with neural tensor networks for knowledge base completion. In Proceedings of the NIPS’2013 (pp. 926–934).

Strube, M., & Ponzetto, S.P. (2006). Wikirelate! Computing semantic relatedness using wikipedia. In Proceedings of the 21st national conference on artificial intelligence (pp. 1419–1424).

Tintarev, N., & Masthoff, J. (2015). Explaining recommendations: design and evaluation. In Recommender systems handbook (pp. 353–382). Springer.

van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-sne. JMLR, 9, 2579–2605.MATH

Wang, Z., Zhang, J., Feng, J., Chen, Z. (2014). Knowledge graph and text jointly embedding. In EMNLP, Citeseer (Vol. 14, pp. 1591–1601).

Wang, P., Xu, B., Xu, J., Tian, G., Liu, C.L, Hao, H. (2016). Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing, 174, 806–814.CrossRef

Wick, M.R, & Thompson, W.B. (1992). Reconstructive expert system explanation. Artificial Intelligence, 54 (1–2), 33–70.CrossRef

Witten, I., & Milne, D. (2008). An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In AAAI workshop on wikipedia and artificial intelligence: an evolving synergy (pp. 25–30).

Wu, F., Song, J., Yang, Y., Li, X., Zhang, Z.M, Zhuang, Y. (2015). Structured embedding via pairwise relations and long-range interactions in knowledge base. In AAAI (pp. 1663–1670).

Xu, C., Bai, Y., Bian, J., Gao, B., Wang, G., Liu, X., Liu, T.Y. (2014). Rc-net: a general framework for incorporating knowledge into word representations. In Proceedings of the 23rd ACM international conference on conference on information and knowledge management (pp. 1219–1228).

Yeh, E., Ramage, D., Manning, C.D, Agirre, E., Soroa, A. (2009). Wikiwalk: random walks on wikipedia for semantic relatedness. In Proceedings of the 2009 workshop on graph-based methods for natural language processing (pp. 41–49).

Yu, M., & Dredze, M. (2014). Improving lexical embeddings with semantic knowledge. In ACL (Vol. 2, pp. 545–550).

Zesch, T., & Gurevych, I. (2007). Analysis of the wikipedia category graph for nlp applications. In Proceedings of the TextGraphs-2 Workshop (NAACL-HLT 2007) (pp. 1–8).

Zhang, Y., Lai, G., Zhang, M., Zhang, Y., Liu, Y., Ma, S. (2014). Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval (pp. 83–92). ACM.

Zheng, G., & Callan, J. (2015). Learning to reweight terms with distributed representations. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval (pp. 575–584). ACM.

Zuccon, G., Koopman, B., Bruza, P., Azzopardi, L. (2015). Integrating and evaluating neural word embeddings in information retrieval. In Proceedings of the 20th Australasian document computing symposium (p. 12). ACM.

Title: EVE: explainable vector based embedding technique using Wikipedia
Authors: M. Atif Qureshi
Derek Greene
Publication date: 04-06-2018
Publisher: Springer US
Published in: Journal of Intelligent Information Systems / Issue 1/2019
Print ISSN: 0925-9902
Electronic ISSN: 1573-7675
DOI: https://doi.org/10.1007/s10844-018-0511-x

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 1/2019

Activity prediction in process mining using the WoMan framework

Efficient question classification and retrieval using category information and word embedding on cQA services

AWML: adaptive weighted margin learning for knowledge graph embedding

Collaborative filtering recommendation based on trust and emotion

Real-time recommendation with locality sensitive hashing

Clustering of semantically enriched short texts

Premium Partner