Skip to main content
Erschienen in: Knowledge and Information Systems 3/2021

09.01.2021 | Regular Paper

Closed form word embedding alignment

verfasst von: Sunipa Dev, Safia Hassan, Jeff M. Phillips

Erschienen in: Knowledge and Information Systems | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We develop a family of techniques to align word embeddings which are derived from different source datasets or created using different mechanisms (e.g., GloVe or word2vec). Our methods are simple and have a closed form to optimally rotate, translate, and scale to minimize root mean squared errors or maximize the average cosine similarity between two embeddings of the same vocabulary into the same dimensional space. Our methods extend approaches known as absolute orientation, which are popular for aligning objects in three dimensions, and generalize an approach by Smith et al. (ICLR 2017). We prove new results for optimal scaling and for maximizing cosine similarity. Then, we demonstrate how to evaluate the similarity of embeddings from different sources or mechanisms, and that certain properties like synonyms and analogies are preserved across the embeddings and can be enhanced by simply aligning and averaging ensembles of embeddings.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Arun KS, Huang TS, Blostein SD (1987) Least-squares fitting of two 3-d points sets. IEEE Trans Pattern Anal Mach Intell 9(5):698–700CrossRef Arun KS, Huang TS, Blostein SD (1987) Least-squares fitting of two 3-d points sets. IEEE Trans Pattern Anal Mach Intell 9(5):698–700CrossRef
2.
Zurück zum Zitat Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: ECCV Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: ECCV
3.
Zurück zum Zitat Besl PJ, McKay ND (1992) A method for registration of 3-d shapes. IEEE Trans Pattern Anal Mach Intell 14(2):239–256CrossRef Besl PJ, McKay ND (1992) A method for registration of 3-d shapes. IEEE Trans Pattern Anal Mach Intell 14(2):239–256CrossRef
4.
Zurück zum Zitat Bojanowski P, Grave E, Joulin A, Mikolov T (2016) Enriching word vectors with subword information Bojanowski P, Grave E, Joulin A, Mikolov T (2016) Enriching word vectors with subword information
5.
Zurück zum Zitat Bollegala D, Hayashi K, Kawarabayashi KI (2017) Learning linear transformations between counting-based and prediction-based word embeddings. PLoS ONE 12:e0184544CrossRef Bollegala D, Hayashi K, Kawarabayashi KI (2017) Learning linear transformations between counting-based and prediction-based word embeddings. PLoS ONE 12:e0184544CrossRef
6.
Zurück zum Zitat Cai H, Zheng VW, Chang KC (2017) A comprehensive survey of graph embedding: problems, techniques and applications. Technical report, arXiv:1709.07604 Cai H, Zheng VW, Chang KC (2017) A comprehensive survey of graph embedding: problems, techniques and applications. Technical report, arXiv:​1709.​07604
7.
Zurück zum Zitat Chen Y, Medioni G (1992) Object modelling by registration of multiple range images. Image Vis Comput 10:145–155CrossRef Chen Y, Medioni G (1992) Object modelling by registration of multiple range images. Image Vis Comput 10:145–155CrossRef
8.
Zurück zum Zitat Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. Proc NAACL-HLT 2019:4171–4186 Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. Proc NAACL-HLT 2019:4171–4186
9.
Zurück zum Zitat Dong Y, Chawla NV, Swami A (2017) metapath2vec: scalable representation learning for heterogeneous networks. In: KDD Dong Y, Chawla NV, Swami A (2017) metapath2vec: scalable representation learning for heterogeneous networks. In: KDD
10.
Zurück zum Zitat Eggert DW, Lorusso A, Fisher RB (1997) Estimating 3-d rigid body transformations: a comparison of four major algorithms. Mach Vis Appl 9:272–290CrossRef Eggert DW, Lorusso A, Fisher RB (1997) Estimating 3-d rigid body transformations: a comparison of four major algorithms. Mach Vis Appl 9:272–290CrossRef
11.
Zurück zum Zitat Faugeras OD, Hebert M (1983) A 3-d recognition and positioning algorithm using geometric matching between primitive surfaces. Proc Int Jt Conf Artif Intell 8:996–1002 Faugeras OD, Hebert M (1983) A 3-d recognition and positioning algorithm using geometric matching between primitive surfaces. Proc Int Jt Conf Artif Intell 8:996–1002
12.
Zurück zum Zitat Finkelstein L, Gabrilovich E, Matias Y, Rivlin E, Solan Z, Wolfman G et al (2002) Placing search in context: the concept revisited. ACM Trans Inf Syst 20:116–131CrossRef Finkelstein L, Gabrilovich E, Matias Y, Rivlin E, Solan Z, Wolfman G et al (2002) Placing search in context: the concept revisited. ACM Trans Inf Syst 20:116–131CrossRef
13.
Zurück zum Zitat Goyal P, Ferrara E (2017) Graph embedding techniques, applications, and performance: a survey. Technical report, arXiV:1705.02801 Goyal P, Ferrara E (2017) Graph embedding techniques, applications, and performance: a survey. Technical report, arXiV:​1705.​02801
14.
Zurück zum Zitat Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: KDD Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: KDD
15.
Zurück zum Zitat Hanson RJ, Norris MJ (1981) Analysis of measurements based on the singular value decomposition. SIAM J Sci Stat Comput 27(3):363–373MathSciNetCrossRef Hanson RJ, Norris MJ (1981) Analysis of measurements based on the singular value decomposition. SIAM J Sci Stat Comput 27(3):363–373MathSciNetCrossRef
16.
Zurück zum Zitat Hermann KM, Blunsom P (2013) Multilingual distributed representations without word alignment. ArXiv e-prints Hermann KM, Blunsom P (2013) Multilingual distributed representations without word alignment. ArXiv e-prints
17.
Zurück zum Zitat Hill F, Reichart R, Korhonen A (2015) Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput Linguist 41:665–695MathSciNetCrossRef Hill F, Reichart R, Korhonen A (2015) Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput Linguist 41:665–695MathSciNetCrossRef
18.
Zurück zum Zitat Horn BKP (1987) Closed-form solution of absolute orientation using unit quaternions. J Opt Soc Am A 4:629–642CrossRef Horn BKP (1987) Closed-form solution of absolute orientation using unit quaternions. J Opt Soc Am A 4:629–642CrossRef
19.
Zurück zum Zitat Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification
20.
Zurück zum Zitat Omer L, Yoav G (2013) Neural word embedding as implicit matrix factorization. In: NIPS Omer L, Yoav G (2013) Neural word embedding as implicit matrix factorization. In: NIPS
21.
Zurück zum Zitat Omer L, Yoav G (2014) Linguistic regularities of sparse and explicit word representations. In: CoNLL Omer L, Yoav G (2014) Linguistic regularities of sparse and explicit word representations. In: CoNLL
22.
Zurück zum Zitat Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110CrossRef Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110CrossRef
23.
Zurück zum Zitat Luong MT, Pham H, Manning CD (2015) Bilingual word representations with monolingual quality in mind. In: NAACL-HLT, pp 151–159 Luong MT, Pham H, Manning CD (2015) Bilingual word representations with monolingual quality in mind. In: NAACL-HLT, pp 151–159
24.
Zurück zum Zitat Mahadevan S, Boucher T, Carey CJ, Dyar MD (2014) Aligning mixed manifolds. In: AAAI Mahadevan S, Boucher T, Carey CJ, Dyar MD (2014) Aligning mixed manifolds. In: AAAI
25.
Zurück zum Zitat Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. Technical report, arXiv:1301.3781 Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. Technical report, arXiv:​1301.​3781
26.
Zurück zum Zitat Mikolov T, Le QV, Sutskever I (2013) Exploiting similarities among languages for machine translation. ArXiv e-prints Mikolov T, Le QV, Sutskever I (2013) Exploiting similarities among languages for machine translation. ArXiv e-prints
27.
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: NIPS, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: NIPS, pp 3111–3119
28.
Zurück zum Zitat Miller G, Charles W (1998) Contextual correlates of semantic similarity. Lang Cogn Process 6:1–28CrossRef Miller G, Charles W (1998) Contextual correlates of semantic similarity. Lang Cogn Process 6:1–28CrossRef
29.
Zurück zum Zitat Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: EMNLP Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: EMNLP
30.
Zurück zum Zitat Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: KDD Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: KDD
31.
Zurück zum Zitat Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of NAACL Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of NAACL
32.
Zurück zum Zitat Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In: NIPS Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In: NIPS
33.
Zurück zum Zitat Rahimi A, Recht B (2008) Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. In: NIPS Rahimi A, Recht B (2008) Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. In: NIPS
35.
Zurück zum Zitat Rubenstein H, Goodenough JB (1965) Contextual correlates of synonymy. Commun ACM 8:627–633CrossRef Rubenstein H, Goodenough JB (1965) Contextual correlates of synonymy. Commun ACM 8:627–633CrossRef
36.
Zurück zum Zitat Sahin CS, Caceres RS, Oselio B, Campbell WM (2017) Consistent alignment of word embedding models. ArXiv e-prints Sahin CS, Caceres RS, Oselio B, Campbell WM (2017) Consistent alignment of word embedding models. ArXiv e-prints
37.
Zurück zum Zitat Schönemann PH (1966) A generalized solution to the orthogonal procrustes problem. Psychometrika 31(1):1–10MathSciNetCrossRef Schönemann PH (1966) A generalized solution to the orthogonal procrustes problem. Psychometrika 31(1):1–10MathSciNetCrossRef
38.
Zurück zum Zitat Schwartz JT, Sharir M (1987) Identification of partially obscured objects in two and three dimensions by matching noisy characteristic curves. Int J Robot Res 6(2):29–44 (Summer 1987)CrossRef Schwartz JT, Sharir M (1987) Identification of partially obscured objects in two and three dimensions by matching noisy characteristic curves. Int J Robot Res 6(2):29–44 (Summer 1987)CrossRef
39.
Zurück zum Zitat Shi Q, Petterson J, Dror G, Langford J, Smola A, Vishwanathan SVN (2009) Hash kernels for structured data. JMLR 10:2615–2637MathSciNetMATH Shi Q, Petterson J, Dror G, Langford J, Smola A, Vishwanathan SVN (2009) Hash kernels for structured data. JMLR 10:2615–2637MathSciNetMATH
40.
Zurück zum Zitat Smith SL, Turban DH, Hamblin S, Hammerla NY (2017) Offline bilingual word vectors, orthogonal transformations and the inverted softmax. In: ICLR Smith SL, Turban DH, Hamblin S, Hammerla NY (2017) Offline bilingual word vectors, orthogonal transformations and the inverted softmax. In: ICLR
41.
Zurück zum Zitat Walker MW, Shao L, Volz RA (1991) Estimating 3-D location parameters using dual number quaternions. CVGIP: Image Underst 54(3):358–367CrossRef Walker MW, Shao L, Volz RA (1991) Estimating 3-D location parameters using dual number quaternions. CVGIP: Image Underst 54(3):358–367CrossRef
42.
Zurück zum Zitat Weinberger K, Dasgupta A, Langford J, Smola A, Attenberg J (2009) Feature hashing for large scale multitask learning. In: ICML Weinberger K, Dasgupta A, Langford J, Smola A, Attenberg J (2009) Feature hashing for large scale multitask learning. In: ICML
Metadaten
Titel
Closed form word embedding alignment
verfasst von
Sunipa Dev
Safia Hassan
Jeff M. Phillips
Publikationsdatum
09.01.2021
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 3/2021
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-020-01531-7

Weitere Artikel der Ausgabe 3/2021

Knowledge and Information Systems 3/2021 Zur Ausgabe