Skip to main content
Erschienen in: Neural Processing Letters 1/2016

01.02.2016

Locality Alignment Discriminant Analysis for Visualizing Regional English

verfasst von: Peng Tang, Mingbo Zhao, Tommy W. S. Chow

Erschienen in: Neural Processing Letters | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, a novel dimensionality reduction algorithm named locality alignment discriminant analysis (LADA) for visualizing regional English is proposed. In the LADA algorithm, the proposed intrinsic graph or penalty graph measures the similarities between each pairwise textual slices, which can better characterize the intra-class compactness and inter-class separability; the projection matrix obtained by the proposed method is orthogonal, which can eliminate the redundancy between different projection directions, and is more effective for preserving the intrinsic geometry and improving the discriminating ability. To evaluate the performance of the algorithm, a regional written English corpus is designed and collected. Consequently, articles are split into slices and then transformed into 140-dimensional data points by 140 text style markers. Finally, variations existing in the regional written English are attempted to be recognized with our proposed LADA. The similarity among different types of English can be observed by the data plots. The results of visualization and numerical comparison indicate that LADA outperforms other existing algorithms in handling regional English data, as the proposed LADA can better preserve the local discriminative information embedded in the data, which is suitable for pattern classification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Biber D (1995) Dimensions of register variation: a cross-linguistic comparison. Cambridge Univesity Press, CambridgeCrossRef Biber D (1995) Dimensions of register variation: a cross-linguistic comparison. Cambridge Univesity Press, CambridgeCrossRef
2.
Zurück zum Zitat Branavan SRK, Chen H, Eisenstein J, Barzilay R (2009) Learning document-level semantic properties from free-text annotations. J Artif Intell Res 34:569–603. doi:10.1613/jair.2633 MATH Branavan SRK, Chen H, Eisenstein J, Barzilay R (2009) Learning document-level semantic properties from free-text annotations. J Artif Intell Res 34:569–603. doi:10.​1613/​jair.​2633 MATH
3.
Zurück zum Zitat Cai D, He X, Han J (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17(12):1624–1637CrossRef Cai D, He X, Han J (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17(12):1624–1637CrossRef
4.
Zurück zum Zitat Fitt S, Isard S (1999) Synthesis of regional english using a keyword lexicon. In: Proceedings Eurospeech 99, 823–826 Fitt S, Isard S (1999) Synthesis of regional english using a keyword lexicon. In: Proceedings Eurospeech 99, 823–826
5.
Zurück zum Zitat Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press, MassachusettsMATH Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press, MassachusettsMATH
6.
Zurück zum Zitat van Halteren H, Tweedie F, Baayen H (1996) Outside the cave of shadows: using syntactic annotation to enhance authorship attribution. Comput Humanit 28(2):87–106 van Halteren H, Tweedie F, Baayen H (1996) Outside the cave of shadows: using syntactic annotation to enhance authorship attribution. Comput Humanit 28(2):87–106
7.
Zurück zum Zitat Han E, Karypis G, Kumar V (2001) Text categorization using weight adjusted k-nearest neighbor classification. Conference on advances in knowledge discovery and data mining, pp 53–65 Han E, Karypis G, Kumar V (2001) Text categorization using weight adjusted k-nearest neighbor classification. Conference on advances in knowledge discovery and data mining, pp 53–65
8.
Zurück zum Zitat He X, Cai D, Niyogi P (2006) Laplacian score for feature selection. Adv Neural Inf Process Syst 18:507 He X, Cai D, Niyogi P (2006) Laplacian score for feature selection. Adv Neural Inf Process Syst 18:507
9.
Zurück zum Zitat Hotho A, Staab S, Stumme G (2003) Ontologies improve text document clustering. In: Third IEEE international conference on data mining 2003, ICDM 2003. pp. 541–544. doi:10.1109/ICDM.2003.1250972 Hotho A, Staab S, Stumme G (2003) Ontologies improve text document clustering. In: Third IEEE international conference on data mining 2003, ICDM 2003. pp. 541–544. doi:10.​1109/​ICDM.​2003.​1250972
10.
Zurück zum Zitat Hughes A, Trudgill P, Watt D (2012) English accents and dialects: an introduction to social and regional varieties of English in the British Isles. Routledge, London Hughes A, Trudgill P, Watt D (2012) English accents and dialects: an introduction to social and regional varieties of English in the British Isles. Routledge, London
11.
Zurück zum Zitat Jia Y, Nie F, Zhang C (2009) Trace ratio problem revisited. IEEE Trans Neural Netw 20(4):729–735CrossRef Jia Y, Nie F, Zhang C (2009) Trace ratio problem revisited. IEEE Trans Neural Netw 20(4):729–735CrossRef
12.
Zurück zum Zitat Joachims T (1999) Transductive inference for text classification using support vector machines. In: Machine learning-international workshop then conference, Morgan Kaufmann Publishers Inc., pp. 200–209 Joachims T (1999) Transductive inference for text classification using support vector machines. In: Machine learning-international workshop then conference, Morgan Kaufmann Publishers Inc., pp. 200–209
13.
Zurück zum Zitat Kessler B, Numberg G, Schütze H (1997) Automatic detection of text genre. In: Proceedings of the 35th annual meeting of the association for computational linguistics and eighth conference of the european chapter of the association for computational linguistics, ACL ’98, Association for Computational Linguistics, Stroudsburg, PA, pp. 32–38. doi:10.3115/976909.979622 Kessler B, Numberg G, Schütze H (1997) Automatic detection of text genre. In: Proceedings of the 35th annual meeting of the association for computational linguistics and eighth conference of the european chapter of the association for computational linguistics, ACL ’98, Association for Computational Linguistics, Stroudsburg, PA, pp. 32–38. doi:10.​3115/​976909.​979622
14.
Zurück zum Zitat Lai Z, Wong WK, Xu Y, Zhao C, Sun M (2013) Sparse alignment for robust tensor learning. IEEE Trans Neural Netw Learn Syst 25(10):1779–1792CrossRef Lai Z, Wong WK, Xu Y, Zhao C, Sun M (2013) Sparse alignment for robust tensor learning. IEEE Trans Neural Netw Learn Syst 25(10):1779–1792CrossRef
15.
Zurück zum Zitat Lai Z, Xu Y, Yang J, Jinhui T, David Z (2013) Sparse tensor discriminant analysis. IEEE Trans Image Process 22(10):3904–3915MathSciNetCrossRef Lai Z, Xu Y, Yang J, Jinhui T, David Z (2013) Sparse tensor discriminant analysis. IEEE Trans Image Process 22(10):3904–3915MathSciNetCrossRef
16.
Zurück zum Zitat Mairesse F, Walker MA, Mehl MR, Moore RK (2007) Using linguistic cues for the automatic recognition of personality in conversation and text. J Artif Intell Res 30:457–500. doi:10.1613/jair.2349 MATH Mairesse F, Walker MA, Mehl MR, Moore RK (2007) Using linguistic cues for the automatic recognition of personality in conversation and text. J Artif Intell Res 30:457–500. doi:10.​1613/​jair.​2349 MATH
17.
Zurück zum Zitat Manevitz L, Yousef M (2007) One-class document classification via neural networks. Neurocomputing 70(7):1466–1481CrossRef Manevitz L, Yousef M (2007) One-class document classification via neural networks. Neurocomputing 70(7):1466–1481CrossRef
18.
Zurück zum Zitat Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of english: the penn treebank. Comput Linguist 19(2):313–330 Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of english: the penn treebank. Comput Linguist 19(2):313–330
19.
Zurück zum Zitat Metcalf AA (2000) How we talk: American regional english today;[a talking tour of American english, region by region]. Houghton Mifflin Harcourt, Boston Metcalf AA (2000) How we talk: American regional english today;[a talking tour of American english, region by region]. Houghton Mifflin Harcourt, Boston
20.
Zurück zum Zitat Nie F, Xiang S, Jia Y, Zhang C, Yan S (2008) Trace ratio criterion for feature selection. In: AAAI, vol. 2, 671–676 Nie F, Xiang S, Jia Y, Zhang C, Yan S (2008) Trace ratio criterion for feature selection. In: AAAI, vol. 2, 671–676
21.
Zurück zum Zitat Stamatatos E, Fakotakis N, Kokkinakis G (1999) Automatic authorship attribution. In: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics, EACL ’99, Association for Computational Linguistics, Stroudsburg, PA, pp. 158–164. doi:10.3115/977035.977057 Stamatatos E, Fakotakis N, Kokkinakis G (1999) Automatic authorship attribution. In: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics, EACL ’99, Association for Computational Linguistics, Stroudsburg, PA, pp. 158–164. doi:10.​3115/​977035.​977057
23.
Zurück zum Zitat Tanaka S (2006) English and multiculturalism—from the language user’s perspective. RELC J 37(1):47–66CrossRef Tanaka S (2006) English and multiculturalism—from the language user’s perspective. RELC J 37(1):47–66CrossRef
24.
Zurück zum Zitat Tang P, Chow TWS (2013) Recognition of word collocation habits using frequency rank ratio and inter-term intimacy. Expert Syst Appl 40(11):4301–4314CrossRef Tang P, Chow TWS (2013) Recognition of word collocation habits using frequency rank ratio and inter-term intimacy. Expert Syst Appl 40(11):4301–4314CrossRef
25.
Zurück zum Zitat Thompson RM (1975) Mexican-American english: social correlates of regional pronunciation. Am Speech 50(1/2):18–24CrossRef Thompson RM (1975) Mexican-American english: social correlates of regional pronunciation. Am Speech 50(1/2):18–24CrossRef
26.
Zurück zum Zitat Vaux B, et al. (2003) Harvard survey of North American dialects Vaux B, et al. (2003) Harvard survey of North American dialects
27.
Zurück zum Zitat Wang H, Yan S, Xu D, Tang X, Huang T (2007) Trace ratio vs. ratio trace for dimensionality reduction. In: IEEE conference on computer vision and pattern recognition 2007, CVPR’07. pp 1–8 Wang H, Yan S, Xu D, Tang X, Huang T (2007) Trace ratio vs. ratio trace for dimensionality reduction. In: IEEE conference on computer vision and pattern recognition 2007, CVPR’07. pp 1–8
29.
Zurück zum Zitat Wolfram W, Schilling-Estes N (1998) American English: dialects and variation. Blackwell Malden, Malden Wolfram W, Schilling-Estes N (1998) American English: dialects and variation. Blackwell Malden, Malden
30.
Zurück zum Zitat Yan S, Xu D, Zhang B, Zhang HJ, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51CrossRef Yan S, Xu D, Zhang B, Zhang HJ, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51CrossRef
31.
Zurück zum Zitat Yu L, Wang S, Lai K (2005) A rough-set-refined text mining approach for crude oil market tendency forecasting. Int J Knowl Syst Sci 2(1):33–46 Yu L, Wang S, Lai K (2005) A rough-set-refined text mining approach for crude oil market tendency forecasting. Int J Knowl Syst Sci 2(1):33–46
32.
Zurück zum Zitat Zhang T, Tao D, Li X, Yang J (2009) Patch alignment for dimensionality reduction. IEEE Trans Knowl Data Eng 21(9):1299–1313CrossRef Zhang T, Tao D, Li X, Yang J (2009) Patch alignment for dimensionality reduction. IEEE Trans Knowl Data Eng 21(9):1299–1313CrossRef
33.
Zurück zum Zitat Zhang Z, Chow T, Zhao M (2013) M-isomap: orthogonal constrained marginal isomap for nonlinear dimensionality reduction. IEEE Trans Cybern 43(1):180–191CrossRef Zhang Z, Chow T, Zhao M (2013) M-isomap: orthogonal constrained marginal isomap for nonlinear dimensionality reduction. IEEE Trans Cybern 43(1):180–191CrossRef
34.
Zurück zum Zitat Zhang Z, Chow TW, Zhao M (2013) Trace ratio optimization-based semi-supervised nonlinear dimensionality reduction for marginal manifold visualization. IEEE Trans Knowl Data Eng 25(5):1148–1161. doi:10.1109/TKDE.2012.47 CrossRef Zhang Z, Chow TW, Zhao M (2013) Trace ratio optimization-based semi-supervised nonlinear dimensionality reduction for marginal manifold visualization. IEEE Trans Knowl Data Eng 25(5):1148–1161. doi:10.​1109/​TKDE.​2012.​47 CrossRef
35.
Zurück zum Zitat Zhao M, Chan RH, Tang P, Chow TW, Wong SW (2013) Trace ratio linear discriminant analysis for medical diagnosis: a case study of dementia. IEEE Signal Process Lett 20(5):431–434CrossRef Zhao M, Chan RH, Tang P, Chow TW, Wong SW (2013) Trace ratio linear discriminant analysis for medical diagnosis: a case study of dementia. IEEE Signal Process Lett 20(5):431–434CrossRef
36.
Zurück zum Zitat Zhao M, Zhang Z, Chow TW (2012) Trace ratio criterion based generalized discriminative learning for semi-supervised dimensionality reduction. Pattern Recognit 45(4):1482–1499CrossRefMATH Zhao M, Zhang Z, Chow TW (2012) Trace ratio criterion based generalized discriminative learning for semi-supervised dimensionality reduction. Pattern Recognit 45(4):1482–1499CrossRefMATH
Metadaten
Titel
Locality Alignment Discriminant Analysis for Visualizing Regional English
verfasst von
Peng Tang
Mingbo Zhao
Tommy W. S. Chow
Publikationsdatum
01.02.2016
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 1/2016
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-015-9422-9

Weitere Artikel der Ausgabe 1/2016

Neural Processing Letters 1/2016 Zur Ausgabe

Neuer Inhalt