nach oben

International Journal of Speech Technology

Erschienen in:

01.06.2014

A semantic parsing approach for Bhutanese language of Dzongkha

verfasst von: P. V. Arun

Erschienen in: International Journal of Speech Technology | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Developments in the computational analysis of Dzongkha have been limited due to the syntactic complexity of the language. Though the natural language processing domains have witnessed rapid developments over the past decade; very few works has been done in Dzongkha despite of being the national language of Bhutan. In this paper, we have investigated the major problems in Dzongkha processing and have proposed a semantic parsing approach for effective processing of this language. We have used a probabilistic approach and have used the linguistic rules in Dzongkha to remove the ambiguities. Semantic representations along with belief net concepts have been used to increase the accuracy of segmentation, syntactic and semantic analyses. The proposed frame work has been able to solve the major issues related to Dzongkha processing, however needs to be further improved to include all the syntactic variations.

Vorheriger Artikel Tone modelling in Ibibio speech synthesis

Nächster Artikel Car noise verification and applications

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abbasi, A. M., & Hussain, S. (2012). Syllable structure and syllabification in Sindhi-English loanwords. International Researchers, 1(4), 92–98.

Arun, P. V., & Sindhu, L. (2010). A probabilistic parser for Malayalam. In ACM transactions of speech & language processing, Kuala Lumpur, Malaysia, November 2–5 (pp. 17–21).

Chungku, C., Jurmey, R., & Gertrud, F. (2010). Building NLP resources for Dzongkha: a tagset and a tagged corpus. In Proceedings of the 8th workshop on Asian language resources, August 21–22, 2010. Beijing, China (pp. 103–110).

Danescu, N. M. C., Sudhof, M., Jurafsky, D., Leskovec, J., & Potts, C. (2013). A computational approach to politeness with application to social factors. In Proceedings of ACL 2013, association for computational linguistics, Bulgaria, January 23–25 (pp. 311–321)

Department of Information Technology & Telecommunications (DITT) (2013). ICT projects, Bhutan. http://www.dit.gov.bt/ict-projects. Accessed 1 Aug 2013.

Dzongkha Development Commission (DDC) (2013). Resources & products, Bhutan. http://www.dzongkha.gov.bt/index.en.html. Accessed 8 Aug 2013.

Fellbaum, C. (2007). WordNet: an electronic lexical database. Cambridge: MIT Press.

Garrett, E., & Bateman, L. (2007). Symbiosis between documentary linguistics and linguistic pragmatics. In Proceedings of conference on language documentation and linguistic theory, May 3–5 (pp. 83–93). London: SOAS.

Green, S., Cer, D., Reschke, K., Sida, R. V., Silveira, N., Neidert, J., & Manning, C. D. (2013). Feature-rich phrase-based translation: Stanford university’s submission to the WMT 2013 translation task. In Proceedings of the eighth workshop on statistical machine translation, Bulgaria, August 1–3 (pp. 148–153).

Hackett, P. G. (2003). Tibetan Verb lexicon (pp. 120–123). Boulder: Snow Lion Publications.

Huidan, L. (2012). Building large scale text corpus for Tibetan natural language processing by extracting text from web pages. In Proceedings of the 10th workshop on Asian language resources, COLING 2012, Mumbai, December 2–8 (pp. 11–20).

Huidan, L., Nuo, M., Ma, L., Wu, J., & He, Y. (2011). Tibetan word segmentation as syllable tagging using conditional random field. In 25th pacific Asia conference on language, information and computation, China, March 17–20 (pp. 168–177).

Irtza, S., & Hussain, S. (2013). Minimally balanced corpus for speech recognition. In Proceedings of 1st IEEE international conference on communications, signal processing, and their applications (ICCSPA’13), Sharjah, January 3–10 (pp. 70–78).

Jiang, T., Yu, H., & Jam, Y. (2011). Tibetan word segmentation system based on conditional random fields. In 2011 IEEE 2nd International Conference on Software Engineering and Service Science (ICSESS), July 15–17 (pp. 446–448). CrossRef

Noor, N. M. M., Ali, N. H., & Ibrahim, N. S. (2010). A new framework to extract WordNet lexicographer files for semi-formal notation: a preliminary study. In International symposium information technology (ITSim), June 15–17 (Vol. 2, pp. 1027–1031).

Norbu, S., Choejey, P., Dendup, T., Hussain, S., & Muaz, A. (2010). Dzongkha word segmentation. In Proceedings of the 8th workshop on Asian language resources, COLING 2010, Beijing, China, April 3–8 (pp. 200–209).

Online resources (2013). http://www.learntibetan.net/grammar/sentence.htm. Accessed 9 July 2013.

Poprat, M., Beisswanger, E., & Hahn, U. (2008). Building a bio-WordNet using WordNet data structures and WordNet’s software infrastructure—a failure story. In ACL 2008 workshop on software engineering, testing, and quality assurance for natural language processing, February 20–25, 2008 (pp. 31–39).

Qiu, L., Long, C., & Zhao, X. (2012). A joint approach for building a large Tibetan corpus with syntactic parsing and semantic role labeling. In 2012 fifth international conference on intelligent networks and intelligent systems, Tianjin, China, November 1–3 (pp. 212–218).

Socher, R., Huval, B., Manning, C. D., & Ng, A. Y. (2012). Semantic compositionality through recursive matrix-vector spaces. In EMNLP.

Taskar, B., Klein, D., Collins, M., Koller, D., & Manning, C. (2004). Max-margin parsing. In Proceedings of EMNLP (pp. 1–8).

Tibetan Himalayan Library (THLIB) (2013). Tibetan translation tools and resources. http://www.thlib.org/reference/dictionaries/tibetan-dictionary/translate.php. Accessed 8 Aug 2013.

Titel: A semantic parsing approach for Bhutanese language of Dzongkha
verfasst von: P. V. Arun
Publikationsdatum: 01.06.2014
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 2/2014
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-013-9218-0

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2014

An improved feature transformation method using mutual information

Audio watermarking in transform domain based on singular value decomposition and Cartesian-polar transformation

Syllable based text to speech synthesis system using auto associative neural network prosody prediction

Tone modelling in Ibibio speech synthesis

GMM based language identification system using robust features

A perceptually motivated stationary wavelet packet filterbank using improved spectral over-subtraction for enhancement of speech in various noise environments