Skip to main content
Top
Published in: International Journal of Speech Technology 2/2014

01-06-2014

A semantic parsing approach for Bhutanese language of Dzongkha

Author: P. V. Arun

Published in: International Journal of Speech Technology | Issue 2/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Developments in the computational analysis of Dzongkha have been limited due to the syntactic complexity of the language. Though the natural language processing domains have witnessed rapid developments over the past decade; very few works has been done in Dzongkha despite of being the national language of Bhutan. In this paper, we have investigated the major problems in Dzongkha processing and have proposed a semantic parsing approach for effective processing of this language. We have used a probabilistic approach and have used the linguistic rules in Dzongkha to remove the ambiguities. Semantic representations along with belief net concepts have been used to increase the accuracy of segmentation, syntactic and semantic analyses. The proposed frame work has been able to solve the major issues related to Dzongkha processing, however needs to be further improved to include all the syntactic variations.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Abbasi, A. M., & Hussain, S. (2012). Syllable structure and syllabification in Sindhi-English loanwords. International Researchers, 1(4), 92–98. Abbasi, A. M., & Hussain, S. (2012). Syllable structure and syllabification in Sindhi-English loanwords. International Researchers, 1(4), 92–98.
go back to reference Arun, P. V., & Sindhu, L. (2010). A probabilistic parser for Malayalam. In ACM transactions of speech & language processing, Kuala Lumpur, Malaysia, November 2–5 (pp. 17–21). Arun, P. V., & Sindhu, L. (2010). A probabilistic parser for Malayalam. In ACM transactions of speech & language processing, Kuala Lumpur, Malaysia, November 2–5 (pp. 17–21).
go back to reference Chungku, C., Jurmey, R., & Gertrud, F. (2010). Building NLP resources for Dzongkha: a tagset and a tagged corpus. In Proceedings of the 8th workshop on Asian language resources, August 21–22, 2010. Beijing, China (pp. 103–110). Chungku, C., Jurmey, R., & Gertrud, F. (2010). Building NLP resources for Dzongkha: a tagset and a tagged corpus. In Proceedings of the 8th workshop on Asian language resources, August 21–22, 2010. Beijing, China (pp. 103–110).
go back to reference Danescu, N. M. C., Sudhof, M., Jurafsky, D., Leskovec, J., & Potts, C. (2013). A computational approach to politeness with application to social factors. In Proceedings of ACL 2013, association for computational linguistics, Bulgaria, January 23–25 (pp. 311–321) Danescu, N. M. C., Sudhof, M., Jurafsky, D., Leskovec, J., & Potts, C. (2013). A computational approach to politeness with application to social factors. In Proceedings of ACL 2013, association for computational linguistics, Bulgaria, January 23–25 (pp. 311–321)
go back to reference Fellbaum, C. (2007). WordNet: an electronic lexical database. Cambridge: MIT Press. Fellbaum, C. (2007). WordNet: an electronic lexical database. Cambridge: MIT Press.
go back to reference Garrett, E., & Bateman, L. (2007). Symbiosis between documentary linguistics and linguistic pragmatics. In Proceedings of conference on language documentation and linguistic theory, May 3–5 (pp. 83–93). London: SOAS. Garrett, E., & Bateman, L. (2007). Symbiosis between documentary linguistics and linguistic pragmatics. In Proceedings of conference on language documentation and linguistic theory, May 3–5 (pp. 83–93). London: SOAS.
go back to reference Green, S., Cer, D., Reschke, K., Sida, R. V., Silveira, N., Neidert, J., & Manning, C. D. (2013). Feature-rich phrase-based translation: Stanford university’s submission to the WMT 2013 translation task. In Proceedings of the eighth workshop on statistical machine translation, Bulgaria, August 1–3 (pp. 148–153). Green, S., Cer, D., Reschke, K., Sida, R. V., Silveira, N., Neidert, J., & Manning, C. D. (2013). Feature-rich phrase-based translation: Stanford university’s submission to the WMT 2013 translation task. In Proceedings of the eighth workshop on statistical machine translation, Bulgaria, August 1–3 (pp. 148–153).
go back to reference Hackett, P. G. (2003). Tibetan Verb lexicon (pp. 120–123). Boulder: Snow Lion Publications. Hackett, P. G. (2003). Tibetan Verb lexicon (pp. 120–123). Boulder: Snow Lion Publications.
go back to reference Huidan, L. (2012). Building large scale text corpus for Tibetan natural language processing by extracting text from web pages. In Proceedings of the 10th workshop on Asian language resources, COLING 2012, Mumbai, December 2–8 (pp. 11–20). Huidan, L. (2012). Building large scale text corpus for Tibetan natural language processing by extracting text from web pages. In Proceedings of the 10th workshop on Asian language resources, COLING 2012, Mumbai, December 2–8 (pp. 11–20).
go back to reference Huidan, L., Nuo, M., Ma, L., Wu, J., & He, Y. (2011). Tibetan word segmentation as syllable tagging using conditional random field. In 25th pacific Asia conference on language, information and computation, China, March 17–20 (pp. 168–177). Huidan, L., Nuo, M., Ma, L., Wu, J., & He, Y. (2011). Tibetan word segmentation as syllable tagging using conditional random field. In 25th pacific Asia conference on language, information and computation, China, March 17–20 (pp. 168–177).
go back to reference Irtza, S., & Hussain, S. (2013). Minimally balanced corpus for speech recognition. In Proceedings of 1st IEEE international conference on communications, signal processing, and their applications (ICCSPA’13), Sharjah, January 3–10 (pp. 70–78). Irtza, S., & Hussain, S. (2013). Minimally balanced corpus for speech recognition. In Proceedings of 1st IEEE international conference on communications, signal processing, and their applications (ICCSPA’13), Sharjah, January 3–10 (pp. 70–78).
go back to reference Jiang, T., Yu, H., & Jam, Y. (2011). Tibetan word segmentation system based on conditional random fields. In 2011 IEEE 2nd International Conference on Software Engineering and Service Science (ICSESS), July 15–17 (pp. 446–448). CrossRef Jiang, T., Yu, H., & Jam, Y. (2011). Tibetan word segmentation system based on conditional random fields. In 2011 IEEE 2nd International Conference on Software Engineering and Service Science (ICSESS), July 15–17 (pp. 446–448). CrossRef
go back to reference Noor, N. M. M., Ali, N. H., & Ibrahim, N. S. (2010). A new framework to extract WordNet lexicographer files for semi-formal notation: a preliminary study. In International symposium information technology (ITSim), June 15–17 (Vol. 2, pp. 1027–1031). Noor, N. M. M., Ali, N. H., & Ibrahim, N. S. (2010). A new framework to extract WordNet lexicographer files for semi-formal notation: a preliminary study. In International symposium information technology (ITSim), June 15–17 (Vol. 2, pp. 1027–1031).
go back to reference Norbu, S., Choejey, P., Dendup, T., Hussain, S., & Muaz, A. (2010). Dzongkha word segmentation. In Proceedings of the 8th workshop on Asian language resources, COLING 2010, Beijing, China, April 3–8 (pp. 200–209). Norbu, S., Choejey, P., Dendup, T., Hussain, S., & Muaz, A. (2010). Dzongkha word segmentation. In Proceedings of the 8th workshop on Asian language resources, COLING 2010, Beijing, China, April 3–8 (pp. 200–209).
go back to reference Poprat, M., Beisswanger, E., & Hahn, U. (2008). Building a bio-WordNet using WordNet data structures and WordNet’s software infrastructure—a failure story. In ACL 2008 workshop on software engineering, testing, and quality assurance for natural language processing, February 20–25, 2008 (pp. 31–39). Poprat, M., Beisswanger, E., & Hahn, U. (2008). Building a bio-WordNet using WordNet data structures and WordNet’s software infrastructure—a failure story. In ACL 2008 workshop on software engineering, testing, and quality assurance for natural language processing, February 20–25, 2008 (pp. 31–39).
go back to reference Qiu, L., Long, C., & Zhao, X. (2012). A joint approach for building a large Tibetan corpus with syntactic parsing and semantic role labeling. In 2012 fifth international conference on intelligent networks and intelligent systems, Tianjin, China, November 1–3 (pp. 212–218). Qiu, L., Long, C., & Zhao, X. (2012). A joint approach for building a large Tibetan corpus with syntactic parsing and semantic role labeling. In 2012 fifth international conference on intelligent networks and intelligent systems, Tianjin, China, November 1–3 (pp. 212–218).
go back to reference Socher, R., Huval, B., Manning, C. D., & Ng, A. Y. (2012). Semantic compositionality through recursive matrix-vector spaces. In EMNLP. Socher, R., Huval, B., Manning, C. D., & Ng, A. Y. (2012). Semantic compositionality through recursive matrix-vector spaces. In EMNLP.
go back to reference Taskar, B., Klein, D., Collins, M., Koller, D., & Manning, C. (2004). Max-margin parsing. In Proceedings of EMNLP (pp. 1–8). Taskar, B., Klein, D., Collins, M., Koller, D., & Manning, C. (2004). Max-margin parsing. In Proceedings of EMNLP (pp. 1–8).
Metadata
Title
A semantic parsing approach for Bhutanese language of Dzongkha
Author
P. V. Arun
Publication date
01-06-2014
Publisher
Springer US
Published in
International Journal of Speech Technology / Issue 2/2014
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-013-9218-0

Other articles of this Issue 2/2014

International Journal of Speech Technology 2/2014 Go to the issue