Notes
The term “translation spotting” refers to the task of identifying the target-language words that correspond to a given set of source-language words in a pair of text segments known to be mutual translations (Simard 2003).
Definition of lexical bundle from Wikipedia: “a sequence of two or more words that occur in language with high frequency but are not idiomatic; a bundle, chunk, or cluster”.
References
Alegria I, Ansa O, Artola X, Ezeiza N, Gojenola K, Urizar R (2004) Representation and treatment of multiword expressions in Basque. In: Proceedings of the workshop on multiword expressions: integrating processing. Barcelona, pp 48–55
Baldwin T, Kim SN (2010) Multiword expressions. In: Indurkhya N, Damerau FJ (eds) Handbook of natural language processing, 2nd edn. CRC Press, Boca Raton, pp 267–292
Biber D, Johansson S, Leech G, Conrad S, Finegan E (1999) Longman grammar of written and spoken english. Longman, London
Dunning T (1993) Accurate methods for the statistics of surprise and coincidence. Comput Linguist 19(1):61–74
Firth JR (1957) Papers in linguistics 1934–1951. Oxford University Press, Oxford
Jackendoff R (1997) The architecture of the language faculty. MIT Press, Cambridge
Koehn P, Hoang H (2007) Factored translation models. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). Prague, pp 868–876
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, College W, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: ACL 2007, proceedings of the interactive poster and demonstration sessions. Prague, pp 177–180
Mitkov R (2002) Anaphora resolution. Longman, London
Ramisch C (2012) A generic framework for multiword expressions treatment: from acquisition to applications. In: Proceedings of ACL 2012 student research workshop. Jeju Island, pp 61–66
Ramisch C (2015) Multiword expressions acquisition: a generic and open framework. Springer, New York
Ramisch C, Villavicencio A, Kordoni V (2013) Introduction to the special issue on multiword expressions: from theory to practice and use. ACM Trans Speech Lang Process 10(2):3
Sag IA, Baldwin T, Bond F, Copestake A, Flickinger D (2002) Multiword expressions: a pain in the neck for NLP. In: Gelbukh A (eds) International conference on intelligent text processing and computational linguistics. Lecture Notes in computer science. Springer, Berlin, pp 1–15
Samaridi N, Markantonatou S, (2014) Parsing modern greek verb mwes with lfg/xle grammars. In: Proceedings of the 10th workshop on multiword expressions (MWE). Gothenburg, pp 33–37
Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: Proceedings of international conference on new methods in language processing (NEMLAP), vol 12. Manchester, pp 44–49
Simard M (2003) Translation spotting for translation memories. In: Proceedings of the HLT-NAACL 2003 workshop on building and using parallel texts: data driven machine translation and beyond. Edmonton, pp 65–72
Varga D, Halácsy P, Kornai A, Viktor NB, Laszlo N, Laszlo, N, Viktor T (2007) Parallel corpora for medium density languages. In: Proceedings of proceedings of RANLP 2007: recent advances in natural language processing. Borovets, pp 590–596
Wehrli E, Seretan V, Nerima L, Russo L (2009) Collocations in a rule-based MT system: A case study evaluation of their translation adequacy. In: Proceedings of the 13th annual meeting of the european association for machine translation. Barcelona, pp 128–135
Zhang Y, Kordoni V, Villavicencio A, Idiart M (2006) Automated multiword expression prediction for grammar engineering. In: Proceedings of the workshop on multiword expressions: identifying and exploiting underlying properties. Sydney, pp 36–44
Acknowledgements
The ADAPT Centre for Digital Content Technology is funded under the Science Foundation Ireland (SFI) Research Centres Programme (Grant No. 13/RC/2106) and is co-funded under the European Regional Development Fund. This project has partially received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 713567, and the publication has emanated from research supported in part by a research Grant from SFI under Grant No. 13/RC/2077.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Haque, R., Hasanuzzaman, M. & Way, A. Ruslan Mitkov, Johanna Monti, Gloria Corpas Pastor, and Violeta Seretan (eds): Multiword units in machine translation and translation technology. Machine Translation 33, 349–354 (2019). https://doi.org/10.1007/s10590-019-09239-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-019-09239-4