Skip to main content
Top

2017 | Book

Machine Translation with Minimal Reliance on Parallel Resources

Authors: Dr. George Tambouratzis, Dr. Marina Vassiliou, Dr. Sokratis Sofianopoulos

Publisher: Springer International Publishing

Book Series : SpringerBriefs in Statistics

insite
SEARCH

About this book

This book provides a unified view on a new methodology for Machine Translation (MT). This methodology extracts information from widely available resources (extensive monolingual corpora) while only assuming the existence of a very limited parallel corpus, thus having a unique starting point to Statistical Machine Translation (SMT). In this book, a detailed presentation of the methodology principles and system architecture is followed by a series of experiments, where the proposed system is compared to other MT systems using a set of established metrics including BLEU, NIST, Meteor and TER. Additionally, a free-to-use code is available, that allows the creation of new MT systems. The volume is addressed to both language professionals and researchers. Prerequisites for the readers are very limited and include a basic understanding of the machine translation as well as of the basic tools of natural language processing.​

Table of Contents

Frontmatter
Chapter 1. Preliminaries
Abstract
This chapter contains a general introduction to the topic of the present book. It presents the current challenges of Machine Translation (MT), in particular for languages where only a limited amount of specialised resources is readily available. To that end, a comprehensive review of the state-of-the-art in MT is performed. Focus is placed on related work on MT methodologies that are portable to new language pairs, and issues such as stability and extensibility are emphasised. It is widely accepted that language portability necessitates an algorithmic approach to extract information from large corpora in an unsupervised manner. This includes both Statistical MT (SMT) and Example-based MT (EBMT). Here, a review of the strengths and shortcomings of the different approaches is performed, in terms of the a priori externally-provided linguistic knowledge and required specialised resources. This review leads to the concept of the proposed MT methodology.
George Tambouratzis, Marina Vassiliou, Sokratis Sofianopoulos
Chapter 2. Implementation
Abstract
This chapter introduces the general design characteristics of PRESEMT and provides a detailed description of all resources required as well as all pre-processing steps needed, such as corpora processing and model creation.
George Tambouratzis, Marina Vassiliou, Sokratis Sofianopoulos
Chapter 3. Main Translation Process
Abstract
This chapter presents in detail the main translation process of PRESEMT, delving deeper in the core of the system and its inner workings.
George Tambouratzis, Marina Vassiliou, Sokratis Sofianopoulos
Chapter 4. Assessing PRESEMT
Abstract
The topic of the current chapter is the evaluation of the performance of PRESEMT both per se as well as in comparison with other MT systems, the performance relating to the translation quality being achieved. While it is possible to employ humans for this task (subjective evaluation), who assess an MT system in terms of fluency (i.e. grammaticality) and adequacy (i.e. fidelity to the original text) (van Slype 1979), this being a laborious and time-consuming process, evaluation normally relies on automatic metrics (objective evaluation) that calculate the similarity between what an MT system produces (system output) and what it should have produced (reference translation).
George Tambouratzis, Marina Vassiliou, Sokratis Sofianopoulos
Chapter 5. Expanding the System
Abstract
Following the detailed description of the PRESEMT Machine Translation system and the report on its performance, the current chapter focuses on the system’s portability. Portability is a term intended to signify the process of integrating a new language pair into the system. This involves reviewing all the necessary system modules and resources and making all the necessary modifications.
George Tambouratzis, Marina Vassiliou, Sokratis Sofianopoulos
Chapter 6. Extensions to the PRESEMT Methodology
Abstract
This chapter describes a number of improvements performed on the basic PRESEMT system. These improvements are aimed at specific modules of the system in an effort to achieve gains in the translation accuracy, for which alternative implementations have been suggested. These extensions concern different modules of the PRESEMT architecture. The first extension covers the pre-processing stage, where an improved phrasing model for the SL side is proposed. The second extension involves the use of supplementary language models (LM) in the TL, to improve the translation accuracy in terms of both the phrasal level but also the post-editing and token generation steps.
George Tambouratzis, Marina Vassiliou, Sokratis Sofianopoulos
Chapter 7. Conclusions and Future Work
Abstract
This chapter performs a review of the research work discussed in the previous chapters of the present volume. This review represents a summary of the outcomes of the research within the PRESEMT project. As a logical outcome, a set of key directions is identified for future work in order to further improve the MT methodology. A brief report of the most promising ones is provided in the second part of this chapter.
George Tambouratzis, Marina Vassiliou, Sokratis Sofianopoulos
Backmatter
Metadata
Title
Machine Translation with Minimal Reliance on Parallel Resources
Authors
Dr. George Tambouratzis
Dr. Marina Vassiliou
Dr. Sokratis Sofianopoulos
Copyright Year
2017
Electronic ISBN
978-3-319-63107-3
Print ISBN
978-3-319-63105-9
DOI
https://doi.org/10.1007/978-3-319-63107-3

Premium Partner