Skip to main content

Über dieses Buch

There is increasing interaction among communities with multiple languages, thus we need services that can effectively support multilingual communication. The Language Grid is an initiative to build an infrastructure that allows end users to create composite language services for intercultural collaboration. The aim is to support communities to create customized multilingual environments by using language services to overcome local language barriers. The stakeholders of the Language Grid are the language resource providers, the language service users, and the language grid operators who coordinate the former.

This book includes 18 chapters in six parts that summarize various research results and associated development activities on the Language Grid. The chapters in Part I describe the framework of the Language Grid, i.e., service-oriented collective intelligence, used to bridge providers, users and operators. Two kinds of software are introduced, the service grid server software and the Language Grid Toolbox, and code for both is available via open source licenses. Part II describes technologies for service workflows that compose atomic language services. Part III reports on research work and activities relating to sharing and using language services. Part IV describes various applications of language services as applicable to intercultural collaboration. Part V contains reports on applying the Language Grid for translation activities, including localization of industrial documents and Wikipedia articles. Finally, Part VI illustrates how the Language Grid can be connected to other service grids, such as DFKI's Heart of Gold and smart classroom services in Tsinghua University in Beijing.

The book will be valuable for researchers in artificial intelligence, natural language processing, services computing and human--computer interaction, particularly those who are interested in bridging technologies and user communities.



Language Grid Framework


Chapter 1. The Language Grid: Service-Oriented Approach to Sharing Language Resources

Since various communities, which use multiple languages, now want to interact in daily life, tools that can effectively support multilingual communication are necessary. However, we often observe that the success of a multilingual tool in one situation does not guarantee its success in another. To develop a multilingual environment that can handle various situations in various communities, existing language resources (dictionaries, parallel texts, part-of-speech taggers, machine translators, etc.) should be easily shared and customized. Therefore, we designed our proposal, the Language Grid, as service-oriented collective intelligence; it allows users to freely create language services from existing language resources and combine those language services to develop new services to meet their own requirements. This chapter explains the design concept and service architecture of the Language Grid, and the approach of user involvement in the collective intelligence activities.
Toru Ishida, Yohei Murakami, Donghui Lin

Chapter 2. Service Grid Architecture

The Language Grid is an infrastructure for enabling users to share language services developed by language specialists and end user communities. Users can also create new services to support their intercultural/multilingual activities by composing language services from a range of providers. Since the Language Grid takes the service-oriented collective intelligence approach, the platform requires the services management to satisfy stakeholders’ needs: access control for service providers, dynamic service composition for service users, and service grid composition and system configurability for service grid operators. To realize the Language Grid, this chapter describes the design concept and the system architecture of the platform based on the service grid.
Yohei Murakami, Donghui Lin, Masahiro Tanaka, Takao Nakaguchi, Toru Ishida

Chapter 3. Intercultural Collaboration Tools Based on the Language Grid

As the number of online machine translation tools continues to expand, the importance of utilizing machine translation in multilingual communities is also increasing. Yet several problems exist when using the existing machine translation tools with intercultural communication in a multilingual community. 1) Translation of community-specific terms or sentences within communities is always of low quality. 2) Machine translation tools lack a view of how a multilingual community’s activities should include the improving of low-quality translation. 3) They do not provide a means for customization based on the requirements unique to a community. To address these issues, we developed Language Grid Toolbox. Language Grid Toolbox aims to support intercultural collaboration using the Language Grid and provides various functions such as creation of community-specific dictionaries combined with a machine translator and multilingual BBS where translated language can be corrected collectively by community members. Moreover, since Toolbox is developed as open source software and provides APIs of basic functions, customized functions for each community can easily be developed. Several customized communication tools extended from Toolbox’s basic modules have already been implemented by universities and local governments.
Masahiro Tanaka, Rieko Inaba, Akiyo Nadamoto, Tomohiro Shigenobu

Composing Language Services


Chapter 4. Horizontal Service Composition for Language Services

In the Language Grid, automatically composing Web services is a crucial task. This task involves vertical and horizontal composition. Vertical composition consists of defining an appropriate combination of simple processes to perform a composition task. Horizontal composition consists of determining the most appropriate Web service from among a set of functionally equivalent ones for each component process. The latter is important in language services. For the horizontal composition of Web services, we propose a generic formalization of any Web service composition problem based on a constraint optimization problem (COP) and then propose an incremental user-intervention-based protocol to find the optimal composite Web service according to some predefined criteria at run-time.
Ahlem Ben Hassine, Shigeo Matsubara, Toru Ishida

Chapter 5. Service Supervision for Runtime Service Management

The Language Grid offers language services with a standardized interface and different non-functional properties. This allows us to create a specialized composite service for our own goals simply by selecting the appropriate services. The language services are, however, provided in various formats with their own policies. In an environment for service-collective intelligence, it is essential to have many service providers join by strongly ensuring that their policies are satisfied. In doing this, we therefore we have to solve the following problems. First, service composition relies on the products of various stakeholders that belong to different organizations, such as service products and composite service designers. This makes it difficult to modify existing services in line with given requirements. Next, selection of services may impose constraints on execution. We therefore often need to apply a certain amount of runtime adaptation toward a composite service in order to enforce given policies. To solve these problems, we proposed an architecture for runtime service management called Service Supervision. Service Supervision provides meta-level execution functions for composite services. These allow operators to modify behaviors of a composite service without changing its model. Service Supervision is also capable of effectively managing a comprehensive process of runtime service selection and adaptation in order to ensure the service providers’ policies are satisfied. We implemented the Service Supervision prototype and showed that applying meta-level execution control barely decreases performance.
Masahiro Tanaka, Toru Ishida, Yohei Murakami

Chapter 6. Language Service Ontology

The Language Grid is a distinctive language service infrastructure in the sense that it accommodates a wide variety of user needs, ranging from technical novices to experts; language resource consumers to language resource providers. As these language services are various in type and each of them can be idiosyncratic in many aspects, the service infrastructure has to address the issue of interoperability. A key to solve this issue is not only to build the services around standardized resources and interfaces, but also to establish a knowledge structure that copes effectively with a range of language services. Given this knowledge structure, referred to as a service ontology, each language service can be systematically classified and its usage specified by a corresponding API. This not only enables the utilization of existing language resources but facilitates the dissemination of newly created language resources as services.
Yoshihiko Hayashi, Thierry Declerck, Nicoletta Calzolari, Monica Monachini, Claudia Soria, Paul Buitelaar

Language Grid for Using Language Resources


Chapter 7. Cascading Translation Services

The Language Grid offers a broad array of language services such as dictionaries and translation, and cascading them enables people in different parts of the world to communicate with one another in their mother tongue. However, when cascading several translation services, words’ meanings often drift due to the inconsistency, asymmetry and intransitivity of word selection. In this section, we propose context-based coordination to maintain the consistency of word meanings. For this, we put forth a method to automatically generate multilingual equivalent terms based on the use of bilingual dictionaries. We generated trilingual equivalent noun terms and implemented a Japanese-to-German-and-back translation, cascading four translation services. The evaluation results showed that the generated terms can cover over 58% of all nouns. Translation quality was improved by 41% for all sentences, and the quality rating for all sentences increased by an average of 0.47 points on a five-point scale.
Rie Tanaka, Yohei Murakami, Toru Ishida

Chapter 8. Sharing Multilingual Resources to Support Hospital Receptions

In the medical field, there exists a serious problem with regard to communications between hospital staff and foreign patients. According to statistics, many countries worldwide have a low rate of literacy. Illiterate people engaging in multilingual communication face problems. Therefore, this situation requires the provision of support in various ways. Currently, medical translators accompany patients to medical care facilities, and the number of requests for medical translators has been increasing. However, medical translators cannot provide support at all times. Therefore, the medical field has high expectations from information technology. However, a useful system has yet to be developed and introduced in the medical field for practical use. In this chapter, we propose a multilingual communication support system called “M3.” M3 uses parallel texts and voice data to achieve high accuracy in communication between people speaking different languages. The Language Grid provides various parallel texts provided by a multilingual parallel text sharing system and parallel text providers. The proposed system can obtain and share parallel texts using Web services via the Language Grid.
Mai Miyabe, Takashi Yoshino, Aguri Shigeno

Chapter 9. Exploring Cultural Differences in Pictogram Interpretations

Pictogram communication is successful when participants at both ends of the communication channel share a common pictogram interpretation. Not all pictograms carry a universal interpretation, however; the issue of ambiguous pictogram interpretation must be addressed to assist pictogram communication. To unveil the ambiguity possible in pictogram interpretation, we conduct a human subject experiment to identify culture-specific criteria employed by humans by detecting cultural differences in pictogram interpretations. Based on the findings, we propose a categorical semantic relevance measure which calculates how relevant a pictogram is to a given interpretation in terms of a given pictogram category. The proposed measure is applied to categorized pictogram interpretations to enhance pictogram retrieval performance. The WordNet, the ChaSen, and the EDR Electronic Dictionary registered to the Language Grid are utilized to merge synonymous pictogram interpretations and to categorize pictogram interpretations into super-concept categories. We show how the Language Grid can assist the cross-cultural research process.
Heeryon Cho, Toru Ishida

Language Grid for Communication


Chapter 10. Intercultural Community Development for Kids around the World

Communication methods and tools are key factors in developing online intercultural communities, especially when community members use their own mother tongue. This chapter introduces a case of an online intercultural community for international youths in NPO Pangaea. Youths and volunteer staff are from different countries and communication in this community is not English-based. Pictograms are used for youth communication and machine translations are used for staff communication. This chapter reports the participatory design and development processes of a pictogram communication system for youths and multilingual community site for staffs. Community-based communication tools such as Pangaea Staff Community Site receive benefits from the Language Grid technology in its aspect of a collective intelligence, because the Language Grid enables community users such as Pangaea volunteers to improve machine translation quality, for example, by adding a Pangaea community dictionary.
Toshiyuki Takasaki, Yumiko Mori, Alvin W. Yeo

Chapter 11. Language-Barrier-Free Room for Second Life

A three-dimensional (3D) online virtual space, such as Second Life, becoming a familiar communication medium is a possibility because of the widespread use of the Internet. Some people view Second Life as the successor of the Internet. However, as in the real world, in the virtual world also language differences pose significant barriers to intercultural communications. We can consider a virtual space to be the simulated environment of a real space. We consider the Language Grid to be the multilingual language environment of the future that can include a variety of language resources. We have developed communication support systems that facilitate multilingual chat in Second Life, called languagebarrier- free rooms. The objective of this study is to develop a communication support system in virtual space that is identical to a system in real space. We will use the findings of the experiment to enhance the communication support systems in real space. From the results of the experiments and those of the trial experiments of the communication systems, we obtained the following result. In virtual space where communication similar to that in the real world can be simulated, we observed that human adjustment of the machine translations is necessary.
Takashi Yoshino, Katsuya Ikenobu

Chapter 12. Conversational Grounding in Machine Translation Mediated Communication

When people communicate in their native languages using machine translation, they face various problems in constructing common ground. This study, based on the Language Grid framework, investigates the difficulties of constructing common ground when pairs and triads communicate using machine translation. We compare referential communication of pairs and triads under two conditions: in their shared second language (English) and in their native languages using machine translation. Consequently, to support natural referring behaviour in machine translation mediated communication between pairs, our study suggests the importance of resolving the asymmetries and inconsistencies caused by machine translations. Furthermore, to successfully build common ground among triads, it is important for addressees to be able to monitor what is going on between a speaker and other addressees. The findings serve as a basis for designing future machine translation embedded communication systems. The proposed design implications, in particular, are fed back to the Language Grid development process and incorporated into the recent Language Grid Toolbox.
Naomi Yamashita, Toru Ishida

Language Grid for Translation


Chapter 13. Humans in the Loop of Localization Processes

The Language Grid is a service-oriented infrastructure for language services. In the Language Grid, machine translation services play important roles in supporting multilingual activities for communities. Although the effectiveness of using machine translation services for multilingual communication has been shown in previous reports, the gap between human translators and machine translators remains huge especially in the domain of localization processes that require high translation quality. In this chapter, we aim at improving localization processes by introducing humans into the loop to utilize machine translation services. We try to compare several different types of localization processes (i.e., absolute machine translation processes, absolute human translation processes and processes by human and machine translation services) in the dimensions of translation quality and translation cost. The experiment results show that monolinguals can help improve the translation quality of machine translators with the aid of community dictionary services, and that collaboration of human and machine translation services make it possible to reduce the cost compared with absolute human translations.
Donghui Lin

Chapter 14. Collaborative Translation Protocols

In this chapter, we present a protocol for collaborative translation, where two non-bilingual people who use different languages collaborate to perform the task of translation using machine translation (MT) services. The key idea of this protocol is that one person, who handles the source language and knows the original sentence (source language side), evaluates the adequacy between the original sentence and the translation of the sentence made fluent by the other person, who handles the target language (target language side). In addition, by determining whether the meaning of the machine-translated sentence is understandable, it is ensured that the two non-bilingual people can do the above tasks without stopping the protocol. As a result, this protocol 1) improves MT quality; and 2) terminates successfully only when the translation result becomes adequate and fluent. An experiment shows that when the protocol terminates successfully, the quality of the translation is increased to about 83 percent for Japanese-English translation and 91 percent for Japanese-Chinese translation. We contributed to the Language Grid Project by proposing a new way to use MT services efficiently in real fields.
Daisuke Morita, Toru Ishida

Chapter 15. Multi-Language Discussion Platform for Wikipedia Translation

The multilingual Wikipedia is the largest existing collaboratively edited encyclopedia, where several translation communities are working towards translating Wikipedia articles. The different language communities are largely independent in terms of policy creation, behavior and community mechanisms. We conducted a case study on the Wikipedia community from a multilingual point of view to better understand community behavior. We also conducted a collaborative Wiki-to-Wiki translation experiment using machine translation tools provided by the Language Grid. Based on the findings of the two studies we designed and developed a multi-language discussion platform for Wikipedia translation communities. In this chapter, we discuss the results of the case study and a laboratory experiment and how the results are applied to facilitate the creation of multilingual collective intelligence in Wikipedia through a multi-language discussion platform.
Ari Hautasaari, Toshiyuki Takasaki, Takao Nakaguchi, Jun Koyama, Yohei Murakami, Toru Ishida

Towards Federation of Service Grids


Chapter 16. Pipelining Software and Services for Language Processing

This chapter reports on our experiences with combining two different platforms in natural language processing research, i.e. Heart of Gold and the Language Grid, to provide more language resources available on the Web. Heart of Gold is known as middleware architecture for pipelining deep and shallow natural language processing components. The Language Grid is one of the service grid infrastructures built on top of the Internet to provide pipelined language services. Both of these frameworks provide composite language services and components. Having Heart of Gold integrated in the Language Grid environment contributes to increased interoperability among various language services. The integrated architecture also supports the combination of pipelined language services in the Language Grid and the pipelined natural language processing components in Heart of Gold to provide a better quality of language services available on the Web. Thus, language services with different characteristics can be combined based on the concept of Web service with different treatment of each combination. An evaluation is presented to show that the overhead of the integration is not significant.
Arif Bramantoro, Ulrich Schäfer, Toru Ishida

Chapter 17. Integrating Smart Classroom and Language Services

The real-time interactive virtual classroom with tele-education experience is an important approach in distance learning. However, most current systems fail to meet the new challenges raised by the development of the serviceoriented architecture. First, the learning systems should be able to facilitate easier integration of increasingly dedicated services, such as language services on the Internet. Second, the learning systems must open their internal interfaces as web services to other systems, so as to enable deeper integration of these systems and easier deployment. Third, the systems are expected to provide flexible interfaces to support mobile device interaction. To address these issues, we build a prototype system, called Open Smart Classroom, by upgrading the original Smart Classroom into a service-oriented open system. With the help of Language Grid services, two Open Smart Classrooms deployed in Tsinghua University and Kyoto University are connected and experimental co-classes have been successfully held. The results of the user study show that integrating Smart Classroom and language services is an interesting and promising approach to building future multicultural distant learning systems.
Yue Suo, Yuanchun Shi, Toru Ishida

Chapter 18. Federated Operation Model for Service Grids

The concept of collective intelligence is contributing significantly to knowledge creation on the Web. While current knowledge creation activities tend to be founded on the approach of assembling content such as texts, images and videos, we propose here the service-oriented approach. We use the term service grid to refer to a framework of collective intelligence based on Web services. This chapter provides an institutional design mainly for non-profit service grids that are open to the public. In particular, we deepen the discussion of 1) intellectual property rights, 2) application systems, and 3) federated operations from the perspective of the following stakeholders: service providers, service users and service grid operators respectively. The Language Grid has been operating, based on the proposed institutional framework, since December 2007.
Toru Ishida, Yohei Murakami, Eri Tsunokawa, Yoko Kubota, Virach Sornlertlamvanich


Weitere Informationen

Premium Partner