Skip to main content

2015 | OriginalPaper | Buchkapitel

Mixed Language Arabic-English Information Retrieval

verfasst von : Mohammed Mustafa, Hussein Suleman

Erschienen in: Computational Linguistics and Intelligent Text Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

For many non-English languages in developing countries (such as Arabic), text switching/mixing (e.g. between Arabic and English) is very prevalent, especially in scientific domains, due to the fact that most technical terms are borrowed from English and/or they are neither included in the native (non-English) languages nor have a precise translation/transliteration in these native languages. This makes it difficult to search only in a non-English (native) language because either non-English-speaking users, such as Arabic speakers, are not able to express terminology in their native languages or the concepts need to be expanded using context. This results in mixed queries and documents in the non-English speaking world (the Arabic world in particular). Mixed-language querying is a challenging problem and does not attained major attention in IR community. Current search engines and traditional CLIR systems did not handle mixed-language querying adequately and did not exploit this natural human tendency. This paper attempts to address the problem of mixed querying in CLIR. It proposes mixed-language (language-aware) IR solution, in terms of cross-lingual re-weighting model, in which mixed queries are used to retrieve most relevant documents, regardless of their languages. For the purpose of conducting the experiments, a new multilingual and mixed Arabic-English corpus on the computer science domain is therefore created. Test results showed that the proposed cross-lingual re-weighting model could yield statistically significant better results, with respect to mixed-language queries and it achieved more than 94% of monolingual baseline effectiveness.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Metadaten
Titel
Mixed Language Arabic-English Information Retrieval
verfasst von
Mohammed Mustafa
Hussein Suleman
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-18117-2_32