skip to main content
research-article

Arabic Cross-Language Information Retrieval: A Review

Authors Info & Claims
Published:28 January 2016Publication History
Skip Abstract Section

Abstract

Cross-language information retrieval (CLIR) deals with retrieving relevant documents in one language using queries expressed in another language. As CLIR tools rely on translation techniques, they are challenged by the properties of highly derivational and flexional languages like Arabic. Much work has been done on CLIR for different languages including Arabic. In this article, we introduce the reader to the motivations for solving some problems related to Arabic CLIR approaches. The evaluation of these approaches is discussed starting from the 2001 and 2002 TREC Arabic CLIR tracks, which aim to objectively evaluate CLIR systems. We also study many other research works to highlight the unresolved problems or those that require further investigation. These works are discussed in the light of a deep study of the specificities and the tasks of Arabic information retrieval (IR). Particular attention is given to translation techniques and CLIR resources, which are key issues challenging Arabic CLIR. To push research in this field, we discuss how a new standard collection can improve Arabic IR and CLIR tracks.

References

  1. M. Ababneh, R. Al-Shalabi, G. Kanaan, and A. Al-Nobani. 2012. Building an effective rule-based light stemmer for Arabic language to improve search effectiveness. International Arab Journal of Information Technology 9, 4, 368--372. http://www.ccis2k.org/iajit/PDF/vol.9,no.4/2834-10.pdf.Google ScholarGoogle Scholar
  2. A. Abdelali. 2004. Localization in Modern Standard Arabic. Journal of the American Society for Information Science 55, 1, 23--28. DOI:http://dx.doi.org/10.1002/asi.10340 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Abdelali, J. R. Cowie, D. Farwell, and W. Ogden. 2004a. UCLIR: A multilingual information retrieval tool. Inteligencia Artificial, Revista Iberoamericana de Inteligencia Artificial 8, 22, 103--110. http://nlp.uned.es/ia-mlia/iberamia2002/papers/mlia10.pdf.Google ScholarGoogle Scholar
  4. A. Abdelali, J. R. Cowie, and H. S. Soliman. 2004b. Arabic information retrieval perspectives. In Proceedings of JEP-TALN 2004, Arabic Language Processing. http://aune.lpl.univ-aix.fr/jep-taln04/proceed/actes/arabe2004/TAAA13.pdf.Google ScholarGoogle Scholar
  5. N. AbdulJaleel and L. S. Larkey. 2003. Statistical transliteration for English-Arabic cross-language information retrieval. In Proceedings of the 12th International Conference on Information and Knowledge Management (CIKM’03). ACM, New York, NY, 139--146. DOI:http://doi.acm.org/10.1145/956863.956890 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Abdul-Rauf. 1996. Arabic for English Speaking Students. Al-Saadawi, Alexandria, VA.Google ScholarGoogle Scholar
  7. I. Abu El-Khair. 2006. Effects of stop words elimination for Arabic information retrieval: A comparative study. International Journal of Computing and Information Sciences 4, 3, 119--133. http://www.ijcis.info/Vol4N3/Vol4N3PP119-133FS.pdf.Google ScholarGoogle Scholar
  8. I. Abu El-Khair. 2007. Arabic information retrieval. Annual Review of Information Science and Technology 41, 1, 505--533. DOI:http://dx.doi.org/10.1002/aris.2007.1440410118 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Abu-Salem. 1992. A Microcomputer Based Arabic Bibliographic Information Retrieval System with Relational Thesauri (Arabic-IRS). Ph.D. Dissertation. Illinois Institute of Technology, Chicago, IL.Google ScholarGoogle Scholar
  10. H. Abu-Salem. 2004. Comparison of stemming and n-gram matching for term conflation in Arabic text. International Journal of Computer Processing of Oriental Languages 17, 2, 61--81. DOI:http://dx.doi.org/10.1142/S0219427904001024Google ScholarGoogle ScholarCross RefCross Ref
  11. H. Abu-Salem, M. Al-Omari, and M. W. Evens. 1999. Stemming methodologies over individual query words for an Arabic information retrieval system. Journal of the American Society for Information Science 50, 6, 524--529. DOI:http://dx.doi.org/10.1002/(SICI)1097-4571(1999)50:6$<$524::AID-ASI7$>$3.0.CO;2-M Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Abu-Salem and P. K.-F. Chan. 2006. English-Arabic cross-language information retrieval based on parallel documents. International Journal of Computer Processing of Oriental Languages 19, 1, 21--37. DOI:http://dx.doi.org/10.1142/S0219427906001372Google ScholarGoogle ScholarCross RefCross Ref
  13. K. Ahn, B. Alex, J. Bos, T. Dalmas, J. L. Leidner, and M. Smillie. 2004. Cross-lingual question answering using off-the-shelf machine translation. In Multilingual Information Access for Text, Speech, and Images. Lecture Notes in Computer Science, Vol. 3491. Springer, 446--457. DOI:http://dx.doi.org/10.1007/11519645_44 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Alajmi, E. M. Saad, and R. R. Darwish. 2012. Toward an Arabic stop-words list generation. International Journal of Computer Applications 46, 8, 8--13. http://research.ijcaonline.org/volume46/number8/pxc3879341.pdf.Google ScholarGoogle Scholar
  15. M. Algarni, B. Martin, T. Bell, and K. Nehsatian. 2014. Simple Arabic stemmer. In Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM’14). ACM, New York, NY, 1803--1806. DOI:10.1145/2661829.2661972 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Alhadidi and M. Alwedyan. 2008. Hybrid stop-word removal technique for Arabic language. Egyptian Computer Science Journal 30, 1, 35--38.Google ScholarGoogle Scholar
  17. M. Aljlayl, S. M. Beitzel, E. C. Jensen, A. Chowdhury, D. O. Holmes, M. Lee, D. A. Grossman, and O. Frieder. 2001. IIT at TREC-10. In Proceedings of the 10th Text Retrieval Conference (TREC-2001). 265--274. http://trec.nist.gov/pubs/trec10/papers/IIT-TREC10.pdf.Google ScholarGoogle Scholar
  18. M. Aljlayl and O. Frieder. 2001. Effective Arabic-English cross-language information retrieval via machine readable dictionaries and machine translation. In Proceedings of the 2001 ACM International Conference on Information and Knowledge Management (CIKM’01). ACM, New York, NY, 295--302. DOI:http://doi.acm.org/10.1145/502585.502635 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Aljlayl and O. Frieder. 2002. On Arabic search: Improving the retrieval effectiveness via a light stemming approach. In Proceedings of the Conference on Information and Knowledge Management (CIKM’02). ACM, New York, NY, 340--347. DOI:http://doi.acm.org/10.1145/584792.584848 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Aljlayl, O. Frieder, and D. A. Grossman. 2002. On Arabic-English cross-language information retrieval: A machine translation approach. In Proceedings of the 2002 International Symposium on Information Technology. IEEE, Los Alamitos, CA, 2--7. DOI:http://doi.ieeecomputersociety.org/10.1109/ITCC.2002.1000351 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Y. Al-Nashashibi, D. Neagu, and A. A. Yaghi. 2010. Stemming techniques of Arabic language: A comparative study from the information retrieval perspective. In Proceedings of the 2nd International Conference on Computer Technology and Development. IEEE, Los Alamitos, CA, 270--276. http://scim.brad.ac.uk/staff/pdf/dneagu/05645873StemmingTechniquesAlnashashibi.pdf.Google ScholarGoogle Scholar
  22. A. Alqudsi, N. Omar, and K. Shaker. 2012. Arabic machine translation: A survey. Artificial Intelligence Review 42, 4, 549--572. DOI:10.1007/s10462-012-9351-1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Al-Shalabi, G. Kanaan, J. M. Jaam, A. Hasnah, and E. Hilat. 2004. Stop-word removal algorithm for Arabic language. In Proceedings of the International Conference on Information and Communication Technologies from Theory to Applications. IEEE, Los Alamitos, CA, 1--5. http://www.cs.wayne.edu/∼eyad/sw_algo_arabic_2004.pdf.Google ScholarGoogle Scholar
  24. M. A. Attia. 2008. Handling Arabic Morphological and Syntactic Ambiguity within the LFG Framework with a View to Machine Translation. Ph.D. Dissertation. University of Manchester, Manchester, UK. http://attiaspace.com/Publications/Attia-PhD-Thesis.pdf.Google ScholarGoogle Scholar
  25. R. Ayed, I. Bounhas, B. Elayeb, N. Bellamine Ben Saoud, and F. Evrard. 2014a. Improving Arabic texts morphological disambiguation using possibilistic classifier. In Proceedings of the 19th International Conference on Application of Natural Language to Information Systems (NLDB’14). 138--147. DOI:10.1007/978-3-319-07983-7_18Google ScholarGoogle Scholar
  26. R. Ayed, I. Bounhas, B. Elayeb, N. Bellamine Ben Saoud, and F. Evrard. 2014b. Evaluation d’une approche possibiliste pour la désambiguïsation des textes arabes. In Actes de Traitement Automatique des Langue Naturelles (TALN’14). 316--327. http://www.aclweb.org/anthology/F/F14/F14-1028.pdf.Google ScholarGoogle Scholar
  27. R. Ayed, I. Bounhas, B. Elayeb, F. Evrard, and N. Bellamine Ben Saoud. 2012a. Arabic morphological analysis and disambiguation using a possibilistic classifier. In Proceedings of the 8th International Conference on Intelligent Computing (ICIC’12). 274--279. DOI:http://dx.doi.org/10.1007/978-3-642-31576-3_36 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Ayed, I. Bounhas, B. Elayeb, F. Evrard, and N. Bellamine Ben Saoud. 2012b. A possibilistic approach for the automatic morphological disambiguation of Arabic texts. In Proceedings of the 13th International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD’12). IEEE, Los Alamitos, CA, 187--194. DOI:10.1109/SNPD.2012.21 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. H. Azarbonyad, A. Shakery, and H. Faili. 2013. Exploiting multiple translation resources for English-Persian cross language information retrieval. In Information Access Evaluation, Multilinguality, Multimodality, and Visualization. Lecture Notes in Computer Science, Vol. 8138. Springer, 93--99. DOI:10.1007/978-3-642-40802-1_11Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. R. Beesley. 1996. Arabic finite-state morphological analysis and generation. In Proceedings of the 16th International Conference on Computational Linguistics (COLING’96). 89--94. http://aclweb.org/anthology/C96-1017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. K. R. Beesley. 1998a. Romanization, Transcription and Transliteration. Retrieved December 1, 2015, from http://open.xerox.com/Services/arabic-morphology/Pages/romanization.Google ScholarGoogle Scholar
  32. K. R. Beesley. 1998b. Arabic morphological analysis on the Internet. In Proceedings of the International Conference on Multi-Lingual Computing Arabic. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.28.5655andrep=rep1andtype=pdf.Google ScholarGoogle Scholar
  33. K. R. Beesley. 2001. Finite-state morphological analysis and generation of Arabic at Xerox Research: Status and plans in 2001. In Proceedings of the ACL Workshop on Arabic Language Processing: Status and Perspective. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.2.3703andrep=rep1and type=pdf.Google ScholarGoogle Scholar
  34. O. Ben Khiroun, R. Ayed, B. Elayeb, I. Bounhas, N. Bellamine Ben Saoud, and F. Evrard. 2014a. Towards a new standard Arabic test collection for mono- and cross-language information retrieval. In Proceedings of the 19th International Conference on Application of Natural Language to Information Systems (NLDB’14). 168--171. DOI:10.1007/978-3-319-07983-7_23Google ScholarGoogle Scholar
  35. O. Ben Khiroun, B. Elayeb, I. Bounhas, F. Evrard, and N. Bellamine Ben Saoud. 2011. A possibilistic approach for semantic query expansion. In Proceedings of the 4th International Conference on Internet Technologies and Applications (ITA’11). 308--316.Google ScholarGoogle Scholar
  36. O. Ben Khiroun, B. Elayeb, I. Bounhas, F. Evrard, and N. Bellamine Ben Saoud. 2012. A possibilistic approach for automatic word sense disambiguation. In Proceedings of the 24th Conference on Computational Linguistics and Speech Processing (ROCLING’12). 261--327. http://aclweb.org/anthology/O/O12/O12-1025.pdf.Google ScholarGoogle Scholar
  37. O. Ben Khiroun, B. Elayeb, I. Bounhas, F. Evrard, and N. Bellamine Ben Saoud. 2014b. Improving query expansion by automatic query disambiguation in intelligent information retrieval. In Proceedings of the 6th International Conference on Agents and Artificial Intelligence (ICAART’14). 153--160.Google ScholarGoogle Scholar
  38. W. Ben Romdhane, B. Elayeb, I. Bounhas, F. Evrard, and N. Bellamine Ben Saoud. 2013. A possibilistic query translation approach for cross-language information retrieval. In Proceedings of the 9th International Conference on Intelligent Computing (ICIC’13). 73--82. DOI:http://dx.doi.org/10.1007/978-3-642-39482-9_9 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. R. Besançon, S. Chaudiron, D. Mostefa, O. Hamon, I. Timimi, and K. Choukri. 2008. Overview of CLEF 2008 INFILE Pilot track. In Proceedings of Evaluating Systems for Multilingual and Multimodal Information Access, the 2008 Cross-Language Evaluation Forum (CLEF’08). 939--946. DOI:10.1007/978-3-642-04447-2_125 Google ScholarGoogle Scholar
  40. R. Besançon, S. Chaudiron, D. Mostefa, O. Hamon, I. Timimi, and K. Choukri. 2009. Information filtering evaluation: Overview of CLEF 2009 INFILE track. In Proceedings of Evaluating Systems for Multilingual and Multimodal Information Access, the 2009 Cross-Language Evaluation Forum (CLEF’09). 342--353. DOI:10.1007/978-3-642-15754-7_41Google ScholarGoogle Scholar
  41. I. Bounhas. 2012. Building and Integrating Ontologies for a Reliability-Guided Mapping of Arabic Corpora. Ph.D. Dissertation. Faculty of Sciences of Tunis, Tunisia.Google ScholarGoogle Scholar
  42. I. Bounhas, R. Ayed, B. Elayeb, F. Evrard, and N. B. Ben Saoud. 2015a. Experimenting a discriminative possibilistic classifier with reweighting model for Arabic morphological disambiguation. Computer Speech and Language 33, 67--87. DOI:http://dx.doi.org/10.1016/j.csl.2014.12.005 Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2010. Towards a computer study of the reliability of Arabic stories. Journal of the American Society for Information Science and Technology 61, 8, 1686--1705. DOI:10.1002/asi.21356 Google ScholarGoogle ScholarCross RefCross Ref
  44. I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2011a. Organizing contextual knowledge for Arabic text disambiguation and terminology extraction. Knowledge Organization Journal 38, 6, 473--490.Google ScholarGoogle Scholar
  45. I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2011b. ArabOnto: Experimenting a new distributional approach for building Arabic ontological resources. International Journal of Metadata, Semantics, and Ontologies 6, 2, 81--95. DOI:10.1504/IJMSO.2011.046578 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2015b. Information reliability evaluation: From Arabic storytelling to computer sciences. ACM Journal on Computing and Cultural Heritage 8, 3, Article No. 14. DOI:http://dx.doi.org/10.1145/2693847. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. I. Bounhas, W. Lahbib, and B. Elayeb. 2014a. Arabic domain terminology extraction: A literature review. In Proceedings of the 13th International Conference on Ontologies, Databases, and Applications of Semantics (ODBASE’14). 792--799. DOI:10.1007/978-3-662-45563-0_51Google ScholarGoogle Scholar
  48. I. Bounhas, W. Lahbib, and B. Elayeb. 2014b. Extraction de terminologies en langue Arabe: Un état de l’art. In Proceedings of Cinquième Journées Francophones sur les Ontologies (JFO’14). 271--282.Google ScholarGoogle Scholar
  49. T. Buckwalter. 2002. Arabic Transliteration. Retrieved December 1, 2015, from http://www.qamus.org/transliteration.htm.Google ScholarGoogle Scholar
  50. J. Callan, M. Hoy, C. Yoo, and L. Zhao. 2009. ClueWeb09 Dataset. Retrieved December 1, 2015, from http://lemurproject.org/clueweb09/.Google ScholarGoogle Scholar
  51. A. Chen and F. Gey. 2002. Building an Arabic stemmer for information retrieval. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 631--639. http://metadata.sims.berkeley.edu/papers/trec2002.pdf.Google ScholarGoogle Scholar
  52. D. Chiang, M. Diab, N. Habash, O. Rambow, and S. Shareef. 2006. Parsing Arabic dialects. In Proceedings of the European Chapter of ACL (EACL’06), Vol. 111. 112. http://acl.ldc.upenn.edu/E/E06/E06-1047.pdf.Google ScholarGoogle Scholar
  53. A. Chowdhury, M. Aljlayl, E. Jensen, S. Beitzel, D. Grossman, and O. Frieder. 2002. IIT at TREC-2002: Linear combinations based on document structure and varied stemming for Arabic retrieval. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 299--310. http://trec.nist.gov/pubs/trec11/papers/iit.grossman.pdf.Google ScholarGoogle Scholar
  54. K. Darwish. 2002. Building a shallow Arabic morphological analyzer in one day. In Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages. 47--54. http://acl.ldc.upenn.edu/W/W02/W02-0506.pdf?origin=publication_detail. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. K. Darwish and A. M. Ali. 2012. Arabic retrieval revisited: Morphological hole filling. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. 218--222. http://www.qcri.com/app/media/2039. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. K. Darwish, D. S. Doermann, R. C. Jones, D. W. Oard, and M. Rautiainen. 2001. TREC-10 experiments at Maryland: CLIR and video. In Proceedings of the 10th Text Retrieval Conference (TREC’01). 549--561. http://trec.nist.gov/pubs/trec10/papers/umdTREC2000.pdf.Google ScholarGoogle Scholar
  57. K. Darwish, H. Hassan, and O. Emam. 2005. Examining the effect of improved context sensitive morphology on Arabic information retrieval. In Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages. 25--30. http://www.aclweb.org/anthology/W05-0704. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. K. Darwish and W. Magdy. 2013. Arabic information retrieval. Foundations and Trends in Information Retrieval 7, 4, 239--342. DOI:10.1561/1500000031 Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. K. Darwish, W. Magdy, and A. Mourad. 2012. Language processing for Arabic microblog retrieval. In Proceedings of Conference on Information and Knowledge Management (CIKM’12). ACM, New York, NY, 2427--2430. DOI:http://doi.acm.org/10.1145/2396761.2398658 Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. K. Darwish and D. W. Oard. 2002a. Term selection for searching printed Arabic. In Proceedings of the 25th ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 261--268. DOI:http://doi.acm.org/10.1145/564376.564423 Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. K. Darwish and D. W. Oard. 2002b. CLIR experiments at Maryland for TREC-2002: Evidence combination for Arabic-English retrieval. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 721--732. http://trec.nist.gov/pubs/trec11/papers/umd.darwish.pdf.Google ScholarGoogle Scholar
  62. K. Darwish and D. W. Oard. 2003. Probabilistic structured query methods. In Proceedings of the 26th ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 338--344. DOI:10.1145/860435.860497 Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. L. Denoyer and P. Gallinari. 2006. The Wikipedia XML corpus. ACM Special Interest Group on Information Retrieval Forum 40, 1. DOI:10.1145/1147197.1147210 Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. A. N. De Roeck and W. Al-Fares. 2000. A morphologically sensitive clustering algorithm for identifying Arabic roots. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics. 199--206. http://www.aclweb.org/anthology/P00-1026. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. M. Diab, K. Hacioglu, and D. Jurafsky. 2007. Automatic processing of Modern Standard Arabic text. In Arabic Computational Morphology: Knowledge-Based and Empirical Methods, A. van den Bosch and A Soudi (Eds.). Kluwer/Springer, 159--180.Google ScholarGoogle Scholar
  66. N. T. Duc, D. Bollegala, and M. Ishizuka. 2012. Cross-language latent relational search between Japanese and English languages using a Web corpus. ACM Transactions on Asian Language Information Processing 11, 3, Article No. 11. DOI:http://doi.acm.org/10.1145/2334801.2334805 Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. B. Elayeb, I. Bounhas, O. Ben Khiroun, F. Evrard, and N. Bellamine Ben Saoud. 2011. Towards a possibilistic information retrieval system using semantic query expansion. International Journal of Intelligent Information Technologies 7, 4, 1--25. DOI:http://dx.doi.org/10.4018/jiit.2011100101 Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. B. Elayeb, I. Bounhas, O. Ben Khiroun, F. Evrard, and N. Bellamine Ben Saoud. 2014. A comparative study between possibilistic and probabilistic approaches for monolingual word sense disambiguation. Knowledge and Information Systems 44, 1, 92--126. DOI:10.1007/s10115-014-0753-z Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. B. Elayeb, F. Evrard, M. Zaghdoud, and M. Ben Ahmed. 2009. Towards an intelligent possibilistic Web information retrieval using multiagent system. Interactive Technology and Smart Education 6, 1, 40--59. DOI:10.1108/17415650910965191Google ScholarGoogle ScholarCross RefCross Ref
  70. M. I. Eldesouki, W. M. Arafa, and K. Darwish. 2009. Stemming techniques of Arabic language: Comparative study from the information retrieval perspective. Egyptian Computer Journal 36, 1, 30--49.Google ScholarGoogle Scholar
  71. T. A. Elghazaly and A. A. Fahmy. 2009. English/Arabic cross language information retrieval (CLIR) for Arabic OCR-degraded text. Communications of the IBIMA 9, 208--218. http://www.ibimapublishing.com/journals/CIBIMA/volume9/v9n25.pdf.Google ScholarGoogle Scholar
  72. A. El Kholy and N. Habash. 2010. Techniques for Arabic morphological detokenization and orthographic denormalization. In Proceedings of the Workshop on LR and HLT for Semitic Languages at LREC 2010. 45--51.Google ScholarGoogle Scholar
  73. C. España-Bonet, J. Giménez, and L. Màrquez. 2009. Discriminative phrase-based models for Arabic machine translation. ACM Transactions on Asian Language Information Processing 8, 4, Article No. 15. DOI:http://doi.acm.org/10.1145/1644879.1644882 Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. European Language Resource Association. 2001. An-Nahar Newspaper Text Corpus. Retrieved December 1, 2015, from http://catalog.elra.info/product_info.php?products_id=767.Google ScholarGoogle Scholar
  75. European Language Resource Association. 2002. Al-Hayat Arabic Corpus. Retrieved December 1, 2015, from http://catalog.elra.info/product_info.php?products_id=632.Google ScholarGoogle Scholar
  76. K. Faidi, R. Ayed, I. Bounhas, and B. Elayeb. 2014. Comparing Arabic NLP tools for Hadith classification. In Proceedings of the 2nd International Conference on Islamic Applications in Computer Science and Technologies (IMAN’14).Google ScholarGoogle Scholar
  77. A. Farag and A. Nürnberger. 2008. Arabic/English word translation disambiguation using parallel corpora and matching schemes. In Proceedings of the 12th Annual Conference of the European Association for Machine Translation. 6--11.Google ScholarGoogle Scholar
  78. A. Farag and A. Nürnberger. 2013. Translation ambiguity resolution using interactive contextual information. In Computational Linguistics. Studies in Computational Intelligence, Vol. 458. Springer, 219--240. DOI:10.1007/978-3-3-642-34399-5_12Google ScholarGoogle Scholar
  79. A. Farag and A. Nürnberger, and M. Nitsche. 2011. Supporting Arabic cross-lingual retrieval using contextual information. In Proceedings of the 2nd International Conference on Multidisciplinary Information Retrieval Facility. 30--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. S. Farag and A. Nürnberger. 2012. Literature review of interactive cross-language information retrieval tools. International Arab Journal of Information Technology 9, 5, 479--486. http://www.ccis2k.org/iajit/PDF/vol.9,no.5/3039-12.pdf.Google ScholarGoogle Scholar
  81. M. Franz and J. S. McCarley. 2002. Arabic information retrieval at IBM. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 260--262. http://trec.nist.gov/pubs/trec11/papers/ibm.franz.pdf.Google ScholarGoogle Scholar
  82. A. Fraser, J. Xu, and R. Weischedel. 2002. TREC 2002: Cross-lingual retrieval at BBN. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 102--106. http://trec.nist.gov/pubs/trec11/papers/bbn.xu.cross.pdf.Google ScholarGoogle Scholar
  83. Y. Gal. 2002. An HMM approach to vowel restoration in Arabic and Hebrew. In Proceedings of the ACL 2002 Semitic Language Workshop. 1--7. DOI:http://dx.doi.org/10.3115/1118637.1118641 Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. J. Gao, J.-Y. Nie, and M. Zhou. 2006. Statistical query translation models for cross-language information retrieval. ACM Transactions on Asian Language Information Processing 5, 4, 323--359. DOI:http://doi.acm.org/10.1145/1236181.1236184 Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. F. C. Gey, H. Jiang, A. Chen, and R. R. Larson. 1998. Manual queries and machine translation in cross-language retrieval and interactive retrieval with Cheshire II at TREC-7. In Proceedings of the 7th Text Retrieval Conference (TREC’98). 463--476. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.689&rep==rep1&type==ps.Google ScholarGoogle Scholar
  86. F. C. Gey, N. Kando and C. Peters. 2005. Cross-language information retrieval: The way ahead. Information Processing and Management 41, 3, 415--431. DOI:http://dx.doi.org/10.1016/j.ipm.2004.06.006 Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. F. C. Gey and D. W. Oard. 2001. The TREC-2001 cross-language information retrieval track: Searching Arabic using English, French or Arabic queries. In Proceedings of the 10th Text Retrieval Conference (TREC’01). 16--25. http://trec.nist.gov/pubs/trec10/papers/clirtrack.pdf.Google ScholarGoogle Scholar
  88. A. Goweder and A. De Roeck. 2001. Assessment of a significant Arabic corpus. In Proceedings of the Arabic NLP Workshop. http://www.abdelali.net/ref/ACL-EACL%202001_goweder.pdf.Google ScholarGoogle Scholar
  89. A. Guessoum and R. N. Zantout. 2004. A methodology for evaluating Arabic machine translation systems. Machine Translation 18, 4, 299--335. DOI:http://dx.doi.org/10.1007/s10590-005-2412-3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. N. Habash. 2010. Introduction to Arabic Natural Language Processing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers. DOI:http://dx.doi.org/10.2200/S00277ED1V01Y201008HLT010 Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. N. Habash, D. Mona, and O. Rambow. 2012. Conventional orthography for dialectal Arabic. In Proceedings of the 2012 Language Resources and Evaluation Conference. 711--718. http://www.lrec-conf.org/proceedings/lrec2012/pdf/579_Paper.pdf.Google ScholarGoogle Scholar
  92. N. Habash and O. Rambow. 2005. Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). 573--580. http://www1.cs.columbia.edu/∼rambow/papers/habash-rambow-2005a.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. N. Habash, O. Rambow, and R. Roth. 2009. MADA+TOKAN: A toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming and lemmatization. In Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR’09). 102--109. http://www.elda.org/medar-conference/pdf/24.pdf.Google ScholarGoogle Scholar
  94. N. Habash, R. Roth, O. Rambow, R. Eskander, and N. Tomeh. 2013. Morphological analysis and disambiguation for dialectal Arabic. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’13). 426--432. http://www.aclweb.org/anthology/N13-1044.Google ScholarGoogle Scholar
  95. N. Habash and F. Sadat. 2006. Arabic preprocessing schemes for statistical machine translation. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers (NAACL-Short’06). 49--52. http://www.mt-archive.info/HLT-NAACL-2006-Habash.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. N. Habash, A. Soudi, and T. Buckwalter. 2007. On Arabic transliteration. In Arabic Computational Morphology: Knowledge-Based and Empirical Methods, A. Soudi, A. van den Bosch, and G. Neumann (Eds.). Springer, 15--22. http://nizarhabash.com/publications/chapter2BisHabash_et_al-2007-web.pdf.Google ScholarGoogle Scholar
  97. F. Harrag, A. Alothaim, A. Abanmy, F. Alomaigan, and S. Alsalehi. 2013. Ontology extraction approach for prophetic narration (Hadith) using association rules. International Journal on Islamic Applications in Computer Science and Technology 1, 2, 17--26.Google ScholarGoogle Scholar
  98. F. Harrag, A. Hamdi-Cherif, A. M. S. Al-Salman, and E. El-Qawasmeh. 2009. Experiments in improvement of Arabic information retrieval. In Proceedings of the 3rd International Conference on Arabic Language Processing (CITALA’09). 71--81. http://www.emi.ac.ma/citala2009/docs/citala%20papers/%28N%B011-Paper%2035%29.pdf.Google ScholarGoogle Scholar
  99. A. Hasnah. 1996. Full text processing and retrieval: Weight ranking, text structuring, and passage retrieval for Arabic documents. Ph.D. Dissertation. Illinois Institute of Technology (IIT), Chicago, IL.Google ScholarGoogle Scholar
  100. A. Hasnah and M. Evens. 2001. Arabic/English cross-language information retrieval using a bilingual dictionary. In Proceedings of the Arabic NLP Workshop at ACL/EACL 2001. http://www.elsnet.org/arabic2001/hasnah.pdf.Google ScholarGoogle Scholar
  101. T. Hedlund, E. Airio, H. Keskustalo, R. Lehtokangas, A. Pirkola, and K. Järvelin. 2004. Dictionary-based cross-language information retrieval: Learning experiences from CLEF 2000-2002. Information Retrieval 7, 1--2, 99--119. DOI:http://dx.doi.org/10.1023/B:INRT.0000009442.34054.55 Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. A. Hefny, K. Darwish, and A. Alkahky. 2011. Is a query worth translating: Ask the users! In Proceedings of the 33rd European Conference on IR Research (ECIR’11). 238--250. DOI:http://dx.doi.org/10.1007/978-3-642-20161-5_24 Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. I. Hmeidi. 1995. Design and implementation of automatic word and phrase indexing for information retrieval with Arabic documents. Ph.D. Dissertation. Illinois Institute of Technology, Chicago, IL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. I. Hmeidi, R. Al-Shalabi, A. T. Al-Taani, H. Najadat, and S. A. Al-Hazaimeh. 2010. A novel approach to the extraction of roots from Arabic words using bigrams. Journal of the American Society for Information Science and Technology 61, 3, 583--591. DOI:http://dx.doi.org/10.1002/asi.21247 Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. P. Iswarya and V. Radha. 2012. Cross language text retrieval: A review. International Journal of Engineering Research and Applications. 2, 5, 1036--1043. http://www.ijera.com/papers/Vol2_issue5/FQ2510361043.pdf.Google ScholarGoogle Scholar
  106. Y. Kadri. 2008. Recherche d’Information Translinguistique sur les Documents en Arabe. Ph.D. Dissertation. Faculty of Higher Studies, Montréal University, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Y. Kadri and J.-Y. Nie. 2004. Traduction des requêtes pour la recherche d’information translinguistique Anglais-Arabe. In Proceedings de la conference sur le Traitement Automatique des Langues Naturelles (TALN’04). 291--296.Google ScholarGoogle Scholar
  108. Y. Kadri and J.-Y. Nie. 2006a. Effective stemming for Arabic information retrieval. In Proceedings of the Challenge of Arabic for NLP/MT Conference. 68--74. http://mt-archive.info/BCS-2006-Kadri.pdf.Google ScholarGoogle Scholar
  109. Y. Kadri and J.-Y. Nie. 2006b. Improving query translation with confidence estimation for cross language information retrieval. In Proceedings of the Conference on Information and Knowledge Management (CIKM’06). ACM, New York, NY, 818--819. DOI:http://doi.acm.org/10.1145/1183614.1183746 Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. Y. Kadri and J.-Y. Nie. 2007. Combining resources with confidence measures for cross language information retrieval. In Proceedings of the 1st Ph.D. Workshop in CIKM (PIKM’07). ACM, New York, NY, 131--138. DOI:http://doi.acm.org/10.1145/1316874.1316896 Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. Y. Kadri and J.-Y. Nie. 2008. A comparative study for query translation using linear combination and confidence measure. In Proceedings of the 3rd International Joint Conference on Natural Language Processing. 181--188. http://aclweb.org/anthology/I/I08/I08-1024.pdf.Google ScholarGoogle Scholar
  112. S. Khoja. 2001. Khoja's Arabic Stemmer (version 1.0). London, UK.Google ScholarGoogle Scholar
  113. S. Khoja and R. Garside. 1999. Stemming Arabic Text. Technical Report. Computing Department, Lancaster University, Lancaster, UK. http://www.comp.lancs.ac.uk/computing/users/khoja/stemmer.ps.Google ScholarGoogle Scholar
  114. S. Khoja, R. Garside, and G. Knowles. 2001. A Tagset for the Morphosyntactic Tagging of Arabic. Retrieved December 1, 2015, from http://zeus.cs.pacificu.edu/shereen/CL2001.pdf.Google ScholarGoogle Scholar
  115. K. Kishida. 2005. Technical issues of cross-language information retrieval: A review. Information Processing and Management. 41, 3, 433--455. DOI:http://dx.doi.org/10.1016/j.ipm.2004.06.007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. K. Kishida. 2008. Prediction of performance of cross-language information retrieval using automatic evaluation of translation. Library and Information Science Research 30, 2, 138--144. DOI:10.1016/j.lisr. 2007.09.003Google ScholarGoogle ScholarCross RefCross Ref
  117. A. Kumar. 2012. Profound survey on cross-language information retrieval methods (CLIR). In Proceedings of the 2012 2nd International Conference on Advanced Computing and Communication Technologies (ACCT’12). IEEE, Los Alamitos, CA, 64--68. DOI:10.1109/ACCT.2012.91 Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. K.-L. Kwok, S. Choi, and N. Dinstl. 2005. Rich results from poor resources: NTCIR-4 monolingual and cross-lingual retrieval of Korean texts using Chinese and English. ACM Transactions on Asian Language Information Processing 4, 2, 136--162. DOI:http://doi.acm.org/10.1145/1105696.1105700 Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. W. Lahbib, I. Bounhas, and B. Elayeb. 2014. Arabic-English domain terminology extraction from aligned corpora. In Proceedings of the 13th International Conference on Ontologies, Databases, and Applications of Semantics (ODBASE’14). 745--759. DOI:10.1007/978-3-662-45563-0_46Google ScholarGoogle Scholar
  120. W. Lahbib, I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2013. A hybrid approach for Arabic semantic relation extraction. In Proceedings of the 26th International FLAIRS Conference. 315--320. http://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS13/paper/view/5891/6090.Google ScholarGoogle Scholar
  121. L. S. Larkey, J. Allan, M. E. Connell, A. Bolivar, and C. Wade. 2002a. UMass at TREC 2002: Cross-language and novelty tracks. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 721--732. http://trec.nist.gov/pubs/trec11/papers/umass.wade.pdf.Google ScholarGoogle Scholar
  122. L. S. Larkey, L. Ballesteros, and M. E. Connell. 2002b. Improving stemming for Arabic information retrieval: Light stemming and co-occurrence analysis. In Proceedings of the 25th ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 275--282. DOI:http://doi.acm.org/10.1145/564376.564425 Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. L. S. Larkey and M. E. Connell. 2001. Arabic information retrieval at UMass in TREC-10. In Proceedings of the 10th Text Retrieval Conference (TREC’01). 562--570. http://trec.nist.gov/pubs/trec10/papers/UMass_TREC10_Final.pdf.Google ScholarGoogle Scholar
  124. C.-J. Lee, C.-H. Chen, S.-H. Kao, and P.-J. Cheng. 2010. To translate or not to translate? In Proceedings of the 33rd ACM SIGIR Conference. ACM, New York, NY, 651--658. DOI:10.1145/1835449.1835558 Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. G.-A. Levow. 2003. Issues in pre- and post-translation document expansion: Untranslatable cognates and missegmented words. In Proceedings of the 6th International Workshop on Information Retrieval with Asian Languages. 77--83. DOI:http://doi.acm.org/10.1145/1118935.1118945 Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. G.-A. Levow, D. W. Oard, and P. Resnik. 2005. Dictionary-based techniques for cross-language information retrieval. Information Processing and Management 41, 3, 523--547. DOI:http://dx.doi.org/10.1016/j.ipm.2004.06.012 Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. D. Lewandowski. 2012. Web Search Engine Research, Vol. 4. Emerald Group Publishing.Google ScholarGoogle Scholar
  128. Linguistic Data Consortium. 2001. Arabic Newswire Part 1. Retrieved December 1, 2015, from http://catalog. ldc.upenn.edu/LDC2001T55Google ScholarGoogle Scholar
  129. Linguistic Data Consortium. 2003. Arabic Gigaword. Retrieved December 1, 2015, from http://catalog.ldc. upenn.edu/LDC2003T12Google ScholarGoogle Scholar
  130. Linguistic Data Consortium. 2006. Arabic Gigaword Second Edition. Retrieved December 1, 2015, from http://catalog.ldc.upenn.edu/LDC2006T02Google ScholarGoogle Scholar
  131. M. Maamouri and C. Cieri. 2002. Resources for Arabic natural language processing. In Proceedings of the International Symposium on Processing Arabic. 125--146.Google ScholarGoogle Scholar
  132. W. Magdy. 2013. TweetMogaz: A news portal of tweets. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1095--1096. DOI:10.1145/2484028.2484212 Google ScholarGoogle ScholarDigital LibraryDigital Library
  133. W. Magdy, A. Ali, and K. Darwish. 2012. A summarization tool for time-sensitive social media. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). 2695--2697. DOI:10.1145/2396761.2398730 Google ScholarGoogle ScholarDigital LibraryDigital Library
  134. S. Mallat, M. A. Ben Mohamed, E. Hkiri, A. Zouaghi, and M. Zrigui. 2014. Semantic and contextual knowledge representation for lexical disambiguation: Case of Arabic-French query translation. Journal of Computing and Information Technology 22, 3, 191--215. DOI:10.2498/cit.1002234Google ScholarGoogle ScholarCross RefCross Ref
  135. J. Mayfield, P. McNamee, C. Costello, C. D. Piatko, and A. Banerjee. 2001. JHU/APL at TREC 2001: Experiments in filtering and in Arabic, video, and Web retrieval. In Proceedings of the 10th Text Retrieval Conference (TREC’01). 322--330. http://trec.nist.gov/pubs/trec10/papers/jhuapl01.pdf.Google ScholarGoogle Scholar
  136. P. McNamee and J. Mayfield. 2002a. Scalable multilingual information access. In Advances in Cross-Language Information Retrieval. Lecture Notes in Computer Science, Vol. 2785. Springer, 207--218. DOI:http://dx.doi.org/10.1007/978-3-540-45237-9_17Google ScholarGoogle Scholar
  137. P. McNamee and J. Mayfield. 2002b. Comparing cross-language query expansion techniques by degrading translation resources. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 159--166. DOI:10.1145/564376.564406 Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. P. McNamee, C. D. Piatko, and J. Mayfield. 2002. JHU/APL at TREC 2002: Experiments in filtering and Arabic retrieval. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 358--363. http://trec.nist.gov/pubs/trec11/papers/jhuapl.mcnamee.pdf.Google ScholarGoogle Scholar
  139. M. Moussa, M. W. Fakhr, and K. Darwish. 2012. Statistical denormalization for Arabic text. In Proceedings of the 11th Conference on Natural Language Processing (KONVENS’12). 228--232. http://www.oegai.at/konvens2012/proceedings/32_moussa12p/32_moussa12p.pdf.Google ScholarGoogle Scholar
  140. S. H. Mustafa. 2005. Character contiguity in n-gram--based word matching: The case for Arabic text. Information Processing and Management 41, 4, 819--827. DOI:http://dx.doi.org/10.1016/j.ipm.2004.02.003 Google ScholarGoogle ScholarDigital LibraryDigital Library
  141. D. W. Oard and F. C. Gey. 2002. The TREC-2002 Arabic/English CLIR track. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 17--26. http://trec.nist.gov/pubs/trec11/papers/OVERVIEW.gey.ps.gz.Google ScholarGoogle Scholar
  142. D. W. Oard, D. He, and J. Wang. 2008. User-assisted query translation for interactive cross-language information retrieval. Information Processing and Management 44, 1, 181--211. DOI:10.1016/j.ipm.2006.12.009 Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. D. W. Oard, G.-A. Levow, and C. I. Cabezas. 2000. CLEF experiments at Maryland: Statistical stemming and back off translation. In Cross-Language Information Retrieval and Evaluation. Lecture Notes in Computer Science, Vol. 2069. Springer, 176--187. DOI:http://dx.doi.org/10.1007/3-540-44645-1_17 Google ScholarGoogle ScholarDigital LibraryDigital Library
  144. J. Olive, C. Christianson, and J. McCary. 2011. Handbook of Natural Language Processing and Machine Translation. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. I. Ounis, C. Macdonald, J. Lin, and I. Soboroff. 2011. Overview of the TREC-2011 microblog track. In Proceedings of the 2011 Text Retrieval Conference (TREC’11). http://trec.nist.gov/pubs/trec20/papers/MICROBLOG.OVERVIEW.pdf.Google ScholarGoogle Scholar
  146. A. Pasha, M. Al-Badrashiny, M. Altantawy, N. Habash, M. Pooleery, O. Rambow, and R. M. Roth. 2013. DIRA: Dialectal Arabic information retrieval assistant. In Proceedings of International Joint Conference on Natural Language Processing (IJCNLP’13): System Demonstrations. 13--16. http://aclweb.org/anthology/I/I13/I13-2004.pdf.Google ScholarGoogle Scholar
  147. A. Pasha, M. Al-Badrashiny, M. Diab, A. El Kholy, R. Eskander, N. Habash, M. Pooleery, O. Rambow, and R. M. Roth. 2014. MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14). 1094--1101. http://www.lrec-conf.org/proceedings/lrec2014/pdf/593_Paper.pdf.Google ScholarGoogle Scholar
  148. A. Peñas, E. H. Hovy, P. Forner, A. Rodrigo, R. Sutcliffe, and R. Morante. 2013. QA4MRE 2011-2013: Overview of question answering for machine reading evaluation. In Information Access Evaluation, Multilinguality, Multimodality, and Visualization. Lecture Notes in Computer Science, Vol. 8138. Springer, 303--320. DOI:10.1007/978-3-642-40802-1_29Google ScholarGoogle ScholarDigital LibraryDigital Library
  149. A. Peñas, E. H. Hovy, P. Forner, A. Rodrigo, R. Sutcliffe, C. Sporleder, C. Forascu, Y. Benajiba, and P. Osenova. 2012. Overview of QA4MRE at CLEF 2012: Question answering for machine reading evaluation. In Proceedings of the 2012 Conference and Labs of the Evaluation Forum (CLEF’12). DOI:10.1.1.360.5198Google ScholarGoogle Scholar
  150. A. Pirkola. 1998. The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. In Proceedings of the 21st ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 55--63. DOI:http://doi.acm.org/10.1145/290941.290957 Google ScholarGoogle ScholarDigital LibraryDigital Library
  151. A. Pirkola, T. Hedlund, H. Keskustalo, and K. Järvelin. 2001. Dictionary-based cross-language information retrieval: Problems, methods, and research findings. Information Retrieval 4, 3--4, 209--230. DOI:http://dx.doi.org/10.1023/A:1011994105352 Google ScholarGoogle ScholarDigital LibraryDigital Library
  152. A. Pirkola, H. Keskustalo, E. Leppänen, A.-P. Känsälä, and K. Järvelin. 2002. Targeted s-gram matching: A novel n-gram matching technique for cross- and mono-lingual word form variants. Information Research 7, 2. http://InformationR.net/ir/7-2/paper126.html.Google ScholarGoogle Scholar
  153. A. Pirkola, D. Puolamäki, and K. Järvelin. 2003. Applying query structuring in cross-language retrieval. Information Processing and Management 39, 3, 391--402. DOI:http://dx.doi.org/10.1016/S0306-4573(02)00091-2 Google ScholarGoogle ScholarDigital LibraryDigital Library
  154. M. F. Porter. 1980. An algorithm for suffix stripping. Program 14, 3, 130--137. DOI:10.1108/eb046814Google ScholarGoogle ScholarCross RefCross Ref
  155. M. F. Porter. 2006. The English (Porter2) Stemming Algorithm. Retrieved December 1, 2015, from http://snowball.tartarus.org/algorithms/english/stemmer.html.Google ScholarGoogle Scholar
  156. M. Rammel, M. Sanan, and K. Zreik. 2011. Improving Arabic information retrieval system using n-gram method. WSEAS Transactions on Computers 10, 4, 125--133. DOI:http://dl.acm.org/citation.cfm?id=2001184.2001187 Google ScholarGoogle ScholarDigital LibraryDigital Library
  157. R. Roth, O. Rambow, N. Habash, M. T. Diab, and C. Rudin. 2008. Arabic morphological tagging, diacritization, and lemmatization using lexeme models and feature ranking. In Proceedings of the Association for Computational Linguistics Conference (ACL’08). 117--120. http://www.aclweb.org/anthology/P08-2030. Google ScholarGoogle ScholarDigital LibraryDigital Library
  158. H. Sajjad, K. Darwish, and Y. Belinkov. 2013. Translating dialectal Arabic to English. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13). 1--6. http://aclweb.org/anthology/P/P13/P13-2001.pdf.Google ScholarGoogle Scholar
  159. X. Saralegi and M. de Lacalle. 2010. Dictionary and monolingual corpus-based query translation for Basque-English CLIR. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’10). 1353--1358. http://www.lrec-conf.org/proceedings/lrec2010/pdf/63_Paper.pdf.Google ScholarGoogle Scholar
  160. G. Salton. 1973. Experiments in multi-lingual information retrieval. Information Processing Letters 2, 1, 6--11. DOI:http://dx.doi.org/10.1016/0020-0190(73)90017-3Google ScholarGoogle ScholarCross RefCross Ref
  161. J. Savoy. 2002. Report on CLEF-2002 experiments: Combining multiple sources of evidence. In Advances in Cross-Language Information Retrieval. Lecture Notes in Computer Science, Vol. 2785. Springer, 66--90.Google ScholarGoogle Scholar
  162. J. Savoy and Y. Rasolofo. 2002. Report on the TREC-11 experiment: Arabic, named page and topic distillation searches. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 765--774. http://trec.nist.gov/pubs/trec11/papers/uneuchatel.pdf.Google ScholarGoogle Scholar
  163. M. Q. Shatnawi, Q. Q. Abuein, and O. Darwish. 2011. Verification Hadith correctness in Islamic Web pages using information retrieval techniques. In Proceedings of the International Conference on Information and Communication Systems. http://www.icics.info/icics/proceeding/icics.paper/64.pdf.Google ScholarGoogle Scholar
  164. N. Soudani, I. Bounhas, B. Elayeb, and Y. Slimani. 2014a. Toward an Arabic ontology for Arabic word sense disambiguation based on normalized dictionaries. In Proceedings of the 13th International Conference on Ontologies, Databases, and Applications of Semantics (ODBASE’14). 655--658. DOI:10.1007/978-3-662-45550-0_68Google ScholarGoogle ScholarDigital LibraryDigital Library
  165. N. Soudani, I. Bounhas, B. Elayeb, and Y. Slimani. 2014b. An LMF-based normalization approach of Arabic Islamic dictionaries for Arabic word sense disambiguation: Application on hadith. In Proceedings of the 2nd International Conference on Islamic Applications in Computer Science and Technologies (IMAN’14).Google ScholarGoogle Scholar
  166. N. Soudani, I. Bounhas, B. Elayeb, and Y. Slimani. 2014c. Generic normalization approach of Arabic dictionaries for Arabic word sense disambiguation. In Proceedings of Cinquième Journées Francophones sur les Ontologies (JFO’14). 309--315.Google ScholarGoogle Scholar
  167. S. Strassel, M. A. Przybocki, K. Peterson, Z. Song, and K. Maeda. 2008. Linguistic resources and evaluation techniques for evaluation of cross-document automatic content extraction. In Proceedings of the 2008 International Conference on Language Resources and Evaluation (LREC’08). http://www.itl.nist.gov/iad/mig//publications/storage_paper/ACEXDOC_FinalPaperV3_NIST.pdf.Google ScholarGoogle Scholar
  168. P. Sujatha and P. Dhavachelvan. 2011. A review on the cross and multilingual information retrieval. International Journal of Web and Semantic Technology 2, 4, 115--124. DOI:10.5121/ijwest.2011.2409Google ScholarGoogle ScholarCross RefCross Ref
  169. T. Talvensaari. 2008. Effects of aligned corpus quality and size in corpus-based CLIR. In Proceedings of the 30th European Conference on Information Retrieval (ECIR’08). 114--125. DOI:10.1007/978-3-540-78646-7_13 Google ScholarGoogle ScholarCross RefCross Ref
  170. J. Toivonen, A. Pirkola, H. Keskustalo, K. Visala, and K. Järvelin. 2005. Translating cross-lingual spelling variants using transformation rules. Information Processing and Management 41, 4, 859--872. DOI:http://dx.doi.org/10.1016/j.ipm.2004.02.001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  171. S. Tomlinson. 2002. Experiments in named page finding and Arabic retrieval with Hummingbird SearchServerTM at TREC 2002. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 248--259. http://trec.nist.gov/pubs/trec11/papers/hummingbird.tomlinson.pdf.Google ScholarGoogle Scholar
  172. F. Türe and E. Boschee. 2014. Learning to translate: A query-specific combination approach for cross-lingual information retrieval. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 589--599. http://emnlp2014.org/papers/pdf/EMNLP2014064.pdf.Google ScholarGoogle ScholarCross RefCross Ref
  173. F. Türe, J. Lin, and D. W. Oard. 2012. Combining statistical translation techniques for cross-language information retrieval. In Proceedings of the 24th International Conference on Computational Linguistics (COLING’12): Technical Papers. 2685--2702.Google ScholarGoogle Scholar
  174. R. Udupa, K. Saravanan, A. Bakalov, and A. Bhole. 2009. “They are out there, if you know where to look”: Mining transliterations of OOV query terms for cross-language information retrieval. In Proceedings of the 31st European Conference on Information Retrieval (ECIR’09). 437--448. DOI:10.1007/978-3-642-00958-7_39 Google ScholarGoogle ScholarDigital LibraryDigital Library
  175. E. M. Voorhees. 2002. Overview of TREC 2002. In Proceedings of the 11th Text Retrieval Conference (TREC’02). 1--15. http://trec.nist.gov/pubs/trec11/papers/OVERVIEW.11.pdf.Google ScholarGoogle Scholar
  176. E. M. Voorhees and D. Harman. 2001. Overview of TREC 2001. In Proceedings of the 10th Text Retrieval Conference (TREC’01). 1--15. http://trec.nist.gov/pubs/trec10/papers/overview_10.pdf.Google ScholarGoogle Scholar
  177. J. Wang and D. W. Oard. 2006. Combining bidirectional translation and synonymy for cross-language information retrieval. In Proceedings of the 29th ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 202--209. DOI:10.1145/1148170.1148208 Google ScholarGoogle ScholarDigital LibraryDigital Library
  178. D. Wu, D. He, H. Ji, and R. Grishman. 2008. A study of using an out-of-box commercial MT system for query translation in CLIR. In Proceedings the 2nd ACM Workshop on Improving Non English Web Searching (iNEWS’08). ACM, New York, NY, 71--76. DOI:10.1145/1460027.1460038 Google ScholarGoogle ScholarDigital LibraryDigital Library
  179. J. Xu, A. Fraser, J. Makhoul, M. Noamany, and G. Osman. 2001a. UN Arabic English Parallel Text Version 1.0 beta [CD-ROM]. Linguistic Data Consortium, Philadelphia, PA.Google ScholarGoogle Scholar
  180. J. Xu, A. Fraser, and R. M. Weischedel. 2001b. TREC 2001: Cross-lingual retrieval at BBN. In Proceedings of the 10th Text Retrieval Conference (TREC’01). 68--77. http://trec.nist.gov/pubs/trec11/ypapers/bbn.xu.cross.pdf.Google ScholarGoogle Scholar
  181. J. Xu, A. Fraser, and R. M. Weischedel. 2002. Empirical studies in strategies for Arabic retrieval. In Proceedings of the 25th ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 269--274. DOI:http://doi.acm.org/10.1145/564376.564424 Google ScholarGoogle ScholarDigital LibraryDigital Library
  182. J. Xu and R. M. Weischedel. 2005. Empirical studies on the impact of lexical resources on CLIR performance. Information Processing and Management 41, 3, 475--487. DOI:10.1016/j.ipm.2004.06.009 Google ScholarGoogle ScholarDigital LibraryDigital Library
  183. Z. Yahya, A. M. Taufik, A. Azreen, and A. K. Rabiah. 2013. Query translation using concepts similarity based on Quran ontology for cross-language information retrieval. Journal of Computer Science 9, 7, 889--897. DOI:10.3844/jcssp.2013.889.897Google ScholarGoogle ScholarCross RefCross Ref
  184. Y. Yang, M. Rogati, and B. Kisiel. 2005. Combining categorization-based and corpus-based approaches for CLIR. In Proceedings of the 2005 FLAIRS Conference. 295--300. http://www.aaai.org/Papers/FLAIRS/2005/Flairs05-049.pdf.Google ScholarGoogle Scholar
  185. R. Zajac, A. Malki, and A. Abdelali. 2001. Arabic-English NLP at CRL. In Proceedings of the Arabic NLP Workshop (ACL/EACL’01). http://www.elsnet.org/arabic2001/zajac.pdf.Google ScholarGoogle Scholar
  186. H. Zeng, M. A. Alhossaini, L. Ding, R. Fikes, and D. L. McGuinness. 2006. Computing trust from revision history. In Proceedings of the 2006 International Conference on Privacy, Security, and Trust: Bridge the Gap between PST Technologies and Business Services. Article No. 8. DOI:10.1145/1501434.1501445 Google ScholarGoogle ScholarDigital LibraryDigital Library
  187. Y. Zhang, P. Vines, and J. Zobel. 2005. Chinese OOV translation and post-translation query expansion in Chinese-English cross-lingual information retrieval. ACM Transactions on Asian Language Information Processing 4, 2, 57--77. DOI:http://doi.acm.org/10.1145/1105696.1105697 Google ScholarGoogle ScholarDigital LibraryDigital Library
  188. D. Zhou, M. Truran, T. J. Brailsford, and H. Ashman. 2008. A hybrid technique for English-Chinese cross language information retrieval. ACM Transactions on Asian Language Information Processing 7, 2, Article No. 5. DOI:http://doi.acm.org/10.1145/1362782.1362784. Google ScholarGoogle ScholarDigital LibraryDigital Library
  189. D. Zhou, M. Truran, T. J. Brailsford, V. Wade, and H. Ashman. 2012. Translation techniques in cross-language information retrieval. ACM Computing Surveys 45, 1, 1--44. DOI:http://doi.acm.org/10.1145/2379776.2379777 Google ScholarGoogle ScholarDigital LibraryDigital Library
  190. I. Zitouni (Ed.): 2014. Natural Language Processing of Semitic Languages. Springer-Verlag, Berlin, Germany. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Arabic Cross-Language Information Retrieval: A Review

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Asian and Low-Resource Language Information Processing
          ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 15, Issue 3
          March 2016
          220 pages
          ISSN:2375-4699
          EISSN:2375-4702
          DOI:10.1145/2876004
          Issue’s Table of Contents

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 28 January 2016
          • Accepted: 1 June 2015
          • Revised: 1 April 2015
          • Received: 1 April 2014
          Published in tallip Volume 15, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader