Skip to main content

2016 | OriginalPaper | Buchkapitel

Concatenation Technique for Extracted Arabic Characters for Efficient Content-based Indexing and Searching

verfasst von : Abdul Khader Jilani Saudagar, Habeeb Vulla Mohammed

Erschienen in: Proceedings of the Second International Conference on Computer and Communication Technologies

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This research paper demonstrates the work accomplished in the last phase of the ongoing research project with an objective of developing a system for moving Arabic video text extraction for efficient content-based indexing and searching. The novelty of this paper is the technique used for concatenation of the individual stand alone Arabic characters which are extracted and recognized from image frames. Unicode format of Arabic characters is used for concatenation of extracted characters which is never done before. The concatenated characters are written into the text file in incessant way. This text files are indexed using Lucene and search for the desired string is done in a faster and precise manner.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Saudagar, A.K.J., Mohammed, H.V.: A comparative study of video splitting techniques. In: 23rd International Conference on Systems Engineering, pp. 783–788. Springer International Publishing, Switzerland (2015) Saudagar, A.K.J., Mohammed, H.V.: A comparative study of video splitting techniques. In: 23rd International Conference on Systems Engineering, pp. 783–788. Springer International Publishing, Switzerland (2015)
2.
Zurück zum Zitat Saudagar, A.K.J., Mohammed, H.V., Iqbal, K., Gyani, Y.J.: Efficient Arabic text extraction and recognition using thinning and dataset comparison technique. In: International Conference on Communication, Information & Computing Technology, pp. 1–5. IEEE Press, New York (2015) Saudagar, A.K.J., Mohammed, H.V., Iqbal, K., Gyani, Y.J.: Efficient Arabic text extraction and recognition using thinning and dataset comparison technique. In: International Conference on Communication, Information & Computing Technology, pp. 1–5. IEEE Press, New York (2015)
4.
Zurück zum Zitat Assabie, Y., Bigun, J.: HMM-based handwritten amharic word recognition with feature concatenation. In: 10th International Conference on Document Analysis and Recognition, pp. 961–965. IEEE Press, New York (2009) Assabie, Y., Bigun, J.: HMM-based handwritten amharic word recognition with feature concatenation. In: 10th International Conference on Document Analysis and Recognition, pp. 961–965. IEEE Press, New York (2009)
5.
Zurück zum Zitat Buckwalter, T.: Issues in Arabic orthography and morphology analysis. In: Workshop on Computational Approaches to Arabic Script-based Languages, pp. 31–34 (2004) Buckwalter, T.: Issues in Arabic orthography and morphology analysis. In: Workshop on Computational Approaches to Arabic Script-based Languages, pp. 31–34 (2004)
6.
Zurück zum Zitat Amin, A.: Recognition of printed Arabic text based on global features and decision tree learning techniques. Pattern Recogn. 33, 1309–1323 (2000)CrossRef Amin, A.: Recognition of printed Arabic text based on global features and decision tree learning techniques. Pattern Recogn. 33, 1309–1323 (2000)CrossRef
7.
Zurück zum Zitat Harmanani, H., Keirouz, W., Raheel, S.: A rule-based extensible stemmer for information retrieval with application to Arabic. Int. Arab. J. Inf. Techn. 3, 265–272 (2006) Harmanani, H., Keirouz, W., Raheel, S.: A rule-based extensible stemmer for information retrieval with application to Arabic. Int. Arab. J. Inf. Techn. 3, 265–272 (2006)
8.
Zurück zum Zitat Chherawala, Y., Cheriet, M.: Arabic word descriptor for handwritten word indexing and lexicon reduction. Pattern Recogn. 47, 3477–3486 (2014)CrossRef Chherawala, Y., Cheriet, M.: Arabic word descriptor for handwritten word indexing and lexicon reduction. Pattern Recogn. 47, 3477–3486 (2014)CrossRef
9.
Zurück zum Zitat Mahmoud, R., Majed, S.: Improving Arabic information retrieval system using n-gram method. WSEAS Trans. Comput. 10, 125–133 (2011) Mahmoud, R., Majed, S.: Improving Arabic information retrieval system using n-gram method. WSEAS Trans. Comput. 10, 125–133 (2011)
10.
Zurück zum Zitat Al-Molijy, A., Hmeidi, I., Alsmadi, I.: Indexing of Arabic documents automatically based on lexical analysis. Int. J. Nat. Lang. Comput. 1, 1–8 (2012) Al-Molijy, A., Hmeidi, I., Alsmadi, I.: Indexing of Arabic documents automatically based on lexical analysis. Int. J. Nat. Lang. Comput. 1, 1–8 (2012)
11.
Zurück zum Zitat Wedyan, M., Alhadidi, B., Alrabea, A.: The effect of using a thesaurus in Arabic information retrieval system. Int. J. Comput. Sci. Issues 9, 431–435 (2012) Wedyan, M., Alhadidi, B., Alrabea, A.: The effect of using a thesaurus in Arabic information retrieval system. Int. J. Comput. Sci. Issues 9, 431–435 (2012)
13.
Zurück zum Zitat Chan, J., Ziftci, C., Forsyth, D.: Searching off-line arabic documents. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1455–1462. IEEE Press, New York (2006) Chan, J., Ziftci, C., Forsyth, D.: Searching off-line arabic documents. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1455–1462. IEEE Press, New York (2006)
14.
Zurück zum Zitat Lin, C.H., Chen, H.: An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents. IEEE T. Syst. Man. Cy. B. 26, 75–88 (1996)CrossRef Lin, C.H., Chen, H.: An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents. IEEE T. Syst. Man. Cy. B. 26, 75–88 (1996)CrossRef
15.
Zurück zum Zitat Moukdad, H., Large, A.: Information retrieval from full-text arabic databases: can search engines designed for English do the job? Libri. 51, 63–74 (2001)CrossRef Moukdad, H., Large, A.: Information retrieval from full-text arabic databases: can search engines designed for English do the job? Libri. 51, 63–74 (2001)CrossRef
18.
Zurück zum Zitat Yacine, E.Y.: Towards an Arabic web-based information retrieval system (ARABIRS): stemming to indexing. Int. J. Comput. Appl. 109, 16–21 (2015) Yacine, E.Y.: Towards an Arabic web-based information retrieval system (ARABIRS): stemming to indexing. Int. J. Comput. Appl. 109, 16–21 (2015)
20.
Zurück zum Zitat Darwish, K., Oard, D.W.: Term selection for searching printed Arabic. In: 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 261–268 (2002) Darwish, K., Oard, D.W.: Term selection for searching printed Arabic. In: 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 261–268 (2002)
21.
Zurück zum Zitat He, J., Yan, H., Suel, T.: Compact full-text indexing of versioned document collections. In: 18th ACM conference on Information and Knowledge Management, pp. 415–424 (2009) He, J., Yan, H., Suel, T.: Compact full-text indexing of versioned document collections. In: 18th ACM conference on Information and Knowledge Management, pp. 415–424 (2009)
23.
Zurück zum Zitat Mazari, A.C., Aliane, H., Alimazighi, Z.: A conceptual indexing approach for Arabic texts. In: ACS International Conference on Computer Systems and Applications (AICCSA), p. 1. IEEE Press, New York (2013) Mazari, A.C., Aliane, H., Alimazighi, Z.: A conceptual indexing approach for Arabic texts. In: ACS International Conference on Computer Systems and Applications (AICCSA), p. 1. IEEE Press, New York (2013)
25.
Zurück zum Zitat Arara, A., Smeda, A., Ellabib, I.: Searching and analyzing Arabic text using regular expressions e–Quran case study. Int. J. Comput. Sci. Electron. Eng. 1, 627–631 (2013) Arara, A., Smeda, A., Ellabib, I.: Searching and analyzing Arabic text using regular expressions e–Quran case study. Int. J. Comput. Sci. Electron. Eng. 1, 627–631 (2013)
27.
Zurück zum Zitat Srihari, S.N., Ball, G.R., Srinivasan, H.: Versatile search of scanned Arabic handwriting. In: Arabic and Chinese Handwriting Recognition. LNCS, vol. 4768, pp. 57–69. Springer, Heidelberg (2008) Srihari, S.N., Ball, G.R., Srinivasan, H.: Versatile search of scanned Arabic handwriting. In: Arabic and Chinese Handwriting Recognition. LNCS, vol. 4768, pp. 57–69. Springer, Heidelberg (2008)
28.
Zurück zum Zitat Navarro, G., Sutinen, E., Tanninen, J., Tarhio, J.: Indexing text with approximate q-Grams. In: Combinatorial Pattern Matching. LNCS, vol. 1848, pp. 350–363. Springer, Heidelberg (2000) Navarro, G., Sutinen, E., Tanninen, J., Tarhio, J.: Indexing text with approximate q-Grams. In: Combinatorial Pattern Matching. LNCS, vol. 1848, pp. 350–363. Springer, Heidelberg (2000)
Metadaten
Titel
Concatenation Technique for Extracted Arabic Characters for Efficient Content-based Indexing and Searching
verfasst von
Abdul Khader Jilani Saudagar
Habeeb Vulla Mohammed
Copyright-Jahr
2016
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2517-1_55