Skip to main content
Erschienen in:
Buchtitelbild

2016 | OriginalPaper | Buchkapitel

A Natural Language Processing Tool for White Collar Crime Investigation

verfasst von : Maarten van Banerveld, Mohand-Tahar Kechadi, Nhien-An Le-Khac

Erschienen in: Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIII

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In today’s world we are confronted with increasing amounts of information every day coming from a large variety of sources. People and corporations are producing data on a large scale, and since the rise of the internet, e-mail and social media the amount of produced data has grown exponentially. From a law enforcement perspective we have to deal with these huge amounts of data when a criminal investigation is launched against an individual or company. Relevant questions need to be answered like who committed the crime, who were involved, what happened and on what time, who were communicating and about what? Not only the amount of available data to investigate has increased enormously, but also the complexity of this data has increased. When these communication patterns need to be combined with for instance a seized financial administration or corporate document shares a complex investigation problem arises. Recently, criminal investigators face a huge challenge when evidence of a crime needs to be found in the Big Data environment where they have to deal with large and complex datasets especially in financial and fraud investigations. To tackle this problem, a financial and fraud investigation unit of a European country has developed a new tool named LES that uses Natural Language Processing (NLP) techniques to help criminal investigators handle large amounts of textual information in a more efficient and faster way. In this paper, we present this tool and we focus on the evaluation its performance in terms of the requirements of forensic investigation: speed, smarter and easier for investigators. In order to evaluate this LES tool, we use different performance metrics. We also show experimental results of our evaluation with large and complex datasets from real-world application.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Real name of department as well as all of its customer names (banks, etc.) cannot be disclosed because of confidential agreement of the project.
 
2
Again, real name of the tool cannot be disclosed because of confidential agreement of the project.
 
Literatur
1.
Zurück zum Zitat Liddy Elizabeth, D.: Natural language processing, 2nd edn. In: Encyclopedia of Library and Information Science. Marcel Decker, Inc., New York (2001) Liddy Elizabeth, D.: Natural language processing, 2nd edn. In: Encyclopedia of Library and Information Science. Marcel Decker, Inc., New York (2001)
2.
Zurück zum Zitat Tjong, K.S., Erik, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the Conference on Natural Language Learning, June 2003, Edmonton, Canada (2003) Tjong, K.S., Erik, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the Conference on Natural Language Learning, June 2003, Edmonton, Canada (2003)
3.
Zurück zum Zitat Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butteworths, London (1979) Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butteworths, London (1979)
6.
Zurück zum Zitat Jurafsky, D., Martin James, H.: Speech and Language Processing - An Introduction to Natural Language Processing, 2nd edn. Stanford University, University of Colorado at Boulder, Pearson Prentice Hall (2009) Jurafsky, D., Martin James, H.: Speech and Language Processing - An Introduction to Natural Language Processing, 2nd edn. Stanford University, University of Colorado at Boulder, Pearson Prentice Hall (2009)
7.
Zurück zum Zitat Fromkin, V., Rodman, R., Hyam, N.: An Introduction to language, 9th edn. Wadsworth, Boston (2011) Fromkin, V., Rodman, R., Hyam, N.: An Introduction to language, 9th edn. Wadsworth, Boston (2011)
9.
Zurück zum Zitat Sokol, L., Ames, R.: Analytics in a Big Data Environment. IBM Redbooks (2012) Sokol, L., Ames, R.: Analytics in a Big Data Environment. IBM Redbooks (2012)
10.
Zurück zum Zitat Innis Tasha, R., et al.: Towards applying text mining and natural language processing for biomedical ontology acquisition. In: TMBIO 2006: Proceedings of the 1st international Workshop on Text Mining in Bioinformatics, pp. 7–14 (2006) Innis Tasha, R., et al.: Towards applying text mining and natural language processing for biomedical ontology acquisition. In: TMBIO 2006: Proceedings of the 1st international Workshop on Text Mining in Bioinformatics, pp. 7–14 (2006)
11.
Zurück zum Zitat Fitzgerald, S., et al.: Using NLP techniques for file fragment classification. Digital Invest. 9, 44–49 (2012)CrossRef Fitzgerald, S., et al.: Using NLP techniques for file fragment classification. Digital Invest. 9, 44–49 (2012)CrossRef
12.
Zurück zum Zitat Scholkopf, B.: A short tutorial on kernels. Technical report MSR-TR-200-6t, Microsoft Research (2000) Scholkopf, B.: A short tutorial on kernels. Technical report MSR-TR-200-6t, Microsoft Research (2000)
13.
Zurück zum Zitat O’Day, D.R., Calix, R.A.: Text message corpus: applying natural language processing to mobile device forensics. In: IEEE International Conference on Multimedia and Expo, 5–9 July 2013, San Jose, USA (2013) O’Day, D.R., Calix, R.A.: Text message corpus: applying natural language processing to mobile device forensics. In: IEEE International Conference on Multimedia and Expo, 5–9 July 2013, San Jose, USA (2013)
14.
Zurück zum Zitat Van Dijk, D., Henseler, H.: Semantic search in e-Discovery: an interdisciplinary approach. In: Workshop on Standards for Using Predictive Coding, Machine Learning, and Other Advanced Search and Review Methods in E-Discovery, ICAIL 2013 Van Dijk, D., Henseler, H.: Semantic search in e-Discovery: an interdisciplinary approach. In: Workshop on Standards for Using Predictive Coding, Machine Learning, and Other Advanced Search and Review Methods in E-Discovery, ICAIL 2013
16.
Zurück zum Zitat Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI 2004: Sixth Symposium on Operating System Design and Implementation, December 2004, San Francisco, CA (2004) Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI 2004: Sixth Symposium on Operating System Design and Implementation, December 2004, San Francisco, CA (2004)
17.
Zurück zum Zitat Popowitch, F.: Using text mining and natural language processing for health care claims processing. ACM SIGKDD Explor. Newsl. - Natural language processing and text mining 7(1), 59–66 (2005)CrossRef Popowitch, F.: Using text mining and natural language processing for health care claims processing. ACM SIGKDD Explor. Newsl. - Natural language processing and text mining 7(1), 59–66 (2005)CrossRef
19.
Zurück zum Zitat Buist, A.H., Kraaij, W., Raaijmakers, S.: Automatic summarization of meeting data: a feasibility study. In: Proceedings of the 15th CLIN Conference (2005) Buist, A.H., Kraaij, W., Raaijmakers, S.: Automatic summarization of meeting data: a feasibility study. In: Proceedings of the 15th CLIN Conference (2005)
21.
Zurück zum Zitat Norvig, P.: Natural language corpus data. In: Beautiful Data, pp. 219–242 (2009) Norvig, P.: Natural language corpus data. In: Beautiful Data, pp. 219–242 (2009)
22.
Zurück zum Zitat Le-Khac, N.-A., Aouad, L.M., Kechadi M.-T., Knowledge map: toward a new approach supporting the knowledge management in distributed data mining, KUI track. In: 3rd IEEE International Conference on Autonomic and Autonomous Systems, 19–25 June 2007. Computer Society Press, Athens (2007) Le-Khac, N.-A., Aouad, L.M., Kechadi M.-T., Knowledge map: toward a new approach supporting the knowledge management in distributed data mining, KUI track. In: 3rd IEEE International Conference on Autonomic and Autonomous Systems, 19–25 June 2007. Computer Society Press, Athens (2007)
23.
Zurück zum Zitat Le-Khac, N.-A., Aouad, L.M., Kechadi M.-T.: Distributed knowledge map for mining data on grid platform. Int. J. Comput. Sci. Netw. Secur. 7(10), 98 (2007). ISSN 1738-7906 Le-Khac, N.-A., Aouad, L.M., Kechadi M.-T.: Distributed knowledge map for mining data on grid platform. Int. J. Comput. Sci. Netw. Secur. 7(10), 98 (2007). ISSN 1738-7906
24.
Zurück zum Zitat Le-Khac, N.-A., Kechadi, M.-T.: Apply data mining and natural computing in detecting suspicious cases of money laundering in an investment bank: a case study. In: The 10th IEEE International Conference on Data Mining, 14–17 December 2010, Sydney, Australia (2010) Le-Khac, N.-A., Kechadi, M.-T.: Apply data mining and natural computing in detecting suspicious cases of money laundering in an investment bank: a case study. In: The 10th IEEE International Conference on Data Mining, 14–17 December 2010, Sydney, Australia (2010)
25.
Zurück zum Zitat Le-Khac, N.-A., et al.: An efficient search tool for an anti-money laundering application of an multi-national bank’s dataset. In: International Conference on Information and Knowledge Engineering, 13–16 July 2009, Las Vegas, USA (2009) Le-Khac, N.-A., et al.: An efficient search tool for an anti-money laundering application of an multi-national bank’s dataset. In: International Conference on Information and Knowledge Engineering, 13–16 July 2009, Las Vegas, USA (2009)
Metadaten
Titel
A Natural Language Processing Tool for White Collar Crime Investigation
verfasst von
Maarten van Banerveld
Mohand-Tahar Kechadi
Nhien-An Le-Khac
Copyright-Jahr
2016
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-49175-1_1

Neuer Inhalt