Skip to main content
Top
Published in:

06-07-2022

Machine Learning Algorithms for Crime Prediction under Indian Penal Code

Authors: Rabia Musheer Aziz, Prajwal Sharma, Aftab Hussain

Published in: Annals of Data Science | Issue 1/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, the authors propose a data-driven approach to draw insightful knowledge from the Indian crime data. The proposed approach can be helpful for police and other law enforcement bodies in India for controlling and preventing crime region-wise. In the proposed approach different regression models are built based on different regression algorithms, viz., random forest regression (RFR), decision tree regression (DTR), multiple linear regression (MLR), simple linear regression (SLR), and support vector regression (SVR) after pre-processing the data using MySQL Workbench and R programming. These regression models can predict 28 different types of IPC cognizable crime counts and also a total number of Indian Penal Code (IPC) cognizable crime counts region-wise, state-wise, and year-wise (for all over the country) provided the desired inputs to the model. Data visualization techniques, namely, chord diagrams and map plots, are used to visualize pre-processed data (corresponding to the years 2014 to 2020) and predicted data by the relatively best regression model for the year 2022. For the chosen data, it is concluded that Random Forest Regression (RFR), which predicts total IPC cognizable crime, fits relatively the best, with a 0.96 adjusted r squared value and a MAPE value of 0.2, and among regression models predicting region-wise theft crime count, the random forest regression-based model relatively fits the best, with an adjusted R squared value of 0.96 and a MAPE value of 0.166. These regression models predict that Andhra Pradesh state will have the highest crime counts, with Adilabad district at the top, having 31,933 predicted crime counts.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Himabindu BL, Arora R, Prashanth NS (2014) Whose problem is it anyway? Crimes against women in India. Glob Health Action 7(1):23718CrossRefPubMed Himabindu BL, Arora R, Prashanth NS (2014) Whose problem is it anyway? Crimes against women in India. Glob Health Action 7(1):23718CrossRefPubMed
3.
go back to reference Zavadzki T, de Pauli S, Kleina M, Bonat WH (2020) Comparing artificial neural network architectures for Brazilian stock market prediction. Ann Data Sci 7(4):613–628CrossRef Zavadzki T, de Pauli S, Kleina M, Bonat WH (2020) Comparing artificial neural network architectures for Brazilian stock market prediction. Ann Data Sci 7(4):613–628CrossRef
4.
go back to reference Aziz R, Verma CK, Srivastava N (2017) A novel approach for dimension reduction of microarray. Comput Biol Chem 71:161–169CrossRefPubMed Aziz R, Verma CK, Srivastava N (2017) A novel approach for dimension reduction of microarray. Comput Biol Chem 71:161–169CrossRefPubMed
5.
go back to reference Misra S (2021) The Police System in India, Global Perspectives in Policing and Law Enforcement Misra S (2021) The Police System in India, Global Perspectives in Policing and Law Enforcement
6.
go back to reference Kassem M, Ali A, Audi M (2019) Unemployment rate, population density and crime rate in Punjab (Pakistan): an empirical analysis. Bull Bus Econ 8(2):92–104 Kassem M, Ali A, Audi M (2019) Unemployment rate, population density and crime rate in Punjab (Pakistan): an empirical analysis. Bull Bus Econ 8(2):92–104
7.
go back to reference Shi Y (2022) Advances in big data analytics: theory, algorithms and practices. Springer Nature, SingaporeCrossRef Shi Y (2022) Advances in big data analytics: theory, algorithms and practices. Springer Nature, SingaporeCrossRef
8.
go back to reference Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining, vol 10. McGraw-Hill/Irwin, New York, pp 2250–2254 Olson DL, Shi Y, Shi Y (2007) Introduction to business data mining, vol 10. McGraw-Hill/Irwin, New York, pp 2250–2254
9.
go back to reference Shermila AM, Bellarmine AB, Santiago N (2018) Crime data analysis and prediction of perpetrator identity using machine learning approach. In: 2018 2nd international conference on trends in electronics and informatics (ICOEI), 2018. IEEE, pp 107–114 Shermila AM, Bellarmine AB, Santiago N (2018) Crime data analysis and prediction of perpetrator identity using machine learning approach. In: 2018 2nd international conference on trends in electronics and informatics (ICOEI), 2018. IEEE, pp 107–114
10.
go back to reference Musheer RA, Verma C, Srivastava N (2019) Novel machine learning approach for classification of high-dimensional microarray data. Soft Comput 23(24):13409–13421CrossRef Musheer RA, Verma C, Srivastava N (2019) Novel machine learning approach for classification of high-dimensional microarray data. Soft Comput 23(24):13409–13421CrossRef
12.
go back to reference Shabat H, Omar N, Rahem K (2014) Named entity recognition in crime using machine learning approach. In Asia information retrieval symposium, 2014. Springer, pp 280–288 Shabat H, Omar N, Rahem K (2014) Named entity recognition in crime using machine learning approach. In Asia information retrieval symposium, 2014. Springer, pp 280–288
13.
14.
go back to reference Heeramun R, Magnusson C (2017) Gumpert CH, Granath S, Lundberg M, Dalman C, Rai D. Autism and convictions for violent crimes: population-based cohort study in Sweden. J Am Acad Child Adolesc Psychiatry 56(6):491–497CrossRefPubMed Heeramun R, Magnusson C (2017) Gumpert CH, Granath S, Lundberg M, Dalman C, Rai D. Autism and convictions for violent crimes: population-based cohort study in Sweden. J Am Acad Child Adolesc Psychiatry 56(6):491–497CrossRefPubMed
15.
go back to reference McDermott RC, Kilmartin C, McKelvey DK, Kridel MM (2015) College male sexual assault of women and the psychology of men: past, present, and future directions for research. Psychol Men Masc 16(4):355CrossRef McDermott RC, Kilmartin C, McKelvey DK, Kridel MM (2015) College male sexual assault of women and the psychology of men: past, present, and future directions for research. Psychol Men Masc 16(4):355CrossRef
16.
go back to reference Morewitz S (2019) Kidnapping and Violence: New Research and Clinical Perspectives. Springer, New YorkCrossRef Morewitz S (2019) Kidnapping and Violence: New Research and Clinical Perspectives. Springer, New YorkCrossRef
17.
go back to reference van Dijk A, Wolswijk H (2017) Criminal liability for serious traffic offences: essays on causing death, injury and danger in traffic. Eleven International Publishing, Amsterdam van Dijk A, Wolswijk H (2017) Criminal liability for serious traffic offences: essays on causing death, injury and danger in traffic. Eleven International Publishing, Amsterdam
18.
go back to reference ToppiReddy HKR, Saini B, Mahajan G (2018) Crime prediction & monitoring framework based on spatial analysis. Procedia Comput Sci 132:696–705CrossRef ToppiReddy HKR, Saini B, Mahajan G (2018) Crime prediction & monitoring framework based on spatial analysis. Procedia Comput Sci 132:696–705CrossRef
19.
go back to reference Shi Y, Tian Y, Kou G, Peng Y, Li J (2011) Optimization based data mining: theory and applications. Springer, BerlinCrossRef Shi Y, Tian Y, Kou G, Peng Y, Li J (2011) Optimization based data mining: theory and applications. Springer, BerlinCrossRef
20.
go back to reference Liao R, Wang X, Li L, Qin Z (2010) A novel serial crime prediction model based on Bayesian learning theory. In 2010 international conference on machine learning and cybernetics, 2010, vol 4. IEEE, pp 1757–1762 Liao R, Wang X, Li L, Qin Z (2010) A novel serial crime prediction model based on Bayesian learning theory. In 2010 international conference on machine learning and cybernetics, 2010, vol 4. IEEE, pp 1757–1762
21.
go back to reference Hosseinkhani J, Taherdoost H, Keikhaee S (2021) ANTON framework based on semantic focused crawler to support web crime mining using SVM. Ann Data Sci 8(2):227–240CrossRef Hosseinkhani J, Taherdoost H, Keikhaee S (2021) ANTON framework based on semantic focused crawler to support web crime mining using SVM. Ann Data Sci 8(2):227–240CrossRef
22.
go back to reference Keyvanpour MR, Javideh M, Ebrahimi MRJPCS (2011) Detecting and investigating crime by means of data mining: a general crime matching framework. Proc Procedia Comput Sci 3:872–880CrossRef Keyvanpour MR, Javideh M, Ebrahimi MRJPCS (2011) Detecting and investigating crime by means of data mining: a general crime matching framework. Proc Procedia Comput Sci 3:872–880CrossRef
23.
go back to reference Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178CrossRef Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178CrossRef
24.
go back to reference Tayal et al (2015) (2015) Crime detection and criminal identification in India using data mining techniques. AI Soc 30(1):117–127MathSciNetCrossRef Tayal et al (2015) (2015) Crime detection and criminal identification in India using data mining techniques. AI Soc 30(1):117–127MathSciNetCrossRef
25.
go back to reference Awal MA, Rabbi J, Hossain SI, Hashem M (2016) Using linear regression to forecast future trends in crime of Bangladesh. In: 2016 5th international conference on informatics, electronics and vision (ICIEV), 2016. IEEE, pp 333–338 Awal MA, Rabbi J, Hossain SI, Hashem M (2016) Using linear regression to forecast future trends in crime of Bangladesh. In: 2016 5th international conference on informatics, electronics and vision (ICIEV), 2016. IEEE, pp 333–338
26.
go back to reference Yadav S, Timbadia M, Yadav A, Vishwakarma R, Yadav N (2017) Crime pattern detection, analysis & prediction. In: 2017 International conference of electronics, communication and aerospace technology (ICECA), 2017, vol 1. IEEE, pp 225–230 Yadav S, Timbadia M, Yadav A, Vishwakarma R, Yadav N (2017) Crime pattern detection, analysis & prediction. In: 2017 International conference of electronics, communication and aerospace technology (ICECA), 2017, vol 1. IEEE, pp 225–230
27.
go back to reference Kim S, Joshi P, Kalsi PS, Taheri P (2018) Crime analysis through machine learning. In 2018 IEEE 9th annual information technology, electronics and mobile communication conference (IEMCON), 2018. IEEE, pp 415–420 Kim S, Joshi P, Kalsi PS, Taheri P (2018) Crime analysis through machine learning. In 2018 IEEE 9th annual information technology, electronics and mobile communication conference (IEMCON), 2018. IEEE, pp 415–420
28.
go back to reference Kumar H, Sainia B, Mahajana G (2018) Crime prediction & monitoring framework based on spatial analysis. In: International conference on computational intelligence and data science, Jaipur Kumar H, Sainia B, Mahajana G (2018) Crime prediction & monitoring framework based on spatial analysis. In: International conference on computational intelligence and data science, Jaipur
29.
go back to reference Rastogi I et al (2020) Knowledge discovery in databases for prediction of future crimes. Turk J Physiother Rehabil 32:3 Rastogi I et al (2020) Knowledge discovery in databases for prediction of future crimes. Turk J Physiother Rehabil 32:3
30.
go back to reference Mittal M, Goyal LM, Sethi JK, Hemanth DJ (2019) Monitoring the impact of economic crisis on crime in India using machine learning. Comput Econ 53(4):1467–1485CrossRef Mittal M, Goyal LM, Sethi JK, Hemanth DJ (2019) Monitoring the impact of economic crisis on crime in India using machine learning. Comput Econ 53(4):1467–1485CrossRef
31.
go back to reference Das P, Das AK (2019) Application of classification techniques for prediction and analysis of crime in India. In: Computational intelligence in data mining. Springer, pp 191–201 Das P, Das AK (2019) Application of classification techniques for prediction and analysis of crime in India. In: Computational intelligence in data mining. Springer, pp 191–201
32.
go back to reference Hossain S, Abtahee A, Kashem I, Hoque MM, Sarker IH (2020) Crime prediction using spatio-temporal data. In: International conference on computing science, communication and security, 2020. Springer, pp 277–289 Hossain S, Abtahee A, Kashem I, Hoque MM, Sarker IH (2020) Crime prediction using spatio-temporal data. In: International conference on computing science, communication and security, 2020. Springer, pp 277–289
33.
go back to reference Pinto M, Wei H, Konate K, Touray I (2020) Delving into factors influencing New York crime data with the tools of machine learning. J Comput Sci Coll 36(2):61–70 Pinto M, Wei H, Konate K, Touray I (2020) Delving into factors influencing New York crime data with the tools of machine learning. J Comput Sci Coll 36(2):61–70
34.
go back to reference Wheeler AP, Steenbeek W (2021) Mapping the risk terrain for crime using machine learning. J Quant Criminol 37(2):445–480CrossRef Wheeler AP, Steenbeek W (2021) Mapping the risk terrain for crime using machine learning. J Quant Criminol 37(2):445–480CrossRef
35.
go back to reference Hatcher WG, Yu WJIA (2018) A survey of deep learning: platforms, applications and emerging research trend. IEEE Access 6:24411–24432CrossRef Hatcher WG, Yu WJIA (2018) A survey of deep learning: platforms, applications and emerging research trend. IEEE Access 6:24411–24432CrossRef
36.
go back to reference Aziz RM, Baluch MF, Patel S, Kumar P (2022) A machine learning based approach to detect the Ethereum fraud transactions with limited attributes. Karbala Int J Mod Sci 8(2):139–151CrossRef Aziz RM, Baluch MF, Patel S, Kumar P (2022) A machine learning based approach to detect the Ethereum fraud transactions with limited attributes. Karbala Int J Mod Sci 8(2):139–151CrossRef
37.
go back to reference Aziz RM, Hussain A, Sharma P, Kumar P (2022) Machine learning-based Soft Computing regression analysis approach for crime data prediction. Karb Int J Mod Sci 8(1):1–9CrossRef Aziz RM, Hussain A, Sharma P, Kumar P (2022) Machine learning-based Soft Computing regression analysis approach for crime data prediction. Karb Int J Mod Sci 8(1):1–9CrossRef
38.
go back to reference Aziz RM, Baluch MF, Patel S, Ganie AH (2022) LGBM: a machine learning approach for Ethereum fraud detection. Int J Inf Technol 29:1–1 Aziz RM, Baluch MF, Patel S, Ganie AH (2022) LGBM: a machine learning approach for Ethereum fraud detection. Int J Inf Technol 29:1–1
39.
go back to reference Safat W, Asghar S, Gillani SA (2021) Empirical analysis for crime prediction and forecasting using machine learning and deep learning techniques. IEEE Access 9(2021):70080–70094CrossRef Safat W, Asghar S, Gillani SA (2021) Empirical analysis for crime prediction and forecasting using machine learning and deep learning techniques. IEEE Access 9(2021):70080–70094CrossRef
40.
go back to reference Berger PD, Maurer RE, Cell GB (2018) Multiple linear regression. In: Experimental design. Springer, pp 505–532 Berger PD, Maurer RE, Cell GB (2018) Multiple linear regression. In: Experimental design. Springer, pp 505–532
41.
go back to reference Aziz RM (2022) Nature-inspired metaheuristics model for gene selection and classification of biomedical microarray data. Med Biol Eng Comput 60(6):1627–1646CrossRefPubMed Aziz RM (2022) Nature-inspired metaheuristics model for gene selection and classification of biomedical microarray data. Med Biol Eng Comput 60(6):1627–1646CrossRefPubMed
42.
go back to reference Vural MS, Gök M (2017) Criminal prediction using Naive Bayes theory. Neural Comput Appl 28(9):2581–2592CrossRef Vural MS, Gök M (2017) Criminal prediction using Naive Bayes theory. Neural Comput Appl 28(9):2581–2592CrossRef
43.
go back to reference Aziz R, Verma CK, Srivastava N (2018) Artificial neural network classification of high dimensional data with novel optimization approach of dimension reduction. Ann Data Sci 5(4):615–635CrossRef Aziz R, Verma CK, Srivastava N (2018) Artificial neural network classification of high dimensional data with novel optimization approach of dimension reduction. Ann Data Sci 5(4):615–635CrossRef
44.
go back to reference Cootes TF, Ionita MC, Lindner C, Sauer P (2012) Robust and accurate shape model fitting using random forest regression voting. In: European conference on computer vision, 2012. Springer, pp 278–291 Cootes TF, Ionita MC, Lindner C, Sauer P (2012) Robust and accurate shape model fitting using random forest regression voting. In: European conference on computer vision, 2012. Springer, pp 278–291
45.
go back to reference Xia Z, Stewart K, Fan J (2021) Incorporating space and time into random forest models for analyzing geospatial patterns of drug-related crime incidents in a major us metropolitan area. Comput Environ Urban Syst 87:101599CrossRefPubMedPubMedCentral Xia Z, Stewart K, Fan J (2021) Incorporating space and time into random forest models for analyzing geospatial patterns of drug-related crime incidents in a major us metropolitan area. Comput Environ Urban Syst 87:101599CrossRefPubMedPubMedCentral
47.
go back to reference Aziz R, Verma C, Srivastava N (2015) A weighted-SNR feature selection from independent component subspace for NB classification of microarray data. Int J Adv Biotech Res 6(2015):245–255 Aziz R, Verma C, Srivastava N (2015) A weighted-SNR feature selection from independent component subspace for NB classification of microarray data. Int J Adv Biotech Res 6(2015):245–255
48.
go back to reference Desai NP, Baluch MF, Makrariya A, MusheerAziz R (2022) Image processing model with deep learning approach for fish species classification. Turk. J. Comput. Math. Educ. 13(1):85–99 Desai NP, Baluch MF, Makrariya A, MusheerAziz R (2022) Image processing model with deep learning approach for fish species classification. Turk. J. Comput. Math. Educ. 13(1):85–99
49.
go back to reference Lakovic V (2020) Modeling of entrepreneurship activity crisis management by support vector machine. Ann Data Sci 7(4):629–638CrossRef Lakovic V (2020) Modeling of entrepreneurship activity crisis management by support vector machine. Ann Data Sci 7(4):629–638CrossRef
Metadata
Title
Machine Learning Algorithms for Crime Prediction under Indian Penal Code
Authors
Rabia Musheer Aziz
Prajwal Sharma
Aftab Hussain
Publication date
06-07-2022
Publisher
Springer Berlin Heidelberg
Published in
Annals of Data Science / Issue 1/2024
Print ISSN: 2198-5804
Electronic ISSN: 2198-5812
DOI
https://doi.org/10.1007/s40745-022-00424-6

Premium Partner