Many researchers have explored speech processing; despite that, the majority of their efforts are concentrated on English, and few of them have particularly targeted Arabic. Although there are many challenges with Arabic speech that can be highlighted, particularly with the Holy Quran. Reciting the Holy Quran correctly is one of these challenges, whereas when reading the Holy Quran, some unique procedures and guidelines must be followed, known as Tajweed rules. Ayyoub et al. [
15] attempted to solve the problem of finding the right way to use Tajweed rules throughout the Holy Quran. In particular, they looked at eight Tajweed rules that beginners in recitation have to deal with. They combined traditional features (such as Linear predictive Code (LPC) [
16], MFCCs [
4], Hidden Markov Model-based Spectral Peak Location (HMM-SPL) [
17], and Wavelet Packet Decomposition (WPD) [
18]) with those retrieved by a convolutional Deep Belief Network [
19] and utilized SVM for classification. For an internal dataset of thousands of audio recordings, the obtained accuracy was 97.7%. Alagrami and ljazzar [
20] also conducted automatic recognition of the Arabic recitation rules of the Holy Quran (smart Tajweed). Alagrami and ljazzar [
20] used filter banks [
21] to extract features as a baseline method, and for classification, they used SVM. This model achieved 99% validation accuracy for only four Tajweed rules. In addition to Tajweed, Makhraj (the parts of the mouth from which Arabic alphabets are uttered) is an important thing Muslims should know to correctly read the Holy Quran [
22]. As a result, distinguishing the Makhraj from the reciter is another challenge of the Holy Quran. Hamid et al. [
23] developed an approach for Makhraj recognition employing MFCC features and Mean Squared Error (MSE) for pattern matching of the hijaiyah letter. This model achieved 100% precision in a dataset that was created for people between the ages of 21 and 23 who are experts in Makhraj utterance. Also, the categorization of specific melodies, known as maqams in Arabic, by the Holy Quran's reciters is a further instance of the Holy Quran's challenges. Shahriar and Tariq [
14] implemented MFCC features and Artificial Neural Network (ANN) that consisted of five deep layers to classify the maqams of the Holy Quran recitations. Shahriar and Tariq [
14] achieved a 95.7% accuracy rate on a dataset created from two Quranic reciters. Identifying the reciter of the Holy Quran is also a difficult challenge. Although there exists a unique signal for each Quran reciter, these individual signals tend to converge due to the influence of Tajweed rules and recitation style. Alkhateeb [
24] developed an algorithm using machine learning for the identification of Holy Quran reciters. MFCC features are extracted from ten reciters, and for classification purposes, the KNN classifier and the ANN classifier are both used. The ANN achieved 97.62% accuracy for Chapter 18 and 96.7% accuracy for Chapter 36, while KNN achieved 97.03% accuracy for chapter 18 and 96.08% accuracy for chapter 36. Also, Anazi and Shahin [
25] used the same model for reciter identification as in a related study [
24]. But in [
25], the authors used another 10 reciters and different Quranic chapters to construct the MFCC features data set. ANN performed with an average accuracy of 98.5% in Chapter 7 and 97.2% in Chapter 32, while KNN's average accuracy for Chapter 7 is 97.02%, compared to 96.07% for Chapter 32. Nahar et al. [
26] suggested two models: ANN and SVM to identify the Quranic reciter out of 15 reciters using MFCC features. The accuracy rate reached was 96.59% using SVM classifier and 86.1% using an ANN. This model is applied to a dataset composed of 230 verses for each of the 15 reciters. In addition, Shah and Ahsan [
27] introduced an Arabic speaker identification system using the combination of Discrete Wavelet Transform (DWT) [
28] and LPC features [
16]. Shah and Ahsan [
27] achieved a 90.90% recognition accuracy in the dataset for five reciters.