Sarcasm, a sentiment often used to express disdain, is the focus of our comprehensive research. We aim to explore the effectiveness of various machine learning and deep learning models, such as Support Vector Machine (SVM), Recurrent Neural Networks (RNNs), Bidirectional Long Short-Term Memory (BiLSTM), and Bidirectional Encoder Representations from Transformers (BERT) models, for detecting sarcasm using the News Headlines dataset. Our thorough framework investigates the impact of the DistilBERT method for text embeddings on enhancing the accuracy of the DL models (RNN and LSTM) for training and classification. To assess the highest values of the proposed models, the authors utilized the four-performance metrices: F1-score, recall, precision, and accuracy. The outcomes revealed that incorporating the BERT model achieves outstanding performance and outperforms other models for an impressive sarcasm classification with a state-of-the-art F1-score of 98%. The outcomes revealed that the F1-scores for SVM, BiSLTM, and RNN are 93%, 95.05%, and 95.52%, respectively. Our experiment demonstrates that incorporating DistilBERT to process the word vector enhances the performance of RNN and BiLSTM, and notably improves their accuracy. The accuracy of the BiLSTM and RNN models when incorporating FT-IDT, Word2Vec, and GLoVe embeddings scored 93.9% and 93.8%, respectively. In contrast, these scores increased to 95.05 and 95.52% when these models incorporated DistilBERT for text embedding. This augmentation can be attributed to the capability of DistilBERT to acquire contextual information and semantic relationships between words, thereby enriching the word vector representation.