Android Application-Based Security Surveillance Implementing Machine Learning

Usually, home automation is constructed to control the appliances even when the owner is not at home. Our system is designed in such a way that the implementation should be cost-effective and powerful using the Internet of Things. This system can be controlled via our developed Android-based mobile application. Using a camera inside home, an automatic identification of intruder as a part of intelligence access control system. Once the intruder gets detected by infrared sensor and our systems will compare the intruder face with existing data set, then it will match the face, if matched occur system will consider the face otherwise, an alert notification will be sent to owner android device. Then the owner can switch on the appliances (light and fans) using the mobile application [4]. This paper combines machine learning and image processing which are powerful modern technologies. This paper basically deals with the integration of all three criteria: Android application, machine learning, and home automation.

Henakshi Das, Preetisudha Meher

Realtime Object Distance Measurement Using Stereo Vision Image Processing

In recent years, great progress has been made on 2D and 3D image understanding tasks, such as object detection and instance segmentation. The recent trends in technology driverless cars are making a difference in daily life. The basic principle in these driverless cars is object detection and localization using multiple video cameras and LIDAR and it is one of the current trends in research and development, so attempts to achieve the same on small scale using the available resources. In the proposed method, firstly the stereo images are captured in a dual-lens camera, and secondly, converting the RGB image into a grayscale image. The third step is to apply a global threshold to separate the background, to get the same size of the image using morphological operation. Blob detection is used to detect the points and regions in the image. The fourth step is to detect the object distance and size measurement using the pinhole camera formula. Further, in the proposed work, an effort is made to determine the linear space between the camera and the object from the pictures taken from the camera. Typically, stereo images are used for computation. Binocular stereopsis, or stereo vision, is the capability to derive information about how far left the objects are, grounded uniquely on the comparative places of the object in the two eyes. It depends on both sensory and motor capabilities, using the similar principle the human brain employs, taking two images of the same object taken from two different linearly separated distances. The frame rate of the system can go a maximum of up to 15 frames per second. 15 frames per second can be considered as acceptable for most autonomous systems, and it will work in realtime. Effective convolutional matching technique between embeddings are used for localization that leads LIDAR to increase centimeter level accuracy by about 97%.

B. N. Arunakumari, R. Shashidhar, H. S. Naziya Farheen, M. Roopa

An Insight on Drone Applications in Surveillance Domain

Unmanned Aerial Vehicle (UAV), also referred as drone, is the rapid development of the technology. The Drone requires a critical infrastructure element, tools, ground station, communication links, server, and application services. The existing system has drawbacks, such as vulnerability, unsafe, risk and unsecured systems etc. To overcome the limitations, the application of Drones in Surveillance uses ground to ground, ground to air and air to air communication. The significant features of drone include takeoff, landing, traveling with payload, record the data, functional operations, and application services. The system with a high-resolution camera can record the data in a specific area using Global Positioning System (GPS), embedded systems, controllers, and concern application(s). The inevitable usage of drones in Surveillance includes the evidence collection in the investigation process during the Forensic study/police investigations. The work can be extended to intelligent navigation with target mission-critical applications in military/defense.

M. Swami Das, Gunupudi Rajesh Kumar, R. P. Ram Kumar

Handwritten Mixed Numerals Classification System

Optical Character Recognition is a growing field in pattern recognition, where we can find lots of work done on different Indic languages such as Tamil, Telugu, Gujarati, Bangla, Kannada, etc. Previous work for OCR based on Indic languages show the need for compound OCR system for Indic languages, which can able to deal with documents having details in multiple languages. The proposed system for the classification of handwritten mixed numerals can overcome the limitations of the existing monolingual system based on OCR. We trained a model to identify numerals of Hindi and Gujarati languages. Here we used the projection profile technique as feature extraction over 22,400 images of numerals to generate a feature set. We tried different classifiers such as Naïve Bayes classifier for multi-class classification, Support Vector Machine, K-Nearest Neighbors, and Multi-Layer Perceptron network over the train and test image datasets having a ratio of 70:30 to evaluate and select the best model with the highest performance. We have successfully achieved an accuracy of 66.94%, 90.78%, 84.61%, and 88.21% using mentioned classifiers for Gujarati-Hindi mixed handwritten numerals.

Krishn Limbachiya, Ankit Sharma

IoT Based Smart Farm Monitoring System

Farming is an occupation which is playing the ultimate role for survival in this world. It supplies the maximum needs for the human being to live in this world. As per the studies, we have seen that agriculture is a prime factor of livelihood for countries like India and to get bulk output with less effort will not only help them gain more productivity but also enhance the technicality in the field. Internet of Things (IoT) plays a vital role in supplementing the facilities of the connecting world through their various levels of sensors and capability of working as a mold as per the creator and users’ choice. This paper is all about creating an environment for smart farm technology, recording various parameters in terms of temperature, humidity, and moisture and automating a system for operating the motor pump along with farm gate opening and closing along with providing crop suggestions to the farmer.

Ankuran Das, Hridaydeep Bora, Jugasmita Kashyap, Chinmoy Bordoloi, Smriti Priya Medhi

An Extensive Review of the Supervised Learning Algorithms for Spiking Neural Networks

A Spiking Neural Network (SNN) processes neural information through precise timing of spikes and is considered a brain-inspired computational model of the third generation of the artificial neural network. SNN has a set of biologically plausible spiking neurons that have proven effective in processing complex temporal and spatio-temporal data. In addition, SNNs are computationally powerful, energy-efficient as well as a dynamic systems. However, the formulation of efficient supervised learning algorithms for SNNs is challenging due to their inherently discontinuous and implicit non-linear mechanisms. It has become a significant challenge in this field. Moreover, there exist a few efficient supervised learning algorithms developed for SNN. This paper provides a thorough review of supervised learning algorithms developed for SNNs categorically. We have divided the supervised learning algorithms into several categories based on the core principles for optimisation, such as gradient rule, asymmetric supervised Hebbian learning, remote supervision, and metaheuristics.

Irshed Hussain, Dalton Meitei Thounaojam

Multitask Learning-Based Simultaneous Facial Gender and Age Recognition with a Weighted Loss Function

Traditionally, researchers train facial gender and age recognition models separately using deep convolutional networks. However, in the real world, it is crucial to build a low-cost and time-efficient multitask learning system that can simultaneously recognize both these tasks. In multitask learning, the synergy among the tasks creates imbalance in the loss functions and influences their individual performances. This imbalance among the task-specific loss functions leads to a drop in accuracy. To overcome this challenge and achieve better performance, we propose a novel weighted sum of loss functions that balances the loss of each task. We train our method for the recognition of gender and age on the publicly available Adience benchmark dataset. Finally, we experiment our method on VGGFace and FaceNet architectures and evaluate on the Adience test set to achieve better performance than previous architectures.

Abhilasha Nanda, Hyun-Seung Yang

Visualizing Crime Hotspots by Analysing Online Newspaper Articles

With improvements in technology, India is growing at a fast pace, which has led to a great deal of urbanization. However, instead of reducing, the rate of crime has increased these past couple of years. The general public must be educated on how safe an area is, so that they may take the appropriate actions to protect themselves. Every day, we see many local crimes published in internet news articles, but not everyone has the time to read them all. They contain information that can be used to determine the safety of a location. Thus, in this paper, we propose an end-to-end solution based on Natural Language Processing to inform users of the crime rate in their area. We create a model that analyzes crimes mentioned in local news articles and collects data such as location and incident type. The model uses the concept of Named Entity Recognition to extract the locations and the crime that has occurred. To take advantage of the benefits of transfer learning, we built the model using Google's BERT framework. It was trained on CONELL2003 with custom modifications and was put to the test using real-time data gathered from several online news outlets’ crime pieces. Our model has an F1 score of 83.87% and a validation accuracy of 96%. The information collected via internet was visualized on a heat map using bokeh package. We display metrics such as name of the location, number of crimes occurred in that area and the recent most crime that has occurred which provides a quick overview and benefits our users.

M. Trupthi, Prerana Rajole, Neha Dinesh Prabhu

Applications of Machine Learning for Face Mask Detection During COVID-19 Pandemic

Covid-19 pandemic has forced us to adapt to the new lifestyle. World Health Organization (WHO) recommends that people should adhere to the public health expert’s guidelines to fight against the spread of Covid-19. The most essential Covid-19 guideline has been the use of facemask which has been enforced throughout the globe and has proven to contain the spread of corona virus. The proposed study aims at examining the detection of mask usage by people through Machine learning approach. The research employs binary classification problem to detect and classify people wearing masks from the people not-wearing masks. Three machine learning models namely InceptionV3, VGGNet and Resnet have been adapted in this research for pre-processing the input images. Similarly, XGBoost, Random Forest and fully connected DNN models have been used for decoding and classification. Performance evaluation has also been done for the different models and a comparison of the performances has been carried out as a part of the research. The results obtained through performance evaluation technique showed that ResNet+Fully Connected DNN is the best among the developed models where the precision was 99.73%, accuracy was 99.7%, F1 Score was 99.69% and the recall score was 99.66%.

Sarfraz Fayaz Khan, Mohammad Ahmar Khan, Rabiah Al-Quadah

A Cascaded Deep Learning Approach for Detection and Localization of Crop-Weeds in RGB Images

Weeds compete with crops in the fields, thus lowering crop yield with losses of up to 80%. The efficient use of chemical herbicides is desired to reduce the harmful effects on the environment, which requires the location of the weeds to be known. In this paper, we present a deep learning approach capable of detecting and localizing weeds in RGB images, trained using the publicly available Open Sprayer dataset. The adopted methodology consists of a classification step using a pre-trained 2D convolution neural network and a Random Forest classifier, which is used to predict the presence of weeds in an RGB image. If presence is predicted, then an attempt to localize them has been done by cascading a segmentation step using a U-Net architecture. The proposed architecture can classify the presence of weeds in an image with an accuracy of 91.19% and predict the location of weeds in the image by generating binary masks with a mean Dice score of 0.879 on the publicly available Open Sprayer dataset.

Rohit Agrawal, Jyoti Singh Kirar

Ensemble of Deep Learning Enabled Tamil Handwritten Character Recognition Model

Recently, digitalization of handwritten characters has become a hot research topic and finds applicability in different domains. At the same time, recognition of Tamil handwritten characters is a tedious task compared to other languages. Therefore, this paper presents a new ensemble deep learning-based Tamil handwritten character recognition (EDL-THCR) model. The EDL-THCR model recognizes and classifies the Tamil handwritten characters. In addition, data preprocessing approach is involved using bilinear interpolation technique to normalize the images. Besides, an ensemble of capsule network (CapsNet) and VGGNet models take place for feature extraction process. Finally, softmax layer is employed to classify the Tamil characters in an effective way. A comprehensive experimental analysis is carried out on benchmark dataset, and the results portrayed the better performance of the EDL-THCR technique.

R. Thanga Selvi

A Comparative Study of Loss Functions for Deep Neural Networks in Time Series Analysis

Currently, deep neural networks are widely used for analyzing temporal data. These networks can adapt their architecture to specific needs and deliver good performance. Researchers and developers frequently update their architecture to meet the requirements, but this process can be quite time-consuming. A crucial aspect of DNN architecture is the loss function, which plays a crucial role in calculating gradients. Most research and applications in time series analysis use the mean squared error (MSE) loss function. In this paper, we aim to explore existing loss functions to address the challenge of selecting the appropriate loss function for DNNs. We conduct experiments on time series datasets to evaluate the impact of different loss functions on DNN model performance. Our findings indicate that the Huber loss function outperforms other loss functions in time series analysis. Additionally, we discuss the potential for custom loss functions as future work, beyond the limitations of existing methods.

Rashi Jaiswal, Brijendra Singh

Learning Algorithm for Threshold Softmax Layer to Handle Unknown Class Problem

Neural network are mostly trained with predefined class training data in supervised learning. But, when unknown test data (other than predefined class) are classified by a trained neural network, they are always misclassified into predefined classes, thus misclassification rate of trained neural network increases. To tackle these problems, Threshold Softmax Layer (TSM) and learning algorithm is proposed. In which, a normalized probability of each output class of the neural network is calculated and a threshold value is updated for each class during threshold learning process. If the maximum normalized probability of test data does not cross threshold value of the corresponding class, we will classify test data into unknown class. This TSM layer with neural network is evaluated on three UCI benchmark dataset (Glass, Yeast and Wine quality) and successfully handles the unknown class problem with reduced misclassification error.

Gaurav Jaiswal

Traffic Monitoring and Violation Detection Using Deep Learning

The traffic density on roads has been increasing rapidly for the past few decades, which has in turn been reflected in the increase in traffic violations and accidents. Official reports from various governments and private entities bolster the fact that, indeed, the current methods for traffic monitoring are inept to deal with the huge traffic density [1, 2]. These methods, which traditionally included the deployment of traffic police personnel at a select few junctions where the traffic density is high, ignore the majority of the other roads. Traffic monitoring systems that exploit image processing, computer vision and deep learning techniques thus come out to be a viable and optimal solution to monitor traffic and detect violations. These systems can easily be integrated with the architecture of law enforcement to penalize violators in real time. The proposed method—which utilizes YOLOv3 and SORT—is effective and accurate in detecting several violations like—over-speeding, wrong-way driving, signal jumping, driving without helmet and triple seat violation. It also helps to keep track of the count of vehicles, their types and also the number of axels for multi-axle vehicles, thus, asserting itself as a novel and indigenous solution to a widely recognized problem.

Omkar Sargar, Saharsh Jain, Sravan Chittupalli, Aniket Tatipamula

Conjugate Gradient Method for finding Optimal Parameters in Linear Regression

Linear regression is one of the most celebrated approaches for modeling the relationship between independent and dependent variables in a prediction problem. It can have applications in a number of domains including weather data analysis, price estimation, bioinformatics, etc. Various computational approaches have been devised for finding the best model parameter. In this work, we explore and establish the possibility of applying the Conjugate Gradient Method for finding the optimal parameters for our regression model, which is demonstrated by taking the house price prediction problem using the Boston dataset. The efficiency of the conjugate gradient method over the pseudo-inverse method and gradient descent methods in terms of computational requirement are discussed. We show that the weights obtained by the conjugate gradient are accurate and the parameter vector converges to an optimal value in relatively fewer iterations when compared to the gradient descent techniques. Hence, Conjugate Gradient Method proves to be a faster approach for a linear regression problem in ordinary least square settings.

Vishal Menon, V. Ashwin, G. Gopakumar

Rugby Ball Detection, Tracking and Future Trajectory Prediction Algorithm

This paper presents a custom object detection and tracking algorithm for position estimation and trajectory prediction of a moving rugby ball. The approach of the algorithm is to combine the accuracy of object detection provided by the custom trained YOLOv5 model and the speed of the KCF tracker to perform a linear trajectory prediction of the ball. Kalman filter is used to ensure the optimal estimation of the current position and increase the accuracy of prediction for the future trajectory. Multi-threading is implemented to concurrently detect and track the ball in consecutive frames, resulting a computationally efficient approach.

Pranesh Nangare, Anagha Dangle

Early Detection of Heart Disease Using Feature Selection and Classification Techniques

Cardiovascular diseases have been recognized as one of the major causes of death in humans. Majority of the time, the increase in death rate is due to the delay in detecting heart disease. Early detection would help to save more lives. Since the early detection of heart disease considers many features and a large volume of data, machine learning techniques can significantly predict heart diseases in the early stages. In this work, three major feature selection techniques have been deployed before each classifier to acquire better performance and accuracy. The dataset has been thoroughly examined, processed and the subset of traits that have a significant role in the prediction of heart disease has been extracted. The classification methods used to classify the retrieved features aided in improving accuracy.

R. S. Renju, P. S. Deepthi

Gun Detection System for Surveillance Cameras Using HOG-Assisted KNN Classifier

Mass shootings have become a norm in public places claiming thousands of innocent civilian lives. Firearm-related violence has been rampant in the last few decades and therefore, needs to be addressed immediately. This research aims to propose a weapon detection system that combines image processing techniques with the most suitable machine learning classifier. The experimental study has been executed in twofold: by providing a comparative analysis with previous study and with other existing algorithms as well. Results of the trained model on the dataset provide an accuracy of 98.7% which is significantly better than others published recently.

Lucy Sumi, Shouvik Dey

Optimized Detection, Classification, and Tracking with YOLOV5, HSV Color Thresholding, and KCF Tracking

This paper shows a detection and tracking approach regarding position estimation of pots and angle of arrow on-ground. The pots are detected by HSV color thresholding, and then classified on the basis of local positional parameters like distance of pots from the robot and relative position of pots with the robot. This is a computationally efficient solution for simple regular objects like pots. The Kalman filter specifically works on providing better depth estimates and thereby the position of pots even when the pot tables are overlapping. The approach of the detection-tracking algorithm for small objects like arrows is to combine the accuracy of object detection provided by the custom-trained YOLOv5 model and the speed of the KCF tracker to outperform the results. Multithreading is used to concurrently detect and track the arrows in consecutive frames, producing a computationally efficient approach as compared to standalone detection with YOLOv5. This paper also describes an approach to effectively get depth information on an object using an Intel RealSense D435i depth camera.

Aditya Yadav, Srushti Patil, Anagha Dangle, Pranesh Nangare

COVID-19 Detection Using Chest X-ray Images

COVID-19 is a respiratory infectious disease discovered in Wuhan, China, which later turned out to be a pandemic disease. The disease is spreading at a rate higher than what the world is prepared for, and hence, there is a huge shortage in testing and resources for it. To overcome this situation, the artificial intelligence community has been working hard to make use of some advanced technology to detect the presence of novel coronavirus. In our paper, we propose an ensemble 3-class classifier model with a stochastic hill-climbing optimisation algorithm for detecting infection in chest X-ray images. The novelty of our work involves the selection of optimal feature set from a feature set of handcrafted features and VGG-16 features using optimisation technique which is followed by a soft voting based ensemble classification. The proposed model achieved an overall F1-score of 0.997. Our dataset has Chest X-Ray images of all age groups and provides a more reliable and consistent result that can be used for the timely detection of COVID-19.

Gautham Santhosh, S. Adarsh, Lekha S. Nair

Comparative Analysis of LDA Algorithm for Low Resource Indian Languages with Its Translated English Documents

Nowadays, social media acts as an information medium that generates a huge amount of data. Irrespective of the age groups, many people across the globe use social media to share their thoughts and emotions in the form of tweets, comments, etc. This information can be used as a source of data for research. This generates a humongous amount of data not only in English but also in many other languages. In this paper, we have focused our research on native Indian languages such as Kannada, Tamil, and Telugu. Latent Dirichlet Allocation (LDA) is an algorithm that can process a large set of text data to produce topic modeling, clustering words, and topics in a document. We have analyzed the performance of the LDA method for native language. Dataset in Kannada and Tamil works better with respect to coherence and English translated dataset of Telugu is Optimal when compared with original Telugu data. With respect to perplexity, LDA works better for native language dataset.

D. K. Meghana, K. Kiran, Saleha Nida, T. B. Shilpa, P. Deepa Shenoy, K. R. Venugopal

Text Style Transfer: A Comprehensive Study on Methodologies and Evaluation

Text Style Transfer (TST) rewords a sentence from one style (e.g., polite) to another (e.g., impolite) while conserving the meaning and content. This domain has attracted the attention of many researchers as it makes natural language generation (NLG) tasks more user-oriented. TST finds its applications widely in industry such as conversational bots and writing assistance tools. With the success of deep learning, a plethora of research works on style transfer based on machine learning have been proposed, developed, and tested. This systematic review presents the past work on TST clustered into categories based on machine learning and deep learning algorithms. It briefly explains the various subtasks within TST and assembles its publicly available datasets. It also summarizes the automatic and manual evaluation practices used for style transfer tasks and finally, sheds some light on current challenges and points towards promising future directions for research in the TST domain.

Nirali Parekh, Siddharth Trivedi, Kriti Srivastava

Classification of Hindustani Musical Ragas Using One-Dimensional Convolutional Neural Networks

Ragas are a melodic progression of notes used in Indian classical music. They are believed to have mental and physiological enriching qualities and are used in Raga music therapy. Identification of Ragas necessitates a great deal of expertise since there are instances where two or more Ragas have very similar characteristics making them difficult to identify. An accurate classifier will be an indispensable tool for Indian classical music learners and enthusiasts alike. This paper proposes a One-Dimensional Convolutional Neural Network (1D-CNN) to classify Ragas in the Hindustani variant of the Indian classical music using raw audio waveform. We compare our model with an Artificial Neural Network (ANN) trained using audio features which were extracted using traditional signal processing techniques from the audio files. The original dataset generated and annotated by an expert consists of audio files for 12 Ragas played on the 4 instruments. An augmented dataset consisting of 12,000 samples was created from the original dataset using slight pitch variation. The ANN trained using audio features and the 1D-CNN trained using raw audio show an accuracy of 97.04% and 98.67%, respectively.

Rutuparn Pawar, Shubham Gujar, Anagha Bidkar, Yogesh Dandawate

W-Tree: A Concept Correlation Tree for Data Analysis and Annotations

As human beings develop and study new topics to understand and conceptualize their surroundings, the need for records and documentation arises. The internet in recent times has proven to be one of the significant contributors to this philosophy. With abundant data presented to the user online, it naturally prompts the user with W-questions. The what, when, who, which, where, and other W-questions stand as an inspiration to build a W-Tree. W-Tree is one such record book that stores such a large amount of data and provides the additional functionality of linking related topics and studying the relationship between various topics. By providing related topics to a specific concept, W-Tree’s aim is for the users to learn and understand the topic in connection and correlation with other concepts. Knowing all the topics related to the one in focus and its origin helps the user understand the specifics related to that and the domain in general. W-Tree also provides the user with the feature of annotations for each article in a three-tuple format. The paper presents the analysis of W-tree over five prominent domains analyzed on the Wikipedia data.

Prakash Hegade, Kishor Rao, Utkarsh Koppikar, Maltesh Kulkarni, Jinesh Nagda

Crawl Smart: A Domain-Specific Crawler

With billions of people using the internet, which consists of an estimate of one billion websites, is explored by an individual with a diverse need and intent. The search engines that present results to internet user queries evaluate the websites on numerous parameters to sort the links from most to least relevant. It has become a pick-and-shovel task to extract the most relevant information for a given concept or user query. Classical crawlers that use traditional crawling techniques pull irrelevant data with the relevant ones, resulting in ineffective CPU time usage, memory, and resources. This paper proposes a knowledge-aware crawling system, Crawl Smart, which learns from its own crawling experiences and improves the crawling process in future crawls. The project’s key focus is a methodology that deploys a unique data structure to overcome the challenges of maintaining visited pages and finding a relation between the crawled pages after having them in the knowledge base, which helps the crawler preserve focus. The data structure design, annotations, similarity measures, and knowledge base supporting the Smart Crawl are detailed in the paper. The paper presents the results that show the comparison between the knowledge-aware crawler and the traditional crawler, assuring better results when used on large-scale data.

Prakash Hegade, Ruturaj Chitragar, Raghavendra Kulkarni, Praveen Naik, A. S. Sanath

Evaluating the Effect of Leading Indicators in Customer Churn Prediction

Customer churn prediction is needed as it is one of the preventive solutions employed to retain customers that are of high value and better prospects for future sales. This churn is preventable if service providers can identify the root cause of churn using data analysis. However, since the objective is to retain those customers after prediction, it is imperative that a significant lead time is available for the service providers to engage with their customers and react in a positive way to retain them. So, early detection of churn candidates is critical for the success of such applications. It is our hypothesis that this additional lead time to engage can be derived from analyzing data sources that have the characteristic quality of being leading indicators rather than lagging indicators. In this paper, we attempt to address the issue, by modeling leading indicator sources of temporal information that are relevant to the customer namely sentiment data and socio-economic data. We also evaluate the importance of using such data sources to address the problem of having a longer time horizon to react and respond to customer churn prediction applications. We present the results of experiments using open datasets that have been adopted to evaluate our hypothesis. Our study shows that customer sentiment and socio-economic indicators are statistically significant (P-value < 0.05) and improve churn prediction accuracy up to 20% compared to conventional approaches.

Sharath Kumar, Nestor Mariyasagayam, Yuichi Nonaka

Classification of Skin Lesion Using Image Processing and ResNet50

One of the most critical concerns of healthcare is cancer and the prevalence of skin cancer in the population has been increasing gradually and ominously. In this paper, we have proposed a deep learning model using a pre-trained neural network architecture called ResNet-50, and classify the dermoscopic images collected from the International Skin Imaging Collaboration (ISIC) archive into either benign or malignant. Our objective here is to make the medical diagnosis of skin cancer much more cost-effective and as efficient as possible. First, the input images have to undergo image preprocessing to remove unwanted noise, then we perform image augmentation to increase the number of images. Next, by implementing Otsu’s method for thresholding coupled with masking using bitwise AND operation, we segment the lesion portion of the image. Our model acquired satisfying results with an accuracy of about 85% and the AUC (Area Under The Curve)–ROC (Receiver Operating Characteristics) value of about 0.852. The precision, recall and F1-score values for benign class stand at around 0.89, 0.82 and 0.85 respectively, whereas, for malignant class, it is about 0.80, 0.88 and 0.84 respectively.

Adarsh Pradhan, Subhojit Saha, Abhinay Das, Santanu Barman

Data Collection and Pre-processing for Machine Learning-Based Student Dropout Prediction

In this era of big data, a large amount of data is generated from various educational environments that will help integrate machine learning techniques to find valuable insights from the data. Student dropout prediction is one of the application areas in education where machine learning algorithms can be applied to predict the students who are likely to dropout of their studies by extracting valuable patterns and key parameters from educational data. However, the performance of the algorithms is determined by the quality of the data we feed into them. The educational data are usually noisy that reduces the performance of the machine learning algorithms and requires accurate data pre-processing. This work presents an overview of data collection and pre-processing steps that consist of data augmentation, data labeling, enhancing data/model, data cleaning, data transformation, feature extraction, feature transformation, feature selection, etc. for analysis of statistical evidence from the source data for student dropout prediction. We also present some of the data pre-processing techniques which are suitable for machine-learning-based student dropout prediction in this study.

Sheikh Wakie Masood, Shahin Ara Begum

Nested Named-Entity Recognition in Multilingual Code-Switched NLP

Named-Entity Recognition (NER) is a core part of information extraction. India is a multilingual country with 23 official languages and over 122 major languages, where a significant population is multilingual. Most of the conversations, whether online or in person, involve code-switching and transliterating. Code-switching is the practice of alternating back and forth between two languages or dialects during a conversation or in writing. Processing of such multilingual code-switched text and speech is essential for building intelligent agents and systems that interact with users in multilingual communities. In Natural Language Processing, Indian languages are termed as low-resource languages because of a lack of large-scale supervised data and linguistic resources to make statistical Natural Language Processing viable. We propose a novel data augmentation technique for transfer learning from high-resource languages to low-resource Indian languages with adaptive, behavioral and task-specific fine-tuning on existing pre-trained language representations like mBERT for Nested Named-Entity Recognition in multilingual code-switched natural language processing which can overcome the shortcomings of traditional Named-Entity Recognition methods which fare poorly in multilingual, code-switched and low-resource language contexts. Named-Entity Recognition is highly computationally expensive. In our approach, we also try to bring the computational costs down by employing a unified and robust model.

Ashwin Patil, Utkarsh Kolhe

Deep Learning-Based Semantic Segmentation of Blood Cells from Microscopic Images

Morphological analysis and differential cell counting are important in characterizing many diseases including malaria and leukaemia. The basic building block involved is cell segmentation and is a challenging but beneficial task in cytopathology. Microscopy being the gold standard for cell analysis, approaches have been discovered from traditional image processing operations to deep learning techniques for cell segmentation from microscopy images. In the last few years, classification networks were extended for image segmentation using the pixel-based classification method, known as semantic segmentation. Convolutional neural networks exhibited good performance in image segmentation. However, the networks suffered some limitations due to fully connected layers and pooling layers that restricted the size of images to be given as input and resulted in the loss of spatial context. In this research, experiments were carried out with two popular CNN architectures UNet and SegNet, traditionally used for semantic segmentation of natural images. By identifying the capacity of these networks for cell segmentation on natural images, we have experimented on a custom-built RBC, WBC and platelet cell segmentation dataset based on ALL-IDB. We critically evaluate the performance of both architectures with an intuitive explanation of their difference in performance. The UNet outperformed SegNet that too with limited labelled training data giving a promising Dice score of 0.97. With the experiments and analysis conducted in this work, we propose that the UNet is a very good choice for cell segmentation in cytopathology applications.

S. B. Asha, G. Gopakumar

A Partitioned Task Offloading Approach for Privacy Preservation at Edge

Internet of Things (IoT) refers to the network of physical objects connected over the Internet for data collection and sharing which in turn has enormous positive impacts in various fields like health care, governance, and manufacturing. However, privacy threats have emerged as the downside of IoT due to the involvement of sensitive information. Most recent approaches have proposed adversarial training to enhance privacy. But the assumption of an adversary during the training is not likely to handle real-world threats. To handle the issue, we propose a Deep Neural Network (DNN)-based approach which is employed at the edge. A partitioned Denoising Autoencoder (DAE) is deployed at the edge where data is collected thus making it burdensome for the adversary to infer any information since the model is partitioned. The accuracy of the adversary reidentifying sensitive information could not exceed 42.60%, while the digit recognition achieved 96.59%. Experiments are conducted on a Handwritten digit recognition dataset, and the proposed technique does not leverage heavy computation like state-of-the-art techniques.

R. Ramprasad, S. Pradhiksha, K. Sundarakantham, Rajashree R. Harine, Shalinie S. Mercy

Artificial Intelligence in Radiological COVID-19 Detection: A State-of-the-Art Review

The requirement for the fast and accurate detection of COVID-19 is of high importance to control the spread of the disease. Recently, Artificial Intelligence and Deep Learning-based techniques have shown great promise in the domain of medical imaging. Several important research works have already been carried out to design automatic systems that can perform effective COVID-19 detection from radiological images like Chest X-Rays and CT Scans. In this work, we present a detailed state-of-the-art review of these research works, along with the techniques proposed and materials that can be used to further explore this area. With the identification of important future research directions, this work is expected to provide a good starting point for researchers exploring this area.

Abhishek Kumar, Pinki Roy, Arnab Kumar Mishra, Sujit Kumar Das

Anomaly Detection in SCADA Industrial Control Systems Using Bi-Directional Long Short-Term Memory

In the modern era, industries are migrating to Industry 4.0 in the aspects of smarter and faster with efficiency to increase productivity to attain a profit. Thus, it becomes a part of Cyber-Physical Systems (CPS) which is connected to the internet and pays a path for intrusion in the networks when dealing with the network traffic of ICS. In this paper, we evaluated the Industrial Control Systems (ICS)/Supervisory Control and Data Acquisition (SCADA) Cyber Attack New Gas Pipeline Dataset from Mississippi State University with specialized two-time-series Recurrent Neural Network architectures of Deep Learning for anomaly detection which will act as an Intrusion Detection System (IDS). Here, we used two-time-series models: Bi-LSTM (Bi-directional long short-term memory) and Bi-GRU (Bi-directional Gated recurrent units) were used along with the LOCF (Last Observation Carried Forward) pre-processing technique to fill out the gaps and SMOTE (Synthetic Minority Oversampling Technique) for balancing the dataset. The main contribution of this paper involves identifying the attack type and its location related to specific attack classification apart from binary and category attacks classification which helps in preventing processing downtime from the zero-day exploits. Bi-LSTM outperforms better in all three types of attacks such as Binary (97.78%), Category (95.56%), and Specific (95.40%) in terms of accuracy metrics compared to Bi-GRU and other Deep learning methods. However, computation time is a trade-off between Bi-LSTM and Bi-GRU, resulting in more or less similar results.

M. Nakkeeran, V. Anantha Narayanan

Implementing Autonomous Navigation on an Omni Wheeled Robot Using 2D LiDAR, Tracking Camera and ROS

This paper demonstrates the implementation and results of autonomous navigation algorithms on an Omni Wheel-based Robot using ROS (Robot Operating System). The basis of this application is autonomous navigation of the robot using Simultaneous Localization and Mapping (SLAM), specifically GMapping and Autonomous Path Planning Algorithms. The actual robot is Arduino-based, equipped with a tracking camera for Odometry data and a 2D LiDAR sensor for laser scan data of the environment. The robot is built on Omni wheels, making it possible to perform holonomic movements. The results of tuning the autonomous algorithms for this holonomic robot are also presented.

Atharva Bhorpe, Pratik Padalkar, Pawan Kadam

Analysis of Deep Learning Models for Text Summarization of User Manuals

User manuals have an inconsistent structure with the data presented in multiple formats such as tables, images, etc. It makes processing them a challenging task as we need to account for these inconsistencies. In this work, we propose a pipeline for processing user manuals and analyzing abstractive model PEGASUS and extractive models XLNet, BERT, and GPT-2 for summarization of user manuals. To evaluate the models, we have generated extractive and abstractive datasets and used metrics such as hit ratio, overlap, and rouge score to compare the performance of the models. We observed that an abstractive model gives more human-like summaries compared to the extractive models which although have higher rouge scores, suffer in readability. The system utilizes automatic text summarization along with multiple methods to process user manuals and extract required information in a summarized manner.

Mihir Kayastha, Megh Khaire, Malhar Gate, Param Joshi, Sheetal Sonawane

Modelling Seismic Performance of Reinforced Concrete Buildings Within Response Spectrum Framework

This paper deals with the modelling and analysis of reinforced concrete buildings for seismic performance within the response spectrum framework using a deep learning toolbox in 64-bit MATLAB R2021a. The response of a building subjected to earthquake ground accelerations is of paramount importance for designing earthquake resistant structures. Huge loss of life and property has resulted in extensive research in the field of seismic prediction and analysis for accurate results. Artificial Intelligence (AI) and Machine Learning (ML) techniques are thus finding a wide variety of applications in seismic analysis for gaining new insights. The seismic data available has increased exponentially in its size, thus AI has emerged as the solution for this challenging task of processing such overwhelming time-history earthquake data sets. The response spectrum method of seismic analysis is widely used as it computes peak displacements and member forces. In the present work, ground motion recordings of the El Centro earthquake, one of the most studied earthquake data is considered as the input data sets along with two other earthquakes of the Indian subcontinent, namely, the Bhuj earthquake and the India–Myanmar earthquake. The response spectrums are developed for multi degrees of freedom (MDOF) systems based on Newmark’s method for linear systems. The ground acceleration data of the three earthquake records are used as inputs and the peak displacement, base shear and strain energy are computed. Numerical examples presented illustrate the effectiveness of the deep learning toolbox in MATLAB for determining the seismic performance of reinforced concrete buildings.

Praveena Rao, Hemaraju Pollayi

A Survey on DDoS Detection Using Deep Learning in Software Defined Networking

In this era of internet, cyber attack is one of the most prominent issues all over the world. Distributed denial of service (DDoS) attack is one such attack that has a catastrophic effect, and it is hard to detect even in the Software defined networking (SDN) too. SDN is an emerging field in the area of computer networks. In this paper, we discuss the current trends in detecting DDoS with the help of deep learning in an SDN environment. Deep learning has gained popularity in recent years due to its efficient feature detection and dimensionality reduction in classifying data to gain maximum accuracy. We have analyzed the deep learning models and their mechanisms, the performance metrics, and the dataset from the various published papers.

M. Franckie Singha, Ripon Patgiri

Segmentation of Dentin and Enamel from Panoramic Dental Radiographic Image (OPG) to Detect Tooth Wear

The healthcare domain is a very important research field with rapid technological advancement. In this study specialized field of oral health care is considered, i.e. Dentistry which is measured as a subdivision of medicine dealing with anatomy, development, and diseases of the teeth. In dentistry, dental panoramic radiography (DPR) images have currently captivated growing attention in the diagnosis process due to their correct endorsement of the clinical findings. Conventionally, diagnosis is done with the help of dental radiographs and clinical examination of patients, which is done by a dentist manually as per available infrastructure and knowledge. This abundant approaches influence researchers to use and develop new machine learning techniques, image processing techniques to understand dental radiographs. To understand radiographs through an automatic process and to speed up diagnosis process segmentation and enhancement of an image plays very significant role at initial phase of processing. Segmentation of radiograph is important to separate the different tooth anatomy part but which processing an image this is a major problem due to variation in size, shape, and arrangement of teeth, which will vary from one person to another. The main motive of this work is to apply different image enhancement and segmentation techniques on panoramic (OPG) x-ray through which isolation of dentin and enamel can be done. It is an essential and primary step for finding tooth wear index and determining tooth structure loss. This paper also deliberates the use of several image enhancement and segmentation techniques which are applied on panoramic (OPG) radiograph and its results are evaluated to check the performance efficiency, feasibility of available techniques with stated problem statement.

Priyanka Jaiswal, Sunil Bhirud

Revisiting Facial Key Point Detection—An Efficient Approach Using Deep Neural Networks

Facial landmark detection is a widely researched field of deep learning as this has a wide range of applications in many fields. These key points are distinguishing characteristic points on the face, such as the eyes centre, the eye's inner and outer corners, the mouth centre, and the nose tip from which human emotions and intent can be explained. The focus of our work has been evaluating transfer learning models such as MobileNetV2 and NasNetMobile, including custom CNN architectures. The objective of the research has been to develop efficient deep learning models in terms of model size, parameters, and inference time and to study the effect of augmentation imputation and fine-tuning on these models. It was found that while augmentation techniques produced lower RMSE scores than imputation techniques, they did not affect the inference time. MobileNetV2 architecture produced the lowest RMSE and inference time. Moreover, our results indicate that manually optimized CNN architectures performed similarly to Auto Keras tuned architecture. However, manually optimized architectures yielded better inference time and training curves.

Prathima Dileep, Bharath Kumar Bolla, E. Sabeesh

A Hybrid Framework Using Natural Language Processing and Collaborative Filtering for Performance Efficient Feedback Mining and Recommendation

Product development insights may be found through user reviews on App stores, product forums, and social media. This feedback is often regarded as the “voice of the users”. This feedback has been subject to a lot of recent research, intending to create systems that can automatically extract, filter, analyze, and report the concerned feedback data in near real time. As per our survey results, often this user feedbacks do not reach the concerned organization promptly due to the volume, veracity, and velocity of feedback from multiple channels. In this rese arch work, we propose using sentiment analysis and social media mining an automatic engine which can be used for better product recommendation and automatic routing of relevant feedback to the product development teams. Our proposed solution is scheduled to run at regular intervals pulling dynamic reviews in an optimized manner with a lesser time complexity and higher efficiency. The reviews are collated from distributed platforms followed by building a domain classification engine on the principles of TF-IDF and Supervised Classifier. This system is used to classify the reviews of the respective enterprises. A sentiment analysis system is built using combined Rule-Based Mining and Supervised Learning Models which makes use of polarity to classify if the feedback is positive or negative. If the polarity is negative, the feedback gets routed to the concerned enterprise for immediate action and if the polarity is positive, it is passed to a user-based collaborative filtering engine which acts as a recommendation system.

Kathakali Mitra, P. D. Parthasarathy

Facial Recognition-Based Automatic Attendance Management System Using Deep Learning

Nowadays, computer-based face recognition, along with other biometric methods, is a mature and reliable mechanism that is widely used in many access control scenarios. Face recognition is divided into two parts: Face Verification and Face Identification. Face Verification evaluates whether two photographs are connected to one person or not by comparing them, whereas Face Identification must identify a specific face among a group of available faces in the database. Angle, illumination, position, facial expression, noise, resolution, occlusion, and the small amount of one-class samples with which to work are all obstacles in face identification. In this study, we use transfer learning in a siamese network, which comprises two comparable CNNs, to do facial recognition. A pair of two face photographs is supplied to the network as input, after which the network extracts the features of the pair of images, and eventually, using a similarity criterion, it assesses if the pair of images belongs to one person or not. The results suggest that the proposed model can compete with advanced models trained on datasets with a large number of samples. Furthermore, it enhances facial recognition accuracy when compared to algorithms that are trained using datasets with a small number of samples.

Saranga Pani Nath, Manditjyoti Borah, Debojit Das, Nilam Kumar Kalita, Zakir Hussain, Malaya Dutta Borah

Application of Infrared Thermography in Assessment of Diabetic Foot Anomalies: A Treatise

Diabetic foot syndrome (DFS) is one of the most prevailing complications pertinent to subjects with long-term diabetes mellitus. DFS when not monitored periodically may end up in the onset of foot ulcers, which may eventually lead to infections and amputation, thereby affecting the quality of life of the subjects. The etiological aspects which are instrumental in the development of DFS are multifarious, namely Diabetic Peripheral Neuropathy (DPN), Peripheral Arterial Disease (PAD), and Neuroischaemia which can be characterized by the temperature distribution pattern in the feet of diabetic subjects. Hence, thermographic studies, that transform the infrared rays emitted from the skin surface of the affected region that are otherwise invisible to naked human eyes, into thermal images, can be deployed in the intervention of DFS, especially as a modality for mass screening in scenarios of strikingly high diabetic populations. This paper reviews the application of thermal imaging in understanding and analyzing the anomalies in diabetic foot and thereby evaluating the competence of medical infrared thermography (IRT).

N. Christy Evangeline, S. Srinivasan

A Survey and Classification on Recommendation Systems

In today’s modern world, the data is growing exponentially and the traditional systems are not able to fulfil the user’s requirements. To fulfil the needs of the users, various companies like Amazon, Netflix, etc. are using recommender systems which recommend content or various type of data on the basis of the user’s previous activities and interactions with the system. In the recommender system, mainly three approaches are present, i.e., content-based, collaborative filtering and knowledge-based approaches. Due to their wide applicability, recommender systems have become an area of active research and in this context, this paper furnishes a survey and comparative discussion of existing approaches. The survey draws a conclusion on how different recommendation techniques are cooperating with today’s growing technology trends and also discusses the challenges faced by them.

Manika Sharma, Raman Mittal, Ambuj Bharati, Deepika Saxena, Ashutosh Kumar Singh

Analysis of Synthetic Data Generation Techniques in Diabetes Prediction

The problem of inadequate and class imbalanced data is one of the major problems in the classification tasks. Therefore applying synthetic data generation (SDG) approaches to handle class imbalances can be useful in improving Machine Learning (ML) classifier’s performance. The aim of this work is to explore various SDG approaches to improve diabetes prediction using Pima Indian Diabetes Dataset (PIDD). We have also proposed a hybrid approach of SDG by combining the idea of popularly used SDG techniques Synthetic Minority Oversampling TEchnique (SMOTE) and SVM-SMOTE (Support Vector Machine-Synthetic Minority Oversampling TEchnique), named as SSVMSMOTE. The idea is to divide training data into equal halves and apply SMOTE and SVM-SMOTE separately to sub-training samples. The approach has successfully overcome the limitation of SMOTE and SVM-SMOTE. A set of classifiers namely Decision Tree (DT), Random Forest (RF), K-Nearest Neighbors (KNN), Logistic Regression (LR), Gaussian Naive Bayes (GNB), AdaBoost (AB), Extreme Gradient Boosting (XGB), Gradient Boosting (GB), and Light Gradient Boosting (LGM) are trained on the combined resampled training data and tested on hold out testset. The experiment shows that boosting classifiers, XGB, and GB outperformed other considered classifiers. Further, the XGB classifier, with the help of the proposed SDG technique, achieved the highest average accuracy of 0.9415. The proposed approach also achieved promising results in terms of other important evaluation metrics such as F-Scores, AUC, Sensitivity, and specificity. Therefore, such an impressive result of the proposed approach suggests its applicability in the real-life decision-making process.

Sujit Kumar Das, Pinki Roy, Arnab Kumar Mishra

Beyond Information Exchange: An Approach to Deploy Network Properties for Information Diffusion

Information diffusion in Online Social Networks is a new and crucial problem in social network analysis field and requires significant research attention. Efficient diffusion of information is of critical importance in diverse situations such as pandemic prevention, advertising and marketing. Although several mathematical models have been developed till date, but previous works lacked systematic analysis and exploration of the influence of neighborhood for information diffusion. In this paper, we have proposed Common Neighborhood Strategy (CNS) algorithm for information diffusion that demonstrates the role of common neighborhood in information propagation throughout the network. The performance of CNS algorithm is evaluated on several real-world datasets in terms of diffusion speed and diffusion outspread and compared with several widely used information diffusion models. Empirical results show CNS enables better information diffusion both in terms of diffusion speed and diffusion outspread.

Soumita Das, Anupam Biswas, Ravi Kishore Devarapalli

Sentiment Analysis on Worldwide COVID-19 Outbreak

Sentiment analysis has proved to be an effective way to easily mine public opinions on issues, products, policies, etc. One of the ways this is achieved is by extracting social media content. Data extracted from the social media has proven time and again to be the most powerful source material for sentiment analysis tasks. Twitter, which is widely used by the general public to express their concerns over daily affairs, can be the strongest tool to provide data for such analysis. In this paper, we intend to use the tweets posted regarding the COVID-19 pandemic for a sentiment analysis study and sentiment classification using BERT model. Due to its transformer architecture and bidirectional approach, this deep learning model can be easily preferred as the best choice for our study. As expected, the model performed very well in all the considered classification metrics and achieved an overall accuracy of 92%.

Rakshatha Vasudev, Prathamesh Dahikar, Anshul Jain, Nagamma Patil

Post-Vaccination Risk Prediction of COVID-19: Machine Learning Approach

COVID-19 warns everyone of the fact that viral infection can be serious and even lethal. Prior to the COVID-19 epidemic, people paid no attention to slight fevers, sore throats, or sneezing. With the passage of time, people were persuaded of the fatality rate and forced to limit themselves to their own homes. However, the discovery of vaccinations and double dosages energized individuals to frequent workplaces, banks, stores, marketplaces, and so on for the sake of need. Although developing a vaccine is a tough undertaking owing to the unique characteristics of the COVID-19 virus, it is difficult to promise that a vaccination will give complete protection against infection. As a result, the post-vaccination risk for COVID-19 is always present. The goal of this work is to predict the post-vaccination risk of COVID 19 illness using machine learning approaches and real datasets. This research will also show that a small number of vaccinated persons get infected with COVID-19 illness. As a result, it is also indicated that if a few issues persist for more than 2 days, a doctor should be sought for proper care.

Anjali Agarwal, Roshni Rupali Das, Ajanta Das

Offensive Language Detection in Under-Resourced Algerian Dialectal Arabic Language

This paper addresses the problem of detecting the offensive and abusive content in Facebook comments, where we focus on the Algerian dialectal Arabic which is one of the under-resourced languages. The latter has a variety of dialects mixed with different languages (i.e., Berber, French, and English). In addition, we deal with texts written in both Arabic and Roman scripts (i.e., Arabizi). Due to the scarcity of works on the same language, we have built a new corpus regrouping more than 8.7 k texts manually annotated as normal, abusive, and offensive. We have conducted a series of experiments using the state-of-the-art classifiers of text categorization, namely: BiLSTM, CNN, FastText, SVM, and NB. The results showed acceptable performances, but the problem requires further investigation on linguistic features to increase the identification accuracy.

Oussama Boucherit, Kheireddine Abainia

A Comparative Analysis of Modern Machine Learning Approaches for Automatic Classification of Scientific Articles

Automatic classification of scientific articles is very beneficial for the scientific research community to know whether the journal is appropriate or not. Specifically, it helps editor(s) pre-screen them at the editor’s desk. In such a scenario, modern machine learning approaches can help automatically classify scientific articles based on their abstracts. In this work, we classify scientific articles based on their category, and hence a comparative analysis work is performed where several deep learning and machine learning-based approaches are analyzed. Our experimental results suggest that the domain-specific pre-trained model SciBert helps in improving the classification performance significantly.

Kongkan Bora, Nihar Jyoti Baishya, Chinmoy Jyoti Talukdar, Deepali Jain, Malaya Dutta Borah

A Review of Machine Learning Algorithms on Different Breast Cancer Datasets

Machine Learning (ML) algorithms have been used widely in the domain of medical science especially in classifying clinical data. Random Forest, Decision Tree, K-Nearest Neighbor, Support Vector Machine, Naive Bayes, Logistic Regression, and Multilayer Perceptron are some of the ML algorithms used for classification and prediction of various diseases. This paper reviews 40 recent ML algorithms published for breast cancer classification and breast cancer prediction along with the associated data pre-processing and feature selection techniques. The paper identifies from literature the pre-processing, feature selection steps, and the ML algorithms used for classification and prediction of breast cancer and tabulates them according to the accuracy. The paper also briefs the aspects of three common clinical breast cancer datasets used to train most such ML algorithms. The review helps prospective researchers in identifying different aspects of research in the domain of providing ML solutions from breast cancer datasets using suitable pre-processing and feature selection techniques.

E. Jenifer Sweetlin, S. Saudia

The Online Behaviour of the Algerian Abusers in Social Media Networks

Connecting to social media networks becomes a daily task for the majority of people around the world, and the amount of shared information is growing exponentially. Thus, controlling the way in which people communicate is necessary, in order to protect them from disorientation, conflicts, aggressions, etc. In this paper, we conduct a statistical study on the cyber-bullying and the abusive content in social media (i.e. Facebook), where we try to spot the online behaviour of the abusers in the Algerian community. More specifically, we have involved 200 Facebook users from different regions among 600 to carry out this study. The aim of this investigation is to aid automatic systems of abuse detection to take decision by incorporating the online activity. Abuse detection systems require a large amount of data to perform better on such kind of texts (i.e. unstructured and informal texts), and this is due to the lack of standard orthography, where there are various Algerian dialects and languages spoken.

Kheireddine Abainia

Interactive Attention AI to Translate Low-Light Photos to Captions for Night Scene Understanding in Women Safety

There is amazing progress in deep learning-based models for image captioning and low-light image enhancement. For the first time in literature, this paper develops a deep learning model that translates night scenes to sentences, opening new possibilities for AI applications in the safety of visually impaired women. Inspired by image captioning and visual question answering, a novel ‘Interactive Image Captioning’ is developed. A user can make the AI focus on any chosen person of interest by influencing the attention scoring. Attention context vectors are computed from CNN feature vectors and user-provided start words. The encoder–attention–decoder neural network learns to produce captions from low-brightness images. This paper demonstrates how women safety can be enabled by researching a novel AI capability in the interactive vision–language model for perception of the environment in the night.

A. Rajagopal, V. Nirmala, Arun Muthuraj Vedamanickam

AI Visualization in Nanoscale Microscopy

Artificial Intelligence (AI) and nanotechnology are promising areas for the future of humanity. While deep learning-based computer vision has found applications in many fields from medicine to automotive, its application in nanotechnology can open doors for new scientific discoveries. Can we apply AI to explore objects that our eyes can’t see such as nanoscale-sized objects? An AI platform to visualize nanoscale patterns learnt by a deep learning neural network can open new frontiers for nanotechnology. The objective of this paper is to develop a deep learning-based visualization system on images of nanomaterials obtained by scanning electron microscope (SEM). This paper contributes an AI platform to enable any nanoscience researchers to use AI in the visual exploration of nanoscale morphologies of nanomaterials. This AI is developed by a technique of visualizing intermediate activations of a Convolutional AutoEncoder (CAE). In this method, a nanoscale specimen image is transformed into its feature representations by a Convolution Neural Network (CNN). The convolutional autoencoder is trained on a 100% SEM dataset from NFFA-EUROPE, and then CNN visualization is applied. This AI generates various conceptual feature representations of the nanomaterial. While deep learning-based image classification of SEM images is widely published in literature, there are not many publications that have visualized deep neural networks of nanomaterials. This is significant to gain insights from the learnings extracted by machine learning. This paper unlocks the potential of applying deep learning-based visualization on electron microscopy to offer AI-extracted features and architectural patterns of various nanomaterials. This is a contribution to explainable AI in nanoscale objects, and to learn from otherwise black box neural networks. This paper contributes an open-source AI with reproducible results at URL ( https://sites.google.com/view/aifornanotechnology ).

A. Rajagopal, V. Nirmala, J. Andrew, Arun Muthuraj Vedamanickam

Convolutional Gated MLP: Combining Convolutions and gMLP

To the best of our knowledge, this is the first paper to introduce Convolutions to Gated Multi-Layer Perceptron (gMLP) and contributes an implementation of this novel Deep Learning architecture. Google Brain introduced the gMLP in May 2021. Microsoft introduced Convolutions in Vision Transformer (CvT) in Mar 2021. Inspired by both gMLP and CvT, we introduce convolutional layers in gMLP. CvT combined the power of Convolutions and Attention. Our implementation combines the best of Convolutional learning along with spatial gated MLP. Further, the paper visualizes how CgMLP learns. Visualizations show how CgMLP learns from features such as outline of a car. While Attention was the basis of much of recent progress in Deep Learning, gMLP proposed an approach that doesn’t use Attention computation. In Transformer based approaches, a whole lot of Attention matrixes need to be learnt using vast amount of training data. In gMLP, the fine tunning for new tasks can be challenging by transfer learning with smaller datasets. We implement CgMLP and compare it with gMLP on CIFAR dataset. Experimental results explore the power of generalization of CgMLP, while gMLP tends to drastically over-fit the training data. To summarize, the paper contributes a novel Deep Learning architecture and demonstrates the learning mechanism of CgMLP through visualizations, for the first time in literature.

A. Rajagopal, V. Nirmala

Unique Covariate Identity (UCI) Detection for Emotion Recognition Through EEG Signals

Affective computing has become one of the emerging technologies in the current arena as most of the industries depend on the consumers and their feedbacks. Opinion and emotional feedbacks are playing major ways to improve the quality of services provided by the industry. Affective computing plays a vital role in analyzing emotional feedbacks. The emotions of the human being can be derived by using various ways including facial emotions and textual emotions. Recent research uses biological signals to detect the emotions of human beings. The emotions include anger, sadness, happiness, joy, disgust, surprise generating promising parameters with the biological signal from EEG. Electroencephalography (EEG) based on unique subject identification is evaluated using the presented system. Biological signals are prone to motion artifacts inside the body that distracts during the recordings. EEG signals are complex with numerous oscillating points that are unique in certain cases. The proposed system focused on capturing the impacted component in the brain wave data that produce the unique identification of subjects. The states of the brain wave data like alpha, beta, gamma, theta, and delta are keenly monitored with the help of frequency-domain analysis through discrete wavelet transforms (DWT). In this paper, the analysis of subject impacted factors is detected using a novel multi-nominal regression-based (UCI) unique covariate identity detection algorithm. The proposed system also compared with state-of-the-art approaches in terms of accuracy, precision, and error rate.

V. S. Bakkialakshmi, T. Sudalaimuthu

A Simple and Effective Method for Segmenting Lung Regions from CT Scan Images Using K-Means

Proposed here is a simple and effective method for segmenting lung regions from CT-scan image. In this method, the CT-scan image in DICOM format is converted into RGB image, which is then further converted into gray image. The resulted grayscale image is then binarized using K-means, which automatically groups the pixels into two clusters of pixels; one belonging to the background pixels, while the other cluster to the pixels belonging to the lung regions. From the cluster of pixels belonging to the lung regions, the left and right lungs can be properly separated. The K-Means clustering used in the method is based on recursive averaging to avoid overflow errors while computing and updating cluster centers. As compared to other traditional methods of lung region segmentation, it is much simpler and gives better results. Also, it is much simpler and faster than the deep learning methods for lung region segmentation as it does not require any rigorous training with a large number of images.

Yumnam Kirani Singh

Risk-Based Portfolio Optimization on Some Selected Sectors of the Indian Stock Market

Designing portfolios with optimum future return and risk have always proved to be a very difficult research problem since the precise estimation of the future returns and volatilities of stocks poses a great challenge. This paper presents an approach to portfolio design using two risk-based methods, the hierarchical risk parity (HRP) and the hierarchical equal risk contribution (HERC). These two methods are applied to five important sectors of the National Stock Exchange (NSE) of India. The portfolios are built on the stock prices for the period January 1, 2016–December 31, 2020, and their performances are evaluated for the period January 1, 2021–November 1, 2021. The results show that the HRP portfolio's performance in the five sectors is superior to its HERC counterpart in all the five sectors.

Jaydip Sen, Abhishek Dutta

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter