ITNG 2022 19th International Conference on Information Technology-New Generations
- 2022
- Book
- Editor
- Dr. Shahram Latifi
- Book Series
- Advances in Intelligent Systems and Computing
- Publisher
- Springer International Publishing
About this book
This volume represents the 19th International Conference on Information Technology - New Generations (ITNG), 2022. ITNG is an annual event focusing on state of the art technologies pertaining to digital information and communications. The applications of advanced information technology to such domains as astronomy, biology, education, geosciences, security, and health care are the among topics of relevance to ITNG. Visionary ideas, theoretical and experimental results, as well as prototypes, designs, and tools that help the information readily flow to the user are of special interest. Machine Learning, Robotics, High Performance Computing, and Innovative Methods of Computing are examples of related topics. The conference features keynote speakers, a best student award, poster award, and service award. . This publication is unique as it captures modern trends in IT with a balance of theoretical and experimental work. Most other work focus either on theoretical or experimental, but not both. Accordingly, we do not know of any competitive literature.
Table of Contents
-
Cybersecurity
-
Frontmatter
-
Chapter 17. Gesturing with Smart Wearables: An Alternate Way to User Authentication
Khandaker Abir Rahman, Avishek Mukherjee, Kristina MullenThis chapter delves into the innovative use of smart wearable gestures for user authentication, leveraging motion sensors in devices like smartwatches. It discusses the viability of using gesture recognition methods for security purposes, focusing on the advantages of user-created gestures over predetermined ones. The authors present a standalone authentication system that can be used across multiple devices, showcasing its resilience to spoof attacks. The chapter details the data collection process, pre-processing methods, and experimental results, demonstrating the system's effectiveness in distinguishing between different gesture patterns. The research highlights the potential for enhancing security through unique hand gestures, offering a promising future for wearable technology in authentication.AI Generated
This summary of the content was generated with the help of AI.
AbstractA method of alternate user authentication that relies on sensory data from a smartwatch has been explored in this paper. This attempt to beef up the authentication security was made by taking the user-defined hand gesture into account while wearing a smartwatch. Eventually, the preset hand gesture would work similar way to the password-based authentication scheme. In our experiment, we recorded the 3D coordinate values measured by the accelerometer and gyroscope over a set of gestures. We experimented with 50 gesture samples comprising of five different gesture patterns and ten repeated samples for each pattern. We developed an Android WearOS smartwatch app for sensor data collection, implemented our method of sensor data processing, and performed a series of experiments to demonstrate the potential of this method to achieve high accuracy. -
Chapter 18. Software Optimization of Rijndael for Modern x86-64 Platforms
Nir Drucker, Shay GueronThe chapter delves into the optimization of the Rijndael 256-bit block cipher for modern x86-64 platforms, focusing on the use of new vector AES-NI instructions. It provides a detailed background on the Advanced Encryption Standard (AES) and its significance in various applications. The authors present their implementation of Rijndael 256 in Electronic CodeBook (ECB) and Counter (CTR) modes, measuring performance on different CPUs with varying microarchitectures. The results show that the optimized implementation achieves a throughput of 0.27 cycles per byte (C/B) on modern processors, comparable to the performance of the standard 128-bit block size AES. The chapter also discusses the implications of block size on security and the potential benefits of a 256-bit block cipher in cloud-scale applications. The authors conclude by highlighting the importance of considering specific processor characteristics and compiler behavior in achieving optimal performance.AI Generated
This summary of the content was generated with the help of AI.
AbstractThe Advanced Encryption Standard (AES) was standardized in 2001 by NIST and has become the de facto block cipher used today. AES is a block cipher with a block size of 128 bits and is based on the proposal by Rijmen and Daemen, named “R i j n d a e l”. The R i j n d a e l proposal includes a definition for a block cipher with 256 bits block size (and a 256-bits key), which we call here R i j n d a e l256. This variant has not been standardized. This paper describes software optimization methods for fast computations of R i j n d a e l256 on modern x86-64 platforms equipped with AES-NI and with vector AES-NI instructions. We explore several implementation methods and report a speed record for R i j n d a e l256 at 0.27 cycles per byte. -
Chapter 19. Cybersecurity Ethics Education: A Curriculum Proposal
Ping WangThe chapter addresses the critical need for qualified cybersecurity professionals in a digitally connected world facing increasing cyber threats. It highlights the ethical dilemmas and challenges in the field, emphasizing the lack of guidance on cybersecurity ethics. The core of the chapter is a detailed curriculum proposal for a cybersecurity ethics course, mapped to the NICE Cybersecurity Workforce Framework and CAE-CDE designation program. The proposal includes learning outcomes, topics, and assessment methods. Additionally, it underscores the importance of mentoring in student success, outlining a comprehensive mentoring model that includes ethical guidance. The chapter concludes with potential future research directions, making it a valuable resource for educators and professionals seeking to enhance cybersecurity education.AI Generated
This summary of the content was generated with the help of AI.
AbstractCybersecurity ethics has emerged as a new and increasingly significant area for research and education. New and complex ethical dilemmas and conflicts in values and judgments arise as new cybersecurity technologies and policies are constantly explored and implemented to defend our cyberspace for a safe and secure living and work environment. As various cyber threats, attacks, and risks pose increasing challenges to the diverse and interconnected world we live in, there is an increasing demand for quality cybersecurity education to prepare and produce qualified and ethically competent professionals to address the cybersecurity challenges. The national Centers of Academic Excellence in Cyber Defense Education (CAE-CDE) designation by the U.S. National Security Agency and Department of Homeland Security (NSA/DHS) is a high-quality program that promotes excellence in cybersecurity education for developing qualified cyber talent. Strong curriculum and courses supported by regular mentoring are essential to successful preparation of cybersecurity professionals. This research paper proposes a new credit course in cybersecurity ethics supported by an adopted comprehensive model of mentoring for an undergraduate cybersecurity education program at a CAE-CDE designated university in the United States. The curriculum proposal will present the rationale, course description, mappings of learning outcomes and topics to the CAE-CDE knowledge unit, suggested methods of assessment, and mentoring activities. The goal of this research is to contribute a new course design with ethical mentoring to enrich and enhance national and international cybersecurity curriculum and education. -
Chapter 20. Performance Evaluation of Online Website Safeguarding Tools Against Phishing Attacks; a Comparative Assessment
Rama Al-Share, Fatima Abu-Akleek, Ahmed S. Shatnawi, Eyad TaqieddinThe chapter delves into the critical issue of phishing attacks, focusing on the OWASP's broken authentication vulnerability. It explores the various types of phishing attacks, including vishing and smishing, and their severe impacts on both individuals and organizations. The study evaluates six prominent online website safeguarding tools using a rigorous evaluation procedure and two datasets of legitimate and malicious URLs. The performance of these tools is compared based on metrics such as precision, recall, F1-score, and accuracy. The results highlight the strengths and weaknesses of each tool, providing valuable insights into their effectiveness in combating phishing attacks. This detailed analysis offers a comprehensive overview of the current state of online website safeguarding tools and their role in protecting against phishing threats.AI Generated
This summary of the content was generated with the help of AI.
AbstractDespite the security policies that organizations follow to defend against cyber crimes, phishing attacks are still among the most popular ways the criminals use to steal user’s credentials. Spear phishing, fake websites, fraudulent emails, smishing, and vishing all fall under the umbrella of phishing attacks. Recent procedures followed by many organizations tend to develop anti-phishing tools that identify fraudulent emails and websites, which are embedded either implicitly within the web browsers and email applications or explicitly as an online service. In this research, we have evaluated the effectiveness of six online checking tools in detecting potentially malicious websites. Six URL scanning engines from the best well-known engines in the VirusTotal website were tested on a list of legitimate and malicious URLs, which were collected from well-known anti-phishing frameworks, including PhishTank. In order to find the most efficient anti-phishing tool, the detection accuracy, precision, recall, and F1-Score were calculated for each engine. The results showed that among website checking tools, Sophos achieved a detection accuracy of 99.23% and a precision value of 100%.
-
-
Blockchain Technology
-
Frontmatter
-
Chapter 21. Blockchain Based Trust for the Internet of Things: A Review
Dina Shehada, Maryam Amour, Suadad Muammar, Amjad GawanmehThis chapter delves into the challenges of trust and security in IoT systems, highlighting the potential of Blockchain technology to address these issues. It explores various Blockchain-based trust management solutions, categorizing them based on security functions, suitability to IoT environments, feasibility, main features, and limitations. The chapter also presents a taxonomy to assess these solutions, providing a side-by-side comparison of state-of-the-art methods. This detailed analysis offers valuable insights into the integration of Blockchain technology within IoT, arousing interest in exploring the full chapter for a deeper understanding of these innovative solutions.AI Generated
This summary of the content was generated with the help of AI.
AbstractEnsuring trust between Internet of Things (IoT) devices is crucial to ensure the quality and the functionality of the system. However, with the dynamism and distributed nature of IoT systems, finding a solution that not only provides trust among IoT systems but is also suitable to their nature of operation is considered a challenge. In recent years, Blockchain technology has attracted significant scientific interest in research areas such as IoT. A Blockchain is a distributed ledger capable of maintaining an immutable log of transactions happening in a network. Blockchain is seen as the missing link towards building a truly decentralized and secure environment for the IoT. This paper gives a taxonomy and a side by side comparison of the state of the art methods securing IoT systems with Blockchain technology. The taxonomy aims to evaluate the methods with respect to security functions, suitability to IoT, viability, main features, and limitations. -
Chapter 22. The Use of Blockchain Technology in Electronic Health Record Management: An Analysis of State of the Art and Practice
Henrique Couto, André Araújo, Rendrikson Soares, Gabriel RodriguesThe chapter delves into the transformative impact of Blockchain technology on Electronic Health Record (EHR) management within the healthcare sector. It begins by discussing the digital transformation of healthcare through information and communication technologies, emphasizing the need for secure and interoperable data solutions. The study analyzes the state of the art in Blockchain technology applications, focusing on data modeling, storage, interoperability, and visualization. It also examines practical industry solutions that leverage Blockchain to enhance data security, traceability, and access control. The chapter concludes with a discussion on the future directions and potential advancements in this field, providing valuable insights for healthcare professionals and data management experts.AI Generated
This summary of the content was generated with the help of AI.
AbstractDriven by the need to offer digital solutions to the population, the healthcare sector requires computational solutions with features of security, immutability and traceability for data transactions on the Electronic Health Record (EHR). An EHR is defined as a repository of healthcare information stored and transmitted in a secure and accessible way by authorized users. To address this important area of research, this work investigated state of the art practice and studies that addressed the development and validation of computational solutions with Blockchain technology applied to the following areas of an EHR lifecycle: (i) modeling and standardization (ii) data storage techniques, (iii) standards for data interoperability, and (iv) data retrieval and visualization solutions. Based on the results found, this study presents an analysis of the main advances and opportunities identified in the use of Blockchain technology in the development of healthcare applications. -
Chapter 23. Blockchain for Security and Privacy of Healthcare Systems: A Protocol for Systematic Literature Review
Saadia Azemour, Meryeme Ayache, Hanane El Bakkali, Amjad GawanmehThe chapter outlines a systematic literature review protocol to assess the security and privacy challenges of blockchain technology in healthcare. It follows the PRISMA-P 2015 guidelines and includes a detailed search strategy, inclusion/exclusion criteria, and data extraction methods. The study aims to gather knowledge on blockchain's impact on healthcare security and privacy, identify limitations and challenges, and propose future research directions. The protocol ensures transparency, clarity, and future reproducibility, making it a valuable resource for researchers and healthcare IT specialists.AI Generated
This summary of the content was generated with the help of AI.
AbstractPatient’s privacy, electronic health records’ confidentiality, integrity and all related e-health security issues are the most critical elements of a successful digital health, as they support building trust between patients and healthcare stakeholders. Blockchain technology appears to cover a wide range of these elements. However, the use of this emergent technology in healthcare domain, still has many security and privacy challenges that need to be overcome. Recently, many research works are focusing on such issues leading to a growing literature. In the perspective to review this literature in a systematic way to deeply investigate the use of blockchain technology in healthcare for security enhancement and privacy protection, this paper proposes a protocol that could be used to conduct successfully this systematic literature review (SLR). The proposed protocol follows the PRISMA-P 2015 Guidelines. At a closer look, we indicate the use of snowballing search and automated search (on eight electronic data sources) to carry out the intended SLR, identify five pertinent research questions, and specify the related inclusion/exclusion criteria. All methods for selection process, data collection and data analysis that will be used in the intended SLR are described in this protocol. -
Chapter 24. Single Sign-On (SSO) Fingerprint Authentication Using Blockchain
Abhijeet Thakurdesai, Marian Sorin Nistor, Doina Bein, Stefan Pickl, Wolfgang BeinThe chapter delves into the concept of Single Sign-On (SSO) and its integration with blockchain technology for secure fingerprint authentication. It begins by discussing the limitations of traditional centralized authentication systems and the vulnerabilities they present. The authors then introduce the benefits of using blockchain for authentication, such as data tamper resistance, decentralization, and high concurrency. The core of the chapter focuses on the implementation of a web application that uses Ethereum for user management and authentication. The application employs fingerprint images as biometric data, with a similarity score threshold for successful authentication. The backend of the application is built using Springboot and communicates with the Ethereum network through smart contracts. The frontend supports user registration and authentication, demonstrating the potential of blockchain to enhance security in biometric-based authentication systems. The chapter concludes with a discussion on future directions and potential improvements, such as implementing microservices architecture and enterprise blockchain solutions.AI Generated
This summary of the content was generated with the help of AI.
AbstractThe objective of this paper is to describe the front-end and the backend of an open-source web-application that can be integrated in any website and for which the storage of single sign-on (SSO) authentication is provided in an Ethereum network. The backend of this application is shared with a browser-based platform or Android platform. Ethereum network facilitated the implementation of peer-to-peer multi-node blockchain in distributed ledger technology. We use smart contract code for user creation and authentication. A contract is a collection of code (its functions) and data (its state) that resides at a specific address on the Ethereum blockchain. The smart contract made our proposed web app self-verifying, self-executing, and tamper resistant. The proposed software system can be used as two factor authentications in combination with passwords for servers, for payments authorizations, in banking and automotive industry.
-
-
Health Informatics
-
Frontmatter
-
Chapter 25. A Detection Method for Early-Stage Colorectal Cancer Using Dual-Tree Complex Wavelet Packet Transform
Daigo Takano, Teruya MinamotoThe chapter introduces a groundbreaking method for detecting early-stage colorectal cancer using Dual-Tree Complex Wavelet Packet Transform (2D-CWPT) and Principal Component Analysis (PCA). Unlike traditional methods that require extensive labeled data, this approach leverages the directional selectivity of 2D-CWPT to extract features from endoscopic images effectively. The method involves preprocessing endoscopic images, applying 2D-CWPT to capture high-frequency components, and using PCA to diagnose cancer based on the first principal component values. The authors demonstrate the method's effectiveness through preliminary and comparison experiments, highlighting its superior accuracy compared to existing methods. The chapter concludes by discussing the method's limitations and suggesting future directions for improving the detection of early-stage colorectal cancer.AI Generated
This summary of the content was generated with the help of AI.
AbstractColorectal cancer is a major cause of death. As a result, cancer detection using supervised learning methods from endoscopic images is an active research area. Regarding early-stage colorectal cancer, preparing a significant number of labeled endoscopic images is impractical. We devise a technique for detecting early-stage colorectal cancer in this study. This technique consists of a 2D complex discrete wavelet packet transform and principal component analysis. As this technique does not require supervised learning, detection is feasible even in the absence of labeled data. In the endoscopic image, this technique correctly classifies early-stage colorectal cancer and normal regions with 92% accuracy. This approach outperforms the local binary pattern method. -
Chapter 26. Visualizing 3D Human Organs for Medical Training
Joshua Chen, Paul J. Cuaresma, Jennifer S. Chen, Fangyang Shen, Yun TianThe chapter delves into the application of 3D visualization techniques for medical training, specifically focusing on the recognition of human organs by health science students. It compares the effectiveness of 3D models against traditional 2D images, highlighting the potential advantages of 3D visualizations in enhancing learning outcomes and confidence levels among medical trainees. The study employs a controlled experiment using 3D models of human organs visualized with Blender software, and surveys health science students to evaluate their accuracy and confidence in identifying these models. The results indicate that 3D models are generally more recognizable and can potentially replace 2D images as primary learning tools in medical education. The chapter concludes with recommendations for further improving 3D visualizations and suggests future research directions.AI Generated
This summary of the content was generated with the help of AI.
AbstractThree-dimensional (3D) models have been used as essential tools in medical training. In this study, we visualize 3D models of human organs with graphics software for the purpose of training medical students. This study investigates whether 3D organ visualizations will be more recognizable to medical students than two-dimensional (2D) organ images. In our experiments, the models were shown to health science students to determine how useful they were in training and we compared the use of 3D models with 2D images. We conclude that the 3D organ models we used are more likely to be recognized by the students. -
Chapter 27. An Information Management System for the COVID-19 Pandemic Using Blockchain
Marcelo Alexandre M. da Conceicao, Oswaldo S. C. Neto, Andre B. Baccarin, Luan H. S. Dantas, Joao P. S. Mendes, Vinicius P. Lippi, Gildarcio S. Gonçalves, Adilson M. Da Cunha, Luiz A. Vieira Dias, Johnny C. Marques, Paulo M. TasinaffoThe chapter details the development of an Information Management System for COVID-19 by students at ITA, leveraging Blockchain, Big Data, and other emerging technologies. The project, named STEPES-BD, was designed to monitor patients and manage data sharing between healthcare stakeholders. It involved the use of a 3-tier architecture, RESTful APIs, and databases like MySQL and BigchainDB to ensure data interoperability and immutability. The project was conducted remotely using the Scrum Framework, demonstrating the feasibility of interdisciplinary problem-based learning and the practical application of advanced technologies in a real-world scenario.AI Generated
This summary of the content was generated with the help of AI.
AbstractDuring the 1st Semester of 2020, 25 students from the Aeronautics Institute of Technology (Instituto Tecnológico de Aeronáutica – ITA) in São José dos Campos, SP, Brazil developed the academic project “Specific Technological Solutions for Special Patients with Big Data”, in Portuguese Projeto STEPES-BD: Soluções Tecnológicas Específicas para Pacientes Especiais e Sistemas em Bancos de Dados. They have accepted the challenge of using a technological approach to help manage and combat the Sars-CoV-2 Virus Pandemic (COVID-19). At that time, the lack of shared data between public and private agencies and the need for faster information flow were considered the main information for combating the spread of the disease to guide the development of an Information Management System for the COVID-19 Pandemic, involving the essential data from Electronic Health Records (EHRs). The combination of some emerging technologies like Big Data and Blockchain, together with the Scrum Framework (SF) and the Interdisciplinary Problem-Based Learning (IPBL) enabled those students from three different academic courses to develop a computer system prototype based on the pressing needs caused by this disease. This article describes the development of the main deliverables made by those graduate students in just 17 academic weeks right from the beginning of the COVID-19 Pandemic crisis in the 1st Semester of 2020. -
Chapter 28. Machine Learning for Classification of Cancer Dataset for Gene Mutation Based Treatment
Jai Santosh Mandava, Abhishek Verma, Fulya Kocaman, Marian Sorin Nistor, Doina Bein, Stefan PicklThe chapter discusses the significant impact of gene mutations on cancer treatment and the potential of machine learning to automate and enhance the classification of cancer datasets. It provides a comprehensive overview of the historical context and current practices in cancer treatment, focusing on the use of gene mutation-based treatments. The authors present a proposed system architecture that leverages machine learning algorithms to classify genetic variations, significantly reducing the time and effort required for manual analysis. The chapter also includes a detailed comparison of various machine learning classification algorithms and their performance metrics. The experimental results show promising accuracy levels, highlighting the potential of machine learning to revolutionize cancer diagnosis and treatment. The conclusion emphasizes the need for further research to improve model accuracy and expand the dataset to achieve real-world applicability.AI Generated
This summary of the content was generated with the help of AI.
AbstractThe objective of this paper is to develop a Machine learning model that can classify cancer patients. Gene mutation-based treatment has a very good success ratio, but only a few cancer institutes follow it. This research uses natural language processing techniques to remove unwanted text and convert the categorical data into numerical data using response coded and one-hot encoded. Then we apply various classification algorithms to classify the training data. The proposed system has the advantage of reducing the time to analyze and classify clinical data of patients, which translates into less wait time for patients in order to get results from pathologists. The results of our experiment will demonstrate that the Stacking Classifier algorithm with One-Hot encoding and Term Frequency – Inverse Document Frequency (TF-IDF) techniques perform better than other Machine Learning methods with around 67% accuracy on the test data.
-
-
Machine Learning
-
Frontmatter
-
Chapter 29. Performance Comparison Between Deep Learning and Machine Learning Models for Gene Mutation-Based Text Classification of Cancer
Fulya Kocaman, Stefan Pickl, Doina Bein, Marian Sorin NistorThe chapter delves into the critical issue of cancer diagnosis and treatment, focusing on the role of gene mutation-based text classification. It introduces the use of deep learning models such as Embedding Layer and Bidirectional LSTM, as well as machine learning classifiers like Random Forest and Stacking Classifiers, to analyze genetic mutations from clinical text. The study employs advanced techniques like BERT text augmentation to enhance data quality and model performance. The results highlight the challenges and potential of these methods in improving cancer diagnosis and personalized medicine. The paper concludes with a call for further research into pre-trained word embeddings and combining text analysis with medical image processing.AI Generated
This summary of the content was generated with the help of AI.
AbstractIdentifying genetic mutations that contribute to cancer tumors is the key to diagnosing cancer and finding specific gene mutation-based treatment. It is a very challenging problem and a time-consuming job. Currently, clinical pathologists classify cancer manually, and they need to analyze and organize every single genetic mutation in cancer tumors from clinical text. The text data analysis can be automated using Deep Learning and Machine Learning classification techniques to ease the manual work needed to extract information from clinical text. This paper aims to analyze the performance of Machine Learning and Deep Learning methods to classify cancer from gene mutation-based clinical text data. This paper uses Natural Language Processing techniques, namely, CountVectorizer and TfidfVectorizer, and Keras API’s One-Hot encoding and to-categorical utility, to vectorize the categorical and text data and transform them into numerical vectors. Machine Learning classification algorithms and Deep Learning methods are then applied to the extracted features, and the most accurate combination of feature extraction and a classifier is discovered. Keras API’s Embedding Layer (Word Embeddings) and Bidirectional Long-Term Short-Term Memory (Bidirectional LSTM) techniques using original and augmented text data from NLPAug library are applied as Deep Learning methods. The Keras Word Embeddings using augmented text data ha performed the highest with an accuracy of 80.67%, the weighted average precision of 0.81, recall of 0.81, F1 score of 0.81, and the log loss of 0.6391. As for the Machine Learning classification algorithms, Random Forest and Stacking classifiers are explored within this paper, and the highest accuracy of 67.02% is achieved from the Random Forest classifier with the weighted average precision of 0.70, recall of 0.67, F1 score of 0.65, and the log loss of 1.0523. -
Chapter 30. Stock Backtesting Engine Using Pairs Trading
Rahul Chauhan, Marian Sorin Nistor, Doina Bein, Stefan Pickl, Wolfgang BeinThe chapter introduces a Stock Backtesting Engine designed to test historical data using pairs trading strategy, a popular method in statistical arbitrage. Pairs trading involves finding stocks with similar historical price behaviors and betting on their convergence. The engine identifies cointegrated pairs using statistical methods, runs backtesting algorithms, and provides detailed analysis and visualization of trade effectiveness. The system architecture, including modules for data collection, pair finding, backtesting, and analysis, is explained. The chapter highlights the importance of cointegration tests, such as the Engle-Granger method, and demonstrates the engine's capabilities with real-world examples. Results from backtesting eight major companies are presented, showcasing the engine's potential to enhance trading strategies. The chapter concludes with limitations and future work, suggesting improvements like machine learning integration and web service conversion.AI Generated
This summary of the content was generated with the help of AI.
AbstractIn this paper we present a Stock Backtesting Engine which would test historical data using pairs trading strategy. We implemented pairs trading strategy and ran it on historical data. We collect the S&P 500 data from the Internet and store it in a database. We then allow users to enter a set of stocks to find cointegrated pairs among them. We also provide an option to find the cointegrated pairs in all of S&P 500 stocks. Once the cointegrated stocks are selected we run the backtesting algorithm on these pairs and find from a given set of stocks, all the pairs of stocks that exhibit cointegration properties. Once such pairs are identified, this program would use pairs trading methods to calculate trades for each stock. Finally we provide the analysis of trades executed by the algorithm with average and daily data, and plot a chart of daily profit and loss with pairs trading strategy to showcase the effectiveness of the trading strategy. -
Chapter 31. Classifying Sincerity Using Machine Learning
Rachana Chittari, Marian Sorin Nistor, Doina Bein, Stefan Pickl, Abhishek VermaThe chapter delves into the challenge of classifying sincere and insincere questions on online forums, highlighting the limitations of human review systems. It explores various machine learning techniques, including traditional algorithms like Naïve Bayes and Support Vector Machines, and deep learning models like Recurrent Neural Networks and Long Short-Term Memory networks. The author presents a detailed methodology for preprocessing text data, feature extraction using word embeddings, and training a bidirectional LSTM model. The results and future improvements are discussed, making the chapter a valuable resource for those interested in applying machine learning to natural language processing tasks.AI Generated
This summary of the content was generated with the help of AI.
AbstractQuora is an online platform that empowers people to learn from each other. On Quora, users can post questions and connect with others who contribute unique insights and quality answers. But as with any other social media or online platform, there is the potential for misuse. A key challenge in maintaining the integrity of such an online platform is to classify and flag negative content. On Quora, the challenge is to identify insincere questions. Insincere questions could be those founded upon false premises, are disparaging, inflammatory, intended to arouse discrimination in any form, or intend to make a statement rather than look for helpful answers. We propose to develop a text classification model that correctly labels questions as sincere or insincere on the Quora platform. For this purpose, we used the Quora Insincere Questions Classification dataset, which is available on Kaggle. We first trained classical machine learning models such as Logistic regression and SVMs to establish a baseline on the performance. However, to leverage the large dataset, we used neural network-based models. We trained several models including standard neural networks, and LSTM based models. The best model that we obtained is a two-layer Bidirectional LSTM network that takes word embeddings as inputs. The classification accuracy and F1-score for this model were 96% and 0.64, respectively. -
Chapter 32. Recommendation System Using MixPMF
Rohit Gund, James Andro-Vasko, Doina Bein, Wolfgang BeinThe chapter introduces a novel recommendation system using MixPMF, a hybrid model that integrates Probabilistic Matrix Factorization (PMF), Constrained PMF (CPMF), and Kernelized PMF (KPMF). It addresses the challenges faced by existing recommendation systems in handling large, sparse, and imbalanced datasets, particularly in the context of music and movie recommendations. The MixPMF model leverages user network information and artist tag data to enhance recommendation accuracy. The chapter provides a comprehensive overview of the architecture, background work, model training, and evaluation of the MixPMF system. It demonstrates the superior performance of MixPMF compared to PMF, CPMF, and KPMF models through RMSE evaluations. The chapter concludes with potential future enhancements, including the development of a GUI with automatic playlist generation and the exploration of parallel processing techniques for scalability.AI Generated
This summary of the content was generated with the help of AI.
AbstractThe objective of this paper is to the use of the Probabilistic Matrix Factorization (PMF) model which scales linearly with the number of observations and performs well on the large, sparse, and imbalanced music/movie dataset. In this project, we compare various PMF-based models and apply them to the recommendation system. We design and develop a Mix Probabilistic Matrix Factorization (MixPMF) model for music recommendation. This new model will take advantage of user network mapping and artist tag information and forms the effective rating matrix and thus will be efficient in recommending music/movies to new users. Simulation results show the advantage of our model. -
Chapter 33. Abstractive Text Summarization Using Machine Learning
Aditya Dingare, Doina Bein, Wolfgang Bein, Abhishek VermaThe chapter delves into the various types of text summarization, including single document, multi-document, informative summary, and query-focused summary. It differentiates between abstractive and extractive text summarization methods, highlighting the challenges and advantages of each. The authors apply three text summarization algorithms—extractive text summarization using NLTK and TextRank, and abstractive text summarization using Seq-to-Seq—to the Amazon Product Review dataset. The chapter presents a detailed analysis of the algorithms' performances, limitations, and potential future improvements, offering valuable insights for professionals in the field of natural language processing and machine learning.AI Generated
This summary of the content was generated with the help of AI.
AbstractText summarization creates a brief and succinct summary of the original text. The summarized text highlights the main text’s most interesting points without omitting crucial details. There is a plethora of applications on the market that include news summaries, such as Inshort and Blinklist which not only save time but also effort. The method of manually summarizing a text can be time-consuming. Fortunately, using algorithms, the mechanism can be automated. We apply three text summarization algorithms on the Amazon Product Review dataset from Kaggle: extractive text summarization using NLTK, extractive text summarization using TextRank, and abstractive text summarization using Seq-to-Seq. We present the advantages and disadvantages for these three methods. -
Chapter 34. Intelligent System for Detection and Identification of Ground Anomalies for Rescue
Antonio Dantas, Leandro Diniz, Maurício Almeida, Ella Olsson, Peter Funk, Rickard Sohlberg, Alexandre RamosThe chapter delves into the importance of capturing images of the earth’s surface for various purposes, particularly in search and rescue (SAR) activities. It discusses the development of methods and tools by Brazilian and Swedish researchers to interpret soil images obtained by sensors embedded in unmanned aerial vehicles (UAVs). The challenges faced by these systems, such as adverse conditions and lighting changes, are highlighted. The chapter also presents recent works that use artificial intelligence for human detection, focusing on techniques like convolutional neural networks (CNN) and algorithms such as Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO). Additionally, it explores the methodology for identifying anomalies in images, including pre-processing techniques and the use of CNNs for efficient processing. The chapter concludes by proposing a decision support system for search and rescue, emphasizing the need for a dataset that reflects real-world scenarios.AI Generated
This summary of the content was generated with the help of AI.
AbstractThe search and identification of people lost in an emergency is a very important activity, it is carried out to assist in human lives in danger, and the unavailability of support technologies. Unmanned aerial vehicles (UAV) act directly in this activity for greater capacity in the coverage of the area in a shorter time of operation. A brief state-of-the-art survey is presented, and specific needs are raised for accuracy and practical application. In this joint work, low-processing image recognition alternatives for use in UAV will be explored to face challenges such as location in large areas, small targets, and orientation. Assessments are performed on real images using Inception, SSD, and Yolo. For tracking, MIL, KCF, and Boosting techniques are applied to empirically observe the results of test missions. Preliminary results show that the proposed method is viable for action in the search and rescue of people.
-
- Title
- ITNG 2022 19th International Conference on Information Technology-New Generations
- Editor
-
Dr. Shahram Latifi
- Copyright Year
- 2022
- Publisher
- Springer International Publishing
- Electronic ISBN
- 978-3-030-97652-1
- Print ISBN
- 978-3-030-97651-4
- DOI
- https://doi.org/10.1007/978-3-030-97652-1
Accessibility information for this book is coming soon. We're working to make it available as quickly as possible. Thank you for your patience.