Skip to main content
Erschienen in: Journal of Big Data 1/2019

Open Access 01.12.2019 | Research

An analytical study of information extraction from unstructured and multidimensional big data

verfasst von: Kiran Adnan, Rehan Akbar

Erschienen in: Journal of Big Data | Ausgabe 1/2019

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Process of information extraction (IE) is used to extract useful information from unstructured or semi-structured data. Big data arise new challenges for IE techniques with the rapid growth of multifaceted also called as multidimensional unstructured data. Traditional IE systems are inefficient to deal with this huge deluge of unstructured big data. The volume and variety of big data demand to improve the computational capabilities of these IE systems. It is necessary to understand the competency and limitations of the existing IE techniques related to data pre-processing, data extraction and transformation, and representations for huge volumes of multidimensional unstructured data. Numerous studies have been conducted on IE, addressing the challenges and issues for different data types such as text, image, audio and video. Very limited consolidated research work have been conducted to investigate the task-dependent and task-independent limitations of IE covering all data types in a single study. This research work address this limitation and present a systematic literature review of state-of-the-art techniques for a variety of big data, consolidating all data types. Recent challenges of IE are also identified and summarized. Potential solutions are proposed giving future research directions in big data IE. The research is significant in terms of recent trends and challenges related to big data analytics. The outcome of the research and recommendations will help to improve the big data analytics by making it more productive.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Abkürzungen
AED
acoustic event detection
ANN
artificial neural network
ASR
automatic speech recognition
AVS
automatic video summarization
BFM
Bayesian fusion model
CNN
convolutional neural network
CRF
conditional random forest
CS
code switching
DBN
deep belief network
DL
deep learning
DPP
determinantal point process
DTW
dynamic time warping
ECR
error classification rate
EE
event extraction
EHR
electronic health record
FR
face recognition
GMM
Gaussian mixture model
GRNN
general regression neural network
HMM
hidden Markov model
IDC
International Data Corporation
IE
information extraction
LBM
learning based method
LDA
Latent Dirichlet allocation
LOD
linked open data
LSTM
Long Short Term Memory
MEMM
maximum entropy Markov model
MFCC
Mel frequency cepstral coefficient
ML
machine learning
MSR
minimum sparse representation
NER
named entity recognition
NLP
natural language processing
NMS
non-maximum suppression
NN
neural network
OCR
optical character recognition
PDN
public domain network
RBM
rule based methods
RCNN
Region Convolutional Neural Network
RDF
Resource Description Framework
RE
relation extraction
RL
reinforecement learning
SLR
systematic literature review
STT
speech to text
SVM
support vector machine
TF-IDF
term frequency-inverse document frequency
TIE
text information extraction
TR
text recognition
UBM
Universal Background Model
UCI
Union Cycliste Internationale
VGG
Visual Geometry Group
VRD
visual relationship detection
VRL
variation structured reinforcement learning
WER
word error rate

Introduction

Information extraction (IE) process extracts useful structured information from the unstructured data in the form of entities, relations, objects, events and many other types. The extracted information from unstructured data is used to prepare data for analysis. Therefore, the efficient and accurate transformation of unstructured data in the IE process improves the data analysis. Numerous techniques have been introduced for different data types i.e. text, image, audio, and video.
The advancement in technology promoted the rapid growth of data volume in recent years. The volume, variety (structured, unstructured, and semi-structured data) and velocity of big data have also changed the paradigm of computational capabilities of the systems. IBM estimated that more than 2.5 quintillion bytes of data are generated every day. Among these statistics, it was also predicted that unstructured data from diverse sources will grow up to 90% in few years. IDC estimated that unstructured data will be 95% of the global data in 2020 with estimated 65% annual growth rate [1]. The common characteristics of unstructured data are, (i) it comes in multiple formats [25] (text, images, audio, video, blogs, and websites, etc.) (ii) schema-less due to non-standardization [24] (iii) it comes from diverse sources (e.g. social media, clouds, sensors, etc.) [24, 6].
Due to the huge volume and complexity of unstructured data, it became a tedious task to extract useful information from different types of data. In this regard, systematic literature review have been conducted to identify state-of-the-art challenges. The primary contribution of this work is twofold. First, a systematic review of existing techniques for IE subtasks for each data type i.e. text, image, audio and video. The systematically extracted and synthesized knowledge can be leveraged by the researchers to understand the concept of IE, its subtasks for each data types and state-of-the-art techniques. Second, a taxonomy of IE research is designed to identify and classify the challenges of IE in big data environment. The main categories include task-related challenges and unstructured data-related challenges. Finally, the IE improvement model is designed to overcome the identified limitations of existing IE techniques for multidimensional unstructured big data.
The remaining document is organized as follows: research methodology with all phases and activities is presented in “Research methodology” section. “Information extraction from text” section presents detailed discussion on IE subtasks such as NER, RE, EE, their techniques and comparison of techniques for text data. In “IE from images” section, visual relationship detection, text recognition and face recognition techniques as IE subtask, recent work, and limitations have been described. “Audio IE” section presents the detailed discussion on IE from audio, its subtasks such as AED and ASR with state-of-the-art techniques and challenges. Text recognition and automatic video summarization are elaborated in “Video IE” section. Results and discussion on this systematic literature review are presented in “Results and discussion” section whereas “Conclusion” and “Future work” section present the conclusion and future work, respectively.

Research methodology

Systematic literature review (SLR) is a process to identify, select and critically analyzing the research to answer the identified research questions. Transparency, clarity, integration, focus, equality, accessibility and coverage are key principles of SLR. It is a comprehensive investigation of existing literature on the identified research question. Therefore, SLR has been selected for this review article on IE solutions for unstructured big data and followed the well-formed guidelines [7, 8]. SLR is more suitable for this study because it provides guidelines to conduct review and present findings in more systematic way. Generally, the process of SLR is divided into three main phases named as planning, conduct and reporting the review. These phases and their corresponding activities followed in this review are depicted in Fig. 1.

Planning the review

The activities performed during the planning phase of the SLR are as follows:
A.
Research questions
The research questions and their rationale have been given in Table 1.
Table 1
Research questions and rationale
 
Research question
Rationale
RQ1
What are the state-of-the-art approaches for IE from unstructured big data?
To explore the state-of-the-art approaches for IE in big data environment for text, images, audio, and video data
RQ2
What are the issues related to the unstructured big data IE for different types of data?
To investigate the impact of unstructured big data on IE techniques
RQ3
What are the common challenges of IE from a variety of big data?
To identify the common challenges for IE from the variety of unstructured data types i.e. text, images, audio, and video
 
B.
Search string and data sources
The following search strings have been used to search the most relevant literature to address the research questions.
TITLE-ABS-KEY ((“information extraction” OR “information extraction system” OR “visual relationship” OR “named entity” OR “relation extraction” OR “event extraction” OR “summarization” OR “speech recognition”) AND (“big data” OR “large-scale data” OR “large data” OR “volume”) AND (“unstructured data” OR “nonstructured data” OR “nonrelational data” OR “free text” OR “image” OR “audio” OR “video”)).
ACM, IEEE Xplore, Springer, ScienceDirect, Scopus, and Wiley online library were selected as data sources for this review. The search was conducted in April 2019 using advanced search on the identified data sources. The details of searched and selected articles from each data source are presented in Table 2.
Table 2
Data sources and publication for each step of phase 2 of SLR
Data sources
Publication count
Searched results
Selected based on title
Selected based on abstract
Selected based on full study + duplicate removal
Wiley Online Library
1012
531
24
3
Scopus
461
146
31
12
Springer
548
204
47
22
IEEE Xplore
203
124
68
36
ACM
633
183
42
10
ScienceDirect
281
122
36
8
Total
3138
1310
248
91
 
C.
Inclusion conditions
The inclusion criteria have been defined to select the most relevant research studies according to the research questions. The inclusion criteria for this study are as follows:
i.
Research work published between January 2013 and April 2019 inclusively.
 
ii.
Studies conducted in the English language.
 
iii.
Studies related to IE for text, images, audio and/or video.
 
iv.
Research work on unstructured data.
 
v.
Research work on data analytics.
 
vi.
Research work related to the IE techniques for big data implicitly or explicitly.
 
 
D.
Exclusion conditions
i.
Studies that used other than the English language.
 
ii.
Short papers, presentations, keynotes, and articles.
 
iii.
Duplicate or redundant studies.
 
iv.
Studies that are not relevant to the research questions.
 
v.
Research work older than January 2013.
 
 

Conducting the review

After planning the review, studies were refined and selected based on the inclusion and exclusion criteria. The selected studies have been filtered based on the relevance to the study objectives. The selection process started with reading the “title” of the selected studies. Next, studies were filtered on the basis of “abstract” and “keywords” and finally selected on the basis of “full article reading”. The publication count of each step to select the most relevant studies for this review is presented in Table 2.

Reporting the review

Figure 2 illustrates the publication venues for each data type from 2013 to 2018, and Fig. 3 illustrates the selected studies distribution over data sources.
Table 3 presents a summary of the categorization of selected studies according to each data type.
Table 3
Distribution of selected studies w.r.t study type and data types
Category
Sub category
Selected studies
J
Ch
C
Total
Related to text IE
Named entity recognition
[916]
4
3
1
8
Relation extraction
[1723]
3
3
1
7
Entity + relation extraction
[2429]
4
1
1
6
Event extraction
[3035]
3
0
3
6
Total selected studies for text
14
7
6
27
 Related to images IE
Visual relationship detection
[3645]
3
0
7
10
Text extraction from images
[4657]
3
1
8
12
Face recognition
[5861]
2
2
0
4
Total selected studies for images
8
3
15
26
 Related to audio IE
Acoustic event detection
[6268]
4
0
3
7
Automatic speech recognition
[6979]
9
0
2
11
Total selected studies for audio data
13
0
5
18
 Related to video IE
General information Extraction from video
[8082]
0
0
3
3
Text recognition
[8392]
4
1
5
10
Automatic video summarization
[9399]
1
2
4
7
Total selected studies for video data
5
3
12
20
Total selected studies
40
13
38
91
J journal article, Ch chapter; C conference
A.
Process validation
The key doubts about the SLR process validation depend upon “study selection”, “inaccurate data extraction”, “inaccurate classification” and “potential author bias”. To ensure the process validation for this SLR, two authors were involved in the “selection” and “classification” of each study. Mutual understanding was developed for conflict resolution between authors.
 

Information extraction from text

The term NLP refers to the methods to interpret the data i.e. spoken or written by humans. In order to process human languages using NLP, several tasks like machine translation, question-answering system, information retrieval, information extraction and natural language understanding are considered high-level tasks. The process of information extraction (IE) is one of the important tasks in data analysis, KDD and data mining [100] which extracts structured information from the unstructured data. IE is defined as “extract instances of predefined categories from unstructured data, building a structured and unambiguous representation of the entities and the relations between them” [101].
One of the intents of IE is to populate the knowledge bases to organize and access useful information. It takes collection of documents as input and generates different representations of relevant information satisfying different criteria. IE techniques efficiently analyze the text in free form by extracting most valuable and relevant information in a structured format. Hence, the ultimate goal of IE techniques is to identify the salient facts from the text to enrich the databases or knowledge bases. The following subsections discuss the literature selected in SLR process according to the IE subtasks for text data.

Named entity recognition (NER)

Named Entity Recognition is one of the important tasks of IE systems used to extract descriptive entities. It helps to identify the generic or domain-independent entities such as location, persons and organization, and domain-specific entities such as disease, drug, chemical, proteins, etc. In this process, entities are identified and semantically classified into pre-characterized classes [102]. Traditional NER systems were using Rule-Based Methods (RBM), Learning-Based Methods (LBM) or hybrid approaches [103]. IE together with NLP plays a significant role in language modeling and contextual IE using morphological, syntactic, phonetic, and semantic analysis of languages. Rich morphological languages like Russian and English make IE process easier. IE is difficult for morphologically poor languages because these languages need extra effort for morphological rules to extract noun due to non-availability of complete dictionary [104].
Question answering, machine translation, automatic text summarization, text mining, information retrieval, opinion mining and knowledgebase population are major applications of NER [105]. Hence, the higher efficiency and accuracy of these NER systems is very important but big data brings new challenges to these systems i.e. volume, variety and velocity. In this regard, this review investigates these challenges and explores the latest trends. Table 4 presents related work of NER using unstructured big data sets. It summarizes techniques, motivation behind research, domain analysis, dataset used in the research and evaluation of proposed solutions to identify the limitations of traditional techniques, impact of big data on NER systems and latest trends. Evaluation of proposed techniques for IE is performed using precision, recall and F1-score. Precision and recall are the measures for completeness and correctness, respectively. F1-score measures the accuracy of the system and harmonic combination of precision and recall [106, 107].
Table 4
Named entity recognition
 
Technique
Purpose
Domain
Dataset
Results
P%
R%
F%
[9]
Self-training with CNN-LSTM-CRF
To improve the performance and accuracy of NER for large scale unlabeled clinical documents
Medical (clinical text)
19,378 patient data
84.2
85.5
84.4
[10]
CNN + Bi-LSTM and CRF
To improve the performance of NE extraction on OMD large size data and complex data structure without manual rules or features i.e. to deal with volume and variety
Online medical diagnosis text (EMR)
untagged corpus of 320,000 Q&A records of the online Q&A website
Trained with 1/3, 2/3 and all of the total data to compare the experimental performance which results 87.26%, 88.79% and 90.31% F-measure resp.
[11]
Comparison of BioNLP task with 3 sequence labeling techniques: CRF, MEMM, SVMhmm using one classifier SVMmulticlass
To evaluate the performance of ML methods and to identify best features for automatic extraction of habitat entities
Features Used: orthographic, morphological, syntactic, semantic
Bacterial Biotope entities
BioNLP 2 datasets BB2013, BB2016
CRFs and SVMhmm have comparable performance, but CRFs achieve higher precision whereas SVMhmm has better recall
CRFs and MEMM are shown to be more robust than SVMhmm under poor feature conditions
[12]
SML based pi-CASTLE: Crowd assisted IE system
To store text annotation in database and addresses the challenges of probabilistic data model, selection of uncertain entities, integration of human entities
 
For NER: CoNLL 2003 corpus, TwitterNLP dataset with 2400 unstructured tweets
pi-CASTLE achieves an optimal balance between cost, speedand accuracy for IE problems
[13]
Hybrid method to automatically generate rule
To automatically extract and structured patient related entities from large scale data
Diagnosis extraction
EHR clinical notes of 9.5M patient records
5 use cases applied to prove the modularity, extensibility, scalability, and flexibility
[14]
Unsupervised ML (clustering)
Examine the impact of volume on three US ML (spectral, agglomerative, and K-Means clustering)
Facebook posts
314,773 posts by companies and 1,427,178 posts by users for these companies
40.7
83.5
56.3
Spectral clustering performed better on larger datasets
[15]
Grammar rules + MapReduce
To handle large amount of data with parallelization
Suitable for incomplete datasets
Free text
3 different text datasets with 1293, 689, 1654 sentences resp.
The results show better recall on 3 text datasets but low precision
It has been identified that text ambiguity, lack of resources, complex nested entities, identification of contextual information, noise in the form of homonyms, language variability and missing data are important challenges in entity recognition from unstructured big data [11, 16, 105]. It is also found that the volume of unstructured big data changed the technological paradigm from traditional rule-based or learning-based techniques to advanced techniques. Variations of deep learning techniques such as CNN are performing better for these NER systems [9, 10].

Relation extraction (RE)

Relation extraction (RE) is a subtask of IE that extracts substantial relationships between entities. Entities and relations are used to correctly annotate the data by analyzing the semantic and contextual properties of data. Supervised approaches use feature-based and kernel-based techniques for RE. DIPRE, Snowball, KnowItAll are some examples of semi-supervised RE [108]. Several supervised, weakly supervised and self-supervised approaches have been introduced to extract one to one and many to many relationships between entities. In the present study, various lexical, semantic, syntactic and morphological features have been extracted and then relationship between entities using learning-based techniques have been identified. Table 5 summarizes the work presented on relation extraction or entities relationship pairs.
Table 5
Relation extraction
 
Technique
Purpose
Domain
Dataset
Results
P%
R%
F%
[17]
CRF
To generate relationship knowledge base and annotation
Lexical, POS and semantic features used
Chinese encyclopedia
52,975 web pages
Model trained for 9 attributes, accuracy of global training is higher than the local whereas recall rate was low
[18]
Knowledge oriented CNN with clustering using word filters (WordNet)
To overcome the limitations of RBM and LBM and to reduce the dimensionality
Text
3 datasets were used: SemEval-2010 task 8 with 10,717 annotated samples, Causal-TimeBank dataset, Event StoryLine dataset
With max clustering achieved 91.34, 76.21, 81.84% macro averaged F1 on SemEval, Casual-TB, Event-SL resp., whereas, with average clustering, it achieved 91.20, 75.43, 81.96% F1 resp.
[19]
Pattern-based method to build info network
To extract large-scale treatment drug-disease pairs and inducement drug-disease pairs
Medical literature for drug repurposing
27M abstracts and titles from PubMed
Algorithm has shown high precision but low recall
[20]
Weakly supervised method without man-made annotation and SVM to train model
To reduce the manual annotation effort and expand the relation types using semantic and syntactic features
News text
Baidu encyclopedia, 50,000 entry pages of 10 GB size
83.61
82.63
83.12
Results proved that entity ambiguity, and poor universality affect the results
[21]
Multi-class SVM and syntactic model development
To detect semantic relation, model architecture with preprocessing phase to build feature vector using lexical, semantic and syntactic features, training phase and RE phase
News Text
ReACE
80.18
70.89
75.25
Traditional learning-based or rule-based techniques are insufficient to handle the volume and dimensionality of unstructured big data [18]. The supervised LBM needs large annotated corpora and it is very laborious task to annotate large data sets manually. In order to reduce manual annotation effort, weakly supervised methods are more effective [20]. Semantic RE with appropriate features [17, 21] and semantic annotation [17, 20] are two critical challenges of RE.
Table 6 presents the research work related to extracted entities and their relationship from free text corpora. Most of the traditional RE techniques were extracting one to one relationship between entities due to limited text input. In this regard, many-to-many relations have been identified from the large scale datasets that reduce the time as well as increase the performance efficiency. Apache Hadoop provides a platform to adopt parallelization in many to many relation extraction tasks using MapReduce. The system was evaluated with 100 GB free text and many to many relationships were identified [24]. Traditional methods are ineffective to handle data sparsity and scalability [24]. Distant supervised learning, CNN and transfer learning have outperformed the existing traditional methods [23, 25, 26].
Table 6
Entity and relation pair extraction
 
Technique
Purpose
Domain
Dataset
Results
P%
R%
F%
[25]
Transfer learning for domain dependent clustering
To adapt the world knowledge to the domain-dependent tasks by using semantic parsing and semantic filtering
News text
20 Newsgroups, RCV1
Case studies conducted to prove that conceptualization based semantic filter can produce more accurate indirect supervision
[26]
Distant supervised learning (deep learning)
To overcome the limitations of text mining methods such as clustering or rule-based etc. in keyword and information extraction with technology dependency graph
Scientific literature
473,935 articles, labeled 38 relation instances from 20 articles and expanded to 573 instances by bootstrapping
Case study: Technology driven graph to analyze the technology architecture of DSSC
[22]
MapReduce + semantic methods (attribute based, isA based and class based) + logistic regression
To overcome the long tail challenge using Sparse IE approach
To deal with scalability and effectiveness
Web pages
1.68 B web pages
Many entity pair identified and classified as good and bad pairs. Results of each entity pair with good and bad recall, precision abd F-measure is presented
[27]
Supervised Kernel methods
To extract morpho-syntactic information from mined text
To deal with challenges of data prioritization and curation
Biomedical
EU-ADR
Proposed method using morpho syntactic and dependency information outperform to identify entity relationship
[28]
Use of declarative rules in contextual exploration
Automatic detection and extraction of meaning from unstructured web using RDF WordNet, DBpedia, etc.
To bypass the limitation of lack of annotated data semantically and automatically usable using LOD
Free text
Large text corpuses provided by the labex OBVIL and the BNF (National Library of France)
EC3 software is implemented and shown considerable contribution in detecting real meaning of text
[29]
CRF and dictionary for NER, word clustering through Unsupervised training
Chemistry aware NLP pipeline with tokenization, POS tagging, NRE and phrase parsing
To populate chemical databases with minimal time, effort and expense
Scientific documents
50 open access chemistry articles
89.1
86.6
87.8
[24]
Hadoop (MapReduce)
To identify many to many relationships with less training data
Free text
100 GB-sized corpus, baike.baidu.com: big encyclopedia having 700M entries
Proposed Snowball++ achieved higher positive pairs as compared to snowball and PROSPERA
[23]
CNN (weakly supervised)
To obtain high-precision data and automatically generate annotated training sample set
Medical
Experiment selected seven medical sites, generate a total of 20,000 labeled samples at last and five categories of directional relations
91.87
91.58
89.08

Event extraction (EE) and salient facts extraction

An event represents a trigger and arguments. A trigger is a verb or normalized verb that denotes the presence of an event whereas the arguments are usually entities which assign semantic roles to illustrate their influence towards event description [30]. The literature on event extraction and other salient fact extraction has been summarized in Table 7.
Table 7
Event extraction and salient fact extraction
Task
Approach
Dataset
Results
Remarks
Term context understanding to deal with homonyms [31]
Semi-automated approach be combining automated content analysis and ANN
26,259 research articles from Web of science
Proposed solution evaluated with different sparsity parameters. Results showed different effects of different modeling terms on error rate
The proposed solution outperformed with manual classification in some instances that could not automatically be classified. Hence, improvement is required to automate the sifting process of homonyms context identification
IE from heterogeneous unstructured big data [32]
Unsupervised deep learning (multiple Kernel)
13 different datasets from UCI Machine Learning Repository
Performance of the proposed system was better in speed from other competitors and same in accuracy
Accuracy of heterogeneous data can be improved with unsupervised learning but advancement in approach is required to handle the dynamicity of such data
Deep semantic IE for big data mining from geoscience data [33]
convolutional neural networks (CNN) for classification and TF-IDF for word statistics
Multivariate and heterogeneous data of 16,098 PDN, 130 LAN’s
classification accuracy of 99.9% and 99.8% at the sentence and paragraph levels, respectively
Insufficient comprehensiveness, poor correlation and inconsistent formats are problems of heterogeneous data
Open domain event extraction [34]
Schema discovery based on probabilistic generative models i.e. LinkLDA
Set of events generated and extracted from Twitter
The difference between proposed and related work is, it can handle complex queries and structured data browsing
The sparsity of unstructured big data can decrease the performance and scalability of solution. So, these are important factors to investigate the effectiveness of approach
Biomedical Event extraction [35]
Syntactic and semantic features to identify event trigger + Phrase Structure Tree
BioNLP-ST 2013
The solution was evaluated and shown 52.23% precision, 26.38% recall, and 35.06% F1-score
The proposed approach uses ML features that inherits the limitations of the ML feature based techniques
The present study identifies several challenges in IE from unstructured big data related to volume, variety and IE techniques. Unstructured big data comes with the heterogeneity of data types, different representations and complex semantic interpretation. These intrinsic problems of unstructured data generate challenges for big data analysis. In order to make unstructured data available in the form ready for analysis, it must be transformed into structured content and prepare for analysis. IE process must be efficient enough to improve the effectiveness of big data analysis. Heterogeneity, dimensionality and diversity of data are important to handle for IE using big data [32, 33]. However, volume of unstructured data is getting double every year [1], it is becoming more critical to extract semantic information from such a huge deluge of unstructured data. Nevertheless, big data bring some challenges also for learning-based approaches which are dimensionality of data, scalability, distributed computing, adaptability and usability [109111]. In this regard, advancements in learning-based approaches are trying their best to handle the complexity of big data.

State-of-the-art IE techniques

Two major categories of IE techniques are rule-based methods (RBM) and learning-based methods (LBM). It is difficult to identify which method is more popular and effective in IE. In this regard, two studies [112, 113] have shown totally different analysis. First, according to a systematic literature review on the popularity comparison of these two methods, it was concluded that more than 60% of the studies included in the review used pure rule-based IE systems. Whereas it was considered that rule-based IE techniques are obsoleted in academic research domain [112]. Another comparison has demonstrated totally different results by examining 177 research papers of four specific conferences on NLP. Among these 177 research papers, only 6 papers relied on pure rule-based IE approach [113]. It was also observed that the IE Systems by large vendors i.e. IBM, SAP and Microsoft are purely rule-based [113]. This review identifies that LBMs are more popular in academic research domain as compared to RBM but the importance of RBM could not be neglected. However, the debate on the comparison of these two approaches is subjective to various factors such as the cost, benefits and task specifications. Table 8 presents a comparison of these two approaches in general.
Table 8
Rule-based vs learning-based techniques
Rule-based approaches
Learning-based approaches
Interpretable and suitable for rapid development and domain transfer [114]
The performance of machine learning approaches is better in terms of precision and recall but appropriate feature selection is important [115]
Humans and machines can contribute to the same model. So it is easy to incorporate domain knowledge [114]
Heavily rely on domain thesauri [11]
Generating training data is time consuming in learning-based approaches whereas rule-based approaches require pre-defined vocabularies [116]
Although rule-based systems require domain knowledge and are time consuming, results proved that these are more reliable and useful for automated processing [117]
No experts are required and system can be developed quickly with relatively low cost [118]
Declarative [119]
Adaptable [119]
Requires tiresome manual work [118]
Less manual effort [118]
Highly transparent and expressive
Higher portability than rule-based [9]
The comparative analysis explores different pros and cons of both approaches but the selection of approach for any task is highly dependent on the user needs and task at hand because IE is community-based process [100]. In general, learning-based approaches are divided into supervised, semi-supervised and unsupervised techniques. These techniques also have limitations to handle large scale big datasets and complexity of huge volume of unstructured data. Supervised techniques require manually labeled training data which is one of the major drawbacks of these techniques. Large scale labeled corpus construction is laborious and time consuming task [9]. These techniques are effective for domain-specific IE where specific information is required to be extracted. The efficiency of these techniques also depends on the selected features like morphological, syntactic, semantic and lexical features. Whereas, unsupervised IE techniques do not need labeled data. These techniques extract entity mentions from the text, clusters the similar entities and identify relations [120]. In this case, intensive data preprocessing will be required for big data because unstructured big data sets have missing values, noise and other errors [16] that produce uninformative as well as incoherent extractions. Semi-supervised techniques use both labeled and unlabeled corpus with small degree of supervision [121]. For large scale data, distant supervised learning [26], deep learning (CNN, RNN, DNN) [9, 10, 18, 23, 3133], transfer learning [25] techniques are more suitable for IE from free-text data.
Deep learning approaches show better results for large datasets despite its own limitations and challenges. It has the ability to generalize the learning and also has a unique characteristic to utilize unlabeled data during training. Deep learning has the ability to learn different features as it has multiple hidden layers. These techniques are more suitable for pattern recognition [122]. Unsupervised learning (deep) have large model capacity/complexity, high learning speed [32]. Feature learning-based systems are computationally expensive for large scale data [123]. For the selection of appropriate technique for large scale datasets, computational cost, scalability and accuracy are the key factors [124]. More advanced algorithms and techniques are required to achieve higher accuracy and efficiency [125]. Over-fitting can be resolved with self-training [18] and to overcome the limitation of large annotated dataset availability, reinforcement learning or distant supervision can be used because these techniques use small labeled dataset [26], [126]. Timeliness of distribution of data [126], balance of informativeness, representativeness, and diversity [127], data modeling performance for heterogeneous, dimensional, sparse and imbalance data [16] and structuring the unstructured data [10] are open challenges for IE using unstructured big data sets.

Unstructured big data barriers for IE

With huge volume and complexity of unstructured big data, natural language free text data implies various issues for the users to extract the most relevant and required information. Noisy and low-quality data is one of the major challenges in IE from big data [16, 31, 128, 129]. It causes difficulties in identifying semantic relatedness among entities and terms [130], improving the effectiveness and performance of IE systems [128], extracting contextually relevant information [31], data modeling [16] and structuring the data [10].
IE from text is also facing natural language barrier. Data diversity [124], ambiguities in text, nested entities [105], heterogeneity [131], automatic format identification [13], sparsity, dimensionality [16], homonym identification and removal [31] are some important challenges to IE from unstructured free text. The exponential growth of unstructured big data is making IE task more arduous. However, MapReduce has the capability to deal with large scale datasets by distributing the data into different clusters that increases the time efficiency [15, 22, 24]. Hence, the volume can be effectively handled using Apache Hadoop, whereas, the issues related to the variety of data needs to be focused. Unstructured big data is adding more challenges to IE from natural language text. Hence, advanced and adaptive preprocessing techniques are required to improve the quality and usability of unstructured big data. After preprocessing the data, IE techniques i.e. RBM or LBM will be able to produce more effective and efficient results.

IE from images

The IE from images is a field with great opportunities and challenges such as extracting linguistic descriptions, semantic, visual and tag features, context understanding and face recognition. Content and context level IE from different types of images could improve image analytics, mining and processing. Following sections review the IE from images w.r.t. different subtasks.

Visual relationship detection

Visual relationship detection extracts interaction information of objects in images. These semantic representations of the relationship of objects are presented in the form of triples (Subject, Predicate, Object). The semantic triples extraction from images would benefit various real-world application such as content-based information retrieval [132], visual question answering [133], sentence to image retrieval [134] and fine-grained recognition [135]. Object classification and detection and context or interaction recognition are main tasks of visual relationship detection in image understanding.
In object detection and classification, objects are recognized based on appearance and its class labels have clear association. CNN based solutions in object classification are outperforming such as VGG [36] and ResNet [37].Whereas, Faster R-CNN and R-CNN achieved great success in deep learning [38, 39]. Unlike object detection, visual relationship detection extracts the interaction of objects. For example, “horse eating grass” and “person eating bread” are two visually dissimilar sentences but both are sharing the same interaction type “eating”. Thus, subject, object and interaction are important in relationship detection as well as context of the interaction. The model of interaction and its context are treated as a single class where images are classified according to the interaction classes [40]. Single class modeling has poor generalization and scalability as it requires training images for each combination of interaction and context. Language priors [41] or structural language [38] are used to overcome the limitations of single class modeling. Intraclass variance, long-tail distribution, class overlapping are three major challenges of visual relationship detection [41]. Long-tail distribution challenges were addressed by introducing spatial vector for imbalance distribution of triples [41]. Long-tail distribution problem causes difficulties in collecting enough training images for all relationships. In this regard, incorporating linguistic knowledge to DNN can regularize the performance [42]. Several modified state of the art deep learning based techniques to extract context and interaction detection have been discussed Table 9.
Table 9
Visual relationship detection
 
Purpose
Technique
Dataset
Results
Limitations/benefits
[43]
To map the images with associated scene triples (subject, predicate, object)
Conditional multiway model with implicitly learned latent representation. Both semantic tensor model and object detection used RCNN model
Stanford visual relationship dataset
Model achieved better performance to predict unobserved triples
Results are comparable to Bayesian fusion model. Proposed approach has used implicit learned prior for semantic triples whereas BFM needs explicit
[38]
To improve global context cues extraction
Variation structured Reinforcement Learning (VRL): Directed semantic graph using language prior + variation structured traversal to construct action set + make sequential predictions using deep RL
VRD dataset with 5000 images and visual genome dataset with 87,398 images
The proposed approach outperforms baseline methods for attribute recall @100 and @50 i.e. 26.43 24.87 resp. other results also show pretty notable results compared to other methods
Although, the proposed approach outperformed but on comparison to VRL performance, the results are almost same as VRL with LSTM. But it was claimed that VRL with LSTM takes more training time
[39]
To identify unseen context interaction relationship
Context aware interaction classification i.e. Faster-RCNN + AP + C + CAT
VRD dataset and visual phrase dataset
The proposed approach has performed better than baseline methods using spatial and appearance features
Spatial feature representation produced better results than appearance based representation
Adding language prior to proposed approach does not bring benefit
[44]
1. To infuse semantic information and improve predicate detection
2. NMS was used to reduce redundancy and boost detection speed
To include spatial, classification and appearance information, feature extraction used + bidirectional RNN + paired non-maximum suppression (NMS)
Visual genome having 108,077 images, VRD having 5000 images
Results are compared with other existing methods for predicate, phrase and relationship detection for recall @ 50 and 100. Proposed solution gave better results for both datasets
Superfluous regions are filtered using NMS improves the performance
[41]
To overcome long tail distribution challenge
To handle widely spread and imbalanced distribution of triples
Visual module using VGG16, Language module using softmax, the contribution was spatial vector using normalized relative location of object and intersection over union
VRD and VG dataset
The results showed that proposed vector improved the performance of 2% and 4% on Recall@50 as compared to other. The proposed solution capable to detect unseen visual relationship
The research only addressed the long tail distribution challenges
It can be concluded that deep learning techniques are outperforming in IE from large scale unstructured images. CNN, RCNN and reinforcement learning achieved better recalls. Also, it has been observed that Faster-RCNN and R-CNN have achieved remarkable achievement in object detection [38, 39]. Whereas language prior and language structures are also improving performance of relationship detection [38]. CNN based VRD techniques extract features from subject and object union box before classification. The training samples contain various same predicate categories which can be used in different context with different entities. CNN based models have the limitation to learn common features in same predicate category [45]. So, intraclass variance is a challenge for CNN based VRD. In order to overcome the limitation of CNN models in VRD, visual appearance gap between same predicates and visual relationship should be reduced. For this, Context and visual appearance features can be used to overcome the identified limitation [44, 45]. Further, modified deep learning techniques are required to overcome the challenges of visual relationship detection for large scale unstructured data. To the best of our knowledge, the impact of volume, variety and velocity of big data is not addressed well in visual relationship detection techniques.

Text recognition

A vast array of information can be extracted from the text content in images. Text within images and videos describes more about the useful information about the visual content and also improves the efficiency of keyword-based searching, indexing, information retrieval and automatic image captioning. Text information extraction (TIE) systems detect, localize and recognize the text in visual data like images and videos. The visual content can be categorized into perceptual content and semantic content. Perceptual content includes color features, shape, texture features, temporal attributes, whereas semantic content deals with the identification and recognition of objects, entities and events [136]. TIE systems follow detection, localization, tracking, extraction or enhancement and recognition phases in terms of detecting and identifying text in the visual data. Each subtask of TIE systems has different techniques, challenges and limitations. In TIE systems, text detection and localization tasks are used to identify different features such as color-based, edge based, texture-based and text-specific features [136, 137]. All these subtasks are important to extract useful information from visual data but only recognition task is more relevant to the identification of objects, entities and characters. Text recognition is a process to identify the character-forming meaningful words. So, recent literature have been discussed, in this section, to identify the potential challenges of text recognition task from images in information extraction.
Text recognition task is tightly coupled with the OCR (Optical Character Recognition) approach to recognize characters from images or scanned documents. Character recognition from the Tamil text in ancient documents and palm manuscripts to extract useful information from document images using OCR involved a segmentation technique which included different stages: image preprocessing, feature extraction, character recognition and digital text conversion. According to the experimental results, the accuracy of conversion for the Brahmi was 91.57% and 89.75% for the Vattezhuthu [46]. Whereas, character recognition using neural networks from handwritten text has shown different results. The Radial Basis Function (RBF) with one input layer and one output layer has been used to train RBF network. As compared to back propagation neural network, gradient feature extraction resulted in less accuracy with RBF using directional group values [47]. OCR systems perform better for scanned documents but different variation in images have shown inappropriate results [137]. The underlying reasons could be the geometric variation, complex background, variation of text layout and font, uneven illumination, multilingual content, low resolution and low quality [138].
Extracting text from the visual data, semantic features use learning-based approaches such as supervised and unsupervised. Supervised learning methods are used to learn structure or concepts from the features such as Support Vector Machine (SVM) and Bayesian classifier. These classifiers are trained to learn the structure and are tested on the unlabeled regions. In this regard, distorted character recognition using Exempler SVM beat the existing state of the art by over 10% for English and 24% for Kannada on the benchmarked dataset Chars74k and ICDAR [48]. Similarly, CRF classifier was used in a framework to recognize characters with scores, spatial constraint and linguistic knowledge that performed 79.3% on ICDAR2003 and 82.79% on ICDAR2011 accuracy [49]. Another system, Stroklete, was designed to detect and recognize characters from the images using histogram features i.e. bag-of-Strokletes to learn the structure of the letters and train the system with Random Forest classifier. The system was trained and tested on English letters and Arabic numbers. It had shown 80% and 75% supporting results on ICDAR2003 and SVT respectively [50]. However, robustness to distortion and generality to variant language are challenging for these systems. To explore the advancement in TIE techniques, Table 10 summarizes the literature on the state of the art TIE techniques for high dimensional or large scale datasets.
Table 10
Text recognition from images
 
Purpose
Technique
Dataset
Results
Limitations/benefits
[51]
To possess high learning capacity
To handle high dimensional data
CNN based OCR
Scanned Sanskrit document images (11,230)
Proposed approach outperform than existing. Accuracy was 93.32%
Training time as 1 h with GPU
[52]
To automatic recognition of handwritten text from images
CNN based OCR
MNIST
98.11% accuracy rate
DL should apply on large datasets
[53]
To compare the results of proposed DBN and CNN ECR
Unsupervised feature learning with DBN
HACDB dataset containing 6600 images
Experiments shown 3.64% and 14.71% for DBN and CNN resp.
DBN with unsupervised feature learning outperform CNN for high dimensional data
[54]
To develop end to end mechanism for Scene TR
FANet using resnet as encoder and seq2seq attention mechanism as decoder
5000 authentic seal dataset, 3660 real time train ticket dataset
Although, proposed approach could not achieve outperforming results but angular and horizontal TR was improved
Full attention mechanism was proposed to replace detect, slice, and recognize process with end to end recognition
Ineffective for long text recognition
[55]
To recognize text from handwritten and printed text images
TMIXT: tessetact for machine printed text recognition and LSTM for handwritten text recognition
IAM handwriting database
Achieved 80% average transcription accuracy
Heavy preprocessing is required for combined text recognition with proposed solution
[56]
To recognize text using attention mechanism
CAN (Convolutional Attention Network), 2D CNN as encoder and one dimensional CNN decoder
Street View text SVT, IIIT5K, and ICDAR 03, ICDAR 13 dataset
The proposed model performed better than others on SVT and ICDAR 03 datasets
Improvement in proposed method is required for promising results
[57]
Semantic based text recognition to extract useful information from images
CNN and bidirectional LSTM where convolutional part uses VGG and recurrent part uses bidirectional LSTM
Interior Design Dataset with 7708 images
Achieved 90% accuracy in word recognition
Generality improved but the text recognition from protest images is relatively an easy task. Evaluation of the system with complex and diverse datasets should be promising
Unlike traditional OCR techniques, CNN, RNN and LSTM are achieving high performance in text recognition in images. Deep learning techniques are showing prevalent results to date. CNN as feature extractor to detect, slice and recognize pipeline [57] and as encoder in attention mechanism outperformed others [56]. Although, these techniques are showing promising results, but diversity in data sources makes the system complex [55]. The effectiveness of these techniques for complex, diverse, high dimensional and heterogeneous datasets must be investigated. The huge volume of unstructured data is creating noisy and low-quality images such that multilingual text in images should be addressed to improve the IE from images [58, 59]. CNN based OCR have also shown pretty good results but the performance of technique on unstructured big datasets is still to be investigated. The attention mechanism is a new approach in text recognition [54, 56]. Initially, the results are satisfactory but there is a huge room for improvement in terms of unstructured and multidimensional big data. It is predicted that OCR with attention mechanism will be the emerging phenomenon in near future for text recognition [54]. In this regard, robust and adaptive techniques are required for unstructured big datasets for semantic understanding of text in images.

Face recognition

The task to recognize similar faces is a computational challenge. It is evident that humans have very strong face recognition abilities and these abilities are superior to known faces but ability to recognize the unfamiliar faces are error-prone [139]. This distinction of face recognition in human lead towards the finding that face recognition depends on different set of facial features for familiar and unfamiliar faces. These features are categorized into internal and external features respectively [140]. In this regard, [60] examined the role of high-PS and low-PS features in face recognition of familiar and unfamiliar faces and role of these critical features for DNN based face recognition. The review concluded that high-PS features are critical for human face recognition and are also used in DNN based trained on unconstrained faces.
In the domain of computer vision, face recognition is a holistic method that analyzes the face images. Various techniques have been proposed for face recognition for different datasets but these traditional techniques are inadequate to deal with large scale datasets efficiently. A comparative analysis shows that these traditional techniques have limitations to handle low-quality large scale image datasets whereas deep learning methods are producing better results for these datasets but with optimal architecture and hyper-parameters [58]. The face recognition in low quality i.e. blur and low-resolution images degrades its performance. Sparse representation and deep learning methods combined with handcrafted features outperformed in case of low-resolution images [59]. Face recognition techniques should be able to recognize faces with different face expressions and poses in different lighting conditions [58]. Various deep learning based solutions are proposed to address the limitations of traditional techniques. Deep CNN face recognition technique without extensive feature engineering reduces the effort of most appropriate feature selection. Deep CNN face recognition technique was evaluated on UJ face database of 50 images and the results have shown validation accuracy of 22% goes to 80% after 10 epochs and 100% after 80 iterations [52]. Certain limitations were also associated with the solution such as overfitting and very small dataset. To reduce overfitting, application of early stopping method will require extra effort. VGG-face architecture and modified VGG-face architecture with 5 convolutional layers, 3 pooling layers, 3 fully connected layers and softmax layer was evaluated using five different image datasets, i.e. ORL face database with 400 images, yale face database with 165 images, extended yale-B cropped face database with 2470 images, faces 94, Feret with 11,338 images and CVL face db. For all datasets, the proposed approach performed better as compared to traditional methods [58]. Although, the proposed technique outperformed five different datasets but the datasets were not complex and large-scale datasets. Deep learning based face recognition techniques such as deep convolutional network or VGG-face and lightened CNN have capability to handle huge amount of wild datasets [61].
Deep learning based face representations are more robust to handle misaligned images [61]. Deep CNN can perform better to recognize objects from partially observed data but image enhancement is important in deep CNN before the convolutional operation for low quality images [58]. Although deep learning techniques have capabilities to improve the performance of face recognition, certain challenges have also been associated with deep learning techniques that should be considered beforehand. Quality of images, missing data in images, noise should be handled because these factors degrade the performance of the deep learning based face recognition techniques [58, 59]. Face recognition with different face expressions, illuminations, using accessories causes partial occlusion [61]. This partial occlusion detection requires new optimal deep learning architecture and hyper-parameters to overcome these challenges. However, the selection of appropriate technique highly depends on the data size and quality. Further, more robust and optimal solutions are required for large scale datasets with high accuracy and low latency.

Audio IE

Companies like call centers and music files are the major sources which generate a huge volume of audio data. Different type of information can be extracted from this data to help predictive and descriptive analytics. The subtasks of IE from audio data are classified as acoustic event detection and automatic speech recognition.

Acoustic event detection

Sound event extraction or acoustic event extraction is an emerging field which aims to process the continuous acoustic signals, convert them into the symbolic description. The applications of automatic sound event detection are multimedia indexing and retrieval [141], pattern recognition [62], surveillance [142] and other monitoring applications. This symbolic representation of sound events is used in automatic tagging and segmentation [143]. These auditory sounds come from diverse sources and contain overlapping events and background noise [63, 64]. Moreover, parametric accuracy of training model on limited training data is also difficult to achieve [62].
As presented in Table 11, data scarcity and overfitting are common limitations of AED solutions. In this regard, modified data augmentation achieved better results due to modification in frequency characteristics with particular frequency band [65]. Context recognition is one of the solutions to overcome the overlapping issue and improve the accuracy of AED but identifying the specific context sound event is one of the critical challenges for AED. Adding language or knowledge prior can help to extract context sound events [64]. In recent work on AED, deep neural networks are outperforming traditional techniques. The capability to jointly learn feature representation is one of the major advantages of DNN. Whereas, supported by large amount of training data, DNN is well progressing in the field of computer vision. But non-availability of large scale datasets publically reduces the progress in this research area [64]. Creating large scale annotated data can be a time-consuming process. Therefore, weakly supervised or self-supervised data for training can perform better. In this context, CNN based weakly supervised technique was compared to the technique trained with fully-supervised data. On evaluating both techniques on UrbanSound and UrbanSound8k datasets, weakly supervised performed better for arbitrary duration without human labor for segmentation and cleaning [67]. On the implementation side of AED techniques on large scale, high computational power, efficient parallelism and support for training large models are important factors to consider [68]. The research on automatic AED is hindering by the complexity of overlapping sound events. Improved accuracy to handle overlapping sound events, efficient solutions to achieve labeled datasets, improved processing time with parallelism for large scale data are important dimensions for the development of optimal solutions for AED with unstructured big data.
Table 11
Acoustic event detection
 
Purpose
Technique
Dataset
Results
Limitations/benefits
[62]
Modeling data with exemplars
To explicitly model the background events
Exemplar-based method with NMF
Office Live recordings from 1 to 3 min and office synthetic with bg noise
With time wrapping, Fscore improved from 50.2 to 65.2% in office live dataset whereas in office synthetic dataset, results were not promising
Proposed solution suffers from data scarcity, and overfitting
[65]
To overcome the overfitting limitation
To improve the performance for large scale input
CNN to train AED end to end + data augmentation method to prevent overfitting
Acoustic event classification database
Achieved 16% improvement as compared to Bag of Audio Words (BoAW) and classical CNN
Results presented with and without data augmentation proved that augmentation improves the performance
[63]
To explore the impact of feature extraction in AED
To explore the effectiveness of deep learning approaches
Multiple single resolution recognizers + selection of optimal set of events + merging or removing repeated labels
CHIL2007
CNN performed better with combination scheme of multi-resolution approach
DNN has the ability to model high dimensional data
[64]
To improve the detection accuracy by extracting context information
Context recognition phase: UBM to capture unknown events and sound event detection stage
Audio database consisting 103 recordings of 1133 min duration
Knowledge of context as context dependent event prior can be used to improve the accuracy
Context dependent event selection and accurate sound event modeling are two important factors for the improvement in AED
[66]
To improve the efficiency of acoustic scene classification and acoustic event detection
Gated recurrent neural networks (GRNN) + linear discriminant analysis (LDA)
DCASE2016 task 1
Achieved overall accuracy of 79.1% on DCASE2016 challenge. Relative improvement of 19.8% as compared to GMM
LDA minimizes inner class variance but not efficient for high dimensional data

Automatic speech recognition (ASR)

Automatic speech recognition (ASR) is a task to recognize and convert speech into any other medium such as text, that’s why it is also known as speech to text (STT). Voice dialing, call routing, voice command and control, computer-aided language learning, spoken search and robotics are major applications of ASR [144]. In the process of speech recognition, sound waves of speaker’s speech are converted into the electrical signal and then transformed into digital signals. These digital speech signals are then represented in discrete sequence of feature vectors [145]. The pipeline of speech recognition system consists of feature extraction, acoustic modeling, pronunciation modeling and decoder. Generally, these automatic speech recognition systems are divided into five categories according to classification methods such as Template-based approaches, Knowledge-based approaches, dynamic time warping (DTW), hidden Markov model (HMM) and artificial neural network (ANN) based approaches [146]. Recently, the exponential growth of unstructured big data and computational power, ASR is moving towards more advanced and challenging applications such as mobile interaction with voice, voice control in smart systems, communicative assistance [147]. For such large scale and real world applications, Table 12 presents the recent literature on ASR to discuss state-of-art classification approaches, its variants, evaluation results and remarks on the proposed solution.
Table 12
Automatic speech recognition
 
Purpose
Approach
Technique
Dataset
Results/limitations
[68]
To improve computational power
To enhance the training capability of larger models
To ease the process
ANN
Mariana: GPU and CPU clusters for parallelism
Three frameworks were developed: multi-GPU for DNN, multi-GPU for DCNN, CPU cluster for large scale DNN
With 6 GPUs, 4.6 times speedup over one GPU was achieved and character error rate was decreased by 10% as compared to existing techniques
DNN framework with GPUs performed better for ASR
[72]
To investigate noise robustness on DNN based models
ANN
DNN-HMM: DNN based noise aware training
Aurora 4 w/o explicit noise compensation
7.5% relative improvement
Dropout training in DNN with overlapping concern as compared to feature space and model space noise adaptive training
[73]
Bilingual ASR system for Frisian and Dutch languages
ANN
DNN with language dependent and language independent phone
FAME Speech database
Bilingual DNN trained on phones of both languages achieved best performance yielding CS-WER of 59.5% and WER of 38.8%
Code switching ASR combines phones of two languages outperformed on WER whereas latency of switching is also an important factor for these systems
[75]
To improve performance for large vocabulary speech recognition
ANN
LSTM RNN
2800 utterance, each distorted once with held-out noise samples
On 25 k word vocabulary, 19.5% WER, and 14.5% in vocabulary WER
Word level acoustic models without language model can achieve reasonable accuracy
[74]
To compare the performance of DNN-HMM with CNN-HMM
ANN
CNN with limited weight sharing scheme to model speech features
Small-scale phone recognition in TIMIT large vocabulary voice search task
CNN reduce error rate by 6% to 10% compared with DNN
ASR performance is sensitive to pooling size but insensitive to overlap b/w pooling units
The results were better for voice search experiment but not for phone recognition
[71]
To develop ASR for Amazigh language
HMM
GMM and tied states
MFCC for feature extraction, phonetic dictionary, language model using CMU-Cambridge Statistical Language Modeling Toolkit, HMM based large vocabulary system
New corpus with 187 distinct isolated word speech recording by 50 speakers
Achieved reduced WER to 8, 20%
The new corpus was collected. Results are not compared with existing state of the art techniques
[69]
LMS Adaptive filter are introduced to preprocess the speech signals and to identify speaker
Template based
Adaptive Filtering + feature extraction + dimensionality reduction + ensemble classification model using LSTM, ICNN, and SVM
IITG multi variability speaker recognition database
Achieved 95.69% accuracy for noisy data
Follows sequential processing
Require memory-bandwidth bound computation
Required large amount of training data for each new speaker
[70]
ASR for Tunisian dialect
Rule based
G2P rules were defined to build pronunciation dictionaries
TARIC, 9.5 h speech for training and 43 min for testing
WER of 22.6%
Validated on manually annotated dataset
Improved quality pronunciation dictionaries can be build using expert knowledge but high linguistic skills are required
ANN based approaches are followed in most of the research studies because these approaches can handle complex interactions and are easier to use as compared to statistical methods. ASR systems can be speaker-independent or speaker-dependent recognition systems. For speaker-dependent recognition systems, template-based methods are performing better due to individual reference template for each speaker which requires large training data from each individual [69]. Due to separate template for each individual, high accuracy can be achieved even in noise, but these methods are suitable for small scale data because, at large scale, it is ineffective to collect large training data from each individual. Rather than collecting large data for training, reinforcement learning can be adopted to make speaker identification automated. To implement speaker-dependent recognition systems at large scale, Apache Hadoop can be used to implement parallelism to make system computationally efficient. Whereas speaker-independent speech recognition systems are not achieving as high accuracy due to noise and overlapping in speech, and language used in speech. Rule-based approaches in speaker-independent recognition system require linguistic skills to implement rules that is a laborious task but rule-based approaches provide quality pronunciation dictionaries [70]. Rule-based methods have limitations of poor generalizability to implement multilingual recognition system or switching for different languages. HMM based speech recognition uses statistical method for data modeling [71]. These systems require large training data for huge number of parameters for HMM. In contrast, ANN-based methods are more flexible and nonlinear e.g. DNN [72, 73], CNN [69, 74], RNN [75]. ANN-based speech recognition systems are more generalize and have flexibility towards changing environments. ANN-based data models are informative and nonlinear. Several ANN-based solutions have been developed for different languages other than English such as Punjabi [76], Tunisian [70], Chhattisgarhi [77], Tamil [78], Amazigh [71] and Russian [79]. The evaluation of LSTM RNN based ASR have proved that word level acoustic models without language model are more efficient to improve accuracy [75]. The performance of ASR is sensitive to pooling size but insensitive to overlap between pooling units with CNN implementation [74]. Although ANN-based ASR systems achieved overall better performance, these systems also have some limitations. The quality of results is unpredictable due to its black box and empirical nature. To improve its computational power, cluster based solution was proposed with DNN framework that speeds up the process 4.6 times and reduces the error rate by 10% [68]. Overall, ANN based ASR systems are performing better than other classification approaches. Hence, modified ANN based ASR systems are required to improve the accuracy of these systems.

Video IE

The primary goal of IE from the video is to understand and extract relevant information from video content carried in videos. The applications of IE from video are semantic indexing [148], content-based analysis and retrieval, content-oriented video coding, Visually impaired people assistance and automation in supermarkets [149]. In the era of big data, social media and many other platforms are producing digital videos at very high speed. It is not only about size of data that matters, high computational power and speed are also essential to extract useful information from these digital videos. In this regard, Apache Hadoop has been used to implement an extensible distributed video processing framework in cloud environment [80]. FFmpeg and OpenCV for video coder and image processing respectively were implemented using MapReduce showing 75% scalability.
Generally, perceptual and semantic content can be extracted from videos. Semantic contents deal with the objects and their relationship [149]. The spatial and temporal association among objects and entities have been used to reduce the semantic gap between visual appearance and semantics with the help of fuzzy logic and RBM [81]. The proposed system achieved high precision but relatively low recall. Similarly, event extraction from audio-visual content consisting of CNN based audio-visual multimodal recognition was developed and incorporated knowledge from the website using HHMM was used to improve the efficiency. The proposed approach outperformed in terms of accuracy and concluded that CNN provides noise and occlusion robustness [82]. The following subsections extensively discuss the issues and state of the art techniques for subtasks of IE from video content.

Text recognition

The large volume of video data is produced and shared every day on social media. Text in videos plays an important role to extract rich information and provides semantic clues about the video content. Text extraction and analysis in video have shown considerable performance in image understanding. A wide variety of methods have been proposed in this regard. Caption text and scene text are two categories of text that can be extracted from videos [150]. Caption text provides high-level semantic information in captions, overlays and subtitles, whereas scene text is normally embedded in the images such as sign boards, trademarks, etc. Caption text or artificial text recognition is easier than scene text because caption text is added over the video to improve the understandability. Whereas, scene text recognition is complex due to low contrast, background complexity, different font size, orientation, type and language [83]. Besides, low-quality video frames, blur frames and high computation time are specific challenges related to video text extraction process [84].
The pipeline of text detection and extraction consists of text detection, text localization, text tracking, text binarization and text recognition stages. Focusing on IE techniques, this review presents only state of the art techniques for text recognition. Text recognition system to extract semantic content from Arabic Tv channel using CNN with auto encoder was developed. The accuracy of character recognition was 94.6% [85]. Moreover, a similar system for Arabic News video was developed for video indexing using OCR engine ABBYY FineReader with linguistic analysis and achieved 80.52% F-measure [86]. Another text recognition system was developed for overlay text extraction and person information extraction using rule-based approach for NER to extract person, organization and location information. To extract text, ABBYY FineReader was used [148]. These text recognition systems deal with printed and artificial text only that is comparatively easy to extract. On the other hand, text binarization is important to segment natural scene text with filtering and iterative variance-based threshold calculation [87]. DNN has the ability to provide robust solution in end to end text recognition in videos. In this regard, Faster R-CNN [88], CNN [89, 90], LSTM based method [91] have shown comparatively better performance on scene text recognition. In general, temporal redundancy can be used in tracking for text detection and recognition from complex videos [92].
Traditional systems are not capable of managing and efficiently analyzing the complex big data. MapReduce based parallel processing system has been proposed to detect text in videos. The proposed system achieved high-speed performance on YouTube videos but the system only detects the text from videos using texture-based features [84]. Text recognition plays an important role in understanding multimedia data and multimedia retrieval, visually impaired people assistance, content-based multimedia analysis [151]. Multimedia big data is growing very fast in batch or streaming, more advanced and computationally powerful techniques are required for text recognition from multimedia big data. More robust algorithm to recognize variety of scene and artificial text from low-quality videos are required having the capability to address the space and speed performance in this area.

Automatic video summarization

Automatic tools are essential to analyze and understand visual content. People are generating huge volume of videos using mobile phones, wearable cameras and Google Glass, etc. Some examples of this explosive growth are: 144,000 h videos are uploaded daily on YouTube, lifeloggers generate Gigabytes videos using wearable cameras, 422,000 CCTV cameras are generating videos 24/7 in London [93]. The explosive growth of video data on daily basis highlighted the need to develop fast and efficient automatic video summarization algorithms. AVS has many applications in real life like surveillance, social media, monitoring, etc. [152]. It provides the summary of the video content in skim through video that presents the short video of semantic content of original long video, known as skimmed based summarization or dynamic video summarization. The second is key-frame based video summarization, a.k.a static video summarization, where frames and audio-visual features are extracted [94]. Selecting the most relevant or important frames or subshots from the video for video summarization is a critical task. Several supervised, unsupervised and other techniques are introduced in the literature of computer vision and multimedia. Selection and prioritization criteria for frames and skims is designed manually in unsupervised approach [95, 96] whereas supervised techniques leverage user-generated summaries for learning [94, 97, 98]. Each technique has different properties for representativeness, diversity and interestingness [93]. Recently, supervised techniques are achieving promising results as compared to traditional unsupervised techniques [94]. Recent literature on user-generated videos have been presented in Table 13.
Table 13
Automatic video summarization
 
Approach
Technique
Purpose
Dataset
Results/limitations
[97]
Supervised with prior segmentation
SVM based kernel video segmentation
Category specific video summarization
MED summaries training set 12,249 videos and testing set 60 videos
Higher quality video summaries can be produced with known categories than unsupervised approach
[95]
Unsupervised with web based prior information
Used four baseline algorithms: random and uniform sampling, k-means and spectral clustering followed by crowdsourcing
To deal with content sparsity and large scale evaluation
180 videos 25 for training and 155 for evaluation
Content sparsity and poor quality of user generated videos are major challenges
Expert evaluation is not possible for large scale data, therefore crowdsourcing used
Adding web images of category to incorporate knowledge is time consuming process especially for unknown categories
[98]
Supervised
Linear combination of Submodular maximization for each objective using structured learning
To implement interestingness, representativeness, and uniformity
Egocentric dataset and SumMe dataset
Shortage of large datasets for summarization
[94]
Supervised
vsLSTM to model variable range temporal dependency
To address the need for large amount of annotated data
SumMe and TVSum
Domain adaptation can improve learning and reduces discrepancies
[93]
Supervised
Sequential determinantal point process: supervised DPP coupled with NN representation
To incorporate human created summaries for selection of informative and diverse datasets
Open video project (50), YouTube (39), Kodak consumer video (18 videos)
Supervised approach with linear representation performed better
[99]
NA
MSR (Minimum Sparse Representation) based summarization
To utilize min number of keyframes
To provide flexibility for practical applications
Open video project (50 videos), several genres dataset (50 videos)
Two variants were proposed for off-line and on-line applications
Focused on selection of key frames
[96]
Unsupervised
General adversarial framework: summarizer (auto encoder LSTM) + discriminator LSTM (LSTM)
To regularize the summary length, diversity, and keyframes for
SumMe, TVSum, open video project, YouTube
Different performance on different datasets
Deep features perform better than shallow features
Frames with very slow motion and no scene change gave poor results
Poor quality e.g. erratic camera motion, variable illumination, etc. and content sparsity i.e. difficulty in finding representative frames, are two important challenges for AVS with user-generated videos [95]. Despite the limitations of unsupervised techniques, modifications such as incorporating prior information about category [95], selection of deep features rather than shallow features [96] have been presented. Unfortunately, the systems were unable to show promising improvement. Furthermore, it is difficult to define optimized joint criteria for frame selection due to the selection complexity of frame among large number of possible subsets. In contrast, supervised techniques require large annotated data that is one of its major limitations due to the shortage of large datasets [98]. Overall, supervised techniques are outperforming unsupervised techniques. However, more efficient and fast algorithms are required for AVS specially to deal with the variety and velocity of big data.

Results and discussion

This SLR distills the key insights from the comprehensive overview of IE techniques for a variety of data types and take a fresh look at older problems, which nevertheless are still highly relevant today. Big data brings a computational paradigm shift to IE techniques. In this regard, this SLR presents a comprehensive review of existing IE techniques for variety of data types. To the best of our knowledge, IE techniques from variety of unstructured big data at a single platform have not been addressed yet. In order to achieve this goal, SLR methodology has been followed to explore the advancements in IE techniques in recent years. To meet the objectives of the study, most relevant and up to date literature on IE techniques for text, images, audio and video data have been discussed. The selected studies have been classified according to IE subtasks for each data type and shown in Fig. 4.
Big data value chain defines high-level activities that are important to find useful information from big data where IE process is concerned with the data analysis. Therefore, the impact of inefficiencies of IE techniques will ultimately decrease the performance of big data analytics or decision making. In order to improve the big data analytics and decision making, this SLR was aimed to investigate the challenges of IE process in the age of big data for variety of data types. The objective of combining IE techniques for variety of data types at single platform was twofold. First, to identify the state of the art IE techniques for variety of big data and second, to investigate the major challenges of IE associated with unstructured big data. Further, the need for new consolidated IE systems is highlighted and some preconditions are also proposed to improve the IE process for the variety of data types in big data. This identified challenges of IE associated with unstructured big data have been discussed in the following subsection.

Unstructured big data challenges for IE

The challenges of IE from unstructured big data are categorized into task-dependent and task-independent categories. The task-dependent challenges have been discussed in their corresponding sections with state of the art techniques in each area. Task-independent challenges are discussed in this section. Table 14 presents a summary of the challenges identified from the selected studies.
Table 14
Independent challenges identified from selected studies
Challenges of unstructured big data IE
Studies
Frequency
Data quality
[31, 58, 59, 63, 64, 84, 95, 138]
9
Data sparsity
[16, 22, 31, 34, 95]
6
Data volume
[10, 13, 15, 19, 65]
5
Data usability
[1113, 27, 28]
5
Context understanding
[27, 31, 39, 64]
4
Computational requirements
[15, 68, 84, 124]
4
Data dimensionality
[16, 18, 66]
3
Heterogeneity
[33, 131]
2
Diversity
[55, 124]
2
Semantic understanding
[43, 44]
2
Data modeling
[16, 68]
2
Ambiguities in data
[31, 105]
2
Data scarcity
[62]
1
Balance among informativeness, representativeness, and diversity
[127]
1
A.
Quality of unstructured big data
Noise [31, 63, 64], missing data [59], incomplete data [15] and low quality data [58, 59, 84, 95, 138] are major quality issues of unstructured big data that degrades the performance of IE process. The quality issues of unstructured big data are huge barriers in extracting useful and most relevant information that makes IE process arduous. Quality improvement, early in the process, is the utmost requirement of IE from unstructured big data.
 
B.
Data sparsity
The enormous growth of user-generated content increased the data sparsity (a.k.a. data sparseness and data paucity) issues where only small fraction of data contains interesting and useful information [16, 22, 31, 95]. Text analysis of social media data, summarization of visual data are directly associated with user-generated content. Due to the sparsity of content, it became difficult to find most relevant representative data to produce semantically rich results. There is a false assumption about large datasets that frequent extractions from large datasets can produce better results [22]. Extracting a small amount of evidence in the corpus to present useful information is a challenge for unstructured big datasets. Therefore, sparse IE for large scale and variety of big data for user-generated content have great opportunities along with the challenges to improve the IE process.
 
C.
Volume of unstructured big data
People and machines are great producers of unstructured big data. The volume of data brings some opportunities as well as challenges for IE from the huge deluge of user and machine-generated content. Existing techniques should adopt new size and time requirements to deal with IE from big data [15, 84]. Automatic IE and structuring the unstructured big data requires the scaling of existing methods designed for very small data to process millions of data records [10, 13]. Therefore, distributed and parallel computing should be adopted for improved efficacy of IE from unstructured big data.
 
D.
Dimensionality and heterogeneity
Unstructured big data comes with high dimensionality [16, 18, 66], diversity [55, 124], dynamicity [32] and heterogeneity [33, 131]. Dimensionality reduction [18] and semantic annotation [131] can further improve the IE performance of high dimensional and heterogeneous data respectively. The techniques with high representational power are appropriate for high dimensional data [66]. With the influx of data from increasingly diverse sources, big data IE and analytics require advanced techniques to handle more than data accessibility.
 
E.
Data usability
Unstructured big data is a rich source of information but exploitation of relevant information is one of the major challenges [27, 28]. It is more relevant to the optimal data selection with balance of cost, speed and accuracy [12]. The main problem with unstructured big data is, huge deluge of data is available, but it is not usable. Usability of data is defined as the capacity of data to fulfill the requirements of user for a given purpose, area and epoch. According to the definition of data usability [153], “Usability is the degree to which each stakeholder is able to effectively access and use the data”. Data usability helps to know more about data, its understanding and its usage. Therefore, usability varies due to the different interpretation of meaning of data values and different nature of tasks that relates IE process improvement to data usability improvement.
 
F.
Context and semantic understanding
Identifying the context of interaction among entities and objects is a crucial task in IE [39, 64], especially with high dimensional, heterogeneous, complex and poor quality data. Data ambiguities add more challenges to contextual IE [31, 105]. Semantics are important to find relationship among entities and objects [44]. Entities and object extraction from text and visual data could not provide accurate information unless the context and semantics of interaction are identified [43]. Efficient data prioritization and curation is important in this regard [27]. Therefore, semantic and context understanding is important as well as challenging for big data IE due to quality and usability issues.
 
G.
Data modeling
As discussed earlier, learning-based techniques are more popular for IE as it reduces manual intensive labor. Efficient data modeling is an important task in learning-based IE techniques. High dimensionality, heterogeneity and low quality of unstructured big data add complexities to data modeling process [16]. Efficient parallelism and computational power are required to support large data models [68].
 

Need for consolidated IE systems for multidimensional unstructured big data

The critical analysis of the existing literature selected in this SLR has identified various task-specific and data-specific challenges for big data IE. Based on the findings of this SLR, variety of big data is posing challenges to extract useful information. Every field is using IE systems for variety of data to perform mining and analysis. New consolidated systems to extract useful information from variety of data types can improve the efficiency of big data analytics by integrating the extracted information. For example, Healthcare systems are using variety of big data in different systems like decision support systems, disease identification, Pharmacovigilance and Healthcare analytics, etc. Consolidated IE systems would help to improve these systems by extracting useful information from variety of unstructured data. The analysis of existing IE techniques and limitations arises the need for the consolidation of IE techniques for variety of data types. The identified need has been depicted in Fig. 5.
As shown in Fig. 5, the identified task-specific and data-specific limitations of IE systems should be considered to design an IE system for more than one data type. Meanwhile, the proposed improvement preconditions should also be considered for the development of these systems. The identified challenges and proposed preconditions will help to extract relevant and useful information from variety of big data. Following are some improvement preconditions that have been proposed for these new consolidated IE systems for multidimensional unstructured big data.

Preconditions 1: Advanced preprocessing

Most of the challenges, identified in this SLR, are related to the quality and usability of unstructured big data. Data and process standardization, efficient data cleaning and quality improvement techniques are required for unstructured big data. Further, advanced and adaptive preprocessing techniques prior to IE are required to improve the effectiveness of big data analytics.

Precondition 2: Pragmatic IE

Pragmatics is a field of study that is related to the usefulness and usability of data [154]. It deals with the dimensions of data that are important to improve the usefulness and usability of data. As IE is a community based process, it depends on the user needs and available data source [100]. Therefore, IE equipped with pragmatics will help to improve unstructured data analysis as it will extract and select data according to the user needs. Pragmatic IE solutions are required to improve big data analytics and big data IE.

Precondition 3: Context and semantics are more important

Context and semantics play an important role in understanding relation among entities or objects. Extracting most relevant data is a difficult task for unstructured big data due to its complexity and quality. Therefore, contextually and semantically rich IE techniques will increase the robustness of big data IE.

Precondition 4: Selection of technique

Selection of appropriate techniques according to the data has strong impact on the results of IE process especially for unstructured big data due to its complexity and large size. Traditional IE techniques are inadequate to efficiently handle unstructured big data. It has been observed that selection of appropriate techniques highly depends on the data characteristics. Weakly supervised or distant supervised learning techniques are suitable for large scale and multi domain datasets as these techniques require small training samples [17]. Unsupervised techniques are suitable for heterogeneous data [32], whereas deep CNN have performed better on high dimensional data [36]. Therefore, understanding the data is an important factor for selection of IE technique.

Conclusion

The systematic literature review serves the purpose of exploring state-of-the-art techniques for IE from unstructured big data types such as text, image, audio and video investigating the limitations of these techniques. Besides, the challenges of IE in big data environment have also been identified. It is found that analysis and mining of data are getting more complex with massive growth of unstructured big data. Deep learning with its generalizability, adaptability and less human involvement capability is playing a key role in this regard. However, to process exponentially growing data, new flexible and scalable techniques are required to deal with the dynamicity and sparsity of unstructured data. Quality, usability and sparsity of unstructured big data are major obstruct in deriving useful information. For improving the IE techniques, mining useful information and supporting versatility of unstructured data, it is required to introduce new techniques and make improvements and enhancements in existing techniques. Overall, the existing IE techniques are outperforming traditional techniques for comparatively larger datasets but inadequate to effectively deal with rapid growth of unstructured big data especially streaming data. Scalability, accuracy and latency are important factors in implementation of these IE techniques in big data platform. Apache MapReduce is also facing scalability issues in big data IE. To overcome these challenges, MapReduce based deep learning solutions are the future of big data IE systems. These systems will be helpful for healthCare analytics, surveillance, e-Government systems, social media analytics and business analytics. The outcome of the study shows that highly scalable and computationally efficient and consolidated IE techniques are required to deal with the dynamicity of unstructured big data. The study significantly contributes to the identification of the challenges to achieve more scalable and flexible IE systems. Quality, usability, sparsity, dimensionality, heterogeneity, context and semantics understanding, scarcity, modeling complexity and diversity of unstructured big data are major challenges in this field. Advanced data preparation techniques, prior to extracting information from unstructured data, semantically and contextually rich IE systems, the emergence of pragmatics and advanced IE techniques are essential for IE systems in unstructured big data environment. Hence, Scalable, computationally efficient and consolidated IE systems are required the can overcome the challenges of multidimensional unstructured big data.

Future work

The major focus of the review was to investigate the challenges of IE systems for multidimensional unstructured big data. The detailed discussion on IE techniques from variety of data types concluded that data preparation is equally important to the efficiency of IE systems. Advanced data improvement techniques will also increase the efficiency of IE systems. Therefore, the findings of the review will be used to develop a usability improvement model for unstructured big data to extract maximum useful information from these data.

Acknowledgements

This work is produced from Universiti Tunku Abdul Rahman Research Fund, UTARRF project, IPSR/RMC/UTARRF/2017-C1/R02.

Competing interests

Not applicable.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
1.
Zurück zum Zitat Gantz J, Reinsel D. The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. IDC iView IDC Analyze Future. 2012;2007(2012):1–16. Gantz J, Reinsel D. The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. IDC iView IDC Analyze Future. 2012;2007(2012):1–16.
2.
Zurück zum Zitat Wang Y, Kung LA, Byrd TA. Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol Forecast Soc Change. 2018;126:3–13.CrossRef Wang Y, Kung LA, Byrd TA. Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol Forecast Soc Change. 2018;126:3–13.CrossRef
3.
Zurück zum Zitat Lomotey RK, Deters R. Topics and terms mining in unstructured data stores. In: 2013 IEEE 16th international conference on computational science and engineering, 2013. p. 854–61. Lomotey RK, Deters R. Topics and terms mining in unstructured data stores. In: 2013 IEEE 16th international conference on computational science and engineering, 2013. p. 854–61.
4.
Zurück zum Zitat Lomotey RK, Deters R. RSenter: terms mining tool from unstructured data sources. Int J Bus Process Integr Manag. 2013;6(4):298.CrossRef Lomotey RK, Deters R. RSenter: terms mining tool from unstructured data sources. Int J Bus Process Integr Manag. 2013;6(4):298.CrossRef
5.
Zurück zum Zitat Scheffer T, Decomain C, Wrobel S. Mining the Web with active hidden Markov models. In: International conference on data mining. New York: IEEE; 2001; p. 645–6. Scheffer T, Decomain C, Wrobel S. Mining the Web with active hidden Markov models. In: International conference on data mining. New York: IEEE; 2001; p. 645–6.
6.
Zurück zum Zitat Lomotey RK, Jamal S, Deters R. SOPHRA: a mobile web services hosting infrastructure in mHealth. In: First international conference on mobile services. New York: IEEE; 2012; p. 88–95. Lomotey RK, Jamal S, Deters R. SOPHRA: a mobile web services hosting infrastructure in mHealth. In: First international conference on mobile services. New York: IEEE; 2012; p. 88–95.
7.
Zurück zum Zitat Brereton P, Kitchenham BA, Budgen D, Turner M, Khalil M. Lessons from applying the systematic literature review process within the software engineering domain. J Syst Softw. 2007;80(4):571–83.CrossRef Brereton P, Kitchenham BA, Budgen D, Turner M, Khalil M. Lessons from applying the systematic literature review process within the software engineering domain. J Syst Softw. 2007;80(4):571–83.CrossRef
8.
Zurück zum Zitat Borrego M, Foster MJ, Froyd JE. Systematic literature reviews in engineering education and other developing interdisciplinary fields. J Eng Educ. 2014;103(1):45–76.CrossRef Borrego M, Foster MJ, Froyd JE. Systematic literature reviews in engineering education and other developing interdisciplinary fields. J Eng Educ. 2014;103(1):45–76.CrossRef
9.
Zurück zum Zitat Che N, Chen D, Le J. Entity recognition approach of clinical documents based on self-training framework. In: Recent developments in intelligent computing, communication and devices. Singapore: Springer; 2019; p. 259–65.CrossRef Che N, Chen D, Le J. Entity recognition approach of clinical documents based on self-training framework. In: Recent developments in intelligent computing, communication and devices. Singapore: Springer; 2019; p. 259–65.CrossRef
10.
Zurück zum Zitat Liu X, Zhou Y, Wang Z. Recognition and extraction of named entities in online medical diagnosis data based on a deep neural network. J Vis Commun Image Represent. 2019;60:1–15.CrossRef Liu X, Zhou Y, Wang Z. Recognition and extraction of named entities in online medical diagnosis data based on a deep neural network. J Vis Commun Image Represent. 2019;60:1–15.CrossRef
11.
Zurück zum Zitat Mao J, Cui H. Identifying bacterial biotope entities using sequence labeling: performance and feature analysis. J Assoc Inf Sci Technol. 2018;69(9):1134–47.CrossRef Mao J, Cui H. Identifying bacterial biotope entities using sequence labeling: performance and feature analysis. J Assoc Inf Sci Technol. 2018;69(9):1134–47.CrossRef
12.
Zurück zum Zitat Goldberg S, Wang DZ, Grant C. A probabilistically integrated system for crowd-assisted text labeling and extraction. J Data Inf Qual. 2017;8(2):1–23.CrossRef Goldberg S, Wang DZ, Grant C. A probabilistically integrated system for crowd-assisted text labeling and extraction. J Data Inf Qual. 2017;8(2):1–23.CrossRef
13.
Zurück zum Zitat Boytcheva S, Angelova G, Angelov Z, Tcharaktchiev D. Text mining and big data analytics for retrospective analysis of clinical texts from outpatient care. Cybern Inf Technol. 2015;15(4):58–77. Boytcheva S, Angelova G, Angelov Z, Tcharaktchiev D. Text mining and big data analytics for retrospective analysis of clinical texts from outpatient care. Cybern Inf Technol. 2015;15(4):58–77.
14.
Zurück zum Zitat Pogrebnyakov N. Unsupervised domain-agnostic identification of product names in social media posts. In: International conference on big data. New York: IEEE; 2018; p. 3711–6. Pogrebnyakov N. Unsupervised domain-agnostic identification of product names in social media posts. In: International conference on big data. New York: IEEE; 2018; p. 3711–6.
15.
Zurück zum Zitat Napoli C, Tramontana E, Verga G. Extracting location names from unstructured italian texts using grammar rules and MapReduce. In: International conference on information and software technologies. Cham: Springer; 2016; p. 593–601. Napoli C, Tramontana E, Verga G. Extracting location names from unstructured italian texts using grammar rules and MapReduce. In: International conference on information and software technologies. Cham: Springer; 2016; p. 593–601.
16.
Zurück zum Zitat Feldman K, Faust L, Wu X, Huang C, Chawla NV. Beyond volume: the impact of complex healthcare data on the machine learning pipeline. In: Towards integrative machine learning and knowledge extraction. Cham: Springer; 2017; p. 150–69.CrossRef Feldman K, Faust L, Wu X, Huang C, Chawla NV. Beyond volume: the impact of complex healthcare data on the machine learning pipeline. In: Towards integrative machine learning and knowledge extraction. Cham: Springer; 2017; p. 150–69.CrossRef
17.
Zurück zum Zitat Wang K, Shi Y. User information extraction in big data environment. In: 3rd IEEE international conference on computer and communications (ICCC). New York: IEEE; 2017; p. 2315–8. Wang K, Shi Y. User information extraction in big data environment. In: 3rd IEEE international conference on computer and communications (ICCC). New York: IEEE; 2017; p. 2315–8.
18.
Zurück zum Zitat Li P, Mao K. Knowledge-oriented convolutional neural network for causal relation extraction from natural language texts. Expert Syst Appl. 2019;115:512–23.CrossRef Li P, Mao K. Knowledge-oriented convolutional neural network for causal relation extraction from natural language texts. Expert Syst Appl. 2019;115:512–23.CrossRef
19.
Zurück zum Zitat Wang P, Hao T, Yan J, Jin L. Large-scale extraction of drug-disease pairs from the medical literature. J Assoc Inf Sci Technol. 2017;68(11):2649–61.CrossRef Wang P, Hao T, Yan J, Jin L. Large-scale extraction of drug-disease pairs from the medical literature. J Assoc Inf Sci Technol. 2017;68(11):2649–61.CrossRef
20.
Zurück zum Zitat Guo X, He T. Leveraging Chinese encyclopedia for weakly supervised relation extraction. In: Joint international semantic technology conference. Cham: Springer; 2015; p. 127–40.CrossRef Guo X, He T. Leveraging Chinese encyclopedia for weakly supervised relation extraction. In: Joint international semantic technology conference. Cham: Springer; 2015; p. 127–40.CrossRef
21.
Zurück zum Zitat Torres JP, de Piñerez Reyes RG, Bucheli VA. Support vector machines for semantic relation extraction in Spanish language. In: Advances in computing. Cham: Springer; 2018; p. 326–37. Torres JP, de Piñerez Reyes RG, Bucheli VA. Support vector machines for semantic relation extraction in Spanish language. In: Advances in computing. Cham: Springer; 2018; p. 326–37.
22.
Zurück zum Zitat Li P, Wang H, Li H, Wu X. Employing semantic context for sparse information extraction assessment. ACM Trans Knowl Discov Data. 2018;12(5):1–36. Li P, Wang H, Li H, Wu X. Employing semantic context for sparse information extraction assessment. ACM Trans Knowl Discov Data. 2018;12(5):1–36.
23.
Zurück zum Zitat Liu Z, Tong J, Gu J, Liu K, Hu B. A Semi-automated entity relation extraction mechanism with weakly supervised learning for Chinese medical webpages. In: International conference on smart health. Cham: Springer; 2016; p. 44–56.CrossRef Liu Z, Tong J, Gu J, Liu K, Hu B. A Semi-automated entity relation extraction mechanism with weakly supervised learning for Chinese medical webpages. In: International conference on smart health. Cham: Springer; 2016; p. 44–56.CrossRef
24.
Zurück zum Zitat Li J, Cai Y, Wang Q, Hu S, Wang T, Min H. Entity relation mining in large-scale data. In: Database systems for advanced applications. Cham: Springer; 2015; p. 109–121.CrossRef Li J, Cai Y, Wang Q, Hu S, Wang T, Min H. Entity relation mining in large-scale data. In: Database systems for advanced applications. Cham: Springer; 2015; p. 109–121.CrossRef
25.
Zurück zum Zitat Wang C, Song Y, Roth D, Zhang M, Han J. World knowledge as indirect supervision for document clustering. ACM Trans Knowl Discov Data. 2016;11(2):1–36. Wang C, Song Y, Roth D, Zhang M, Han J. World knowledge as indirect supervision for document clustering. ACM Trans Knowl Discov Data. 2016;11(2):1–36.
26.
Zurück zum Zitat Gao H, Gui L, Luo W. Scientific literature based big data analysis for technology insight. J Phys Conf Ser. 2019;1168(3):032007.CrossRef Gao H, Gui L, Luo W. Scientific literature based big data analysis for technology insight. J Phys Conf Ser. 2019;1168(3):032007.CrossRef
27.
Zurück zum Zitat Bravo À, Piñero J, Queralt-Rosinach N, Rautschka M, Furlong LI. Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinform. 2015;16(1):55.CrossRef Bravo À, Piñero J, Queralt-Rosinach N, Rautschka M, Furlong LI. Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinform. 2015;16(1):55.CrossRef
28.
Zurück zum Zitat Fadili H, Jouis C. Towards an automatic analyze and standardization of unstructured data in the context of big and linked data. In: Proceedings of the 8th international conference on management of digital ecosystems—MEDES. New York: ACM Press; 2016; p. 223–30. Fadili H, Jouis C. Towards an automatic analyze and standardization of unstructured data in the context of big and linked data. In: Proceedings of the 8th international conference on management of digital ecosystems—MEDES. New York: ACM Press; 2016; p. 223–30.
29.
Zurück zum Zitat Swain MC, Cole JM. ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature. J Chem Inf Model. 2016;56(10):1894–904.CrossRef Swain MC, Cole JM. ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature. J Chem Inf Model. 2016;56(10):1894–904.CrossRef
30.
Zurück zum Zitat Miwa M, Thompson P, Korkontzelos Y, Ananiadou S. Comparable study of event extraction in newswire and biomedical domains. In: 25th international conference on computational linguistics. 2014; p. 2270–9. Miwa M, Thompson P, Korkontzelos Y, Ananiadou S. Comparable study of event extraction in newswire and biomedical domains. In: 25th international conference on computational linguistics. 2014; p. 2270–9.
31.
Zurück zum Zitat Roll U, Correia RA, Berger-Tal O. Using machine learning to disentangle homonyms in large text corpora. Conserv Biol. 2018;32(3):716–24.CrossRef Roll U, Correia RA, Berger-Tal O. Using machine learning to disentangle homonyms in large text corpora. Conserv Biol. 2018;32(3):716–24.CrossRef
32.
Zurück zum Zitat Xiang L, Zhao G, Li Q, Hao W, Li F. TUMK-ELM: a fast unsupervised heterogeneous data learning approach. IEEE Access. 2018;6:35305–15.CrossRef Xiang L, Zhao G, Li Q, Hao W, Li F. TUMK-ELM: a fast unsupervised heterogeneous data learning approach. IEEE Access. 2018;6:35305–15.CrossRef
33.
Zurück zum Zitat Shi L, Jianping C, Jie X. Prospecting information extraction by text mining based on convolutional neural networks–a case study of the Lala copper deposit, China. IEEE Access. 2018;6:52286–97.CrossRef Shi L, Jianping C, Jie X. Prospecting information extraction by text mining based on convolutional neural networks–a case study of the Lala copper deposit, China. IEEE Access. 2018;6:52286–97.CrossRef
34.
Zurück zum Zitat Mezhar A, Ramdani M, Elmzabi A. A novel approach for open domain event schema discovery from twitter. In: 2015 10th international conference on intelligent systems: theories and applications (SITA). New York: IEEE; 2015; p. 1–7. Mezhar A, Ramdani M, Elmzabi A. A novel approach for open domain event schema discovery from twitter. In: 2015 10th international conference on intelligent systems: theories and applications (SITA). New York: IEEE; 2015; p. 1–7.
35.
Zurück zum Zitat Gong L, Zhang Z, Yang X, Huang D, Yang R, Yang G. A biomedical events extracted approach based on phrase structure tree. In: 2017 13th international conference on natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD). New York: IEEE; 2017; p. 1984–88. Gong L, Zhang Z, Yang X, Huang D, Yang R, Yang G. A biomedical events extracted approach based on phrase structure tree. In: 2017 13th international conference on natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD). New York: IEEE; 2017; p. 1984–88.
36.
Zurück zum Zitat Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556. 2014.
37.
Zurück zum Zitat KHe K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2016; p. 770–8. KHe K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2016; p. 770–8.
38.
Zurück zum Zitat Liang X, Lee L, Xing EP. Deep variation-structured reinforcement learning for visual relationship and attribute detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2017; p. 4408–17. Liang X, Lee L, Xing EP. Deep variation-structured reinforcement learning for visual relationship and attribute detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2017; p. 4408–17.
39.
Zurück zum Zitat Zhuang B, Liu L, Shen C, Reid I. Towards context-aware interaction recognition for visual relationship detection. In: Proceedings of the IEEE international conference on computer vision (ICCV). 2017; p. 589–98. Zhuang B, Liu L, Shen C, Reid I. Towards context-aware interaction recognition for visual relationship detection. In: Proceedings of the IEEE international conference on computer vision (ICCV). 2017; p. 589–98.
40.
Zurück zum Zitat Ramanathan V et al. Learning semantic relationships for better action retrieval in images. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2015; p. 1100–9. Ramanathan V et al. Learning semantic relationships for better action retrieval in images. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2015; p. 1100–9.
41.
Zurück zum Zitat Jung J, Park J. Visual relationship detection with language prior and softmax. In: 2018 IEEE international conference on image processing, applications and systems (IPAS). 2018; p. 143–8. Jung J, Park J. Visual relationship detection with language prior and softmax. In: 2018 IEEE international conference on image processing, applications and systems (IPAS). 2018; p. 143–8.
42.
Zurück zum Zitat Yu R, Li A, Morariu VI, Davis LS. Visual relationship detection with internal and external linguistic knowledge distillation. In: Proceedings of the IEEE international conference on computer vision (ICCV). 2017; p. 1068–76. Yu R, Li A, Morariu VI, Davis LS. Visual relationship detection with internal and external linguistic knowledge distillation. In: Proceedings of the IEEE international conference on computer vision (ICCV). 2017; p. 1068–76.
43.
Zurück zum Zitat Baier S, Ma Y, Tresp V. Improving information extraction from images with learned semantic models. arXiv preprint arXiv:1808.08941 2018. Baier S, Ma Y, Tresp V. Improving information extraction from images with learned semantic models. arXiv preprint arXiv:​1808.​08941 2018.
45.
Zurück zum Zitat Han Y, Xu Y, Liu S, Gao S, Li S. Visual relationship detection based on local feature and context feature. In: 2018 International conference on network infrastructure and digital content (IC-NIDC). New York: IEEE; 2018; p. 420–4. Han Y, Xu Y, Liu S, Gao S, Li S. Visual relationship detection based on local feature and context feature. In: 2018 International conference on network infrastructure and digital content (IC-NIDC). New York: IEEE; 2018; p. 420–4.
46.
Zurück zum Zitat Vellingiriraj EK, Balamurugan M, Balasubramanie P. Information extraction and text mining of Ancient Vattezhuthu characters in historical documents using image zoning. In: 2016 international conference on Asian language processing (IALP). New York: IEEE; 2016; p. 37–40. Vellingiriraj EK, Balamurugan M, Balasubramanie P. Information extraction and text mining of Ancient Vattezhuthu characters in historical documents using image zoning. In: 2016 international conference on Asian language processing (IALP). New York: IEEE; 2016; p. 37–40.
47.
Zurück zum Zitat Singh D, Saini JP, Chauhan DS. Hindi character recognition using RBF neural network and directional group feature extraction technique. In: 2015 International conference on cognitive computing and information processing (CCIP). New York: IEEE; 2015; p. 1–4. Singh D, Saini JP, Chauhan DS. Hindi character recognition using RBF neural network and directional group feature extraction technique. In: 2015 International conference on cognitive computing and information processing (CCIP). New York: IEEE; 2015; p. 1–4.
48.
Zurück zum Zitat Sheshadri K, Divvala SK. Exemplar driven character recognition in the wild. In: Proceedings of the British Machine Vision Conference (BMVC). 2012; p. 13.1–13.10. Sheshadri K, Divvala SK. Exemplar driven character recognition in the wild. In: Proceedings of the British Machine Vision Conference (BMVC). 2012; p. 13.1–13.10.
49.
Zurück zum Zitat Shi Cun-Zhao, Wang Chun-Heng, Xiao Bai-Hua, Gao Song, Jin-Long Hu. Scene text recognition using structure-guided character detection and linguistic knowledge. IEEE Trans Circuits Syst Video Technol. 2014;24(7):1235–50.CrossRef Shi Cun-Zhao, Wang Chun-Heng, Xiao Bai-Hua, Gao Song, Jin-Long Hu. Scene text recognition using structure-guided character detection and linguistic knowledge. IEEE Trans Circuits Syst Video Technol. 2014;24(7):1235–50.CrossRef
50.
Zurück zum Zitat Yao C, Bai X, Shi B, Liu W. Strokelets: a learned multi-scale representation for scene text recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2014; p. 4042–49. Yao C, Bai X, Shi B, Liu W. Strokelets: a learned multi-scale representation for scene text recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2014; p. 4042–49.
51.
Zurück zum Zitat Avadesh M, Goyal N. Optical character recognition for Sanskrit using convolution neural networks. In: 2018 13th IAPR international workshop on document analysis systems (DAS). New York: IEEE; 2018. p. 447–52. Avadesh M, Goyal N. Optical character recognition for Sanskrit using convolution neural networks. In: 2018 13th IAPR international workshop on document analysis systems (DAS). New York: IEEE; 2018. p. 447–52.
52.
Zurück zum Zitat Younis KS, Alkhateeb AA. A new implementation of deep neural networks for optical character recognition and face recognition. Jordan: Proc New Trends Inf Technol; 2017. p. 157–62. Younis KS, Alkhateeb AA. A new implementation of deep neural networks for optical character recognition and face recognition. Jordan: Proc New Trends Inf Technol; 2017. p. 157–62.
53.
Zurück zum Zitat Elleuch M, Tagougui N, Kherallah M. Towards unsupervised learning for Arabic handwritten recognition using deep architectures. In: International conference on neural information processing. Cham: Springer; 2015; p. 363–372.CrossRef Elleuch M, Tagougui N, Kherallah M. Towards unsupervised learning for Arabic handwritten recognition using deep architectures. In: International conference on neural information processing. Cham: Springer; 2015; p. 363–372.CrossRef
54.
Zurück zum Zitat Ding Z, Chen Z, Wang S. FANet: an end-to-end full attention mechanism model for multi-oriented scene text recognition. In: 2019 5th international conference on big data and information analytics (BigDIA). New York: IEEE; 2019; p. 97–102. Ding Z, Chen Z, Wang S. FANet: an end-to-end full attention mechanism model for multi-oriented scene text recognition. In: 2019 5th international conference on big data and information analytics (BigDIA). New York: IEEE; 2019; p. 97–102.
55.
Zurück zum Zitat Medhat F et al. Theodoropoulos G, Obara B. TMIXT: a process flow for Transcribing MIXed handwritten and machine-printed text. In: 2018 IEEE international conference on big data (Big Data). 2018; p. 2986–94. Medhat F et al. Theodoropoulos G, Obara B. TMIXT: a process flow for Transcribing MIXed handwritten and machine-printed text. In: 2018 IEEE international conference on big data (Big Data). 2018; p. 2986–94.
56.
Zurück zum Zitat Xie H, Fang S, Zha Z-J, Yang Y, Li Y, Zhang Y. Convolutional attention networks for scene text recognition. ACM Trans Multimedia Comput Commun Appl. 2019;15(1s):1–17.CrossRef Xie H, Fang S, Zha Z-J, Yang Y, Li Y, Zhang Y. Convolutional attention networks for scene text recognition. ACM Trans Multimedia Comput Commun Appl. 2019;15(1s):1–17.CrossRef
57.
Zurück zum Zitat Zheng Y, Wang Q, Betke M. Deep neural network for semantic-based text recognition in images. Computer vision and pattern recognition. No. arXiv:1908.01403. 2019. Zheng Y, Wang Q, Betke M. Deep neural network for semantic-based text recognition in images. Computer vision and pattern recognition. No. arXiv:​1908.​01403. 2019.
58.
Zurück zum Zitat Wani MA, Bhat FA, Afzal S, Khan AI. Supervised deep learning in face recognition. Singapore: Springer; 2020. p. 95–110. Wani MA, Bhat FA, Afzal S, Khan AI. Supervised deep learning in face recognition. Singapore: Springer; 2020. p. 95–110.
59.
Zurück zum Zitat Heinsohn D, Villalobos E, Prieto L, Mery D. Face recognition in low-quality images using adaptive sparse representations. Image Vis Comput. 2019;85:46–58.CrossRef Heinsohn D, Villalobos E, Prieto L, Mery D. Face recognition in low-quality images using adaptive sparse representations. Image Vis Comput. 2019;85:46–58.CrossRef
60.
Zurück zum Zitat Abudarham N, Shkiller L, Yovel G. Critical features for face recognition. Cognition. 2019;182:73–83.CrossRef Abudarham N, Shkiller L, Yovel G. Critical features for face recognition. Cognition. 2019;182:73–83.CrossRef
61.
Zurück zum Zitat Prasad PS, Pathak R, Gunjan VK, Rao HR. Deep learning based representation for face recognition. In: ICCCE 2019. Springer: Singapore; 2019; p. 419–4. Prasad PS, Pathak R, Gunjan VK, Rao HR. Deep learning based representation for face recognition. In: ICCCE 2019. Springer: Singapore; 2019; p. 419–4.
62.
Zurück zum Zitat Gemmeke JF, Vuegen L, Karsmakers P, Vanrumste B. An exemplar-based NMF approach to audio event detection. In: 2013 IEEE workshop on applications of signal processing to audio and acoustics. 2013; p. 1–4. Gemmeke JF, Vuegen L, Karsmakers P, Vanrumste B. An exemplar-based NMF approach to audio event detection. In: 2013 IEEE workshop on applications of signal processing to audio and acoustics. 2013; p. 1–4.
63.
Zurück zum Zitat Espi M, Fujimoto M, Kinoshita K, Nakatani T. Exploiting spectro-temporal locality in deep learning based acoustic event detection. EURASIP J Audio Speech Music Process. 2015;2015(1):26.CrossRef Espi M, Fujimoto M, Kinoshita K, Nakatani T. Exploiting spectro-temporal locality in deep learning based acoustic event detection. EURASIP J Audio Speech Music Process. 2015;2015(1):26.CrossRef
64.
Zurück zum Zitat Heittola T, Mesaros A, Eronen A, Virtanen T. Context-dependent sound event detection. EURASIP J Audio Speech Music Process. 2013;2013(1):1.CrossRef Heittola T, Mesaros A, Eronen A, Virtanen T. Context-dependent sound event detection. EURASIP J Audio Speech Music Process. 2013;2013(1):1.CrossRef
65.
Zurück zum Zitat Takahashi N, Gygli M, Pfister B, Van Gool L. Deep convolutional neural networks and data augmentation for acoustic event detection. In: InterSpeech. arXiv:1604.07160. 2016. Takahashi N, Gygli M, Pfister B, Van Gool L. Deep convolutional neural networks and data augmentation for acoustic event detection. In: InterSpeech. arXiv:​1604.​07160. 2016.
66.
Zurück zum Zitat Zöhrer M, Pernkopf F. Gated recurrent networks applied to acoustic scene classification and acoustic event detection. In: Proceedings of the detection and classification of acoustic scenes and events workshop (DCASE2016), Budapest, Hungary, 3 Sept 2016, p. 115–9. Zöhrer M, Pernkopf F. Gated recurrent networks applied to acoustic scene classification and acoustic event detection. In: Proceedings of the detection and classification of acoustic scenes and events workshop (DCASE2016), Budapest, Hungary, 3 Sept 2016, p. 115–9.
67.
Zurück zum Zitat Su TW, Liu JY, Yang YH. Weakly-supervised audio event detection using event-specific Gaussian filters and fully convolutional networks. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2017; p. 791–5. Su TW, Liu JY, Yang YH. Weakly-supervised audio event detection using event-specific Gaussian filters and fully convolutional networks. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2017; p. 791–5.
68.
Zurück zum Zitat Zou Y, Jin X, Li Y, Guo Z, Wang E, Xiao B. Mariana: tencent deep learning platform and its applications. Proc VLDB Endow. 2014;7(13):1772–7.CrossRef Zou Y, Jin X, Li Y, Guo Z, Wang E, Xiao B. Mariana: tencent deep learning platform and its applications. Proc VLDB Endow. 2014;7(13):1772–7.CrossRef
70.
Zurück zum Zitat Masmoudi A, Bougares F, Ellouze M, Estève Y, Belguith L. Automatic speech recognition system for Tunisian dialect. Lang Resour Eval. 2018;52(1):249–67.CrossRef Masmoudi A, Bougares F, Ellouze M, Estève Y, Belguith L. Automatic speech recognition system for Tunisian dialect. Lang Resour Eval. 2018;52(1):249–67.CrossRef
71.
Zurück zum Zitat El Ouahabi S, Atounti M, Bellouki M. Toward an automatic speech recognition system for amazigh-tarifit language. Int J Speech Technol. 2019;22(2):421–32.CrossRef El Ouahabi S, Atounti M, Bellouki M. Toward an automatic speech recognition system for amazigh-tarifit language. Int J Speech Technol. 2019;22(2):421–32.CrossRef
72.
Zurück zum Zitat Seltzer ML, Yu D, Wang Y. An investigation of deep neural networks for noise robust speech recognition. In: 2013 IEEE international conference on acoustics, speech and signal processing. 2013; p. 7398–402. Seltzer ML, Yu D, Wang Y. An investigation of deep neural networks for noise robust speech recognition. In: 2013 IEEE international conference on acoustics, speech and signal processing. 2013; p. 7398–402.
73.
Zurück zum Zitat Yılmaz E, van den Heuvel H, van Leeuwen D. Investigating bilingual deep neural networks for automatic recognition of code-switching Frisian speech. Procedia Comput Sci. 2016;81:159–66.CrossRef Yılmaz E, van den Heuvel H, van Leeuwen D. Investigating bilingual deep neural networks for automatic recognition of code-switching Frisian speech. Procedia Comput Sci. 2016;81:159–66.CrossRef
74.
Zurück zum Zitat Abdel-Hamid O, Mohamed A, Jiang H, Deng L, Penn G, Yu D. Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio Speech Lang Process. 2014;22(10):1533–45.CrossRef Abdel-Hamid O, Mohamed A, Jiang H, Deng L, Penn G, Yu D. Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio Speech Lang Process. 2014;22(10):1533–45.CrossRef
75.
Zurück zum Zitat Sak H, Senior A, Rao K, Beaufays F. Fast and accurate recurrent neural network acoustic models for speech recognition. Computation and language. No. arXiv:1507.06947. 2015. Sak H, Senior A, Rao K, Beaufays F. Fast and accurate recurrent neural network acoustic models for speech recognition. Computation and language. No. arXiv:​1507.​06947. 2015.
76.
Zurück zum Zitat Kumar Y, Singh N. An automatic speech recognition system for spontaneous Punjabi speech corpus. Int J Speech Technol. 2017;20(2):297–303.MathSciNetCrossRef Kumar Y, Singh N. An automatic speech recognition system for spontaneous Punjabi speech corpus. Int J Speech Technol. 2017;20(2):297–303.MathSciNetCrossRef
77.
Zurück zum Zitat Londhe ND, Kshirsagar GB. Chhattisgarhi speech corpus for research and development in automatic speech recognition. Int J Speech Technol. 2018;21(2):193–210.CrossRef Londhe ND, Kshirsagar GB. Chhattisgarhi speech corpus for research and development in automatic speech recognition. Int J Speech Technol. 2018;21(2):193–210.CrossRef
78.
Zurück zum Zitat Lokesh S, Kumar PM, Devi MR, Parthasarathy P, Gokulnath C. An automatic Tamil speech recognition system by using bidirectional recurrent neural network with self-organizing map. Neural Comput Appl. 2019;31(5):1521–31.CrossRef Lokesh S, Kumar PM, Devi MR, Parthasarathy P, Gokulnath C. An automatic Tamil speech recognition system by using bidirectional recurrent neural network with self-organizing map. Neural Comput Appl. 2019;31(5):1521–31.CrossRef
79.
Zurück zum Zitat Karpukhin IA. Contribution from the accuracy of phoneme recognition to the quality of automatic recognition of Russian speech. Moscow Univ Comput Math Cybern. 2016;40(2):89–95.MathSciNetMATHCrossRef Karpukhin IA. Contribution from the accuracy of phoneme recognition to the quality of automatic recognition of Russian speech. Moscow Univ Comput Math Cybern. 2016;40(2):89–95.MathSciNetMATHCrossRef
80.
Zurück zum Zitat Ryu C, Lee D, Jang M, Kim C, Seo E. Extensible video processing framework in Apache Hadoop. In: 2013 IEEE 5th international conference on cloud computing technology and science. 2013; p. 305–310. Ryu C, Lee D, Jang M, Kim C, Seo E. Extensible video processing framework in Apache Hadoop. In: 2013 IEEE 5th international conference on cloud computing technology and science. 2013; p. 305–310.
81.
Zurück zum Zitat Manju A, Valarmathie P. Organizing multimedia big data using semantic based video content extraction technique. In: 2015 International conference on soft-computing and networks security (ICSNS). New York: IEEE; 2015; p. 1–4. Manju A, Valarmathie P. Organizing multimedia big data using semantic based video content extraction technique. In: 2015 International conference on soft-computing and networks security (ICSNS). New York: IEEE; 2015; p. 1–4.
82.
Zurück zum Zitat Kojima R, Sugiyama O, Nakadai K. Audio-visual scene understanding utilizing text information for a cooking support robot. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). 2015; p. 4210–5. Kojima R, Sugiyama O, Nakadai K. Audio-visual scene understanding utilizing text information for a cooking support robot. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). 2015; p. 4210–5.
83.
Zurück zum Zitat Risnumawan A, Shivakumara P, Chan CS, Tan CL. A robust arbitrary text detection system for natural scene images. Expert Syst Appl. 2014;41(18):8027–48.CrossRef Risnumawan A, Shivakumara P, Chan CS, Tan CL. A robust arbitrary text detection system for natural scene images. Expert Syst Appl. 2014;41(18):8027–48.CrossRef
84.
Zurück zum Zitat Ben Ayed A, Ben Halima M, Alimi AM. MapReduce based text detection in big data natural scene videos. Procedia Comput Sci. 2015;53:216–23.CrossRef Ben Ayed A, Ben Halima M, Alimi AM. MapReduce based text detection in big data natural scene videos. Procedia Comput Sci. 2015;53:216–23.CrossRef
85.
Zurück zum Zitat Yousfi S, Berrani SA, Garcia C. Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos. In: 2015 13th international conference on document analysis and recognition (ICDAR) New York: IEEE; 2015; p. 1026–30. Yousfi S, Berrani SA, Garcia C. Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos. In: 2015 13th international conference on document analysis and recognition (ICDAR) New York: IEEE; 2015; p. 1026–30.
86.
Zurück zum Zitat Mansouri S, Charhad M, Rekik A, Zrigui M. A framework for semantic video content indexing using textual information. In: 2018 IEEE second international conference on data stream mining & processing (DSMP). 2018; p. 107–10. Mansouri S, Charhad M, Rekik A, Zrigui M. A framework for semantic video content indexing using textual information. In: 2018 IEEE second international conference on data stream mining & processing (DSMP). 2018; p. 107–10.
87.
Zurück zum Zitat Sudir P, Ravishankar M. An effective approach towards video text recognition. In: Advances in signal processing and intelligent recognition systems. Cham: Springer; 2014; p. 323–33. Sudir P, Ravishankar M. An effective approach towards video text recognition. In: Advances in signal processing and intelligent recognition systems. Cham: Springer; 2014; p. 323–33.
88.
Zurück zum Zitat Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst. 2015;28:91–9. Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst. 2015;28:91–9.
89.
Zurück zum Zitat Wang X et al. End-to-end scene text recognition in videos based on multi frame tracking. In 2017 14th IAPR international conference on document analysis and recognition (ICDAR). New York: IEEE; 2017; p. 1255–60. Wang X et al. End-to-end scene text recognition in videos based on multi frame tracking. In 2017 14th IAPR international conference on document analysis and recognition (ICDAR). New York: IEEE; 2017; p. 1255–60.
90.
Zurück zum Zitat Ali A, Pickering M, Shafi K. Urdu natural scene character recognition using convolutional neural networks. In: 2018 IEEE 2nd international workshop on Arabic and derived script analysis and recognition (ASAR). 2018; p. 29–34. Ali A, Pickering M, Shafi K. Urdu natural scene character recognition using convolutional neural networks. In: 2018 IEEE 2nd international workshop on Arabic and derived script analysis and recognition (ASAR). 2018; p. 29–34.
91.
Zurück zum Zitat Shi B, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell. 2017;39(11):2298–304.CrossRef Shi B, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell. 2017;39(11):2298–304.CrossRef
92.
Zurück zum Zitat Tian S, Yin X-C, Su Y, Hao H-W. A unified framework for tracking based text detection and recognition from web videos. IEEE Trans Pattern Anal Mach Intell. 2018;40(3):542–54.CrossRef Tian S, Yin X-C, Su Y, Hao H-W. A unified framework for tracking based text detection and recognition from web videos. IEEE Trans Pattern Anal Mach Intell. 2018;40(3):542–54.CrossRef
93.
Zurück zum Zitat Gong B, Chao WL, Grauman K, Sha F. Diverse sequential subset selection for supervised video summarization. Adv Neural Inf Process Syst. 2014;27:2069–77. Gong B, Chao WL, Grauman K, Sha F. Diverse sequential subset selection for supervised video summarization. Adv Neural Inf Process Syst. 2014;27:2069–77.
94.
Zurück zum Zitat Zhang K, Chao WL, Sha F, Grauman K. Video summarization with long short-term memory. In: European conference on computer vision 2016, Cham: Springer; 2016; p. 766–82.CrossRef Zhang K, Chao WL, Sha F, Grauman K. Video summarization with long short-term memory. In: European conference on computer vision 2016, Cham: Springer; 2016; p. 766–82.CrossRef
95.
Zurück zum Zitat Khosla A, Hamid R, Lin CJ, Sundaresan N. Large-scale video summarization using web-image priors. In: Proceedings of the IEEE conference on computer vision and pattern recognition 2013. p. 2698–705. Khosla A, Hamid R, Lin CJ, Sundaresan N. Large-scale video summarization using web-image priors. In: Proceedings of the IEEE conference on computer vision and pattern recognition 2013. p. 2698–705.
96.
Zurück zum Zitat Mahasseni B, Lam M, Todorovic S. Unsupervised video summarization with adversarial LSTM networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2017; p. 2982–91. Mahasseni B, Lam M, Todorovic S. Unsupervised video summarization with adversarial LSTM networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2017; p. 2982–91.
97.
Zurück zum Zitat Potapov D, Douze M, Harchaoui Z, Schmid C. Category-specific video summarization. In: European conference on computer vision. Cham: Springer; 2014; p. 540–55.CrossRef Potapov D, Douze M, Harchaoui Z, Schmid C. Category-specific video summarization. In: European conference on computer vision. Cham: Springer; 2014; p. 540–55.CrossRef
98.
Zurück zum Zitat M. Gygli, H. Grabner, and L. Van Gool, “Video summarization by learning submodular mixtures of objectives,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3090–3098. M. Gygli, H. Grabner, and L. Van Gool, “Video summarization by learning submodular mixtures of objectives,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3090–3098.
99.
Zurück zum Zitat Mei S, Guan G, Wang Z, Wan S, He M, Feng DD. Video summarization via minimum sparse reconstruction. Pattern Recognit. 2015;48(2):522–33.CrossRef Mei S, Guan G, Wang Z, Wan S, He M, Feng DD. Video summarization via minimum sparse reconstruction. Pattern Recognit. 2015;48(2):522–33.CrossRef
100.
Zurück zum Zitat Lomotey RK, Deters R. Real-time effective framework for unstructured data mining. In: 2013 12th IEEE international conference on trust, security and privacy in computing and communications. 2013; p. 1081–8. Lomotey RK, Deters R. Real-time effective framework for unstructured data mining. In: 2013 12th IEEE international conference on trust, security and privacy in computing and communications. 2013; p. 1081–8.
101.
Zurück zum Zitat Nadeau D, Sekine S. A survey of named entity recognition and classification. Lingvisticae Investig. 2007;30(1):3–26.CrossRef Nadeau D, Sekine S. A survey of named entity recognition and classification. Lingvisticae Investig. 2007;30(1):3–26.CrossRef
102.
Zurück zum Zitat Marrero M, Urbano J, Sánchez-Cuadrado S, Morato J, Gómez-Berbís JM. Named Entity recognition: fallacies, challenges and opportunities. Comput Stand Interfaces. 2013;35(5):482–9.CrossRef Marrero M, Urbano J, Sánchez-Cuadrado S, Morato J, Gómez-Berbís JM. Named Entity recognition: fallacies, challenges and opportunities. Comput Stand Interfaces. 2013;35(5):482–9.CrossRef
103.
Zurück zum Zitat Abdallah ZS, Carman M, Haffari G. Multi-domain evaluation framework for named entity recognition tools. Comput Speech Lang. 2017;43:34–55.CrossRef Abdallah ZS, Carman M, Haffari G. Multi-domain evaluation framework for named entity recognition tools. Comput Speech Lang. 2017;43:34–55.CrossRef
104.
Zurück zum Zitat Sazali SS, Rahman NA, Bakar ZA. Information extraction: Evaluating named entity recognition from classical Malay documents. In: 2016 third international conference on information retrieval and knowledge management (CAMP). 2016; p. 48–53. Sazali SS, Rahman NA, Bakar ZA. Information extraction: Evaluating named entity recognition from classical Malay documents. In: 2016 third international conference on information retrieval and knowledge management (CAMP). 2016; p. 48–53.
105.
Zurück zum Zitat Goyal A, Gupta V, Kumar M. Recent Named entity recognition and classification techniques: a systematic review. Comput Sci Rev. 2018;29:21–43.CrossRef Goyal A, Gupta V, Kumar M. Recent Named entity recognition and classification techniques: a systematic review. Comput Sci Rev. 2018;29:21–43.CrossRef
106.
Zurück zum Zitat Piskorski J, Yangarber R. Information extraction: Past, present and future. In: Multi-source, multilingual information extraction and summarization. Berlin: Springer; 2013; p. 23–49. Piskorski J, Yangarber R. Information extraction: Past, present and future. In: Multi-source, multilingual information extraction and summarization. Berlin: Springer; 2013; p. 23–49.
107.
Zurück zum Zitat Goutte C, Gaussier E. A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: European conference on information retrieval. 2005; p. 345–59. Goutte C, Gaussier E. A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: European conference on information retrieval. 2005; p. 345–59.
108.
Zurück zum Zitat Konstantinova N. Review of relation extraction methods: What is new out there?. In: International conference on analysis of images, social networks and texts. Cham: Springer; 2014; p. 15–28. Konstantinova N. Review of relation extraction methods: What is new out there?. In: International conference on analysis of images, social networks and texts. Cham: Springer; 2014; p. 15–28.
109.
Zurück zum Zitat Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learning applications and challenges in big data analytics. J Big Data. 2015;2(1):1.CrossRef Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learning applications and challenges in big data analytics. J Big Data. 2015;2(1):1.CrossRef
110.
Zurück zum Zitat Zhou L, Pan S, Wang J, Vasilakos AV. Machine learning on big data: opportunities and challenges. Neurocomputing. 2017;237:350–61.CrossRef Zhou L, Pan S, Wang J, Vasilakos AV. Machine learning on big data: opportunities and challenges. Neurocomputing. 2017;237:350–61.CrossRef
111.
Zurück zum Zitat Wang W, et al. Deep learning at scale and at ease. ACM Trans Multimedia Comput Commun Appl. 2016;12(4s):1–25. Wang W, et al. Deep learning at scale and at ease. ACM Trans Multimedia Comput Commun Appl. 2016;12(4s):1–25.
112.
Zurück zum Zitat Wang Y, et al. Clinical information extraction applications: a literature review. J Biomed Inform. 2018;77:34–49.CrossRef Wang Y, et al. Clinical information extraction applications: a literature review. J Biomed Inform. 2018;77:34–49.CrossRef
113.
Zurück zum Zitat Chiticariu L, Li Y, Reiss FR. Rule-based information extraction is dead! Long live rule-based information extraction systems! In: Proceedings of the 2013 conference on empirical methods in natural language processing 2013; p. 827–32. Chiticariu L, Li Y, Reiss FR. Rule-based information extraction is dead! Long live rule-based information extraction systems! In: Proceedings of the 2013 conference on empirical methods in natural language processing 2013; p. 827–32.
114.
Zurück zum Zitat Valenzuela-Escárcega MA, Hahn-Powell G, Surdeanu M, Hicks T. A domain-independent rule-based framework for event extraction. In: Proceedings of ACL-IJCNLP 2015 system demonstrations. 2015; p. 127–32. Valenzuela-Escárcega MA, Hahn-Powell G, Surdeanu M, Hicks T. A domain-independent rule-based framework for event extraction. In: Proceedings of ACL-IJCNLP 2015 system demonstrations. 2015; p. 127–32.
115.
Zurück zum Zitat Patel R, Tanwani S. Application of machine learning techniques in clinical information extraction. In: Smart techniques for a smarter planet. Cham: Springer; 2019; p. 145–65.CrossRef Patel R, Tanwani S. Application of machine learning techniques in clinical information extraction. In: Smart techniques for a smarter planet. Cham: Springer; 2019; p. 145–65.CrossRef
116.
Zurück zum Zitat Topaz M, et al. Mining fall-related information in clinical notes: comparison of rule-based and novel word embedding-based machine learning approaches. J Biomed Inform. 2019;90:103103.CrossRef Topaz M, et al. Mining fall-related information in clinical notes: comparison of rule-based and novel word embedding-based machine learning approaches. J Biomed Inform. 2019;90:103103.CrossRef
117.
Zurück zum Zitat Mykowiecka A, Marciniak M, Kupść A. Rule-based information extraction from patients’ clinical data. J Biomed Inform. 2009;42(5):923–36.CrossRef Mykowiecka A, Marciniak M, Kupść A. Rule-based information extraction from patients’ clinical data. J Biomed Inform. 2009;42(5):923–36.CrossRef
118.
Zurück zum Zitat Gorinski PJ et al. Named entity recognition for electronic health records: a comparison of rule-based and machine learning approaches. Computation and language. 2019. Gorinski PJ et al. Named entity recognition for electronic health records: a comparison of rule-based and machine learning approaches. Computation and language. 2019.
119.
Zurück zum Zitat Atzmueller M, Kluegl P, Puppe F. Rule-based information extraction for structured data acquisition using TextMarker. In: LWA. 2008; p. 1–7. Atzmueller M, Kluegl P, Puppe F. Rule-based information extraction for structured data acquisition using TextMarker. In: LWA. 2008; p. 1–7.
120.
Zurück zum Zitat Fader A, Soderland S, Etzioni O. Identifying relations for open information extraction. In: Proceedings of the conference on empirical methods in natural language processing. 2011; p. 1535–45. Fader A, Soderland S, Etzioni O. Identifying relations for open information extraction. In: Proceedings of the conference on empirical methods in natural language processing. 2011; p. 1535–45.
121.
Zurück zum Zitat Kanya N, Ravi T. Modelings and techniques in named entity recognition: an information extraction task. In: IET Chennai 3rd international conference on sustainable energy and intelligent systems (SEISCON 2012). 2012; p. 104–8. Kanya N, Ravi T. Modelings and techniques in named entity recognition: an information extraction task. In: IET Chennai 3rd international conference on sustainable energy and intelligent systems (SEISCON 2012). 2012; p. 104–8.
122.
Zurück zum Zitat Wani MA, Bhat FA, Afzal S, Khan AI. Introduction to deep learning. In: Advances in deep learning. Singapore: Springer; 2020; p. 1–11. Wani MA, Bhat FA, Afzal S, Khan AI. Introduction to deep learning. In: Advances in deep learning. Singapore: Springer; 2020; p. 1–11.
123.
Zurück zum Zitat Coates A, Carpenter B, Case C, Satheesh S, Suresh B, Wang T, Wu DJ, Ng AY. Text detection and character recognition in scene images with unsupervised feature learning. In: ICDAR. 2011; p. 440–5. Coates A, Carpenter B, Case C, Satheesh S, Suresh B, Wang T, Wu DJ, Ng AY. Text detection and character recognition in scene images with unsupervised feature learning. In: ICDAR. 2011; p. 440–5.
124.
Zurück zum Zitat Wang H, Nie F, Huang H. Large-scale cross-language web page classification via dual knowledge transfer using fast nonnegative matrix trifactorization. ACM Trans Knowl Discov Data. 2015;10(1):1–29.CrossRef Wang H, Nie F, Huang H. Large-scale cross-language web page classification via dual knowledge transfer using fast nonnegative matrix trifactorization. ACM Trans Knowl Discov Data. 2015;10(1):1–29.CrossRef
125.
Zurück zum Zitat Jan B et al. Deep learning in big data analytics: a comparative study. Comput Electr Eng. 2019;75:275–87.CrossRef Jan B et al. Deep learning in big data analytics: a comparative study. Comput Electr Eng. 2019;75:275–87.CrossRef
126.
Zurück zum Zitat Gheisari M, Wang G, Bhuiyan MZ. A survey on deep learning in big data. In: 2017 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC). 2017; p. 173–80. Gheisari M, Wang G, Bhuiyan MZ. A survey on deep learning in big data. In: 2017 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC). 2017; p. 173–80.
127.
Zurück zum Zitat Reyes O, Ventura S. Evolutionary strategy to perform batch-mode active learning on multi-label data. ACM Trans Intell Syst Technol. 2018;9(4):1–26.CrossRef Reyes O, Ventura S. Evolutionary strategy to perform batch-mode active learning on multi-label data. ACM Trans Intell Syst Technol. 2018;9(4):1–26.CrossRef
128.
Zurück zum Zitat Berndt DJ, McCart JA, Finch DK, Luther SL. A case study of data quality in text mining clinical progress notes. ACM Trans Manag Inf Syst. 2015;6(1):1–21.CrossRef Berndt DJ, McCart JA, Finch DK, Luther SL. A case study of data quality in text mining clinical progress notes. ACM Trans Manag Inf Syst. 2015;6(1):1–21.CrossRef
129.
Zurück zum Zitat Nuray-Turan R, Kalashnikov DV, Mehrotra S. Adaptive connection strength models for relationship-based entity resolution. J Data Inf Qual. 2013;4(2):1–22.CrossRef Nuray-Turan R, Kalashnikov DV, Mehrotra S. Adaptive connection strength models for relationship-based entity resolution. J Data Inf Qual. 2013;4(2):1–22.CrossRef
130.
Zurück zum Zitat Zhang Z, Gao J, Ciravegna F. SemRe-rank: improving automatic term extraction by incorporating semantic relatedness with personalised pagerank. ACM Trans Knowl Discov Data. 2018;12(5):1–41.CrossRef Zhang Z, Gao J, Ciravegna F. SemRe-rank: improving automatic term extraction by incorporating semantic relatedness with personalised pagerank. ACM Trans Knowl Discov Data. 2018;12(5):1–41.CrossRef
131.
Zurück zum Zitat Adrian WT, Leone N, Manna M, Marte C. Document layout analysis for semantic information extraction. In: Conference of the Italian association for artificial intelligence. 2017. Cham: Springer; 2017; p. 269–81.CrossRef Adrian WT, Leone N, Manna M, Marte C. Document layout analysis for semantic information extraction. In: Conference of the Italian association for artificial intelligence. 2017. Cham: Springer; 2017; p. 269–81.CrossRef
132.
Zurück zum Zitat C. Lu, R. Krishna, M. Bernstein, and L. Fei-Fei, “Visual Relationship Detection with Language Priors,” in Computer Vision - ECCV 2016, Springer, Cham, 2016, pp. 852–869.CrossRef C. Lu, R. Krishna, M. Bernstein, and L. Fei-Fei, “Visual Relationship Detection with Language Priors,” in Computer Vision - ECCV 2016, Springer, Cham, 2016, pp. 852–869.CrossRef
133.
Zurück zum Zitat Antol S et al. VQA: Visual question answering. In: Proceedings of the IEEE international conference on computer vision. 2015; p. 2425–33. Antol S et al. VQA: Visual question answering. In: Proceedings of the IEEE international conference on computer vision. 2015; p. 2425–33.
134.
Zurück zum Zitat Ma L, Lu Z, Shang L, Li H. Multimodal convolutional neural networks for matching image and sentence. In: Proceedings of the IEEE international conference on computer vision. 2015; p. 2623–31. Ma L, Lu Z, Shang L, Li H. Multimodal convolutional neural networks for matching image and sentence. In: Proceedings of the IEEE international conference on computer vision. 2015; p. 2623–31.
135.
Zurück zum Zitat Yatskar M, Zettlemoyer L, Farhadi A. Situation recognition: visual semantic role labeling for image understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2016; p. 5534–42. Yatskar M, Zettlemoyer L, Farhadi A. Situation recognition: visual semantic role labeling for image understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2016; p. 5534–42.
136.
Zurück zum Zitat Joan SF, Valli S. A survey on text information extraction from born-digital and scene text images. Proc Natl Acad Sci. 2019;89(1):77–101. Joan SF, Valli S. A survey on text information extraction from born-digital and scene text images. Proc Natl Acad Sci. 2019;89(1):77–101.
137.
Zurück zum Zitat Jung K, Kim KI, Jain AK. Text information extraction in images and video: a survey. Pattern Recognit. 2004;37(5):977–97.CrossRef Jung K, Kim KI, Jain AK. Text information extraction in images and video: a survey. Pattern Recognit. 2004;37(5):977–97.CrossRef
138.
Zurück zum Zitat Zhang H, Zhao K, Song Y-Z, Guo J. Text extraction from natural scene image: a survey. Neurocomputing. 2013;122:310–23.CrossRef Zhang H, Zhao K, Song Y-Z, Guo J. Text extraction from natural scene image: a survey. Neurocomputing. 2013;122:310–23.CrossRef
139.
Zurück zum Zitat Young AW, Burton AM. Recognizing faces. Curr Direct Psychol Sci. 2017;26(3):212–7.CrossRef Young AW, Burton AM. Recognizing faces. Curr Direct Psychol Sci. 2017;26(3):212–7.CrossRef
140.
Zurück zum Zitat Young AW, Burton AM. Are we face experts? Trends Cognit Sci. 2018;22(2):100–10.CrossRef Young AW, Burton AM. Are we face experts? Trends Cognit Sci. 2018;22(2):100–10.CrossRef
141.
Zurück zum Zitat Peng YT, Lin CY, Sun MT, Tsai KC. Healthcare audio event classification using hidden Markov models and hierarchical hidden Markov models. In: 2009 IEEE International conference on multimedia and expo. 2009; p. 1218–21. Peng YT, Lin CY, Sun MT, Tsai KC. Healthcare audio event classification using hidden Markov models and hierarchical hidden Markov models. In: 2009 IEEE International conference on multimedia and expo. 2009; p. 1218–21.
142.
Zurück zum Zitat Harma A, McKinney MF, Skowronek J. Automatic surveillance of the acoustic activity in our living environment. In: 2005 IEEE international conference on multimedia and expo. 2005; p. 634–7. Harma A, McKinney MF, Skowronek J. Automatic surveillance of the acoustic activity in our living environment. In: 2005 IEEE international conference on multimedia and expo. 2005; p. 634–7.
143.
Zurück zum Zitat Zhuang X, Zhou X, Hasegawa-Johnson MA, Huang TS. Real-world acoustic event detection. Pattern Recognit Lett. 2010;31(12):1543–51.CrossRef Zhuang X, Zhou X, Hasegawa-Johnson MA, Huang TS. Real-world acoustic event detection. Pattern Recognit Lett. 2010;31(12):1543–51.CrossRef
144.
Zurück zum Zitat Li J, Deng L, Gong Y, Haeb-Umbach R. An overview of noise-robust automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process. 2014;22(4):745–77.CrossRef Li J, Deng L, Gong Y, Haeb-Umbach R. An overview of noise-robust automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process. 2014;22(4):745–77.CrossRef
145.
Zurück zum Zitat Saini P, Kaur P. Automatic speech recognition: a review. Int J Eng Trends Technol. 2013;4(2):1–5. Saini P, Kaur P. Automatic speech recognition: a review. Int J Eng Trends Technol. 2013;4(2):1–5.
146.
Zurück zum Zitat Cutajar M, Gatt E, Grech I, Casha O, Micallef J. Comparative study of automatic speech recognition techniques. IET Signal Process. 2013;7(1):25–46.CrossRef Cutajar M, Gatt E, Grech I, Casha O, Micallef J. Comparative study of automatic speech recognition techniques. IET Signal Process. 2013;7(1):25–46.CrossRef
147.
Zurück zum Zitat He X, Deng L. Speech-centric information processing: an optimization-oriented approach. Proc IEEE. 2013;101(5):1116–35.CrossRef He X, Deng L. Speech-centric information processing: an optimization-oriented approach. Proc IEEE. 2013;101(5):1116–35.CrossRef
148.
Zurück zum Zitat Lee S, Jo K. Automatic person information extraction using overlay text in television news interview videos. In: 2017 IEEE 15th international conference on industrial informatics (INDIN). 2017; p. 583–8. Lee S, Jo K. Automatic person information extraction using overlay text in television news interview videos. In: 2017 IEEE 15th international conference on industrial informatics (INDIN). 2017; p. 583–8.
149.
Zurück zum Zitat Lu T, Palaiahnakote S, Tan CL, Liu W. Introduction to video text detection. In: Video text detection. London: Springer; 2014; p. 1–18. Lu T, Palaiahnakote S, Tan CL, Liu W. Introduction to video text detection. In: Video text detection. London: Springer; 2014; p. 1–18.
150.
Zurück zum Zitat Ye Q, Doermann D. Text detection and recognition in imagery: a survey. IEEE Trans Pattern Anal Mach Intell. 2015;37(7):1480–500.CrossRef Ye Q, Doermann D. Text detection and recognition in imagery: a survey. IEEE Trans Pattern Anal Mach Intell. 2015;37(7):1480–500.CrossRef
151.
Zurück zum Zitat Zhu Y, Yao C, Bai X. Scene text detection and recognition: recent advances and future trends. Front Comput Sci. 2016;10(1):19–36.CrossRef Zhu Y, Yao C, Bai X. Scene text detection and recognition: recent advances and future trends. Front Comput Sci. 2016;10(1):19–36.CrossRef
152.
Zurück zum Zitat Rajpoot V, Girase S. A study on application scenario of video summarization. In: 2018 Second international conference on electronics, communication and aerospace technology (ICECA). New York: IEEE; 2018; p. 936–43. Rajpoot V, Girase S. A study on application scenario of video summarization. In: 2018 Second international conference on electronics, communication and aerospace technology (ICECA). New York: IEEE; 2018; p. 936–43.
153.
Zurück zum Zitat Shanks G, Corbitt B. Understanding data quality: social and cultural aspects. In: Proceedings of the 10th Australasian conference on information systems. 1999; p. 785–96. Shanks G, Corbitt B. Understanding data quality: social and cultural aspects. In: Proceedings of the 10th Australasian conference on information systems. 1999; p. 785–96.
154.
Zurück zum Zitat Price R, Shanks G. A semiotic information quality framework: development and comparative analysis. In: Enacting research methods in information systems. Cham: Springer; 2016; p. 219–50. Price R, Shanks G. A semiotic information quality framework: development and comparative analysis. In: Enacting research methods in information systems. Cham: Springer; 2016; p. 219–50.
Metadaten
Titel
An analytical study of information extraction from unstructured and multidimensional big data
verfasst von
Kiran Adnan
Rehan Akbar
Publikationsdatum
01.12.2019
Verlag
Springer International Publishing
Erschienen in
Journal of Big Data / Ausgabe 1/2019
Elektronische ISSN: 2196-1115
DOI
https://doi.org/10.1186/s40537-019-0254-8

Weitere Artikel der Ausgabe 1/2019

Journal of Big Data 1/2019 Zur Ausgabe