Skip to main content

Über dieses Buch

This book presents a contemporary view of the role of information quality in information fusion and decision making, and provides a formal foundation and the implementation strategies required for dealing with insufficient information quality in building fusion systems for decision making. Information fusion is the process of gathering, processing, and combining large amounts of information from multiple and diverse sources, including physical sensors to human intelligence reports and social media. That data and information may be unreliable, of low fidelity, insufficient resolution, contradictory, fake and/or redundant. Sources may provide unverified reports obtained from other sources resulting in correlations and biases. The success of the fusion processing depends on how well knowledge produced by the processing chain represents reality, which in turn depends on how adequate data are, how good and adequate are the models used, and how accurate, appropriate or applicable prior and contextual knowledge is.

By offering contributions by leading experts, this book provides an unparalleled understanding of the problem of information quality in information fusion and decision-making for researchers and professionals in the field.



Information Quality: Concepts, Models and Dimensions


Chapter 1. Information Quality in Fusion-Driven Human-Machine Environments

Effective decision making in complex dynamic situations calls for designing a fusion-based human-machine information system requiring gathering and fusing a large amount of heterogeneous multimedia and multispectral information of variable quality coming from geographically distributed sources. Successful collection and processing of such information strongly depend on the success of being aware of, and compensating for, insufficient information quality at each step of information exchange. Designing methods of representing and incorporating information quality into fusion processing is a relatively new and rather difficult problem. The chapter discusses major challenges and suggests some approaches to address this problem.
Galina L. Rogova

Chapter 2. Quality of Information Sources in Information Fusion

Pieces of information can only be evaluated if knowledge about the quality of the sources of information is available. Typically, this knowledge pertains to the source relevance. In this chapter, other facets of source quality are considered, leading to a general approach to information correction and fusion for belief functions. In particular, the case where sources may partially lack truthfulness is deeply investigated. As a result, Shafer’s discounting operation and the unnormalised Dempster’s rule, which deal only with source relevance, are considerably extended. Most notably, the unnormalised Dempster’s rule is generalised to all Boolean connectives. The proposed approach also subsumes other important correction and fusion schemes, such as contextual discounting and Smets’ α-junctions. We also study the case where pieces of knowledge about the quality of the sources are independent. Finally, some means to obtain knowledge about source quality are reviewed.
Frédéric Pichon, Didier Dubois, Thierry Denœux

Chapter 3. Using Quality Measures in the Intelligent Fusion of Probabilistic Information

Our objective here is to obtain quality-fused values from multiple sources of probabilistic distributions, where quality is related to the lack of uncertainty in the fused value and the use of credible sources. We first introduce a vector representation for a probability distribution. With the aid of the Gini formulation of entropy, we show how the norm of the vector provides a measure of the certainty, i.e., information, associated with a probability distribution. We look at two special cases of fusion for source inputs, those that are maximally uncertain and certain. We provide a measure of credibility associated with subsets of sources. We look at the issue of finding the highest-quality fused value from the weighted aggregation of source-provided probability distributions.
Ronald R. Yager, Frederick E. Petry

Chapter 4. Conflict Management in Information Fusion with Belief Functions

In information fusion, the conflict is an important concept. Indeed, combining several imperfect experts or sources allows conflict. In the theory of belief functions, this notion has been discussed a lot. The mass appearing on the empty set during the conjunctive combination rule is generally considered as conflict, but that is not really a conflict. Some measures of conflict have been proposed, and some approaches have been proposed in order to manage this conflict or to decide with conflicting mass functions. We recall in this chapter some of them, and we propose a discussion to consider the conflict in information fusion with the theory of belief functions.
Arnaud Martin

Chapter 5. Basic Properties for Total Uncertainty Measures in the Theory of Evidence

The theory of evidence is a generalization of the probability theory which has been used in many applications. That generalization permits to represent more different types of uncertainty. To quantify the total information uncertainty in theory of evidence, several measures have been proposed in the last decades. From the axiomatic point of view, any uncertainty measure must verify a set of important properties to guarantee a correct behavior. Thus, a total measure in theory of evidence must preserve the total amount of information or not to decrease when uncertainty is increased. In this chapter we review and revise the properties of a total measure of uncertainty considered in the literature.
Joaquín Abellán, Carlos J. Mantas, Éloi Bossé

Chapter 6. Uncertainty Characterization and Fusion of Information from Unreliable Sources

Intelligent systems collect information from various sources to support their decision-making. However, misleading information may lead to wrong decisions with significant losses. Therefore, it is crucial to develop mechanisms that will make such systems immune to misleading information. This chapter presents a framework to exploit reports from possibly unreliable sources to generate fused information, i.e., an estimate of the ground truth, and characterize the uncertainty of that estimate as a facet of the quality of the information. First, the basic mechanisms to estimate the reliability of the sources and appropriately fuse the information are reviewed when using personal observations of the decision-maker and known types of source behaviors. Then, we propose new mechanisms for the decision-maker to establish fused information and its quality when it does not have personal observations and knowledge about source behaviors.
Lance Kaplan, Murat Şensoy

Chapter 7. Assessing the Usefulness of Information in the Context of Coalition Operations

This chapter presents the results of a study aiming at restricting the flow of information exchanged between various agents in a coalition. More precisely, when an agent expresses a need of information, we suggest sending only the information that is the most useful for this particular agent to act. This requires the characterization of “the most useful information.” The model described in this chapter defines a degree of usefulness of a piece of information as an aggregation of several usefulness degrees, each of them representing a particular point of view of what useful information might be. Specifically, the degree of usefulness of a piece of information is a multifaceted notion which takes into account the fact that it represents potential interest for the user with respect to his request, has the required security clearance level, can be accessed in time and understood by the user, and can be trusted by the user at a given level.
Claire Saurel, Olivier Poitou, Laurence Cholvy

Chapter 8. Fact, Conjecture, Hearsay and Lies: Issues of Uncertainty in Natural Language Communications

Humans are very important sources of information for intelligence purposes. They are multi-modal: they see, hear, smell, and feel. However, the information which they relay is not simply that which they personally experience. They may pass on hearsay, they form opinions, they analyze and interpret what they hear or see or feel. Sometimes they pass on ambiguous, vague, misleading or even false information, whether intentional or not. However, whether imprecise or vague, when humans communicate information, they often embed clues in the form of lexical elements in that which they pass on that allows the receiver to interpret where the informational content originated, how strongly the speaker herself believes in the veracity of that information. In this chapter, we look at the ways in which human communications are uncertain, both within the content and about the content. We illustrate a methodology which helps us to make an initial evaluation of the evidential quality of information based upon lexical clues.
Kellyn Rein

Chapter 9. Fake or Fact? Theoretical and Practical Aspects of Fake News

The phenomenon of fake news is nothing new. It has been around as long as people have had a vested interest in manipulating opinions and images, dating back to historical times, for which written accounts exist and probably much beyond. Referring to it as post-truth seems futile, as there’s probably never been an era of truth when it comes to news. More recently, however, the technical means and the widespread use of social media have propelled the phenomenon onto a new level altogether. Individuals, organizations, and state-actors actively engage in propaganda and the use of fake news to create insecurity, confusion, and doubt and promote their own agenda – frequently of a financial or political nature. We discuss the history of fake news and some reasons as to why people are bound to fall for it. We address signs of fake news and ways to detect it or, to at least become more aware of it and discuss the subject of truthfulness of messages and the perceived information quality of platforms. Some examples from the recent past demonstrate how fake news has played a role in a variety of scenarios. We conclude with remarks on how to tackle the phenomenon – to eradicate it will not be possible in the near term. But employing a few sound strategies might mitigate some of the harmful effects.
George Bara, Gerhard Backfried, Dorothea Thomas-Aniola

Chapter 10. Information Quality and Social Networks

Decision-making requires accurate situation awareness by the decision-maker, be it a human or a computer. The goal of high-level fusion is to help achieve this by building situation representations. These situation representations are often in the form of graphs or networks, e.g. they consist of nodes, edges and attributes attached to the nodes or edges. In addition to these situation representation networks, there can also be computational networks in fusion systems. These networks represent the computations that are being performed by the fusion system. Yet another relation between networks and fusion is that today much information comes from sources that are inherently organised as a network. The first example of this that comes to mind is the use of information from social media in fusion processes. Social media are also networks, where the links are formed by follow/reading/friend relations. There can also be implicit links between information sources that come from other It is vital for the fusion process and the ensuing decision-making to ensure that we have accurate estimates of the quality of various kinds of information. The quality of an information element has several components, for instance, the degree to which we trust the source and the accuracy of the information. Note that the source could be a high-level processing system itself: a fusion node that processed information from, e.g. sensors, and outputs a result. In this case, the quality determination must take account also of the way that the fusion node processed the data. In this chapter, we describe how social network analysis can help with these problems. First, a brief introduction to social network analysis is given. We then discuss the problem of quality assessment and how social network analysis measures could be used to provide quantitative estimates of the reliability of a source, based on its earlier behaviour as well as that of other sources.
Pontus Svenson

Chapter 11. Quality, Context, and Information Fusion

Context has received significant attention in recent years within the information fusion community as it can bring several advantages to fusion processes by allowing for refining estimates, explaining observations, constraining computations, and thereby improving the quality of inferences. At the same time, context utilization involves concerns about the quality of the contextual information and its relationship with the quality of information obtained from observations and estimations that may be of low fidelity, contradictory, or redundant. Knowledge of the quality of this information and its effect on the quality of context characterization can improve contextual knowledge. At the same time, knowledge about a current context can improve the quality of observations and fusion results. This chapter discusses the issues associated with context exploitation in information fusion, understanding and evaluating information quality in context and formal context representation, as well as the interrelationships among context, context quality, and quality of information. The chapter also presents examples of utilization of context and information quality in fusion applications.
Galina L. Rogova, Lauro Snidaro

Chapter 12. Analyzing Uncertain Tabular Data

It is common practice to spend considerable time refining source data to address issues of data quality before beginning any data analysis. For example, an analyst might impute missing values or detect and fuse duplicate records representing the same real-world entity. However, there are many situations where there are multiple possible candidate resolutions for a data quality issue, but there is not sufficient evidence for determining which of the resolutions is the most appropriate. In this case, the only way forward is to make assumptions to restrict the space of solutions and/or to heuristically choose a resolution based on characteristics that are deemed predictive of “good” resolutions. Although it is important for the analyst to understand the impact of these assumptions and heuristic choices on her results, evaluating this impact can be highly nontrivial and time-consuming. For several decades now, the fields of probabilistic, incomplete, and fuzzy databases have developed strategies for analyzing the impact of uncertainty on the outcome of analyses. This general family of uncertainty-aware databases aims to model ambiguity in the results of analyses expressed in standard languages like SQL, SparQL, R, or Spark. An uncertainty-aware database uses descriptions of potential errors and ambiguities in source data to derive a corresponding description of potential errors or ambiguities in the result of an analysis accessing this source data. Depending on the technique, these descriptions of uncertainty may be either quantitative (bounds, probabilities) or qualitative (certain outcomes, unknown values, explanations of uncertainty). In this chapter, we explore the types of problems that techniques from uncertainty-aware databases address, survey solutions to these problems, and highlight their application to fixing data quality issues.
Oliver Kennedy, Boris Glavic

Chapter 13. Evaluation of Information in the Context of Decision-Making

In a broad sense, solving a problem can be treated as deciding or making a decision what the solution to this problem is. In particular, decision-making with respect to a question means finding an answer to the question. Thus, solution of any problem can be treated as decision-making. However, traditionally decision-making is understood as making a choice from a set of alternatives, which are usually alternatives of actions. Here we consider decision-making in the traditional form exploring the role and features of information in this process. In section “The Process of Decision-Making”, we consider existing models and elaborate a more detailed model of decision-making. In section “Properties of Information and Their Evaluation”, we demonstrate that each stage and each step of decision-making involve work with information—information search, acquisition, processing, evaluation, and application. Evaluation of information is especially important for decision-making because utilization of false or incorrect information can result in wrong and even disastrous decisions. We show how to evaluate quality of information in the context of decision-making, what properties are important for information quality, and what measures can be useful for information evaluation. The obtained results are aimed at improving quality of information in decision-making by people and development of better computer decision support systems and expert systems.
Mark Burgin

Chapter 14. Evaluating and Improving Data Fusion Accuracy

Information fusion is the process of combining different sources of information for use in a particular application. The production of almost every information product incorporates some level of data fusion. Poor implementation of data and information fusion will have an impact on many other key data processes, most particularly data quality management, data governance, and data analytics. In this chapter we focus on a particular type of data fusion process called entity-based data fusion (EBDF) and on the application of EBDF in high-risk applications where accuracy of the fusion must be very high. One of the foremost examples is in healthcare. Fusing information belonging to different patients or failing to bring together all of the information for the same patient can both have dire, even life-threatening, implications.
John R. Talburt, Daniel Pullen, Melody Penning

Aspects of Information Quality in Various Domains of Application


Chapter 15. Decision-Aid Methods Based on Belief Function Theory with Application to Torrent Protection

In mountainous areas, decision-makers must find the best solution to protect elements-at-torrential risk. The decision process involves several criteria and is based on imperfect information. Classical Multi-Criteria Decision-Aiding methods (MCDAs) are restricted to precise criteria evaluation for decision-making under a risky environment and suffer of rank reversal problems. To bridge these gaps, several MCDAs have been recently developed within belief function theory framework. The aims of this chapter are to introduce how these methods can be applied in practice and to introduce their general principles. To show their applicability to the real-life problem, we apply them to the Decision-Making Problem (DMP) comprising the comparison of several protective alternatives against torrential floods and selection of the most efficient one. We finally discuss the method improvements to promote their practical implementation.
Simon Carladous, Jean-Marc Tacnet, Jean Dezert, Mireille Batton-Hubert

Chapter 16. An Epistemological Model for a Data Analysis Process in Support of Verification and Validation

The verification and validation (V&V) of the data analysis process is critical for establishing the objective correctness of an analytic workflow. Yet, problems, mechanisms, and shortfalls for verifying and validating data analysis processes have not been investigated, understood, or well defined by the data analysis community. The processes of verification and validation evaluate the correctness of a logical mechanism, either computational or cognitive. Verification establishes whether the object of the evaluation performs as it was designed to perform. (“Does it do the thing right?”) Validation establishes whether the object of the evaluation performs accurately with respect to the real world. (“Does it do the right thing?”) Computational mechanisms producing numerical or statistical results are used by human analysts to gain an understanding about the real world from which the data came. The results of the computational mechanisms motivate cognitive associations that further drive the data analysis process. The combination of computational and cognitive analytical methods into a workflow defines the data analysis process. People do not typically consider the V&V of the data analysis process. The V&V of the cognitive assumptions, reasons, and/or mechanisms that connect analytical elements must also be considered and evaluated for correctness. Data Analysis Process Verification and Validation (DAP-V&V) defines a framework and processes that may be applied to identify, structure, and associate logical elements. DAP-V&V is a way of establishing correctness of individual steps along an analytical workflow and ensuring integrity of conceptual associations that are composed into an aggregate analysis.
Alicia Ruvinsky, LaKenya Walker, Warith Abdullah, Maria Seale, William G. Bond, Leslie Leonard, Janet Wedgwood, Michael Krein, Timothy Siedlecki

Chapter 17. Data and Information Quality in Remote Sensing

Remote sensing datasets are characterized by multiple types of imperfections that alter extracted information and taken decisions to a variable degree depending on data acquisition conditions, processing, and final product requirements. Therefore, regardless of the sensors, type of data, extracted information, and complementary algorithms, the quality assessment question is a pervading and particularly complex one. This chapter summarizes relevant quality assessment approaches that have been proposed for data acquisition, information extraction, and data and information fusion, of the remote sensing acquisition-decision process. The case of quality evaluation for geographic information systems, which make use of remote sensing products, is also described. Aspects of a comprehensive quality model for remote sensing and problems that remain to be addressed offer a perspective of possible evolutions in the field.
John Puentes, Laurent Lecornu, Basel Solaiman

Chapter 18. Reliability-Aware and Robust Multi-sensor Fusion Toward Ego-Lane Estimation Using Artificial Neural Networks

In the field of road estimation, incorporating multiple sensors is essential to achieve a robust performance. However, the reliability of each sensor changes due to environmental conditions. Thus, we propose a reliability-aware fusion concept, which takes into account the sensor reliabilities. By that, the reliabilities are estimated explicitly or implicitly by classification algorithms, which are trained with extracted information from the sensors and their past performance compared to ground truth data. During the fusion, these estimated reliabilities are then exploited to avoid the impact of unreliable sensors. In order to prove our concept, we apply our fusion approach to a redundant sensor setup for intelligent vehicles containing three-camera systems, several lidars, and radar sensors. Since artificial neural networks (ANN) have produced great results for many applications, we explore two ways of incorporating them into our fusion concept. On the one hand, we use ANN as classifiers to explicitly estimate the sensors’ reliabilities. On the other hand, we utilize ANN to directly predict the ego-lane from sensor information, where the reliabilities are implicitly learned. By the evaluation with real-world recording data, the direct ANN approach leads to satisfactory road estimation.
Tran Tuan Nguyen, Jan-Ole Perschewski, Fabian Engel, Jonas Kruesemann, Jonas Sitzmann, Jens Spehr, Sebastian Zug, Rudolf Kruse

Chapter 19. Analytics and Quality in Medical Encoding Systems

Medical practice support intends to provide important complementary information for diagnosis by preprocessing voluminous data available on separate, distributed, commonly noninteroperable applications of complex existing medical information systems. Such technology is being investigated to support medical encoding, which manually identifies groups of patients with equivalent diagnosis to determine healthcare expenses, billing, and reimbursement. Medical encoding is expensive, takes considerable time, and depends on multiple scattered and heterogeneous data sources. This chapter summarizes some relevant approaches and findings that illustrate how the considerations of information quality and analytics technologies may enable to improve medical practice. Essential components of a conceived medical encoding support system are described, followed by the associated data analysis, information fusion, and information quality measurement. Results show that it is possible to process, generate, and qualify pertinent medical encoding information in this manner, meeting physicians’ requirements, making use of data available in existing systems and clinical workflows.
John Puentes, Laurent Lecornu, Clara Le Guillou, Jean-Michel Cauvin

Chapter 20. Information Quality: The Nexus of Actionable Intelligence

Information quality (IQ) plays a critical role in the ability of a high-level information fusion (HLIF) system of systems (SoS) to achieve actionable intelligence (AI). Whereas the need for information quality management in traditional information systems has been understood for some time, and its issues are fairly well mitigated, the challenges pertaining to the relatively new field of high-level information fusion remain significant. Principal and unique among these challenges are the multitude of issues which arise from the inherent complexity in high-level information fusion system of systems and which permeate throughout the various interdependent phases of its life cycle. Actively managing information quality in HLIF is essential in ensuring that they do not adversely impact decision-making and the ability to determine the best course of action (COA). Accordingly, in an effort to advance this critical facet of high-level information fusion, this chapter proposes an end-to-end framework that enables (a) the development of an information quality meta-model (IQMM), (b) the characterization of information quality elements, (c) the assessment of impacts of information quality elements and their corresponding mitigation, and (d) the integration of these aforementioned objectives within the HILF processes and life cycle.
Marco Antonio Solano

Chapter 21. Ranking Algorithms: Application for Patent Citation Network

How do technologies evolve in time? One way of answering this is by studying the US patent citation network. We begin this exploration by looking at macroscopic temporal behavior of classes of patents. Next, we quantify the influence of a patent by examining two major methods of ranking of nodes in networks: the celebrated “PageRank” and one of its extensions, reinforcement learning. A short history and a detailed explanation of the algorithms are given. We also discuss the influence of the damping factor when using PageRank on the patent network specifically in the context of rank reversal. These algorithms can be used to give more insight into the dynamics of the patent citation network. Finally, we provide a case study which combines the use of clustering algorithms with ranking algorithms to show the emergence of the opioid crisis. There is a great deal of data contained within the patent citation network. Our work enhances the usefulness of this data, which represents one of the important information quality characteristics. We do this by focusing on the structure and dynamics of the patent network, which allows us to determine the importance of individual patents without using any information about the patent except the citations to and from the patent.
Hayley Beltz, Timothy Rutledge, Raoul R. Wadhwa, Péter Bruck, Jan Tobochnik, Anikó Fülöp, György Fenyvesi, Péter Érdi

Chapter 22. Conflict Measures and Importance Weighting for Information Fusion Applied to Industry 4.0

Information sources such as sensors, databases, and human experts serve as sources in order to realise condition monitoring and predictive maintenance in Industry 4.0 scenarios. Complex technical systems create a large amount of data which cannot be analysed manually. Thus, information fusion mechanisms gain increasing importance. Besides the management of large amounts of data, further challenges towards the fusion algorithms arise from epistemic uncertainties (incomplete knowledge) and—mostly overseen—conflicts in the input signals. This contribution describes the multilayered information fusion system MACRO (multilayer attribute-based conflict-reducing observation) employing the BalTLCS (balanced two-layer conflict solving) fusion algorithm to reduce the impact of conflicts on the fusion result by a quality measure which is denoted by importance. Furthermore, we show that the numerical stability in heavy conflicts is a key factor in real-world applications. Different examples end this contribution.
Uwe Mönks, Volker Lohweg, Helene Dörksen

Chapter 23. Quantify: An Information Fusion Model Based on Syntactic and Semantic Analysis and Quality Assessments to Enhance Situation Awareness

Situation awareness is a concept especially important in the area of criminal data analysis and refers to the level of consciousness that an individual or team has about a situation, in this case a criminal event. Being unaware of crime situations can cause decision-makers to fail, affecting resource allocation for crime mitigation and jeopardizing human safety and their patrimony. Data and information fusion present opportunities to enrich the knowledge about crime situations by integrating heterogeneous and synergistic data from different sources. However, the problem is complicated by poor quality of information, especially when humans are the main sources of data. Motivated by the challenges in analyzing complex crime data and by the limitations of the state of the art on critical situation assessment approaches, this chapter presents Quantify, a new information fusion model. Its main contribution is the use of the information quality management throughout syntactic and semantic fusion routines to parameterize and to guide the work of humans and systems. To validate the new features of the model, a case study with real crime data was conducted. Crime reports were submitted to the modules of the model and had situations depicted and represented by an Emergency Situation Assessment System. Results highlighted the limitations of using only lexical and syntactical variations to support data and information fusion and the demand and benefits provided by quality and semantic means to assess crime situations.
Leonardo Castro Botega, Allan Cesar Moreira de Oliveira, Valdir Amancio Pereira Junior, Jordan Ferreira Saran, Lucas Zanco Ladeira, Gustavo Marttos Cáceres Pereira, Seiji Isotani

Chapter 24. Adaptive Fusion

This chapter describes a methodology for adaptively incorporating reliability of information, provided by multiple sensors, to improve the quality of the data fusion result. The adaptivity is achieved by dynamically utilizing auxiliary information, comprising the measure of performance of each sensor or contextual information when it is available. A comparative study of the results obtained by using either source of auxiliary information for adaptivity is presented in this paper.
Vincent Nimier, Kaouthar Benameur


Weitere Informationen

Premium Partner