Skip to main content
main-content

Über dieses Buch

This book presents the features and advantages offered by complex networks in the machine learning domain. In the first part, an overview on complex networks and network-based machine learning is presented, offering necessary background material. In the second part, we describe in details some specific techniques based on complex networks for supervised, non-supervised, and semi-supervised learning. Particularly, a stochastic particle competition technique for both non-supervised and semi-supervised learning using a stochastic nonlinear dynamical system is described in details. Moreover, an analytical analysis is supplied, which enables one to predict the behavior of the proposed technique. In addition, data reliability issues are explored in semi-supervised learning. Such matter has practical importance and is not often found in the literature. With the goal of validating these techniques for solving real problems, simulations on broadly accepted databases are conducted. Still in this book, we present a hybrid supervised classification technique that combines both low and high orders of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features, while the latter measures the compliance of the test instances with the pattern formation of the data. We show that the high level technique can realize classification according to the semantic meaning of the data. This book intends to combine two widely studied research areas, machine learning and complex networks, which in turn will generate broad interests to scientific community, mainly to computer science and engineering areas.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

When we say “learning,” one of the words that comes into our mind may be “mystery”; when we talk about “large scale networks,” we may associate it to the word “complexity.” What happens when we put together these two concepts? In this chapter, we present an overview on complex network-based machine learning. Throughout the entire book, we show the diversity of approaches for treating such a subject.
Thiago Christiano Silva, Liang Zhao

Chapter 2. Complex Networks

Complex network comprises an emerging interdisciplinary research area that triggers much attention from physicists, mathematicians, biologists, engineering, computer scientists, among many others. Complex network structures describe a wide variety of systems of high technological and intellectual importance, such as the Internet, World Wide Web, coupled biological and chemical systems, financial, social, neural, and communication networks. The desire to understand such interwoven systems summed with their inherent complexity are factors that explain the increasing interest in enhancing complex network tools . The data representation in complex networks permits us to unify the structural complexity and vertex and connection diversities. Several relevant questions arise when investigating dynamics in complex networks, such as learning how large ensembles of dynamical systems that interact through a complex wiring topology can behave collectively. In this way, the network topology plays an important role in that it affects the functions of the represented system. As an example, the structure of social networks affects the information and disease propagation speeds, the topology of a financial network may amplify shocks in different manners, and the disposition of power grids in networks may affect the robustness and stability of power transmission. Due to the rapid evolution and the large amount of developed theories and techniques, it becomes prohibitive to make a comprehensive review on this topic. In this chapter, we present the basic concepts and ideas of complex networks that are useful in machine learning. We start out by presenting the main concepts of networks. Since complex networks and graphs share the same definition, we first present the basic notations of graph theory. Afterwards, we explore the evolution line and milestones of the complex network research. Following that, a comprehensive list of network measurements is discussed, which enables us to capture structural features of the networks in a systematic manner. Finally, we present some well-known dynamical processes that are defined within the complex networks framework.
Thiago Christiano Silva, Liang Zhao

Chapter 3. Machine Learning

Machine learning relates to the study, design and development of algorithms that give computers the capability to learn without being explicitly programmed. Machine learning techniques are fairly generic and can be applied in various settings. To utilize such kinds of algorithms, one has to translate the problem to the domain of machine learning, which usually expects a set of features and a desirable output or grouping criterion. In this chapter, we introduce the three machine learning paradigms often employed by the literature: supervised, unsupervised, and semi-supervised. We show that supervised algorithms exclusively utilize external information to induce or to train their hypotheses. In contrast, unsupervised learning methods are guided exclusively by the intrinsic structure of the data items throughout the learning process, i.e., without any sort of external knowledge. In-between these two learning paradigms lies the semi-supervised learning, which employs both the labeled and unlabeled data in the learning process. Here, we focus on supplying the shortcomings and potentialities of traditional and representative techniques that are well-known by the machine learning community. We will not go into technical details of traditional machine learning techniques in this chapter, because these are not the focus of this book.
Thiago Christiano Silva, Liang Zhao

Chapter 4. Network Construction Techniques

In many areas of machine learning, networks are used to model local relationships between data points and to build global structures from local information. Building networks is often a necessary step when dealing with problems arising from applications in machine learning or data mining. This fact becomes crucial when we want to apply network-based learning methods to vector-based data sets, in which a network must be constructed from the input data set using some convenient network formation criteria. In this chapter, we review the main ingredients that are needed to construct a graph from non-networked data. In special, we discuss transformation of vector-based and time series data. Several similarity functions are also discussed.
Thiago Christiano Silva, Liang Zhao

Chapter 5. Network-Based Supervised Learning

In this chapter, we focus on supervised learning algorithms that act on networked environments. These methods utilize external information in the form of labels to induce or train their models. Generally, the learning process is composed of two serial steps denominated training and classification phases. While in the first the algorithm attempts to learn from the data according to some external aid, such as of a human expert, in the latter the algorithm is tested against unseen data to verify its generalization power. In network-based methods, both phases take place in a network by navigating through it or updating its structure according to new information originated from the human expert. In the test phase, normally the network structure remains static as new data items are classified. However, some algorithms attempt to update the learned network structure in a process classified as self-learning. In this chapter, we present some of the shortcomings and advantages of using the network-based approach to conduct supervised learning. Representative network-based methods are discussed.
Thiago Christiano Silva, Liang Zhao

Chapter 6. Network-Based Unsupervised Learning

In this chapter, we present representative state-of-the-art unsupervised learning techniques that rely on networked environments to conduct the learning process. In a typical unsupervised task, no external knowledge is presented to the algorithm. As such, the learning process is guided by the provided data, since no prior knowledge about the existing groups is supplied. For network-based methods, the learning procedure is performed by navigating in networks that are constructed from the input data set according to some similarity criterion. As networks naturally embody topological information of data relationships, network-based methods take advantage over algorithms that make use of raw, vector-based data. Moreover, network-based methods can be conceived as a general solution for unsupervised learning tasks even for data sets that are not represented by networks. In this case, we can apply network formation techniques on that data set to generate a network from the input data. Once the network is constructed, all of the network-based techniques described in this chapter can effectively be employed.
Thiago Christiano Silva, Liang Zhao

Chapter 7. Network-Based Semi-Supervised Learning

In this chapter, we present network-based algorithms that run in the semi-supervised learning scheme. The semi-supervised learning paradigm lies somewhere in-between the unsupervised learning paradigm, which does not employ any external information to infer knowledge, and the supervised learning paradigm, which in contrast makes use of a fully labeled set to train models. Semi-supervised learning aims, among other features, to reduce the work of human experts in the labeling process. This feature is quite interesting especially when the labeling process is expensive and time consuming as in video indexing, classification of audio signals, text categorization, medical diagnostics, genome data, among many other applications. In network-based methods, the graph structure is the main driver in propagating labels from labeled vertices to unlabeled vertices. We show that different techniques apply different criteria in their label diffusion processes, generating, as a result, distinct outcomes. In addition, we discuss some of the shortcomings and benefits of the within-graph semi-supervised learning process, also called transductive learning.
Thiago Christiano Silva, Liang Zhao

Chapter 8. Case Study of Network-Based Supervised Learning: High-Level Data Classification

The power of computers to generalize to unseen data is intriguing. Computers have been used successfully to accurately predict prices of non-catalogued houses, trends in financial time series, or even to classify whether cancer tumors are benign or malign. One thing that all these tasks have in common is that computers are put forward to output answers to which they have not been explicitly programmed. A natural computational solution to estimate unseen data is to rely on the knowledge bases to which computers have been exposed, effectively mimicking the past behaviors. This chapter deals with supervised learning from a new learning perspective: a hybrid classification framework is presented that combines the decisions of low- and high-level classifiers . The low-level classifier realizes the classification task considering physical features of the input data, such as geometrical or statistical characteristics. In contrast, the high-level classification process checks the compliance of new test instances with the characteristic patterns formed by each of the classes that composes the training data. Test instances are declared members of those classes whose formed patterns are maintained with the introduction of those test instances. For this end, the high-level classifier extracts suitable organizational and topological descriptors of the network constructed from the input data. Using these network-based descriptors in a convenient collective manner, the high-level term is expected to promote the detection of data patterns with semantic and global meanings. The way we extract the patterns using these descriptors gives rise to several strategies to build up the high-level framework. In this book, we show two forms of pattern extraction strategies: using classical network measurements and employing dynamic information that is generated by several tourist walk processes. The ability of discovering high-level features formed by the data relationships is investigated using several artificial and real-world data sets. Here, we focus in situations in which the high-level term is able to identify intrinsic data patterns, but the low-level term alone fails to do so. This provides a clear motivation for the employment of a dual classification procedure (low + high). The obtained results reveal that the hybrid classification technique is able to improve the already optimized performances of traditional classification techniques. Finally, the hybrid classification approach is applied to recognize handwritten digits images.
Thiago Christiano Silva, Liang Zhao

Chapter 9. Case Study of Network-Based Unsupervised Learning: Stochastic Competitive Learning in Networks

Many business and day-to-day problems that arise in our lives must be dealt with under several constraints, such as the prohibition of external interventions of human beings. This may be due to high operational costs or physical or economical impossibilities that are inherently involved in the process. The unsupervised learning—one of the existing machine learning paradigms—can be employed to address these issues and is the main topic discussed in this chapter. For instance, a possible unsupervised task would be to discover communities in social networks, find out groups of proteins with the same biological functions, among many others. In this chapter, the unsupervised learning is investigated with a focus on methods relying on the complex networks theory. In special, a type of competitive learning mechanism based on a stochastic nonlinear dynamical system is discussed. This model possesses interesting properties, runs roughly in linear time for sparse networks, and also has good performance on artificial and real-world networks. In the initial setup, a set of particles is released into vertices of a network in a random manner. As time progresses, they move across the network in accordance with a convex stochastic combination of random and preferential walks, which are related to the offensive and defensive behaviors of the particles, respectively. The competitive walking process reaches a dynamic equilibrium when each community or data cluster is dominated by a single particle. Straightforward applications are in community detection and data clustering. In essence, data clustering can be considered as a community detection problem once a network is constructed from the original data set. In this case, each vertex corresponds to a data item and pairwise connections are established using a suitable network formation process.
Thiago Christiano Silva, Liang Zhao

Chapter 10. Case Study of Network-Based Semi-Supervised Learning: Stochastic Competitive-Cooperative Learning in Networks

Information reaches us at a remarkable speed and the amount of data it brings is unprecedented. In many situations, only a small subset of data items can be effectively labeled. This is because the labeling process is often expensive , time consuming, and requires intensive human involvement. As a result, partially labeled data sets are more frequently encountered. In order to get a better characterization of partially labeled data sets, semi-supervised classifiers are designed to learn from both labeled and unlabeled data. It has turned out to be a new topic of machine learning research that has received increasing attention in the past years. In this chapter, the semi-supervised classification with focus on methods based on complex networks is explored. In special, the particle competition model that we have introduced in the previous chapter is adapted to this new learning paradigm. Specifically, this enhancement is achieved by introducing the idea of cooperation among the particles and by changing the inner mechanisms of the original algorithm so as to fit it into a semi-supervised environment. In contrast to the unsupervised learning model, where the particles are randomly spawned in the network because no prior analysis of the groups is available, the semi-supervised learning version does have some external knowledge by definition. This knowledge is represented by the labeled data items, usually offered as a small fraction of the entire data set. In this scenario, the objective is to propagate the labels from the labeled set to the unlabeled set. Likewise the previous chapter, a mathematical formalization of the model, as well as a theoretical analysis, is also provided. A great portion of this analysis is based on the model that we have studied in the last chapter. A validation is also presented linking the numerical and theoretical results. An application in imperfect data learning is also presented, where the particle competition model is employed to detect and prevent error propagation in the learning process due to noisy or wrongly labeled data .
Thiago Christiano Silva, Liang Zhao

Backmatter

Weitere Informationen

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

Whitepaper

- ANZEIGE -

Globales Erdungssystem in urbanen Kabelnetzen

Bedingt durch die Altersstruktur vieler Kabelverteilnetze mit der damit verbundenen verminderten Isolationsfestigkeit oder durch fortschreitenden Kabelausbau ist es immer häufiger erforderlich, anstelle der Resonanz-Sternpunktserdung alternative Konzepte für die Sternpunktsbehandlung umzusetzen. Die damit verbundenen Fehlerortungskonzepte bzw. die Erhöhung der Restströme im Erdschlussfall führen jedoch aufgrund der hohen Fehlerströme zu neuen Anforderungen an die Erdungs- und Fehlerstromrückleitungs-Systeme. Lesen Sie hier über die Auswirkung von leitfähigen Strukturen auf die Stromaufteilung sowie die Potentialverhältnisse in urbanen Kabelnetzen bei stromstarken Erdschlüssen. Jetzt gratis downloaden!

Bildnachweise