nach oben

2005 | Buch

Kapitel lesen Erstes Kapitel lesen

Soft Computing for Information Processing and Analysis

herausgegeben von: Prof. Masoud Nikravesh, Prof. Lotfi A. Zadeh, Prof. Janusz Kacprzyk

Verlag: Springer Berlin Heidelberg

Buchreihe : Studies in Fuzziness and Soft Computing

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Search engines, with Google at the top, have become the most heavily used online service, with millions of searches performed every day and many remarkable capabilities. Soft Computing for Information Processing and Analysis includes reports from the front of soft computing in the internet industry and imparts knowledge and understanding of the significance of the field's accomplishments, new developments and future directions. This carefully edited book has evolved from presentations made by the participants of a meeting entitled "Fuzzy Logic and the Internet: Enhancing the Power of the Internet", organized by the Berkeley Initiative in Soft Computing (BISC), University of California, Berkeley. It addresses the important topics of modern search engines such as fuzzy query, decision analysis and support systems, including articles about topics such as Web Intelligence, World Knowledge and Fuzzy Logic (by Lotfi A. Zadeh), perception based information processing, or web intelligence.

Inhaltsverzeichnis

Frontmatter

Web Intelligence, World Knowledge and Fuzzy Logic

Abstract

Existing search engines—with Google at the top—have many remarkable capabilities; but what is not among them is deduction capability—the capability to synthesize an answer to a query from bodies of information which reside in various parts of the knowledge base. In recent years, impressive progress has been made in enhancing performance of search engines through the use of methods based on bivalent logic and bivalent-logic-based probability theory. But can such methods be used to add nontrivial deduction capability to search engines, that is, to upgrade search engines to question-answering systems? A view which is articulated in this note is that the answer is “No.” The problem is rooted in the nature of world knowledge, the kind of knowledge that humans acquire through experience and education.

It is widely recognized that world knowledge plays an essential role in assessment of relevance, summarization, search and deduction. But a basic issue which is not addressed is that much of world knowledge is perception-based, e.g., “it is hard to find parking in Paris,” “most professors are not rich,” and “it is unlikely to rain in midsummer in San Francisco.” The problem is that (a) perception-based information is intrinsically fuzzy; and (b) bivalent logic is intrinsically unsuited to deal with fuzziness and partial truth.

To come to grips with the fuzziness of world knowledge, new tools are needed. The principal new tool—a tool which is briefly described in their note—is Precisiated Natural Language (PNL). PNL is based on fuzzy logic and has the capability to deal with partiality of certainty, partiality of possibility and partiality of truth. These are the capabilities that are needed to be able to draw on world knowledge for assessment of relevance, and for summarization, search and deduction.

Lotfi A. Zadeh

Towards More Powerful Information Technology via Computing with Words and Perceptions: Precisiated Natural Language, Protoforms and Linguistic Data Summaries

Summary

We show how Zadeh’s idea of computing with words and perceptions, based on his concept of a precisiated natural language (PNL), can lead to a new direction in the use of natural language in data mining, linguistic data(base) summaries. We emphasize the relevance of Zadeh’s another idea, that of a protoform, and show that various types of linguistic data summaries may be viewed as items in a hierarchy of protoforms of summaries. We briefly present an implementation for a sales database of a computer retailer as a convincing example that these tools and techniques are implementable and functional. These summaries involve both data from an internal database of the company and data downloaded from external databases via the Internet.

Janusz Kacprzyk, Sławomir Zadrożny

Enhancing the Power of Search Engines and Navigations Based on Conceptual Model: Web Intelligence

Abstract

Retrieving relevant information is a crucial component of cased-based reasoning systems for Internet applications such as search engines. The task is to use user-defined queries to retrieve useful information according to certain measures. Even though techniques exist for locating exact matches, finding relevant partial matches might be a problem. It may not be also easy to specify query requests precisely and completely - resulting in a situation known as a fuzzy-querying. It is usually not a problem for small domains, but for large repositories such as World Wide Web, a request specification becomes a bottleneck. Thus, a flexible retrieval algorithm is required, allowing for imprecise or fuzzy query specification or search. In this chapter, first we will present the role of the fuzzy logic in the Internet. Then we will present an intelligent model that can mine the Internet to conceptually match and rank homepages based on predefined linguistic formulations and rules defined by experts or based on a set of known homepages. The Fuzzy Conceptual Matching (FCM) model will be used for intelligent information and knowledge retrieval through conceptual matching of both text and images (here defined as “Concept”). The FCM can also be used for constructing fuzzy ontology or terms related to the context of the query and search to resolve the ambiguity. This model can be used to calculate conceptually the degree of match to the object or query. We will also present the integration of our technology into commercial search engines such as Google ™ and Yahoo! as a framework that can be used to integrate our model into any other commercial search engines, or development of the next generation of search engines.

Masoud Nikravesh, Tomohiro Takagi, Masanori Tajima, Akiyoshi Shinmura, Ryosuke Ohgaya, Koji Taniguchi, Kazuyosi Kawahara, Kouta Fukano, Akiko Aizawa

Soft Computing for Perception-Based Decision Processing and Analysis: Web-Based BISC-DSS

Abstract

Searching a database records and ranking the results based on multicriteria queries is central for many database applications used within organizations in finance, business, industrial and other fields. For Example, the process of ranking (scoring) has been used to make billions of financing decisions each year serving an industry worth hundreds of billion of dollars. To a lesser extent, ranking has also been used to process hundreds of millions of applications by U.S. Universities resulting in over 15 million college admissions in the year 2000 for a total revenue of over $250 billion. College admissions are expected to reach over 17 million by the year 2010 for total revenue of over $280 billion. In this paper, we will introduce fuzzy query and fuzzy aggregation as an alternative for ranking and predicting the risk for credit scoring and university admissions, which currently utilize an imprecise and subjective process. In addition we will introduce the BISC Decision Support System. The main key features of the BISC Decision Support System for the internet applications are 1) to use intelligently the vast amounts of important data in organizations in an optimum way as a decision support system and 2) To share intelligently and securely company’s data internally and with business partners and customers that can be process quickly by end users. The model consists of five major parts: the Fuzzy Search Engine (FSE), the Application Templates, the User Interface, the database and the Evolutionary Computing (EC).

Masoud Nikravesh, Souad Bensafi

Evaluating Ontology Based Search Strategies

Abstract

We present a framework system for evaluating the effectiveness of various types of “ontologies” to improve information retrieval. We use the system to demonstrate the effectiveness of simple natural language-based ontologies in improving search results and have made provisions for using this framework to test more advanced ontological systems, with the eventual goal of implementing these systems to produce better search results, either in restricted search domains or in a more generalized domain such as the World Wide Web.

Chris Loer, Harman Singh, Allen Cheung, Sergio Guadarrama, Masoud Nikravesh

Soft Computing for Perception Based Information Processing

Abstract

Humans have a remarkable capability (perception) to perform a wide variety of physical and mental tasks without any measurements or computations. Familiar examples of such tasks are: playing golf, assessing wine, recognizing distorted speech, and summarizing a story. The question is whether a special type information retrieval processing strategy can be designed that build in perception. Commercial Web search engines have been defined which manage information only in a crisp way. Their query languages do not allow the expression of preferences or vagueness. Even though techniques exist for locating exact matches, finding relevant partial matches might be a problem. It may not be also easy to specify query requests precisely and completely - resulting in a situation known as a fuzzy-querying. It is usually not a problem for small domains, but for large repositories such as World Wide Web, a request specification becomes a bottleneck. Thus, a flexible retrieval algorithm is required, allowing for imprecise or fuzzy query specification or search. In addition, they have problems as follows : (1) large answer set; (2) low precision; (3) unable to preserve the hypertext structures of matching hyperdocuments; (4) ineffective for general-concept queries. The task is to use user-defined queries to retrieve useful information according to certain measures. In order to handle these problems, we propose the Perception Index (PI) that contains attributes associated with a focal keyword restricted by fuzzy term(s) used in fuzzy queries on the Internet. If we integrate the Document Index (DI) used in commercial Web search engines with the proposed PI, we can handle both crisp terms (keyword-based) and fuzzy terms (perception-based). In this respect, the proposed approach is softer than the keyword-based approach. The PI brings somewhat closer to natural language. It is a further step toward a real human-friendly, natural language-based interface for Internet. It should greatly help the user relatively easily retrieve relevant information. In other words, the PI assists the user to reflect his/her perception in the process of query. Consequently, Internet users can narrow thousands of hits to the few that users really want. In this respect, the PI provides a new tool for targeting queries that users really want, and an invaluable personalized search. In this chapter, we also present the search mechanism based on the integrated index (DI + PI) and fuzzy query based on the integrated index (DI + PI). Moreover, we describe some features of the proposed method and suggest some considerations for implementing the proposed method. The main goal of the perception-based information processes and retrieval system is to design a model for the internet based on user profile with capability of exchanging and updating the rules dynamically and “do what I mean, not as I say” and using programming with “human common sense capability”.

Masoud Nikravesh, Dae-Young Choi

Distributed Architecture for Modeling and Simulation of Autonomous Multi-agent Multi-Physics Systems

Abstract

The need for Modeling and Simulation (M&S) is seen in many diverse applications such multi-agent systems, robotics, control systems, software engineering, complex adaptive systems, homeland security, and many others. In this paper we introduce an architecture for distributed simulation of multi-agent systems called Virtual Laboratory (V-Lab®), based on discrete event system specification (DEVS). V-Lab® is a test bed for many control algorithms and allows the user to demonstrate the working of several soft-computing methodologies like fuzzy logic, learning automata, neural networks, genetic algorithms, etc. applied to multi-agent systems. DEVS defines a framework for discrete event simulation and V-Lab® defines a framework for distributed simulation for multi-agent autonomous systems.

Prasanna Sridhar, Mo Jamshidi

Fuzzy Thesauri for and from the WWW

Abstract

We revisit some “old” strategies for the automatic construction of fuzzy relations between terms. Enriching them with new insights from the mathematical machinery behind fuzzy set theory, we are able to put them in the same general framework, thereby showing that they carry the same basic idea.

Martine De Cock, Sergio Guadarrama, Masoud Nikravesh

Consumer Profiling Using Fuzzy Query and Social Network Techniques

Abstract

Web communities possess the unprecedented ability to map out with ease the networks of communication linking their users. This social connectivity information combined with the instant-feedback nature of web interactivity creates the potential for advanced, automated consumer profiling systems to be used for targeted advertisements and other commercial purposes. This research proposes one method for consumer profiling inspired by social network theory that is based on the BISC Decision Support System. Real-world applications and possible ethical concerns are explored in some detail.

F. Olcay Cirit, Masoud Nikravesh, Sema E. Alptekin

A Trial to Represent Dynamic Concepts

Abstract

We consider the expression and recognition of dynamic concepts by assigning the movement patterns learned in a recurrent neural net as symbols. We then develop a method to express more abstract dynamic concepts by combining them with symbols and connecting several recurrent neural networks. Application of the method to actual recognition cases, such as ball bouncing and dance movement (i.e. dancing), demonstrated its effectiveness. These experiments showed the ability of the method to deal with dynamic concepts that are difficult to describe because of vagueness.

Kazushi Kawase, Tomohiro Takagi, Masoud Nikravesh

SORE (Self Organizable Regulating Engine) - An Example of a Possible Building Block for a “Biologizing” Control System

Abstract

The goals of this paper are threefold: (1) to introduce SORE to the biocontrol systems research community, describe how it works and explain why it could be an successful basic building block for a biocontrol system, (2) to present the basic characteristics of SORE and Boolean networks (BN) in a modern control language, with emphasis on their mathematical bases, (3) to illustrate, using some simple examples, why SORE’s inherent properties enable it to realize many of the desired basic requirements for a “biologizing” control system. SORE also exhibits self-organizing, reproducing, colonization and grouping actions - essential traits of life. This paper does not report detailed research results; rather, it studies the feasibility of SORE in biocontrol systems based upon computer simulations. Rigorous research results will be presented in the future.

Paul P. Wang, Joshua Robinson, Byung-Jae Choi

Multivariate Non-Linear Feature Selection with Kernel Methods

Abstract

We address problems of classification in which the number of input components (variables, features) is very large compared to the number of training samples. Such problems are encountered in Internet application such as text filtering, in biomedical applications such as medical diagnosis from genomic or protemic data, and drug screening from combinatorial chemistry data. In this setting, it is often desirable to perform a feature selection to reduce the number of inputs, either for efficiency, performance, or to gain understanding of the data and the classifiers. We compare a number of methods on mass-spectrometric data of human protein sera from asymptomatic patients and prostate cancer patients. We show empirical evidence that, in spite of the high danger of overfitting, non-linear methods can outperform linear methods, both in performance and number of features selected.

Isabelle Guyon, Hans-Marcus Bitter, Zulfikar Ahmed, Michael Brown, Jonathan Heller

A New Fuzzy Spectral Approach to Information Integration in a Search Engine

Abstract

The problem of information integration is important for upgrading a search engine to a question-answering system. In the paper we consider a new fuzzy spectral approach to information integration in a search engine. The approach employs a series of variance-covariances matrices and suggests the eigenvalue spectra of the matrices as important characteristics of information integration. We show that the characteristics can be described in terms of eigenvalue dynamics. Through computational experiments we have identified an eigenvalue dynamics that can be efficiently computed by using the quadratic trace of the variance-covariance matrix. Moreover, this dynamics shows a property that can be interpreted as the eigenvalue integration. This suggests that the spectral characteristics are connected with an integration mechanism. The fuzzyfication of eigenvalues plays a key role in the observation of the dynamics. This may support the idea that fuzziness is an integral part of information integration.

Galina Korotkikh

Towards Irreducible Modeling of Structures and Functions of Protein Sequences

Abstract

A major aim of bioinformatics is to contribute to our understanding of the relationship between protein sequence and its structure and function. In the paper we present an approach that allows us to derive a new type of hierarchical structures and formation processes from sequences. These structures and formation processes are irreducible, because they are based only on the integers and develop within existing rules of arithmetic. Therefore, a key feature of the approach is that it may model structures and functions of protein sequences in an irreducible way.

Victor Korotkikh

Mining Fuzzy Association Rules: An Overview

Abstract

The main aim of this paper is to present a revision of the most relevant results about the use of Fuzzy Sets in Data Mining, specifically in relation with the discovery of Association Rules. Fuzzy Sets Theory has been shown to be a very useful tool in Data Mining in order to represent the so-called Association Rules in a natural and human-understandable way.

First of all we will introduce the basic concepts of Data Mining to justify the need of using Fuzzy Sets. A historical revision on developments in this field is made too.

Next we will present our researches about Fuzzy Association Rules, starting with the formulation of a general model to discover association rules among items in a (crisp) set of fuzzy transactions. This general model can be particularized in several ways so that each particular instance allows to represent and mine a different kind of pattern on some kind of data. We describe some applications of this scheme, paying special attention to its application in Text Mining.

The paper finishes with some suggestions about future researches and problems to be solved.

M. Delgado, N. Manín, M. J. Martín-Bautista, D. Sánchez, M. -A. Vila

A Foundation for Computing with Words: Meta-Linguistic Axioms

Abstract

As a foundation for Computing With Words, meta-linguistic axioms are proposed in analogy to the axioms of classical theory. Consequences of these meta-linguistic expressions are explored in the light of Interval-valued Type 2 Fuzzy Sets. This once again demonstrates that fuzzy set theories and hence CWW have a richer and more expressive power that classical theory.

I. Burhan Türkşen

Augmented Fuzzy Cognitive Maps Supplemented with Case Based Reasoning for Advanced Medical Decision Support

Abstract

Fuzzy Cognitive Maps (FCMs) have been used to design Decision Support Systems and particularly for medical informatics to develop Intelligent Diagnosis Systems. Even though they have been successfully used in many different areas, there are situations where incomplete and vague input information may present difficulty in reaching a decision. In this chapter the idea of using the Case Based Reasoning technique to augment FCMs is presented leading to the development of an Advanced Medical Decision Support System. This system is applied in the speech pathology area to diagnose language impairments..

Voula Georgopoulos, Chrysostomos Stylios

Pruning, Selective Binding and Emergence of Internal Models: Applications to ICA and Analogical Reasoning

Abstract

Pruning of multi input/output neural networks is discussed and a pruning algorithm called CSDF is described. CSDF acts to induce internal models as a result of redundancy elimination and selective bindings. CSDF is used in a new ICA method based on an auto-encoder performing sensor-signal identity mapping. An internal model of the external signal-mixing situation emerges due to the CSDF pruning, and the hidden units that survive the CSDF pruning reconstruct the blind source signals. This ICA method which requires no pre-processing such as whitening is characterized by its high adaptability and robustness, as is demonstrated by trouble cases such as sudden increase of the source signals, sudden failure of sensors and so on. As another example, CSDF is applied in a neural network for analogical learning/inference. Internal abstraction models together with abstraction/de-abstraction bindings are generated as a result of the CSDF structural learning coupled with the backpropagation training. The internal abstraction model acts as an attractor for new relevant dataset, a process corresponding to analogical memory retrieval.

Syozo Yasui

Evolution of the Laws That Deal with the Utilization of Information Networks

Abstract

Three Laws are used to explain how the potential value of a network in-creases as the network expands: Sarnoff’s Law, Metcalf’s Law, and Reed’s Law. How accurately do these laws predict the actual value of information networks? We will take a closer look at the application of the laws to information networks and derive corollaries based upon which we shall propose certain attributes that will increase the value of an information network much more profoundly than the number of nodes, which is the primary concern of the laws mentioned above.

Babak Hodjat, Adam Cheyer

Intelligent Type-2 Fuzzy Inference for Web Information Search Task

Abstract

This chapter focuses on using interval TSK type-2 fuzzy inference to execute a Web Information Search Task (WIST). Type-2 fuzzy inference is helpful to address the “rule uncertainty problem” to improve the performance of a WIST because less prediction error can be achieved. On the other hand, type-2 fuzzy inference is generally computational intensive; this chapter proposes a simple idea to simplify the computation for interval TSK type-2 fuzzy inference.

Yuchun Tang, Yan-Qing Zhang

Causality In An Inherently III Defined World

Abstract

Commonsense causal reasoning occupies a central position in human reasoning. It plays an essential role in both informal and formal human decision-making. Causality itself as well as human understanding of causality is imprecise, sometimes necessarily so. Our common sense understanding of the world tells us that we have to deal with imprecision, uncertainty and imperfect knowledge. A difficulty is striking a good balance between precise formalism and commonsense imprecise reality. Clearly, an algorithmic method of handling imprecision is needed. Today, data mining holds the promise of extracting unsuspected information from very large databases. In many ways, the interest is the promise (or illusion) of causal, or at least, predictive relationships. However, the most common data mining rule forms only calculate a joint occurrence frequency; they do not express a causal relationship. Without understanding the underlying causality, a naïve use of data mining rules can lead to undesirable actions.

Lawrence J. Mazlack

Titel: Soft Computing for Information Processing and Analysis
herausgegeben von: Prof. Masoud Nikravesh
Prof. Lotfi A. Zadeh
Prof. Janusz Kacprzyk
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-540-32365-5
Print ISBN: 978-3-540-22930-8
DOI: https://doi.org/10.1007/3-540-32365-1