Skip to main content

2006 | Buch

Computational Intelligence and Bioinformatics

International Conference on Intelligent Computing, ICIC 2006, Kunming, China, August 16-19, 2006. Proceedings, Part III

herausgegeben von: De-Shuang Huang, Kang Li, George William Irwin

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The International Conference on Intelligent Computing (ICIC) was formed to provide an annual forum dedicated to the emerging and challenging topics in artificial intelligence, machine learning, bioinformatics, and computational biology, etc. It aims to bring together researchers and practitioners from both academia and industry to share ideas, problems and solutions related to the multifaceted aspects of intelligent computing. ICIC 2006 held in Kunming, Yunnan, China, August 16-19, 2006, was the second International Conference on Intelligent Computing, built upon the success of ICIC 2005 held in Hefei, China, 2005. This year, the conference concentrated mainly on the theories and methodologies as well as the emerging applications of intelligent computing. It intended to unify the contemporary intelligent computing techniques within an integral framework that highlights the trends in advanced computational intelligence and bridges theoretical research with applications. In particular, bio-inspired computing emerged as having a key role in pursuing for novel technology in recent years. The resulting techniques vitalize life science engineering and daily life applications. In light of this trend, the theme for this conference was “Emerging Intelligent Computing Technology and Applications”. Papers related to this theme were especially solicited, including theories, methodologies, and applications in science and technology.

Inhaltsverzeichnis

Frontmatter

Ant Colony Optimisation

A Constrained Ant Colony Algorithm for Image Registration

Ant Colony optimization takes inspiration from the behavior of real ant colony to solve optimization problems. We attach some constraints to ant colony model and present a parallel constrained ant colony model to solve the image registration problem. The problem is represented by a directed graph so that the objective of the original problem becomes to find the shortest closed circuit on the graph under the problem-specific constraints. A number of artificial ants are distributed on the graph and communicate with one another through the pheromone trails which are a form of the long-term memory guiding the future exploration of the graph. The algorithm supports the parallel computation and facilitates quick convergence to the optimal solution. The performance of the proposed method as compared to those of the genetic-based approaches is very promising.

Wen Peng, Ruofeng Tong, Guiping Qian, Jinxiang Dong
A Novel ACO Algorithm with Adaptive Parameter

ACO has been proved to be one of the best performing algorithms for NP-hard problems as TSP. Many strategies for ACO have been studied, but little theoretical work has been done on ACO’s parameters

α

and

β

, which control the relative weight of pheromone trail and heuristic value. This paper describes the importance and functioning of

α

and

β

, and draws a conclusion that a fixed

β

may not enable ACO to use both heuristic and pheromone information for solution when

α

= 1. Later, following the analysis, an adaptive

β

strategy is designed for improvement. Finally, a new ACO called adaptive weight ant colony system (AWACS) with the adaptive

β

and

α

= 1 is introduced, and proved to be more effective and steady than traditional ACS through the experiment based on TSPLIB test.

Han Huang, Xiaowei Yang, Zhifeng Hao, Ruichu Cai
Study of Parametric Relation in Ant Colony Optimization Approach to Traveling Salesman Problem

Presetting control parameters of algorithms are important to ant colony optimization (ACO). This paper presents an investigation into the relationship of algorithms performance and the different control parameter settings. Two tour building methods are used in this paper including the max probability selection and the roulette wheel selection. Four parameters are used, which are two control parameters of transition probability

α

and

β

, pheromone decrease factor

ρ

, and proportion factor

q

0

in building methods. By simulated result analysis, the parameter selection rule will be given.

Xuyao Luo, Fang Yu, Jun Zhang
Ant Colony System for Optimizing Vehicle Routing Problem with Time Windows (VRPTW)

Research on the optimization of Vehicle Routing Problem with Time Windows (VRPTW) is a significant investigation area of ant colony system (ACS). This paper proposes an enhanced ACS, which embeds the sequential insertion heuristic method, to solve VRPTW. The main idea is to organize two respective ant colonies to successively achieve a multiple objective minimization. Experiments on a series of benchmark problems demonstrate the excellent performance of ACS when compared with other optimization methods.

Xuan Tan, Xiaolan Zhuo, Jun Zhang

Particle Swarm Optimisation

A Hybrid Particle Swarm Optimization for Binary CSPs

The target of solving constraint satisfaction problems(CSP) is to satisfy all constraints simultaneously. The CSP model is transformed into a discrete optimization problem with boundary constraints and is solved by particle swarm optimization(PSO) in this paper. To improve the performance of the proposed PSO algorithm, ERA(Environment, Reactive rules, Agent) model is used to proceed with local search after the process of boundary constraints. Further improvement including nohope and tabu list are also combined with PSO. When particles can not explore more search space, nohope is introduced to improve the activities of particles. Tabu list is used to avoid cycling in the global best particle. We experiment with random constraint satisfaction problem instances based on phase transition theory. Experimental results indicate that the hybrid algorithm has advantages on the search capability and the iterative number.

Qingyun Yang, Jigui Sun, Juyang Zhang, Chunjie Wang
A New Hybrid Algorithm of Particle Swarm Optimization

This paper presents a new hybrid algorithm of particle swarm optimization (PSO) called PSOSA, in which the mechanism of modified simulated annealing (SA) is embedded into standard PSO algorithm. The proposed algorithm not only keeps the characters of simple and easy to be implemented, but also enhances the ability of getting rid of local optimum and improves the speed and precision of convergence. The testing results of several benchmark functions with different dimensions show that the proposed algorithm is superior to standard PSO and the other PSO algorithms.

Guangyou Yang, Dingfang Chen, Guozhu Zhou
A Novel Particle Swarm Optimizer Using Optimal Foraging Theory

Based on the research of optimal foraging theory (OFT), we present a novel particle swarm optimizer (PSO) to improve the performance of standard PSO (SPSO). The resulting algorithm is known as PSOOFT that makes use of two mechanisms of OFT: a reproduction strategy to enhance the ability to converge rapidly to good solutions and a patch-choice based scheme to keep a right balance of exploration and exploitation. In the simulation studies, several benchmark functions are performed, and the performance of the proposed algorithm is compared to the standard PSO (SPSO). The experimental results show that the PSOOFT prevents premature convergence to a high degree, but still has a more rapid convergence rate than SPSO.

Ben Niu, Yunlong Zhu, Kunyuan Hu, Sufen Li, Xiaoxian He
A Smart Particle Swarm Optimization Algorithm for Multi-objective Problems

Maintaining the diversity and convergence of Pareto optimal solutions is a desired task of optimization methods for multi-objective optimization problems(MOP). While accelerating the computing speed is important for algorithms to solve real-life MOP also. A Smart Particle Swarm Optimization algorithm for MOP(SMOPSO) is proposed. By setting the cooperative action of all the objective functions as the global best guide of swarm and selecting the closest or farthest archive member as the personal best guide of each particle, the SMOPSO method can find many Pareto optimal solutions in less iteration steps. Three well-known test functions have been used to validate our approach. Results show that the SMOPSO method is available and rapid.

Xiaohua Huo, Lincheng Shen, Huayong Zhu
Adaptive Particle Swarm Optimization with Feedback Control of Diversity

Swarm-diversity is an important factor influencing the global convergence of particle swarm optimization (PSO). In order to overcome the premature convergence, the paper introduced a negative feedback mechanism into particle swarm optimization and developed an adaptive PSO. The improved method takes advantage of the swarm-diversity to control the tuning of the inertia weight (PSO-DCIW), which in turn can adjust the swarm-diversity adaptively and contribute to a successful global search. The proposed PSO-DCIW was applied to some well-known benchmarks and compared with the other notable improved methods for PSO. The relative experimental results show PSO-DCIW is a robust global optimization method for the complex multimodal functions, which can improve the performance of the standard PSO and alleviate the premature convergence validly.

Jing Jie, Jianchao Zeng, Chongzhao Han
An Improved Particle Swarm Algorithm and Its Application to Power System Transfer Capability Optimization

In this paper, an algorithm based on PSO (Particle Swarm Optimization) for power system transfer capability calculation is presented. A dual fitness scheme that takes both objective and constraint into account is adopted to evaluate the survival chance of any particle, thus avoid the drawbacks of traditional penalty method. In the evolution process, if the population best particle has no update during a prescribed number of consecutive generations, it is regarded as a local optimum solution and the searching space around this particle is locked to prevent other particles flying into it. And this particle is saved as one of the candidate solution. In the end, by comparing the fitness of all saved particles and the current population best particle the optimum value can be obtained. This improved particle swarm algorithm is then successfully applied to IEEE118 bus system optimization problem. Compared with a traditional well-known method, sequential quadratic programming, our proposal obtains better solutions for this problem.

Si-jun Peng, Chang-hua Zhang, Liang Tang
An Improved Particle Swarm Optimization Algorithm with Disturbance Term

The standard particle swarm optimization (PSO) algorithm, existing improvements and their influence to the performance of standard PSO are introduced. The framework of PSO basic formula is analyzed. Implied by its three-term structure, the inherent shortcoming that trends to local optima is indicated. Then a modified velocity updating formula of particle swarm optimization algorithm is declared. The addition of the disturbance term based on existing structure effectively mends the defects. The convergence of the improved algorithm is analyzed. Simulation results demonstrated that the improved algorithm have a better performance than the standard one.

Qingyuan He, Chuanjiu Han
Blending Scheduling Under Uncertainty Based on Particle Swarm Optimization with Hypothesis Test

Blending is an important unit operation in process industry. As a nonlinear optimization problem with constraints, it is difficult to obtain optimal solution for blending scheduling, especially under uncertainty. As a novel evolutionary computing technique, particle swarm optimization (PSO) has powerful ability to solve nonlinear optimization problems with both continuous and discrete variables. In this paper, the performance of PSO under uncertainty for blending scheduling problem is investigated, and a new hybrid approach (namely PSOHT) that combines PSO and hypothesis test (HT) is proposed. The simulation results based on an example of gasoline blending problem show that the proposed PSOHT algorithm is valid and effective for solving problem under uncertainty.

Hui Pan, Ling Wang
Fixed Parameter Estimation Method Using Gaussian Particle Filter

The degeneracy problems of general particle filtering frequently occur. Although this kind of problems can be mitigated by resampling, but the diversity characteristic between particles may be lost because the higher weighted particles will be replicated and the lower weighted particles will be discarded. For parameter-fixed application cases, the standard particle filter is invalid as no importance density function can be sampled for new particles required by the predictive distribution, and particles will quickly be exhausted. This paper proposes a new method for the parameter-fixed estimation by use of Gaussian particle filter, which can avoid making particles exhausted and can improve the estimation performance. Refer to a practical example of Direction of Arrived (DOA) estimation for coherent signals propagated in space with multi-path fading, the computer simulation has been performed. The simulation results have indicated that the performance of the new method is rather than general particle filtering.

Lixin Wang
Improving Quantum-Behaved Particle Swarm Optimization by Simulated Annealing

Quantum-behaved Particle Swarm Optimization (QPSO) is a global convergence guaranteed search method, which introduced quantum theory into original Particle Swarm Optimization (PSO). While Simulated Annealing (SA) is another important stochastic optimization with the ability of probabilistic hill-climbing. In this paper, the mechanism of Simulated Annealing is introduced into the weak selection implicit in our QPSO algorithm, which effectively employs both the ability to jump out of the local minima in Simulated Annealing and the capacity of searching the global optimum in QPSO algorithm. The experimental results show that the proposed hybrid algorithm increases the diversity of the population in the search process and improves its precision in the latter period of the search.

Jing Liu, Jun Sun, Wenbo Xu
Optimization of a Child Restraint System by Using a Particle Swarm Algorithm

Child restraint system (CRS) is a system in automotive vehicles for the protection of child occupants in traffic accidents. Design of appropriate CRS has been one of the major subjects for both the research community and the automotive industry. In this paper, a CRS, which includes a child booster and an adult seatbelt with load limiting function, is optimized for a ten-year child dummy. The model is built and simulated using MADYMO. Several key parameters of the system are optimized to minimize the injury to child passengers under the crash test circumstance in accordance with the ECE Regulation 44 by using a recently emerged optimization scheme, particle swarm algorithm. In order to validate this optimization approach, another optimization method, AutoDOE, a built-in subroutine of MADYMO, is also utilized for comparison. The results indicate that the particle swarm algorithm has certain advantages over the AutoDOE method in terms of the solution quality. Moreover, regarding the computational efficiency, for this particular problem the particle swarm algorithm outperforms AutoDOE.

Liang Tang, Meng Luo, Qing Zhou
Predicted-Velocity Particle Swarm Optimization Using Game-Theoretic Approach

In standard particle swarm optimization, velocity information only provides a moving direction of each particle of the swarm, though it also can be considered as one point if there is no limitation restriction. Predicted-velocity particle swarm optimization is a new modified version using velocity and position to search the domain space equality. In some cases, velocity information may be effectively, but fails in others. This paper presents a game-theoretic approach for designing particle swarm optimization with a mixed strategy. The approach is applied to design a mixed strategy using velocity and position vectors. The experimental results show the mixed strategy can obtain the better performance than the best of pure strategy.

Zhihua Cui, Xingjuan Cai, Jianchao Zeng, Guoji Sun
Solving the Hard Knapsack Problems with a Binary Particle Swarm Approach

Knapsack problems are important NP-Complete combinatorial optimization problems. Although nearly all the classical instances can be solved in pseudo-polynomial time nowadays, yet there are a variety of test problems which are hard to solve for the existing algorithms. In this paper we propose a new approach based upon binary particle swarm optimization algorithm (BPSO) to find solutions of these hard knapsack problems. The standard PSO iteration equations are modified to operate in discrete space. Furthermore, a heuristic operator based on the total-value greedy algorithm is employed into the BPSO approach to deal with constrains. Numerical experiments show that the proposed algorithm outperforms both the existing exact approaches and recent state-of-the-art search heuristics on most of the hard knapsack problems.

Bin Ye, Jun Sun, Wen-Bo Xu

Swarm Intelligence

Collective Behavior of an Anisotropic Swarm Model Based on Unbounded Repulsion in Social Potential Fields

Swarm system with flexible structures adapts well to variable environment. In this article, we propose an anisotropic swarm model based on unbounded repulsion and social potential fields. The unbounded repulsion ensures the independence among autonomous agents in social potential fields, which consist of obstacles to avoid and targets to move towards. Simulation results show that the aggregating swarm can construct various formations by changing its anisotropy coefficient, and the collective behavior of mass individuals emerges from combination of the inter-individual interactions and the interaction of the individual with outer circumstances.

Liang Chen, Li Xu
Combining Particle Swarm Optimization and Neural Network for Diagnosis of Unexplained Syncope

Given the relative limitations of BP and GA based leaning algorithms, Particle Swarm Optimization (PSO) is proposed to train Artificial Neural Networks (ANN) for the diagnosis of unexplained syncope. Compared with BP and GA based training techniques, PSO based learning method improves the diagnosis accuracy and speeds up the convergence process. Experimental results show that PSO is a robust training algorithm and should be extended to other real-world pattern classification applications.

Liang Gao, Chi Zhou, Hai-Bing Gao, Yong-Ren Shi
Parameter Estimation Approach in Groundwater Hydrology Using Hybrid Ant Colony System

A new approach to parameter estimation in groundwater hydrology is developed using hybrid ant colony system with simulated annealing. Based on the information from the observed water heads and calculated water heads, an objective function for inverse problem is proposed. The inverse problem of parameter identification is formulated as an optimization problem. Simulated annealing has the ability of probabilistic hill-climbing and is combined with ant colony system to produce an adaptive algorithm. A hybrid ant colony optimization is presented to identify the transmissivity and storage coefficient for a two-dimensional, unsteady state groundwater flow model. The ill-posedness of the inverse problem as characterized by instability and non-uniqueness is overcome by using computational intelligence. Compared with gradient-based optimization methods, hybrid ant colony system is a global search algorithm and can find parameter set in a stable manner. A numerical example is used to demonstrate the efficiency of hybrid ant colony system.

Shouju Li, Yingxi Liu, He Yu
Route-Exchange Algorithm for Combinatorial Optimization Based on Swarm Intelligence

Inspired by the information interaction of individuals in swarm intelligence, a new algorithm for combinatorial optimization is proposed, which is called as Route-Exchange Algorithm (REA). This is a heuristic approach, in which the individuals of the swarm search the state space independently and simultaneously. When one encounters another in the process, they would interact with each other, exchange the information of routes toured, and utilize the more valuable experiences to improve their own search efficiency. An elite strategy is designed to avoid vibrations. The algorithm has been applied to Traveling Salesman Problem (TSP) and assignment problem in this paper. Some benchmark functions are tested in the experiments. The results indicate the algorithm can quickly converge to the optimal solution with quite low cost.

Xiaoxian He, Yunlong Zhu, Kunyuan Hu, Ben Niu
Stability Analysis of Swarm Based on Double Integrator Model

In this paper, we consider an

M

-member individual-based continuous-time double integrator swarm model in an

n

-dimensional space. The swarm model based on the Newton’s law is more suitable to describe the swarm aggregation and has wide practical applications. We present stability analysis for the case of linear attraction and bounded repulsion to characterize swarm cohesiveness, size and ultimate motions while in a cohesive group. Finally, some numerical examples and simulation results are presented to illustrate the effectiveness of the swarm model.

Dan Jin, Lixin Gao

Autonomy-Oriented Computing

Interest Based Negotiation Automation

The negotiation in general sense, as one of the most fundamental and powerful interaction of human beings, represents the dynamic process of exchanging information and perspectives towards mutual understanding and agreements. Interest based negotiation allows negotiators to discuss the concerns behind the negotiation issues so that a mutually acceptable win-win solution is more likely to be reached. This paper, for the first time, proposes a computational model for interest based negotiation automation which enables the automation of the fundamental elements of negotiation. Based on the model, algorithms are designated to automate the fundamental elements with practical computational complexity. This model provides not only a theoretical foundation for software agent based negotiation automation, but also a practical approach.

Xuehong Tao, Yuan Miao, ZhiQi Shen, ChunYan Miao, Nicola Yelland

Quantum and Molecular Computations

Phase Transition of a Skeleton Model for Surfaces

A spherical model of skeleton with junctions is investigated by Monte Carlo simulations. The model is governed by one-dimensional bending energy. The results indicate that the model undergoes a first-order transition, which separates the smooth phase from the crumpled phase. The existence of phase transition indicates that junctions play a non-trivial role in the transition.

Hiroshi Koibuchi

Biological and DNA Computing

A Novel Approach Model for Chinese Postman Problem

Molecular programming(MP) has been proposed as an evolutionary computation algorithm at the molecular level. MP is different from other evolutionary algorithms in its representation of solutions using DNA molecular structures and its use of bio-lab techniques for recombination of partial solution. In this paper, Chinese Postman Problem(CPP) has been solved by means of molecular programming. We make use of the encoding scheme of S.Y.Shin for encoding real values. The new method is biologically plausible and has a fixed code length.

Jiang Bo, Shi Xiaoying, Xu Zhibang
DNA Computing Model of the Integer Linear Programming Problem Based on Molecular Beacon

Biological chip technology and DNA computing are new research areas in biology science and information science separately. The essential characteristic of both is the massive parallel of obtaining and managing information. The integer linear programming problem is an important problem in opsearch and it is an NP-complete problem. But up to now, there does not exist any good algorithm yet. A new DNA computing model is provided to solve a integer linear programming problem based on Molecular Beacon chip. In the method, the integer linear programming problem is solved with molecular beacon by fluorescing upon hybridization to their complementary DNA targets. The method has some significant advantages such as simple encoding, excellent sensitivity, high selectivity, low cost, low error, short operating time, reusable surface and simple experimental steps. The result suggest s the potential of Molecular Beacon used as a DNA computer chip.

Zhi-xiang Yin, Jian-zhong Cui, Jin Yang, Jin Xu
DNA Computing Processor: An Integrated Scheme Based on Biochip Technology for Performing DNA Computing

An integrated scheme based on biochip technology for performing DNA computing is proposed here. This work is motivated by the goal of integrating all the steps of DNA computing into one machine called DNA computing processor. The basic structure of processor consists of making DNA micro-arrays unit, encoding DNA sequences unit, micro-reaction unit, solution extraction unit and micro-control unit. The functions of each unit are discussed in detail, especially for the solution extraction unit, where the optimal solution spaces are extracted. Finally, conclusions are drawn and future studies are discussed.

Yan-Feng Wang, Guang-Zhao Cui, Bu-Yi Huang, Lin-Qiang Pan, Xun-Cai Zhang
General DNA Automaton Model with R/W Tape

Since Turing machine is consisted of R/W Tape head and Turing Tape, that means a DNA based Turing machine must realize the read and write processes on Turing Tape with DNA molecules, and make the head move in two directions either. Based on our previous works on DNA computing, especially on DNA automaton and the relationship between enzymes and their computation ability with DNA automaton model. We combined different enzymes together to modify the cut and recognition sites with a programmable Tape head to embody the read and write symbol procedure on Turing tape. And this article will describe how the programmable Tape head moves in two direction and read/write procedure of Turing Tape in details.

Shi Xiaolong, Pan linqiang, Xu Jin
Interpolated Hidden Markov Models Estimated Using Conditional ML for Eukaryotic Gene Annotation

To improve the performance of computational gene annotation programs, we introduced the well known interpolated Markov Chain (IMC) technology to the class Hidden Markov models (CHMM). CHMM was applied in one of the best eukaryotic gene prediction systems: HMMgene. The conditional Maximum Likelihood estimation (CMLE) algorithm was educed to estimate the interpolation parameters. The resulting gene prediction program improves exon level sensitivity by 3% and specificity by about 1% compared to HMMgene as trained and tested on some standard human DNA sequence dataset.

Hongmei Zhu, Jiaxin Wang, Zehong Yang, Yixu Song
Predicting Melting Temperature (Tm) of DNA Duplex Based on Neural Network

In DNA computing, similar thermodynamic stability of the encoding DNA sequences is conduced to improve the reliability and precision of the computing process. The melting temperature is a suitable parameter used to evaluating the stability of DNA duplex. Traditional method to predict Tm in biological engineering may exist lager error for a few sequences. Thus it misfits the lager amount of DNA sequences in DNA computing. In this paper, we introduced artificial neural network to predict the Tm based on Next-Nearest-Neighbor model. Our result shows that the methods have a higher precision than TP methods based on nearest-neighbor model.

Xiangrong Liu, Wenbin Liu, Juan Liu, Linqiang Pan, Jin Xu
Programmable Pushdown Store Base on DNA Computing

DNA computing aims at using nucleic acids for computing. Micro-molar DNA solutions can act as billions of parallel nanoprocessors with few consume. Finite automation is a general compute model. Now, A programmable finite automation based on DNA computing is available[1]. It’s act as a basic features and processes of a finite automaton with two internal states and an alphabet comprising two input symbols. Here we describe a new finite automata made of biomolecules, which can be used as a programmable pushdown store.

Zheng Zhang, Jin Xu, Jie Liu, Linqiang Pan
Research on the Counting Problem Based on Linear Constructions for DNA Coding

This article advances linear coding construction method for DNA codes based on three-letter alphabets. Further more it also advances the new conception of Constrain Strength and algorithm of Construction and Search. We use the former to evaluate the coding amount of combinatorial constraints and use the latter to validate the evaluated results. Experiments show that experiment result accords with the theoretically evaluated value very well.

Xiangou Zhu, Chuan Sun, Wenbing Liu, Wenguo Wu
RNA Secondary Structure Prediction with Simple Pseudoknots Based on Dynamic Programming

RNA molecules are sequences of nucleotides that serve as more than mere intermediaries between DNA and proteins, e.g. as catalytic molecules. The sequence of nucleotides of an RNA molecule constitutes its primary structure, and the pattern of pairing between nucleotides determines the secondary structure of an RNA. Computational prediction of RNA secondary structure is among the few structure prediction problems that can be solved satisfactory in polynomial time. Most work has been done to predict structures that do not contain pseudoknots. Pseudoknots have generally been excluded from the prediction of RNA secondary structures due to its difficulty in modelling. In this paper, we present a computation the maximum number of base pairs of an RNA sequence with simple pseudoknots. Our approach is based on pseudoknot technique proposed by Akutsu. We show that a structure with the maximum possible number of base pairs could be deduced by a improved Nussinov’s trace-back procedure. In our approach we also considered wobble base pairings (G·U). We introduce an implementation of RNA secondary structure prediction with simple pseudoknots based on dynamic programming algorithm. To evaluate our method we use the 15 sequences with simple pseudoknots of variable size from 19 to 25 nucleotides. We get our experimental data set from PseudoBase. Our program predicts simple pseudoknots with correct or almost correct structure for 53% sequences.

Oyun-Erdene Namsrai, Kwang Su Jung, Sunshin Kim, Keun Ho Ryu
Template Frame for DNA Computing

The encoding problem is very crucial for the information processing based on DNA computing. The main difficulty lies in that it is hard to control the shift hybridization between DNA sequences. In this paper, a new data structure, named Template Frame, is proposed based on the template strategy. As the Template Frame provides an effective tool to map the various concatenations of code sequences with the template set, thus it can be used to tackle the shift distance problem.

Wenbin Liu, Xiangou Zhu
A New DNA-Based Approach to Solve the Maximum Weight Clique Problem

Given an undirected graph with weights on the vertices, the maximum weight clique problem is to find a subset of mutually adjacent vertices, i.e. a clique, which have the largest total weight. We devised a new

DNA

encoding method to solve the maximum weight clique problem whose basic idea is that each vertex on weighted graph is encoded by two

DNA

strands of different length and each edge is encoded by one

DNA

strand with a length of 20. The longer

DNA

strand corresponding to vertex

v

i

consists of three parts and its center part is with a length of

w

i

; the shorter

DNA

strand is the reverse complementation of the longer one’s center part. We also gave the correspond- ing molecule algorithm and its biological implementation. The proposed

DNA

computing method can be expanded to solve other

NP

-hard problems, and it provides further evidence for the ability of

DNA

computing to solve numerical optimization problems.

Aili Han, Daming Zhu
A New DNA Encoding Method for Traveling Salesman Problem

We have devised a new

DNA

encoding method to represent weight and apply it to solve the traveling salesman problem, an instance of optimization problems on weighted graphs. For any weighted graph

G

=(

V

,

E

),

v

i

V

, 1≤

i

n

, where exists weight

w

ij

on edge

v

i

v

j

, we use two

DNA

strands with different lengths to encode each of the edges. The longer

DNA

strand consists of three parts: one for the departure vertex, another for the weight, and the last for the arrival vertex. The shorter

DNA

strand is the

reverse complementation

of the center part of the longer one. The proposed weight encoding method is an improvement on the previous weight encoding methods, and it can more easily find the optimal solutions than the former ones. This work extends the capability of

DNA

computing to solving numerical optimization problems, which is contrasted with other

DNA

computing methods focusing on decision problems.

Aili Han, Daming Zhu
Computational Design Approach to Hydrodynamic Focusing in a Flow Cytometer

A two-fluid theoretical model was developed to simulate the hydrodynamic focusing process formed by the coflowing sample and sheath fluids in a flow cytometer. The analysis consists of two groups of time-dependent three-dimensional conservation equations of mass and momentum. To validate the computer code, the predicted focused width in the two-dimensional test configuration was compared with Lee et al.’s measured data. The present study also examines the pressure distribution of the three-dimensional hydrodynamic focusing flowfield. For the

u

Sh

/u

S

ratio ranging from 10 to 80, the focused width was determined to explore the applicability of the proposed flow cytometer.

An-Shik Yang, Chun-Yao Wu

Intelligent Computing in Bioinformatics

A Biometric Encryption Approach Incorporating Fingerprint Indexing in Key Generation

This paper presents a new biometric encryption protocol in which encryption key is incorporated with fingerprint indexing. Based on the extended chaotic Baker map, the pixel permutation and gray level value substitution are performed to shuffle pixel positions in the original fingerprint images. The encryption key of the pixel permutation is generated by the combination of the random pixel distribution of a fingerprint imprint as well as features used for the fingerprint indexing. In addition to the advantage enjoyed by biometric keys over traditional password/PINs, the proposed biometric encryption approach is very efficient in identity identification within a large database due to the pre-filtering feature of the fingerprint indexing incorporated in the keys. This approach is applicable to the centralized matching scenario where fingerprints need to be encrypted before transmitted. Simulation results have validated the proposed schemes.

Fengling Han, Jiankun Hu, Xinghuo Yu
A General Solution for the Optimal Superimposition of Protein Structures

The theorem presented in this paper is a general solution for the optimal superimposition between two protein structures, which is actually the problem of the weighted optimal rigid superimposition between two vector sets with the same size. The theorem gives not only the rotational and translational parameters, but also the minimal value of the mean squared deviation of the optimal superimposition.

Qishen Li, Jian Shu, Zhaojun Shi, Dandan Zhang
A Personalized Biological Data Management System Based on BSML

A mass of biological sequence and structural information have been produced in biological laboratories since the techniques to get the sequences of genomes or proteins have been improved in HGP (Human Genome Project). Unfortunately, there are scarcely software packages available to deal with the biological data in most of biological laboratories and they are just stored in file formats. An integrated management system of biological data is required to manage the sequence and annotation data taken from other open databases to improve the analysis of sequence data in biological laboratories. We therefore suggest a personalized system to edit, store and retrieve biological information, and convert the formats of sequence data, as well as to integrate and manage data.

Kwang Su Jung, Sunshin Kim, Keun Ho Ryu
An Automatic Nematode Identification Method Based on Locomotion Patterns

Nematodes are primitive creatures that are endangering and devouring many of the essential resources beneficial to human beings. For effective inspection and quarantine, we have devised an image based system for quantitatively characterizing and identifying nematodes, and achieved average successful identification rate of 71.2% for the Uncoordinated (Unc) mutant types and 91.2% for other types. To enhance system performance, here we introduce the worm-body Trunk Coordinate System for defining and characterizing the locomotion patterns of representative mutants. At least 60,000 frames for each species, totally 480,000 frames, representing wild type and 7 other mutant types, were analyzed. The average correct classification rate, measured by Classification and Regression Tree (CART) algorithm, was 79.3% for Unc types. The scheme devised and the features extracted are good supplements for the previous automated nematode identification system.

Bai-Tao Zhou, Joong-Hwan Baek
An Efficient Attribute Ordering Optimization in Bayesian Networks for Prognostic Modeling of the Metabolic Syndrome

The metabolic syndrome has become a significant problem in Asian countries recently due to the change in dietary habit and life style. Bayesian networks provide a robust formalism for probabilistic modeling, so they have been used as a method for prognostic model in medical domain. Since K2 algorithm is influenced by an input order of the attributes, optimization of BN attribute ordering is studied. This paper proposes an evolutionary optimization of attribute ordering in BN to solve this problem using a genetic algorithm with medical knowledge. Experiments have been conducted with the dataset obtained in Yonchon County of Korea, and the proposed model provides better performance than the comparable models.

Han-Saem Park, Sung-Bae Cho
Analysis and Simulation of Synchronization for Large Scale Networks

Networks of coupled large scale oscillators have been studied in biology for a number of years. It has been recognized that transient in the nearest neighbor connected networks may take far too long to die out. In the model of mammalian rhythm, it is considered that a few long distance interconnections exist. Typically, these long distance interconnections are considered to occur in a random way. In this study, we discuss the synchronization problem for coupled oscillator networks which can model the mammalian rhythm. Then, the distribution model for the random long distance connections is proposed and is demonstrated by simulation. Furthermore, simulation also shows that synchronization still holds even a large part of the network is destroyed.

Xinkai Chen, Guisheng Zhai
Detection of Basal Cell Carcinoma by Automatic Classification of Confocal Raman Spectra

Raman spectroscopy has strong potential for providing noninvasive dermatological diagnosis of skin cancer. In this study, we investigated various classification methods with confocal Raman spectra for the detection of basal cell carcinoma (BCC), which is one of the most common skin cancer. The methods include maximum a posteriori (MAP) probability, probabilistic neural networks (PNN),

k

-nearest neighbor (KNN), multilayer perceptron networks (MLP), and support vector machine (SVM). The classification framework consists of preprocessing of Raman spectra, feature extraction, and classification. In the preprocessing step, a simple half Hanning method is adopted to obtain robust features. Classification results involving 216 spectra gave about 97% true classification rate in case of MLP and SVM, which is an evident proof of the effectiveness of confocal Raman spectra for BCC detection. In addition to it, spectral regions, which are important for classification, are examined by sensitivity analysis.

Seong-Joon Baek, Aaron Park, Jin-Young Kim, Seung Yu Na, Yonggwan Won, Jaebum Choo
Clustering Gene Expression Data for Periodic Genes Based on INMF

In this paper, we have explored the use of improved non – negative matrix factorization (INMF) to analyze gene expression data. Firstly, the mathematical principle of INMF algorithm is analyzed; Secondly, we proposed an INMF - based method for clustering periodic genes, which can provide valuable information for gene network research. Using simulated data, our approach is able to extract periodic genes subsets even when the signal-to-noise ratio is low. Subsequently, our approach is tested by real gene expression datasets from Yeast and is compared with the related other approaches. Our results showed that our scheme is feasible and effective.

Nini Rao, Simon J. Shepherd
Feature Selection for Microarray Data Analysis Using Mutual Information and Rough Set Theory

Cancer classification is one major application of microarray data analysis. Due to the ultra high dimension of gene expression data, efficient feature selection methods are in great needs for selecting a small number of informative genes. In this paper, we propose a novel feature selection method MIRS based on mutual information and rough set. First, we select some top-ranked features which have higher mutual information with the target class to predict. Then rough set theory is applied to remove the redundancy among these selected genes. Binary particle swarm optimization (BPSO) is first proposed for attribute reduction in rough set. Finally, the effectiveness of the proposed method is evaluated by the classification accuracy of SVM classifier. Experiment results show that MIRS is superior to some other classical feature selection methods and can get higher prediction accuracy with small number of features. Generally, the results are highly promising.

Wengang Zhou, Chunguang Zhou, Hong Zhu, Guixia Liu, Xiaoyu Chang
Feature Subset Selection for Protein Subcellular Localization Prediction

Most of the existing methods for protein subcellular localization prediction are based on a large number of features that are considered to be potentially useful for determining protein subcellular localizations. However, predictors with large numbers of input variables usually suffer from the curse of dimensionality as well as the risk of overfitting. Using only those features that are relevant for protein subcellular localization might improve the prediction performance and might also provide us with some biologically useful knowledge. In this paper, we present a feature ranking based feature subset selection approach for subcellular localization prediction of proteins in the context of support vector machines (SVMs). Experimental results show that this method improves the prediction performance with selected subsets of features. It is anticipated that the proposed method will be a powerful tool for large-scale annotation of biological data.

Qing-Bin Gao, Zheng-Zhi Wang
Fuzzy k-Nearest Neighbor Method for Protein Secondary Structure Prediction and Its Parallel Implementation

Fuzzy

k

-nearest neighbor method is a generalization of nearest neighbor method, the simplest algorithm for pattern classification. One of the important areas for application of the pattern classification is the protein secondary structure prediction, an important topic in the field of bioinformatics. In this work, we develop a parallel algorithm for protein secondary structure prediction, based on the fuzzy

k

-nearest neighbor method, that uses evolutionary profile obtained from PSI-BLAST (Position Specific Iterative Basic Local Sequence Alignment Tool) as the feature vectors.

Seung-Yeon Kim, Jaehyun Sim, Julian Lee
Gene Selection Based on Mutual Information for the Classification of Multi-class Cancer

With the development of mirocarray technology, microarray data are widely used in the diagnoses of cancer subtypes. However, people are still facing the complicated problem of accurate diagnosis of cancer subtypes. Building classifiers based on the selected key genes from microarray data is a promising approach for the development of microarray technology; yet the selection of non-redundant but relevant genes is complicated. The selected genes should be small enough to allow diagnosis even in regular laboratories and ideally identify genes involved in cancer-specific regulatory pathways. Instead of the traditional gene selection methods used for the classification of two categories of cancers, in the present paper, a novel gene selection algorithm based on mutual information is proposed for the classification of multi-class cancer using microarray data, and the selected key genes are fed into the classifier to classify the cancer subtypes. In our algorithm, mutual information is employed to select key genes related with class distinction. The application on the breast cancer data suggests that the present algorithm can identify the key genes to the BRCA1 mutations/BRCA2 mutations/the sporadic mutations class distinction since the result of our proposed algorithm is promising, because our method can perform the classification of the three types of breast cancer effectively and efficiently. And two more microarray datasets, leukemia and ovarian cancer data, are also employed to validate the performance of our method. The performances of these applications demonstrate the high quality of our method. Based on the present work, our method can be widely used to discriminate different cancer subtypes, which will contribute to the development of technology for the recovery of the cancer.

Sheng-Bo Guo, Michael R. Lyu, Tat-Ming Lok
Gene Selection by Cooperative Competition Clustering

Clustering analysis of data from DNA microarray hybridization studies is an essential task for identifying biologically relevant groups of genes. Attribute cluster algorithm (ACA) has provided an attractive way to group and select meaningful genes. However, ACA needs much prior knowledge about the genes to set the number of clusters. In practical applications, if the number of clusters is misspecified, the performance of the ACA will deteriorate rapidly. In fact, it is a very demanding to do that because of our little knowledge.We propose the Cooperative Competition Cluster Algorithm (CCCA) in this paper. In the algorithm, we assume that both cooperation and competition exist simultaneously between clusters in the process of clustering. By using this principle of Cooperative Competition, the number of clusters can be found in the process of clustering. Experimental results on a synthetic and gene expression data are demonstrated. The results show that CCCA can choose the number of clusters automatically and get excellent performance with respect to other competing methods.

Shun Pei, De-Shuang Huang, Kang Li, George W. Irwin
Genetic Algorithm and Neural Network Based Classification in Microarray Data Analysis with Biological Validity Assessment

Microarrays allow biologists to better understand the interactions between diverse pathologic states at the gene level. However, the amount of data generated by these tools becomes problematic. New techniques are then needed in order to extract valuable information about gene activity in sensitive processes like tumor cells proliferation and metastasis activity. Recent tools that analyze microarray expression data have exploited correlation-based approach such as clustering analysis. Here we describe a novel GA/ANN based method for assessing the importance of genes for sample classification based on expression data. Several different approaches have been exploited and a com-parison has been given. The developed system has been employed in the classification of ER+/- metastasis recurrence of breast cancer tumours and results were validated using a real life database. Further validation has been carried out using Gene Ontology based tools. Results proved the valuable potentialities and robustness of similar systems.

Vitoantonio Bevilacqua, Giuseppe Mastronardi, Filippo Menolascina
Inferring Species Phylogenies: A Microarray Approach

The incongruence between the gene trees and species remains a challenge in molecular phylogenetics. In this work, we propose a novel microarray approach to resolve this problem based on our previously proposed phylogenomic mining method. In our microarray approach, we first selected 28 genes from a set of statistically significant housekeeping genes from the

S. cerevisiae

cell cycle time series microarray data. Then we employed the BLAST and synteny criteria to identify homologs and orthologys of the selected genes among the genomes of other species. Finally, we applied the phylogenomic mining method for the aligned genes to infer the species phylogeny. The phylogenetic mining method used the self-organizing map mining, hierarchical clustering and entropy measure to concatenate the phylogenomically informative genes to infer species phylogenies. Compared with the original gene concatenation approach, our method not only overcome the ad-hoc mechanism and prohibitive phylogenetic computing problem of the species inference for the large number of taxa but also first integrated the microarray techniques in the species phylogeny inference.

Xiaoxu Han
Penalized Independent Component Discriminant Method for Tumor Classification

This paper proposes a new method for tumor classification using gene expression data. In this method, we first employ independent component analysis (ICA) to model the gene expression data, then apply optimal scoring algorithm to classify them. To show the validity of the proposed method, we apply it to classify two DNA microarray data sets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible.

Chun-Hou Zheng, Li Shang, Yan Chen, Zhi-Kai Huang
Practical Linear Space Algorithms for Computing String-Edit Distances

String-edit operations consist of insertion of a symbol, deletion of a symbol, and substituting one symbol with another. String-edit distances have been applied in problems of error correction and pattern recognition. In this paper, two practical algorithms for computing the edit distance between two strings are presented. The space complexity for the first is

m

+

n

+

O

(1), where

m

and

n

are the lengths of the input strings. The second requires only min(

m

,

n

)+

O

(1).

Tony Y. T. Chan
Prediction of Protein Complexes Based on Protein Interaction Data and Functional Annotation Data Using Kernel Methods

Prediction of protein complexes is a crucial problem in computational biology. The increasing amount of available genomic data can enhance the identification of protein complexes. Here we describe an approach for predicting protein complexes based on integration of protein-protein interaction (PPI) data and protein functional annotation data. The basic idea is that proteins in protein complexes often interact with each other and protein complexes exhibit high functional consistency/even multiple functional consistency. We create a protein-protein relationship network (PPRN) via a kernel-based integration of these two genomic data. Then we apply the MCODE algorithm on PPRN to detect network clusters as numerically determined protein complexes. We present the results of the approach to yeast

Sacchromyces cerevisiae

. Comparison with well-known experimentally derived complexes and results of other methods verifies the effectiveness of our approach.

Shi-Hua Zhang, Xue-Mei Ning, Hong-Wei Liu, Xiang-Sun Zhang
Prediction of Transmembrane Proteins from Their Primary Sequence by Support Vector Machine Approach

Prediction of transmembrane (TM) proteins from their sequence facilitates functional study of genomes and the search of potential membrane-associated therapeutic targets. Computational methods for predicting TM sequences have been developed. These methods achieve high prediction accuracy for many TM proteins but some of these methods are less effective for specific class of TM proteins. Moreover, their performance has been tested by using a relatively small set of TM and non-membrane (NM) proteins. Thus it is useful to evaluate TM protein prediction methods by using a more diverse set of proteins and by testing their performance on specific classes of TM proteins. This work extensively evaluated the capability of support vector machine (SVM) classification systems for the prediction of TM proteins and those of several TM classes. These SVM systems were trained and tested by using 14962 TM and 12168 NM proteins from Pfam protein families. An independent set of 3389 TM and 6063 NM proteins from curated Pfam families were used to further evaluate the performance of these SVM systems. 90.1% and 86.7% of TM and NM proteins were correctly predicted respectively, which are comparable to those from other studies. The prediction accuracies for proteins of specific TM classes are 95.6%, 90.0%, 92.7% and 73.9% for G-protein coupled receptors, envelope proteins, outer membrane proteins, and transporters/channels respectively; and 98.1%, 99.5%, 86.4%, and 98.6% for non-G-protein coupled receptors, non-envelope proteins, non-outer membrane proteins, and non-transporters/non-channels respectively. Tested by using a significantly larger number and more diverse range of proteins than in previous studies, SVM systems appear to be capable of prediction of TM proteins and proteins of specific TM classes at accuracies comparable to those from previous studies. Our SVM systems – SVMProt, can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.

C. Z. Cai, Q. F. Yuan, H. G. Xiao, X. H. Liu, L. Y. Han, Y. Z. Chen
Protein Subcellular Location Prediction Based on Pseudo Amino Acid Composition and Immune Genetic Algorithm

Protein subcellular location prediction with computational method is still a hot spot in bioinformatics. In this paper, we present a new method to predict protein subcellular location, which based on pseudo amino acid composition and immune genetic algorithm. Hydrophobic patterns of amino acid couples and approximate entropy are introduced to construct pseudo amino acid composition. Immune Genetic algorithm (IGA) is applied to find the fittest weight factors for pseudo amino acid composition, which are crucial in this method. As such, high success rates are obtained by both self-consistency test and jackknife test. More than 80% predictive accuracy is achieved in independent dataset test. The result demonstrates that this new method is practical. And, the method illuminates that the hydrophobic patterns of protein sequence influence its subcellular location.

Tongliang Zhang, Yongsheng Ding, Shihuang Shao
SPDBS: An SBML-Based Biochemical Pathway Database System

Biochemical pathways such as metabolic, regulatory, or signal transduction pathways can be viewed as a complicated network of interactions among molecular species in the cell. As the amount of pathway information for various organisms is increasing very rapidly, performing various analyses on the full network of pathways for even multiple organisms can be possible and therefore developing an integrated database for storing and analyzing pathway information is becoming a critical issue. However, analyzing these networks is not easy because of the nature of the existing pathway databases, which are often heterogeneous, incomplete, and/or inconsistent. To solve this problem, SBML(Systems Biology Markup Language) – a computer-readable format for representing various biochemical pathways – has been adopted by the most of the SW packages in systems biology. We developed an SBML-based Biochemical Pathway Database System (SPDBS) that supports (1) efficient integration of the heterogeneous and distributed pathway databases, (2) prediction of the metabolic pathways for a given (non-annotated) genome sequence, (3) dynamic visualization/ simulation of the pathways, (4) starting from the detailed pathway graph, build networks at different levels of representation, (5) imports/exports of SBML documents for the simulation and/or exchange of the biochemical pathways in various applications. To evaluate the system, we applied the system to the construction of pathways from its genome sequences. For the E. coli genome sequence, SPDBS estimates the same metabolic pathways as the original well-known E. coli pathway. We are applying our system to the pathway prediction of S. chungbukensis DJ77 and Vibrio vulnificus CMCP6.

Tae-Sung Jung, Kyoung-Ran Kim, Seung-Hyun Jung, Tae-Kyung Kim, Myung-Sang Ahn, Jong-Hak Lee, Wan-Sup Cho
Supervised Inference of Gene Regulatory Networks by Linear Programming

The development of algorithms for reverse-engineering gene regulatory networks is boosted by microarray technologies, which enable the simultaneous measurement of all RNA transcripts in a cell. Meanwhile the curated repository of regulatory associations between transcription factors (TF) and target genes is available based on bibliographic references. In this paper we propose a novel method to combine time-course microarray dataset and documented or potential known transcription regulators for inferring gene regulatory networks. The gene network reconstruction algorithm is based on linear programming and performed in the supervised learning framework. We have tested the new method using both simulated data and experimental data. The result demonstrates the effectiveness of our method which significantly alleviates the problem of data scarcity and remarkably improves the reliability.

Yong Wang, Trupti Joshi, Dong Xu, Xiang-Sun Zhang, Luonan Chen

Intelligent Computing in Computational Biology and Drug Design

Double Optimization for Design of Protein Energy Function

We propose an automated protocol for designing the energy landscape suitable for the description of a given set of protein sequences with known structures, by optimizing the parameters of the energy function. The parameters are optimized so that not only the global minimum-energy conformation becomes native-like, but also the conformations distinct from the native structure have higher energies than those close to the native one, for each protein sequence in the set. In order to achieve this goal, one has to sample protein conformations that are local minima of the energy function for given parameters. Then the parameters are optimized using linear approximation, and then local minimum conformations are searched with the new energy parameters. We develop an algorithm that repeats this process of parameter optimization based on linear approximation, and conformational optimization for the current parameters, ultimately leading to the optimization of the energy parameters. We test the feasibility of this algorithm by optimizing a coarse grained energy function, called the UNRES energy function, for a set of ten proteins.

Seung-Yeon Kim, Julian Lee
Efficient Solution of Bidomain Equations in Simulation of Cardiac Excitation Anisotropic Propagation

Bidomain equations are used to characterize myocardial physiological activation and propagation. Their numerical solution is costly computation because of the higher temporal and spatial discretization requirements, especially in three dimensions. In most previous studies, the heart was supposed to be homogeneous isotropic medium, and thus can use the mondomain equation in stead of the bidomain equations to simulate the heart excitation propagation. Simulation of heart excitation anisotropic propagation in three-dimensional large tissue size by solving bidomain equations has not been implemented yet. In this paper, we present an efficient solution of bidomain equations in simulation of heart excitation anisotropic propagation by combining some numerical techniques such as non-standard finite difference (NSFD), domain decomposition and multigrid methods. The results show that the proposed method can successfully be used to simulate heart excitation anisotropic propagation in three-dimensional large tissue size, and it suggests that such method may provide a good basis for heart simulation research in a more physiologically way.

Yu Zhang, Ling Xia, Guanghuan Hou
Analysis of Numerical Solutions to Stochastic Age-Dependent Population Equations

In this paper, stochastic age-dependent population equations, one of the important classes of hybrid systems, are studied. In general, most of stochastic age-dependent population equations do not have explicit solutions. Thus numerical approximation schemes are invaluable tools for exploring their properties. The main purpose of this paper is to develop a numerical scheme and show the convergence of the numerical approximation solution to the true solution.

Qimin Zhang, Xining Li

Computational Genomics and Proteomics

Classifying G-Protein Coupled Receptors with Hydropathy Blocks and Support Vector Machines

This paper developes a new method for recognizing G-Protein Coupled Receptors (GPCRs) based on features generated from the hydropathy properties of the amino acid sequences. Using the hydropathy characteristics, namely hydropathy blocks, the protein sequences are converted into fixed-dimensional feature vectors. Subsequently, the Support Vector Machine (SVM) classifier is utilized to identify the GPCR proteins belonging to the same families or subfamilies. The experimental results on GPCR datasets show that the proteins belonging to the same family or subfamily can be identified using features generated based on the hydropathy blocks.

Xing-Ming Zhao, De-Shuang Huang, Shiwu Zhang, Yiu-ming Cheung
HSPPIP: An Online Tool for Prediction of Protein–Protein Interactions in Humans

Recently, protein-protein interaction prediction (PPIP) has been emerging as an appealing question. Although several in silico approaches have been developed to delineate the potential protein-protein interaction (PPI), there are few online tools of human PPIP for further experimental design. Here we present an online service, hsPPIP (Protein-Protein Interaction Predicting of Homo Sapiens), to predict or evaluate the potential PPIs in human. The annotations of functional domain (Interpro) and GO (Gene Ontology) for proteins are employed as two informative features, and are integrated by the naïve Bayesian approach. The prediction accuracy is comparable to the existing tools. Based on the hypothesis that the features correlated with PPIs are conserved in different organisms, the web server hsPPIP is established and could predict the PPIs of human dynamically. hsPPIP is implemented in PHP+MySQL and can be freely accessed at: http://973-proteinweb.ustc.edu.cn/ hsppip/.

Yu Xue, Changjiang Jin, Xuebiao Yao
Prediction of Ribosomal -1 Frameshifts in the Escherichia coli K12 Genome

Ribosomal frameshifting at a particular site can yield two protein products from one coding sequence or one protein product from two overlapping open reading frames. Many organisms are known to utilize ribosomal frameshifting to express a minority of genes. However, finding ribosomal frameshift sites by a computational method is difficult because frameshift signals are diverse and dependent on the organisms and environments. There are few computer programs available for public use to identify frameshift sites from genomic sequences. We have developed a web-based application program called FSFinder2 for predicting frameshift sites of general type. We tested FSFinder2 on the

Escherichia coli

K12 genome to detect potential -1 frameshifting genes. From the genome sequence, we identified 18,401 frameshift sites following the X XXY YYZ motif. 11,530 frameshift sites out of the 18,401 sites include secondary structures. Comparison with the GenBank annotation produced 11 potential frameshift sites, including 3 known frameshift sites. The program is useful for analyzing frameshifts of various types and for discovering new genes expressed by frameshifts.

Sanghoon Moon, Yanga Byun, Kyungsook Han
Using a Stochastic AdaBoost Algorithm to Discover Interactome Motif Pairs from Sequences

Protein interactome is an important research focus in the post-genomic era. The identification of interacting motif pairs is essential for exploring the mechanism of protein interactions. We describe a stochastic AdaBoost approach for discovering motif pairs from known interactions and pairs of proteins that are putatively not to interact. Our interacting motif pairs are validated by multiple-chain PDB structures and show more significant than those selected by traditional statistical method. Furthermore, in a cross-validated comparison, our model can be used to predict interactions between proteins with higher sensitivity (66.42%) and specificity (87.38%) comparing with the Naive Bayes model and the dominating model.

Huan Yu, Minping Qian, Minghua Deng
Web Service for Predicting Interacting Proteins and Application to Human and HIV-1 Proteins

Finding all the proteins that potentially interact with a query protein is useful when studying a biological mechanism, but it is much more difficult than finding interactions between a set of given proteins. This is because it involves intensive computation to search the relevant data in databases, and different databases have different accession numbers and names for the same protein. Recently a few servers have been developed for finding interactions between user-specified proteins, but none of these can find all the potentially interacting proteins for a given protein, including the former version of our prediction server, HPID version 1.0. This paper describes a new online prediction system, HPID 2.0 (http://www.hpid.org), for finding partners interacting with a query protein as well as for finding interactions between query proteins. We applied the new system to predicting the interactions between the entire human proteins, and to identifying human proteins interacting with HIV-1 proteins (http://hiv1.hpid.org). We believe that this is the first online server for predicting proteins interacting with a given protein, and that it will be a useful resource for studying protein-protein interactions.

Byungkyu Park, Kyungsook Han

Artificial Life and Artificial Immune Systems in Intelligent Computing

An Immunity-Based Dynamic Multilayer Intrusion Detection System

A real computer network produces new network traffic continuously in real time, thus the normal behaviors of network traffic are different in different time, but the self set of current network detection systems based on immunity are static. If the network environments change, the false negative rates and false positive rates will increase rapidly. So the traditional method can not adapt to changing network environments. In order to get over the limitation of the traditional means, an immunity-based dynamic intrusion detection system is proposed in this paper. In the proposed model, a dynamic renewal process of self set is described in detail. In addition, we establish a new set to improve the detection precision and shorten the training phase by adding the characters of the current known attacks to memory cell set. The experimental results show that the new model not only reduces the false negative rates and false positive rates effectively but also has the feature to adapt to continuous changing network environments.

Gang Liang, Tao Li, Jiancheng Ni, Yaping Jiang, Jin Yang, Xun Gong
Immunity and Mobile Agent Based Grid Intrusion Detection

This paper analyzes the unique characteristics of a grid environment and the deficiencies of current grid intrusion detection systems, and proposes a novel immunity and mobile agent based grid intrusion detection (

IMGID

) model. Then, the concepts and formal definitions of

self

,

nonself

,

antibody

,

antigen

and

agent

in the grid security domain are given. Furthermore, the mathematical models of

self

,

mature MoA

(mature monitoring agent) and

dynamic memory MoA

(memory monitoring agent)

survival

are established. Our theoretical analysis and experimental results show that the model is a good solution to grid intrusion detection.

Xun Gong, Tao Li, Gang Liang, Tiefang Wang, Jin Yang, Xiaoqin Hu
Immune-Based Peer-to-Peer Model for Anti-spam

Spam (or junk email) has been a major problem on the Internet. A lot of solutions have been proposed to deal with it. However, with the evolvement of spammers’ techniques and the diversification of email content, the traditional anti-spam approaches alone are no longer efficient. In this paper, a new anti-spam Peer-to-Peer (P2P) model based on immunity was presented.

Self

,

Nonself

, Antibody, Antigen and immune cells in email system were defined. The model architecture, the process of Antigen presenting, clone selection and mutation, immune tolerance, immune response, life cycle of immune cells and some other immune principles were described respectively. The analyses of theory and experiment results demonstrate that this model enjoys better adaptability and provides a new attractive solution to cope with junk emails in P2P environment.

Feng Wang, Zhisheng You, Lichun Man
NASC: A Novel Approach for Spam Classification

The technology of spam filters has recently received considerable attention as a powerful approach to Internet security management. The traditional spam filters almost adopted static measure, and filters need be updated and maintained frequently, so they can not adapt to dynamic spam. In order to get over the limitation of the traditional means, an immunity-based spam classification was proposed in this paper. A brief review about traditional technology of spam filter is given first, particularly, the Bayes probability model. Then the NASC is described in detail by introducing self, non-self, detector and detect algorithm. Finally, a simulation experiment was performed, and the important parameters of the model were analyzed. The experimental result shows that the new model increases the recall of filter greatly in condition that precision also increasing, and demonstrate that the model has the features of self-learning and self-adaptation.

Gang Liang, Tao Li, Xun Gong, Yaping Jiang, Jin Yang, Jiancheng Ni
Prediction Algorithms in Large Scale VOD Network Collaborations

VOD (Video on Demand) is one of significant services for next generation networks. VOD networks mean local networks to provide VOD services to communities in a scale about 500 to 1000 users simultaneously. Many VOD networks could cooperate through external networks link internet. The cooperation requires not only huge local bandwidth but great backbone bandwidth. This paper presents prediction algorithms to reduce the cost in external communications which is very important for VOD operators. Basic algorithms can reduce overall costs about 30% and well trained ANN can provide extra 10% performance.

Bo Li, Hualin Wan, Depei Qian

Special Session on Bio-oriented and Bio-inspired Information Systems

A Computational Korean Lexical Access Model Using Artificial Neural Network

In this paper, we propose a computational Korean lexical access model based on connectionist approach. The model is designed to simulate the behaviors observed in human lexical decision task. The proposed model adopts a simple recurrent neural network architecture which takes a Korean string of 2-syllable length as an input and makes an output as a semantic vector representing semantic of the input. As experimental results, the model shows similar behaviors of human lexical decision task such as frequency effect, lexical status effect, word similarity effect, semantic priming effect, and visual degradation effect.

Heui Seok Lim, Kichun Nam, Kinam Park, Sungho Cho
An Artificial Retina Chip Using Switch-Selective Resistive Network for Intelligent Sensor Systems

We designed and fabricated a bio-inspired CMOS vision chip for edge detection using a switch-selective resistive network. A CMOS buffer circuit, which is embodied in the structure of a common drain amplifier, is commonly used for both raw and smoothed images by using additional switches. By using the switch-selective resistive network, the total number of MOSFETs in the unit pixel and fixed-pattern noise (FPN) was reduced. In addition, by applying the saturating resistive network, the chip outputs a uniform edge signal under various lighting conditions. From the experimental results of a fabricated one-dimensional array with 20 pixels, we could confirm the operation of the vision chip.

Jae-Sung Kong, Sang-Heon Kim, Jang-Kyoo Shin, Minho Lee
An Efficient Feedback Canceler for Hearing Aids Based on Approximated Affine Projection

In the feedback cancelation systems in hearing aids, signal cancelation and coloration artifacts can occur for a narrow-band input. In this paper, we propose a new adaptive feedback cancelation algorithm that can achieve fast convergence by approximating the affine projection (AP) algorithm and prevent signal cancelation by controlling the step-size. A Gram-Schmidt prediction error filter (GS-PEF) is used for a stable approximation of the AP algorithm, and the step-size is varied using the prediction factor of the input signal. Simulations results are presented to verify efficiency of the proposed algorithm.

Sangmin Lee, Inyoung Kim, Youngcheol Park
Robust Real-Time Face Detection Using Hybrid Neural Networks

In this paper, a multi-stage face detection method using hybrid neural networks is presented. The method consists of three stages: preprocessing, feature extraction and pattern classification. We introduce an adaptive filtering technique which is based on a skin-color analysis using fuzzy min-max(FMM) neural networks. A modified convolutional neural network(CNN) is used to extract translation invariant feature maps for face detection. We present an extended version of fuzzy min-max (FMM) neural network which can be used not only for feature analysis but also for pattern classification. Two kinds of relevance factors between features and pattern classes are defined to analyze the saliency of features. These measures can be utilized to select more relevant features for the skin-color filtering process as well as the face detection process.

Ho-Joon Kim, Juho Lee, Hyun-Seung Yang
The Novel Feature Selection Method Based on Emotion Recognition System

This paper presents an original feature selection method for Emotion Recognition which includes many original elements. Feature selection has some merit regarding pattern recognition performance. Thus, we developed a method called an ‘Interactive Feature Selection’ and the results (selected features) of the IFS were applied to an emotion recognition system (ERS), which was also implemented in this research. Our innovative feature selection method was based on a Reinforcement Learning Algorithm and since it required responses from human users, it was denoted an ‘Interactive Feature Selection’. By performing an IFS, we were able to obtain three top features and apply them to the ERS. Comparing those results from a random selection and Sequential Forward Selection(SFS) and Genetic Algorithm Feature Selection(GAFS), we verified that the top three features were better than the randomly selected feature set.

Chang-Hyun Park, Kwee-Bo Sim
Unsupervised Feature Extraction for the Representation and Recognition of Lip Motion Video

The lip-reading recognition is reported with lip-motion features extracted from multiple video frames by three unsupervised learning algorithms, i.e., Principle Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Since the human perception of facial motion goes through two different pathways, i.e., the lateral fusifom gyrus for the invariant aspects and the superior temporal sulcus for the changeable aspects of faces, we extracted the dynamic video features from multiple consecutive frames for the latter. The multiple-frame features require less number of coefficients for the same frame length than the single-frame static features. The ICA-based features are most sparse, while the corresponding coefficients for the video representation are the least sparse. PCA-based features have the opposite characteristics, while the characteristics of the NMF-based features are in the middle. Also the ICA-based features result in much better recognition performance than the others.

Michelle Jeungeun Lee, Kyungsuk David Lee, Soo-Young Lee

Special Session on Novel Applications of Knowledge Discovery on Bioinformatics

A Novel Method for Expanding Current Annotations in Gene Ontology

Since the gap between the amount of protein sequence data and the reliable function annotations in public databases is growing, characterizing protein functions becomes a major task in the post genomic era. Some current ways to predict functions of a protein are based on the relationships between the protein and other proteins in databases. As a large fraction of annotated proteins are not fully characterized, annotating novel proteins is limited. Therefore, it is of high demand to develop efficient computation methods to push the current broad function annotations of the partially known proteins toward more detailed and specific knowledge. In this study, we explore the capability of a rule-based method for expanding the current annotations per some function categorization system such as Gene Ontology. Applications of the proposed method to predict human and yeast protein functions demonstrate its efficiency in expanding the knowledge space of the partially known proteins.

Dapeng Hao, Xia Li, Lei Du, Liangde Xu, Jiankai Xu, Shaoqi Rao
Identifying the Modular Structures in Protein Interaction Networks

Identifying the modular structures in Protein Interaction Networks (PINs) is crucial to the understanding of the organization and function of biological systems. We propose a new module definition taking into account the degree of a subgraph instead of a vertice, and design a corresponding agglomerative algorithm, which is implemented into a computational tool called ModuleSpider, to recognize modules within a network. The application of ModuleSpider to the yeast core protein interaction network identifies 97 simple modules which are biologically meaningful according to the GO annotation of proteins in the modules. In addition, the results of comparison analysis demonstrate that ModuleSpider modules show stronger biological relevance. Further, the ModuleSpider is able to construct the interaction web of modules, which provides insights to the high level relationships of different functional modules. ModuleSpider is freely available upon request to the authors.

Yanen Li, Feng Lu, Yanhong Zhou
An Analysis of Gene Expression Relationships Between Periodically Expressed Genes in the Hela Cells

The availability of increasingly accumulated time-series expression profiles has provided the data structures to study the temporal relationship between genes. The aims of the study were to explore the modes of relationship between the genes periodically expressed in human HeLa cell cycle, and to reversely engineer the dynamic networks of gene transcription. We also studied the phase-specific properties of the genetic relationships by decomposing the whole network into the sub-networks according to the cell cycling phases. The results demonstrated that the gene-gene relationships within a same phase or between different phases followed different modes.

Yun Xiao, Xia Li, Shaoqi Rao, Juan Wang, Yun Zhang, Lei Du
Analysis of Sib-Pair IBD Profiles Using Ensemble Decision Tree Approach: Application to Alcoholism

There is currently a great interest in using single-nucleotide polymorphisms (SNPs) in genetic linkage and association studies because of the abundance of SNPs as well as the availability of high-throughput genotyping technologies. Here, we apply a novel ensemble decision approach to extract the relevant SNPs for alcoholism using sib-pair IBD profiles of pedigrees. The results indicate that ensemble decision tree is a promising algorithm for mining genetic markers for complex genetic diseases.

Yang Jiang, Qingpu Zhang, Xia Li, Lei Du, Wei Jiang, Ruijie Zhang, Jing Li, Shaoqi Rao
Association Research on Potassium Channel Subtypes and Functional Sites

Potassium channels, a goup of membrane proteins and found in virtually all cells, are of crucial physiological importance for understanding cell biology and channelopathy. Studying the association of channel subtypes with their functional sites, and mining the functional sites mostly relevant to channel subtypes, can not only enhance molecular classification of potassium channels but only provide some clues for the functional targets. Here, we proposed to use functional sites profiles as features, where the functional sites are often used to characterize ion channels in biomedical research domains. We employed a novel integrated decision method to recognize the functional sites subset that is mostly relevant to the differentiation of potassium channel subtypes.

Peng Wu, Xia Li, Shaoqi Rao, Wei Jiang, Chuanxing Li, Jie Zhang
Nonequilibrium Model for Yeast Cell Cycle

In the living cells, molecules including proteins, DNAs, RNAs and so on, with interactions between them cooperate as networks that govern various cellular functions. In this paper, a stochastic model with trigger mechanism is proposed based on what are known about the genes and proteins controlling the cell cycle of budding yeast. With respect to the biological observations, it looks more natural and understandable than deterministic dynamical model and our former stochastic model. Our model vividly describes that the protein interaction network goes through the biological pathway and forms an endless loop.

Yuping Zhang, Huan Yu, Minghua Deng, Minping Qian
Tissue Classification Using Gene Expression Data and Artificial Neural Network Ensembles

An important challenge in the use of large-scale gene expression data for biological classification occurs when the number of genes far exceeds the number of samples. This situation will make the classification results are unstable. Thus, a tissue classification method using artificial neural network ensembles was proposed. In this method, a feature preselection method is presented to identify significant genes highly correlated with tissue types. Then pseudo data sets for training the component neural network of ensembles were generated by bagging. The predictions of those individual networks were combined by simple averaging method. Some data experiments have shown that this classification method yields competitive results on several publicly available datasets.

Huijuan Lu, Jinxiang Zhang, Lei Zhang
Backmatter
Metadaten
Titel
Computational Intelligence and Bioinformatics
herausgegeben von
De-Shuang Huang
Kang Li
George William Irwin
Copyright-Jahr
2006
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-37282-0
Print ISBN
978-3-540-37277-6
DOI
https://doi.org/10.1007/11816102

Premium Partner