Skip to main content

2004 | Buch

Biological and Medical Data Analysis

5th International Symposium, ISBMDA 2004, Barcelona, Spain, November 18-19, 2004. Proceedings

herausgegeben von: José María Barreiro, Fernando Martín-Sánchez, Víctor Maojo, Ferran Sanz

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Inhaltsverzeichnis

Frontmatter

Data Analysis for Image Processing

RF Inhomogeneity Correction Algorithm in Magnetic Resonance Imaging

MR images usually present grey level inhomogeneities which are a problem of significant importance. Eliminating these inhomogeneities is not an easy problem and has been studied and discussed in several previous publications. Most of those approaches are based on segmentation processes. The algorithm presented in this paper has the advantage that it does not involve any segmentation step. Instead, a interpolating polynomial model based on a Gabor transform was used to construct a filter that can be used in order to correct these inhomogeneities. The results obtained are really good and shows that the grey-level inhomogeneities can be corrected without segmentation.

Juan A. Hernández, Martha L. Mora, Emanuele Schiavi, Pablo Toharia
Fully 3D Wavelets MRI Compression

This paper deals with the implementation of 3D wavelets techniques for compression of medical images, specifically in the MRI modality. We show that at the same compression rate a lower loss of information can be obtained by using fully 3D wavelets as compared to iterated 2D methods. An adaptive thresholding step is performed in each sub-band and a simple Run Length Encoding (RLE) method for compression is finally used. A reconstruction algorithm is then implemented and a comparison study of the reconstructed image is then proposed, showing the high performance of the proposed algorithm.

Emanuele Schiavi, C. Hernández, Juan A. Hernández
A New Approach to Automatic Segmentation of Bone in Medical Magnetic Resonance Imaging

This paper presents the modelling and segmentation with correction of inhomogeneity in magnetic resonance imaging of shoulder. For that purpose a new heuristic is proposed using a morphological method and a pyramidal Gaussian decomposition (Discrete Gabor Transform). After the application of these filters, an automatic segmentation of a bone is possible despite of other semiautomatic methods present in the literature.

Gabriela Pérez, Raquel Montes Diez, Juan A. Hernández, José San Martín
An Accurate and Parallelizable Geometric Projector/Backprojector for 3D PET Image Reconstruction

C code for an accurate projector/backprojector for Positron Emission Tomography (PET) image reconstruction has been developed. Iterative PET image reconstruction methods are supported by a projection/backprojection step built inside of any of such algorithms. It is not surprising that a more precise modeling of the forward process of the measurement will yield better results when solving the inverse problem of image reconstruction. Among the factors that can be include in this forward model are γ-ray scatter contributions, attenuation, positron range, photon non-collinearity, crystal characteristics. Currently, we only include the geometric tube of response (TOR) modeling for a generic multi-ring scanner device. The elements in the transition matrix are calculated by a high statistics Monte Carlo simulation, taking advantage of the inherent symmetries and then the nonzero elements stored in a run length encoding incremental fashion. The resulting projector/backprojector is a voxel driven implementation. We show some preliminary results for 3D ML-EM reconstructions on synthetic phantom simulated data.

Roberto de la Prieta

Data Visualization

EEG Data and Data Analysis Visualization

Healthcare technology produces today large sets of data every second. An information overload results from these enormous data volumes not manageable by physicians, e.g. in intensive care. Data visualization tools aim at reducing the information overload by intelligent abstraction and visualization of the features of interest in the current situation. Newly developed software tools for visualization should support fast comprehension of complex, large, and dynamically growing datasets in all fields of medicine. One of such fields is the analysis and evaluation of long-term EEG recordings. One of the problems that are connected with the evaluation of EEG signals is that it necessitates visual checking of such a recording performed by a physician. In case the physician has to check and evaluate long-term EEG recordings computer-aided data analysis and visualization might be of great help. Software tools for visualization of EEG data and data analysis are presented in the paper.

Josef Rieger, Karel Kosar, Lenka Lhotska, Vladimir Krajca
A Web Information System for Medical Image Management

In the last years, the use of digital images for medical diagnosis and research has increased considerably. For this reason, it is necessary to develop new and better applications for managing in an effective way a large amounts of medical information. DICOM is the standard for the Digital Imaging and COmmunications in Medicine. However, DICOM information is difficult to interchange and integrate out of the scope of medical specialized equipment. This drawback make difficult its use and its integration in a wide context as the Web. XML is the standard for the information exchange and data transportation between multiple applications. As the XML databases are emerging like the best alternative to storage and manage XML documents, in this work we present a Web Information System to store, in an integrated way, DICOM and Analyze 7.5 files in an XML Database. For its development, the XML schemas for both DICOM and Analyze 7.5 formats have been obtained and the architecture for the integration of XML documents in the XML DB has been defined.

César J. Acuña, Esperanza Marcos, Valeria de Castro, Juan A. Hernández
Reliable Space Leaping Using Distance Template

Several optimization techniques for direct volume rendering have been proposed since its rendering speed is too slow. An acceleration method using min-max map requires a little preprocessing and additional data structures while preserving image quality. However, we have to calculate accurate distance from current point to the boundary of a min-max block to skip over empty space. Unfortunately the cost for evaluating the distance is so expensive. In this paper, we propose reliable space leaping method to jump to the boundary of current block using pre-calculated distance template. A template can be reused for entire volume since it is independent on viewing conditions. Our algorithm reduced rendering time in comparison to conventional min-max map based volume ray casting.Paper domain:Biological and medical data visualization

Sukhyun Lim, Byeong-Seok Shin

Decision Support Systems

A Rule-Based Knowledge System for Diagnosis of Mental Retardation

We present in this paper a Rule-Based Knowledge System that both verifies the consistency of the knowledge explicitly provided by experts in mental retardation, and automatically extracts consequences (in this case: diagnoses) from that knowledge.Expert knowledge is translated into symbolic expressions, which are written in CoCoA (a Computer Algebra language). The program, using a theory developed by this research team, outputs diagnoses from the different inputs that describe each patient.

R. Sánchez-Morgado, Luis M. Laita, Eugenio Roanes-Lozano, Luis de Ledesma, L. Laita
Case-Based Diagnosis of Dysmorphic Syndromes

Since diagnosis of dysmorphic syndromes is a domain with incomplete knowledge where even experts have seen only few syndromes themselves during their lifetime, documentation of cases and the use of case-oriented techniques are popular. In dysmorphic systems, diagnosis usually is performed as a classification task, where a prototypicality measure is applied to determine the most probable syndrome. Our approach additionally applies adaptation rules. These rules do not only consider single symptoms but combinations of them, which indicate high or low probabilities of specific syndromes.Paper domain: Decision support systems

Tina Waligora, Rainer Schmidt
Bayesian Prediction of Down Syndrome Based on Maternal Age and Four Serum Markers

Screening tests have been designed to identify women at increased risk of having a Down syndrome pregnancy. These tests have no risks of miscarriage, but they are not able to determine with certainty whether a fetus is affected. Diagnostic tests such as amniocentesis, on the other hand, are extremely accurate at identifying abnormalities in the fetus, but carry some risk of miscarriage, making it inappropriate to examine every pregnancy in this way. Muller et al.(1999), compares six software packages that calculate Ds risk, concluding that substantial variations are observed among them. In this paper, we provide a Bayesian reanalysis of the current quadruple screening test, based on maternal age and four serum markers (afp, uE3, hCG and DIA), which suggests the need to reevaluate more carefully actual recomendations.

Raquel Montes Diez, Juan M. Marin, David Rios Insua
SOC: A Distributed Decision Support Architecture for Clinical Diagnosis

In this paper we introduce SOC (Sistema de Orientación Clínica, Clinic Orientation System), a novel distributed decision support system for clinical diagnosis. The decision support systems are based on pattern recognition engines which solve different and specific classification problems.SOC is based on a distributed architecture with three specialized nodes: 1) Information System where the remote data is stored, 2) Decision Support Web-services which contains the developed pattern recognition engines and 3) Visual Interface, the clinicians’ point of access to local and remote data, statistical anasysis tools and distributed information.A location-independent and multi-platform system has been developed to bring together hospitals and institutions to research useful tools in clinical and laboratory environments. The nodes maintenance and upgrade are automatically controlled by the architecture.Two examples of the application of SOC are presented. The first example is the Soft Tissue Tumors (STT) diagnosis. The decision support systems are based on pattern recognition engines to classify between benign/malignant character and histological groups with good estimated efficiency. In the second example we present clinical support for Microcytic Anemia (MA) diagnosis. For this task, the decision support systems are based, too, on pattern recognition engines to classify between normal, ferropenic anemia and thalassemia.This tool will be useful for several puposes: to assist the radiologist/hematologist decision in a new case and help the education of new radiologist/hematologist without expertise in STT or MA diagnosis.

Javier Vicente, Juan M. Garcia-Gomez, César Vidal, Luís Marti-Bonmati, Aurora del Arco, Montserrat Robles
Decision Support Server Architecture for Mobile Medical Applications

In the paper, the architecture for a mobile medical decision support and its exemplary application to medicines prescribing is presented. The aim of the architecture is to provide the decision algorithms and database support for thin client applications running on low cost palmtop computers. We consider the wide class of medical applications, where decision making is equivalent to typical pattern recognition problem. The decision support consists in ordering the decisions by their degree of belief in the context, in which the decision is being made, and presenting them in such order to the user. The role of the palmtop computer is to organize the dialog with the user, while the role of decision support server consists in decision algorithm executing, and delivering the results to a mobile application. Providing the ordered decision list to the palmtop application not only aids the user in making right decision, but also significantly simplifies the user interaction with the keyboard-less palmtop device. The relation between these two topics is shown and the method of user interface dynamic configuration is proposed.

Marek Kurzynski, Jerzy Sas
Ordered Time-Independent CIG Learning

Clinical practice guidelines are medical and surgical statements to assist practitioners in the therapy procedure. Recently, the concept computer-interpretable guideline (CIG) has been introduced to describe formal descriptions of clinical practice guidelines. Ordered time-independent one-visit CIGs are a sort of CIG wich are able to cope with the description and use of real therapies. Here, this representation model and a machine learning algorithm to construct such CIGs from the hospital databases or from predefined CIGs are introduced and tested within the domain of attrial fibrillation.

David Riaño
SINCO: Intelligent System in Disease Prevention and Control. An Architectural Approach

SINCO is a research effort to develop a software environment that contributes to the prevention and control of infectious diseases. This paper describes the system architecture already implemented where four important elements interact: (a) expert system (b) geographical information system (c) simulation component, and (d) training component. This architecture is itself a scalable, interoperable and modular approach. The system is being currently used in several health establishments as part of its validation process.

Carolina González, Juan C. Burguillo, Juan C. Vidal, Martin Llamas
Could a Computer Based System for Evaluating Patients with Suspected Myocardial Infarction Improve Ambulance Allocation?

The very early handling of patients with suspected acute myocardial infarction (AMI) is crucial for the outcome. In Gothenburg approximately two-third of all patients with AMI dial the emergency number for ambulance transport (reference).

Martin Gellerstedt, Angela Bång, Johan Herlitz
On the Robustness of Feature Selection with Absent and Non-observed Features

To improve upon early detection of Classical Swine Fever, we are learning selective Naive Bayesian classifiers from data that were collected during an outbreak of the disease in the Netherlands. The available dataset exhibits a lack of distinction between absence of a clinical symptom and the symptom not having been addressed or observed. Such a lack of distinction is not uncommonly found in biomedical datasets. In this paper, we study the effect that not distinguishing between absent and non-observed features may have on the subset of features that is selected upon learning a selective classifier. We show that while the results from the filter approach to feature selection are quite robust, the results from the wrapper approach are not.

Petra Geenen, Linda C. van der Gaag, Willie Loeffen, Armin Elbers
Design of a Neural Network Model as a Decision Making Aid in Renal Transplant

This paper presents the application of this new tool of data processing in the study of the problem that arises when a renal transplant is indicated for a paediatric patient. Its aim is the development and validation of a neural network based model which can predict the success of the transplant over the short, medium and long term, using pre-operative characteristics of the patient (recipient) and implant organ (donor). When compared to results of logistic regression, the results of the proposed model showed better performance. Once the model is obtained, it will be converted into a tool for predicting the efficiency of the transplant protocol in order to optimise the donor-recipient pair and maximize the success of the transplant. The first real use of this application will be as a decision aid tool for helping physicians and surgeons when preparing to perform a transplant.

Rafael Magdalena, Antonio J. Serrano, Agustin Serrano, Jorge Muñoz, Joan Vila, E. Soria
Learning the Dose Adjustment for the Oral Anticoagulation Treatment

Several attempts have been recently provided to define Oral Anticoagulant (OA) guidelines. These guidelines include indications for oral anticoagulation and suggested arrangements for the management of an oral anticoagulant service. They aim to take care of the current practical difficulties involved in the safe monitoring of the rapidly expanding numbers of patients on long-term anticoagulant therapy. Nowadays, a number of computer-based systems exist for supporting hematologists in the oral anticoagulation therapy. Nonetheless, computer-based support improves the quality of the Oral Anticoagulant Therapy (OAT) and also possibly reduces the number of scheduled laboratory controls. In this paper, we discuss an approach based on statistical methods for learning both the optimal dose adjustment for OA and the time date required for the next laboratory control. This approach has been integrated in DNTAO-SE, an expert system for supporting hematologists in the definition of OAT prescriptions. In the paper, besides discussing the approach, we also present experimental results obtained by running DNTAO-SE on a database containing more than 4500 OAT prescriptions, collected from a hematological laboratory for the period December 2003 – February 2004.

Giacomo Gamberoni, Evelina Lamma, Paola Mello, Piercamillo Pavesi, Sergio Storari, Giuseppe Trocino

Information Retrieval

Thermal Medical Image Retrieval by Moment Invariants

Thermal medical imaging provides a valuable method for detecting various diseases such as breast cancer or Raynaud’s syndrome. While previous efforts on the automated processing on thermal infrared images were designed for and hence constrained to a certain type of disease we apply the concept of content-based image retrieval (CBIR) as a more generic approach to the problem. CBIR allows the retrieval of similar images based on features extracted directly from image data. Image retrieval for a thermal image that shows symptoms of a certain disease will provide visually similar cases which usually also represent similarities in medical terms. The image features we investigate in this study are a set of combinations of geometric image moments which are invariant to translation, scale, rotation and contrast.

Shao Ying Zhu, Gerald Schaefer

Knowledge Discovery and Data Mining

Employing Maximum Mutual Information for Bayesian Classification

In order to employ machine learning in realistic clinical settings we are in need of algorithms which show robust performance, producing results that are intelligible to the physician. In this article, we present a new Bayesian-network learning algorithm which can be deployed as a tool for learning Bayesian networks, aimed at supporting the processes of prognosis or diagnosis. It is based on a maximum (conditional) mutual information criterion. The algorithm is evaluated using a high-quality clinical dataset concerning disorders of the liver and biliary tract, showing a performance which exceeds that of state-of-the-art Bayesian classifiers. Furthermore, the algorithm places less restrictions on classifying Bayesian network structures and therefore allows easier clinical interpretation.

Marcel van Gerven, Peter Lucas
Model Selection for Support Vector Classifiers via Genetic Algorithms. An Application to Medical Decision Support

This paper addresses the problem of tuning hyperparameters in support vector machine modeling. A Genetic Algorithm-based wrapper, which seeks to evolve hyperparameter values using an empirical error estimate as a fitness function, is proposed and experimentally evaluated on a medical dataset. Model selection is then fully automated. Unlike other hyperparameters tuning techniques, genetic algorithms do not require supplementary information making them well suited for practical purposes. This approach was motivated by an application where the number of parameters to adjust is greater than one. This method produces satisfactory results.

Gilles Cohen, Mélanie Hilario, Antoine Geissbuhler
Selective Classifiers Can Be Too Restrictive: A Case-Study in Oesophageal Cancer

Real-life datasets in biomedicine often include missing values. When learning a Bayesian network classifier from such a dataset, the missing values are typically filled in by means of an imputation method to arrive at a complete dataset. The thus completed dataset then is used for the classifier’s construction. When learning a selective classifier, also the selection of appropriate features is based upon the completed data. The resulting classifier, however, is likely to be used in the original real-life setting where it is again confronted with missing values. By means of a real-life dataset in the field of oesophageal cancer that includes a relatively large number of missing values, we argue that especially the wrapper approach to feature selection may result in classifiers that are too selective for such a setting and that, in fact, some redundancy is required to arrive at a reasonable classification accuracy in practice.

Rosa Blanco, Linda C. van der Gaag, Iñaki Inza, Pedro Larrañaga
A Performance Comparative Analysis Between Rule-Induction Algorithms and Clustering-Based Constructive Rule-Induction Algorithms. Application to Rheumatoid Arthritis

We present a performance comparative analysis between traditional rule-induction algorithms and clustering-based constructive rule-induction algorithms. The main idea behind these methods is to find dependency relations among primitive variables and use them to generate new features. These dependencies, corresponding to regions in the space, can be represented as clusters of examples. Unsupervised clustering methods are proposed for searching for these dependencies. As a benchmark, a database of rheumatoid arthritis (RA) patients has been used. A set of clinical prediction rules for prognosis in RA was obtained by applying the most successful methods, selected according to the study outcomes. We suggest that it is possible to relate predictive features and long-term outcomes in RA.

J. A. Sanandrés-Ledesma, Victor Maojo, Jose Crespo, M. García-Remesal, A. Gómez de la Cámara
Domain-Specific Particularities of Data Mining: Lessons Learned

Numerous data mining methods and tools have been developed and applied during the last two decades. Researchers have usually focused on extracting new knowledge from raw data, using a large number of methods and algorithms. In areas such as medicine, few of these DM systems have been widely accepted and adopted. In contrast, DM has obtained a considerable success in recent genomic research, contributing to the huge tasks of data analysis linked to the human genome project and related research. This paper presents a study of relevant past research in biomedical DM. It is proposed that traditional approaches used in medical DM should apply some of the lessons learned in decades of research in disciplines such as epidemiology and medical statistics. In this context, novel methodologies will be needed for data analysis in the areas related to genomic medicine, where genomic and clinical data will be tightly collected and studied. Some ideas are proposed for new research design, considering those lessons learned during the last decades.

Victor Maojo

Statistical Methods and Tools for Biological and Medical Data Analysis

A Structural Hierarchical Approach to Longitudinal Modeling of Effects of Air Pollution on Health Outcomes

In the paper we present the methodology of construction and interpretation of models for the study of air pollution effects on health outcomes and their applications. According to the main assumption of the model, every health outcome is an element of the multivariate hierarchical system and depends on the system meteorology, pollution, geophysical, socio-cultural and other factors. The given model is built on system approach using GEE-technique and time-series analysis. The model is tested by the data collected from lung function measurements of the group of 48 adults with vulnerable respiratory system in Leipzig, Germany, over the period from October 1990 till April 1991 (the total of 10,080 individual daily records). The meteorological variables comprise temperature and humidity, while the pollution variables are made of the Total Suspended Particulate Matter and Sulfur Dioxide airborne concentration. Results of the models, constructed separately for morning, noon, and evening, demonstrate direct and indirect influence of air pollution on the lung function under the certain meteorological, individual factors and seasonal changes.

Michael Friger, Arkady Bolotin, Ulrich Ranft
Replacing Indicator Variables by Fuzzy Membership Functions in Statistical Regression Models: Examples of Epidemiological Studies

The aim of this paper is to demonstrate the possibility of substitution of indicator variables in statistical regression models used in epidemiology for corresponding membership functions. Both methodological and practical aspects of such substitution are considered in this paper through three examples. The first example considers the connection between women’s quality of life and categories of Body Mass Index. In the second example we examine death incidence among Bedouin children of different age categories. The third example considers the factors that can affect on high hemoglobin HbA(1c) in diabetic patients.

Arkady Bolotin
PCA Representation of ECG Signal as a Useful Tool for Detection of Premature Ventricular Beats in 3-Channel Holter Recording by Neural Network and Support Vector Machine Classifier

In the paper classification method of compressed ECG signal was presented. Classification of single heartbeats was performed by neural networks and support vector machine. Parameterization of ECG signal was realized by principal component analysis (PCA). For every heartbeat only two descriptors have been used. The results of real Holter signal were presented in tables and as plots in planespherical coordinates. The efficiency of classification is near to 99%.

Stanisław Jankowski, Jacek J. Dusza, Mariusz Wierzbowski, Artur Oręziak
Finding Relations in Medical Diagnoses and Procedures

A graph-based probability model has been defined that represents the relations between primary and secondary diagnoses and procedures in a file of episodes. This model has been used as a basis of an application that permits the graphical visualization of the relations between diagnoses and procedures. The application can be used by physicians to detect diagnosis or procedure structures that are hidden in the data of assisted patients in a health care center and to perform probability data analysis. The application has been tested with the data of the Hospital Joan XIII in Tarragona (Spain).

David Riaño, Ioannis Aslanidis
An Automatic Filtering Procedure for Processing Biomechanical Kinematic Signals

In biomechanics studies it is necessary to obtain acceleration of certain parts of the body in order to perform dynamical analysis. The motion capture systems introduce systematic measurement errors that appear in the form of high-frequency noise in recorded displacement signals. The noise is dramatically amplified when differentiating displacements to obtain velocities and accelerations. To avoid this phenomenon it is necessary to smooth the displacement signal prior to differentiation. The use of Singular Spectrum Analysis (SSA) is presented in this paper as an alternative to traditional digital filtering methods. SSA decomposes original time series into a number of additive time series each of which can be easily identified as being part of the modulated signal, or as being part of the random noise. An automatic filtering procedure based in SSA is presented in this work. The procedure is applied to two signals to demonstrate its performance.

Francisco Javier Alonso, José María Del Castillo, Publio Pintado
Analysis of Cornea Transplant Tissue Rejection Delay in Mice Subjects

In this paper we present statistical analysis of cornea transplant tissue rejection delay in mice subjects resulting from using one of five types of immunosuppressive agents or placebo. The therapy included FK506 (tacrolimus), MMF (mycophenolate mofetil), AMG (aminoguanidine) and combinations FK506+AMG and FK506+MMF. Subjects were randomized to receiving one of the post-transplant regimens and were followed for up to two months until either rejection of the transplant tissue or censoring occurred. Due to complexity and personnel limitations the trial was performed in four stages using groups of either high risk or regular subjects. We used covariate-adjusted Gray’s time-varying coefficients model for analyzing the time to transplant tissue rejection. At several occasions the type of the outcome (failure or censoring) could not be unambiguously determined. Analyses resulting from the two extreme interpretations of the data are therefore presented, leading to consistent conclusions regarding the treatments efficacy.

Zdenek Valenta, P. Svozilkova, M. Filipec, J. Zvarova, H. Farghali
Toward a Model of Clinical Trials

In the paper some results of the modelling of the clinical trial (CT) process are presented. CT research is a complex process which includes the protocol editing, its use and implementation in the CT experimentation, and the evaluation of the results. To improve the medical research, it is necessary to consider the CT research process as a whole. We structured the CT research process in three subprocesses: a) clinical trial management; b) management of statistical units; and c) patient health care delivery process. Each process has different objectives and is enacted in different environments, carried out by its own agents and resources, and influenced by specific rules characterising each process. The model is supported by three perspectives on the CT process: functional, structural, and behavioural views.

Laura Collada Ali, Paola Fazi, Daniela Luzi, Fabrizio L. Ricci, Luca Dan Serbanati, Marco Vignetti

Time Series Analysis

Predicting Missing Parts in Time Series Using Uncertainty Theory

As extremely large time series data sets grow more prevalent in a wide variety of applications, including biomedical data analysis, diagnosis and monitoring systems and exploratory data analysis in scientific and business time series, the need of developing efficient analysis methods is high. However, essential preprocessing algorithms are required in order to obtain positive results. The goal of this paper is to propose a novel algorithm that is appropriate for filling missing parts of time series. This algorithm, named FiTS (Filling Time Series), was evaluated over 11 congestive heart failure patients’ ECGs (Electrocardiogram). Those patients using electronic microdevices with which were recording their ECGs and sending them via telephone to a home care monitoring system, over a period of 8 to 16 months. Randomly missing parts in each ECG were introduced in the initial ECG. As a result, FiTS had 100% of successfully completion with high reconstructed signal accuracy.

Sokratis Konias, Nicos Maglaveras, Ioannis Vlahavas
Classification of Long-Term EEG Recordings

Computer assisted processing of long-term EEG recordings is gaining a growing importance. To simplify the work of a physician, that must visually evaluate long recordings, we present a method for automatic processing of EEG based on learning classifier. This method supports the automatic search of long-term EEG recording and detection of graphoelements – signal parts with characteristic shape and defined diagnostic value. Traditional methods of detection show great percent of error caused by the great variety of non-stationary EEG. The idea of this method is to break down the signal into stationary sections called segments using adaptive segmentation and create a set of normalized discriminative features representing segments. The groups of similar patterns of graphoelements form classes used for the learning of a classifier. Weighted features are used for classification performed by modified learning classifier fuzzy k – Nearest Neighbours. Results of classification describe classes of unknown segments. The implementation of this method was experimentally verified on a real EEG with the diagnosis of epilepsy.

Karel Kosar, Lenka Lhotska, Vladimir Krajca
Application of Quantitative Methods of Signal Processing to Automatic Classification of Long-Term EEG Records

The aim of the work described in the paper has been to develop a system for processing long-term EEG recordings, especially of comatose state EEG. However with respect to the signal character, the developed approach is suitable for analysis of sleep and newborn EEG too. EEG signal can be analysed both in time and frequency domains. In time domain the basic descriptive quantities are general and central moments of lower orders, in frequency domain the most frequently used method is Fourier transform. For segmentation, combination of non-adaptive and adaptive segmentation has been used. The approach has been tested on real sleep EEG recording for which the classification has been known. The core of the developed system is the training set on which practically depends the quality of classification. The training set containing 319 segments classified into 10 classes has been used for classification of the 2hour sleep EEG recording. For classification, algorithm of nearest neighbour has been used. In the paper, the issues of development of the training set and experimental results are discussed.

Josef Rieger, Lenka Lhotska, Vladimir Krajca, Milos Matousek
Semantic Reference Model in Medical Time Series

The analysis of time series databases is very important in the area of medicine. Most of the approaches that address this problem are based on numerical algorithms that calculate distances, clusters, index trees, etc. However, a domain-dependent analysis sometimes needs to be conducted to search for the symbolic rather than numerical characteristics of the time series. This paper focuses on our work on the discovery of reference models in time series of isokinetics data and a technique that transforms the numerical time series into symbolic series. We briefly describe the algorithm used to create reference models for population groups and its application in the real world. Then, we describe a method based on extracting semantic information from a numerical series. This symbolic information helps users to efficiently analyze and compare time series in the same or similar way as a domain expert would.Domain:Time series analysis

Fernando Alonso, Loïc Martínez, César Montes, Aurora Pérez, Agustín Santamaría, Juan Pedro Valente
Control of Artificial Hand via Recognition of EMG Signals

The paper presents a concept of bioprosthesis control via recognition of user intent on the basis of myopotentials acquired from his body. The EMG signals characteristics and the problems of their measurement have been discussed. The contextual recognition has been considered and three description method for such approach (respecting 1st and 2nd -order context), using: Markov chains, fuzzy rules, neural networks, as well as the involved decision algorithms have been described. The algorithms have been experimentally tested as far as the decision quality is concerned.

Andrzej Wolczowski, Marek Kurzynski

Bioinformatics: Data Management and Analysis in Bioinformatics

SEQPACKER: A Biologist-Friendly User Interface to Manipulate Nucleotide Sequences in Genomic Epidemiology

The aim of this paper is to present a new integrated bioinformatics tool for manipulating nucleotide sequences with a user-friendly graphical interface. This tool is named “SeqPacker” because it uses DNA/RNA sequences. In addition, SeqPacker can be seen as a kind of nucleotide chain editor using standardized technologies, nucleotide representation standards, and high platform portability in support of research in Genomic Epidemiology. SeqPacker is written in JAVA as free and stand-alone software for several computer platforms.

Oscar Coltell, Miguel Arregui, Larry Parnell, Dolores Corella, Ricardo Chalmeta, Jose M. Ordovas
Performing Ontology-Driven Gene Prediction Queries in a Multi-agent Environment

Gene prediction is one of the most challenging problems in Computational Biology. Motivated by the strengths and limitations of the currently available Web-based gene predictors, a Knowledge Base was constructed that conceptualizes the functionalities and requirements of each tool, following an ontology-based approach. According to this classification, a Multi-Agent System was developed that exploits the potential of the underlying semantic representation, in order to provide transparent and efficient query services based on user-implied criteria. Given a query, a broker agent searches for matches in the Knowledge Base, and coordinates correspondingly the submission/retrieval tasks via a set of wrapper agents. This approach is intended to enable efficient query processing in a resource-sharing environment by embodying a meta-search mechanism that maps queries to the appropriate gene prediction tools and obtains the overall prediction outcome.

Vassilis Koutkias, Andigoni Malousi, Nicos Maglaveras
Protein Folding in 2-Dimensional Lattices with Estimation of Distribution Algorithms

This paper introduces a new type of evolutionary computation algorithm based on probability distributions for the solution of two simplified protein folding models. The relationship of the introduced algorithm with previous evolutionary methods used for protein folding is discussed. A number of experiments for difficult instances of the models under analysis is presented. For the instances considered, the algorithm is shown to outperform previous evolutionary optimization methods.

Roberto Santana, Pedro Larrañaga, José A. Lozano

Bioinformatics: Integration of Biological and Medical Data

Quantitative Evaluation of Established Clustering Methods for Gene Expression Data

Analysis of gene expression data generated by microarray techniques often includes clustering. Although more reliable methods are available, hierarchical algorithms are still frequently employed. We clustered several data sets and quantitatively compared the performance of an agglomerative hierarchical approach using the average-linkage method with two partitioning procedures, k-means and fuzzy c-means. Investigation of the results revealed the superiority of the partitioning algorithms: the compactness of the clusters was markedly increased and the arrangement of the profiles into clusters more closely resembled biological categories. Therefore, we encourage analysts to critically scrutinize the results obtained by clustering.

Dörte Radke, Ulrich Möller
DiseaseCard: A Web-Based Tool for the Collaborative Integration of Genetic and Medical Information

The recent advances on genomics and proteomics research bring up a significant grow on the information that is publicly available. However, navigating through genetic and bioinformatics databases can be a too complex and unproductive task for a primary care physician. Moreover, considering the rare genetic diseases field, we verify that the knowledge about a specific disease is commonly disseminated over a small group of experts. The capture, maintenance and sharing of this knowledge over user-friendly interfaces will introduce new insights in the understanding of some rare genetic diseases. In this paper we present DiseaseCard, a web available collaborative service that aims to integrate and disseminate genetic and medical information on rare genetic diseases.Paper Domain:Methods and systems for database integration Biological and medical data visualization Health bioinformatics and genomics

José Luís Oliveira, Gaspar Dias, Ilídio Oliveira, Patrícia Rocha, Isabel Hermosilla, Javier Vicente, Inmaculada Spiteri, Fernando Martin-Sánchez, António Sousa Pereira
Biomedical Informatics: From Past Experiences to the Infobiomed Network of Excellence

Medical Informatics (MI) and Bioinformatics (BI) are now facing, after various decades of ongoing research and activities, a crucial time, where both disciplines could merge, increase collaboration or follow separate roads. In this paper, we provide a vision of past experiences in both areas, pointing out significant achievements in both fields. Then, scientific and technological aspects are considered, following an ACM report on computing. Following this approach, both MI and BI are analyzed, from three perspectives: design, abstraction, and theories, showing differences between them. An overview of training experiences in Biomedical Informatics is also included, showing current trends. In this regard, we present the INFOBIOMED network of excellence, funded by the European Commission, as an example of a systematic effort to support a synergy between both disciplines, in the new area of Biomedical Informatics.

Victor Maojo, Fernando Martin-Sánchez, José María Barreiro, Carlos Diaz, Ferran Sanz

Bioinformatics: Metabolic Data and Pathways

Network Analysis of the Kinetics of Amino Acid Metabolism in a Liver Cell Bioreactor

The correlation of the kinetics of 18 amino acids, ammonia and urea in 18 liver cell bioreactor runs was analyzed and described by network structures. Three kinds of networks were investigated: i) correlation networks, ii) Bayesian networks, and iii) dynamic networks that obtain their structure from systems of differential equations. Three groups of liver cell bioreactor runs with low, medium and high performance, respectively, were investigated. The aim of this study was to identify patterns and structures of the amino acid metabolism that can characterize different performance levels of the bioreactor.

Wolfgang Schmidt-Heck, Katrin Zeilinger, Michael Pfaff, Susanne Toepfer, Dominik Driesch, Gesine Pless, Peter Neuhaus, Joerg Gerlach, Reinhard Guthke
Model Selection and Adaptation for Biochemical Pathways

In bioinformatics, biochemical signal pathways can be modeled by many differential equations. It is still an open problem how to fit the huge amount of parameters of the equations to the available data. Here, the approach of systematically obtaining the most appropriate model and learning its parameters is extremely interesting.One of the most often used approaches for model selection is to choose the least complex model which “fits the needs”. For noisy measurements, the model with the smallest mean squared error of the observed data results in a model which fits too accurately to the data – it is overfitting. Such a model will perform good on the training data, but worse on unknown data.This paper proposes as model selection criterion the least complex description of the observed data by the model, the minimum description length MDL. For the small, but important example of inflammation modeling the performance of the approach is evaluated.

Rüdiger W. Brause
NeoScreen: A Software Application for MS/MS Newborn Screening Analysis

The introduction of the Tandem Mass Spectrometry (MS/MS), in neonatal screening laboratories, has opened the doors to innovative newborn screening analysis. With this technology the number of metabolic disorders, that can be detected, from dried blood-spot species, increases significantly. However, the amount of information obtained with this technique and the pressure for quick and accurate diagnostics raises serious difficulties in the daily data analysis. To face this challenge we developed a software system, NeoScreen, which simplifies and allow speeding up newborn screening diagnostics.Paper Domain:Health bioinformatics and genomics Statistical methods and tools for biological and medical data analysis

Miguel Pinheiro, José Luís Oliveira, Manuel A. S. Santos, Hugo Rocha, M. Luis Cardoso, Laura Vilarinho

Bioinformatics: Microarray Data Analysis and Visualization

Technological Platform to Aid the Exchange of Information and Applications Using Web Services

Motivation: This article describes a technological platform that allows sharing not only the access to information sources, but also the use of processing data algorithms through Web Services.Results:This architecture has been applied in the Thematic Network of Cooperative Investigation named INBIOMED. In this network, several groups of biomedical research can harness their results by accessing to the information of other groups, and besides, to be able to analyze this information using processes developed by other groups specialized in the development of analysis algorithms.Additional information:http://www.inbiomed.uji.esContact:estruch@sg.uji.es

Antonio Estruch, José Antonio Heredia
Visualization of Biological Information with Circular Drawings

Being able to clearly visualize clusters of genes provided by gene expressions over a set of samples is of high importance. It gives researchers the ability to create an image of the structure, and connections of the information examined. In this paper we present a tool for visualizing such information, with respect to clusters trying to produce an aesthetic, Symmetrie image, which allows the user to interact with the application and query for specific information.

Alkiviadis Symeonidis, Ioannis G. Tollis
Gene Selection Using Genetic Algorithms

Microarrays are emerging technologies that allow biologists to better understand the interactions between disease and normal states, at genes level. However, the amount of data generated by these tools becomes problematic when data are supposed to be automatically analyzed (e.g., for diagnostic purposes). In this work, the authors present a novel gene selection method based on Genetic Algorithms (GAs). The proposed method uses GAs to search for subsets of genes that optimize 2 measures of quality for the clusters presented in the domain. Thus, data are better represented and classification of unknown samples may become easier. In order to demonstrate the strength of the proposed approach, experimental results using 4 public available microarray datasets were carried out.

Bruno Feres de Souza, André C. P. L. F. de Carvalho
Knowledgeable Clustering of Microarray Data

A novel graph-theoretic clustering (GTC) is presented. The method relies on a weighted graph arrangement of the genes, and the iterative partitioning of the respective minimum spanning tree of the graph. The final result is the hierarchical clustering of the genes. GTC utilizes information about the functional classification of genes to knowledgeably guide the clustering process and achieve more informative clustering results. The method was applied and tested on an indicative real-world domain producing satisfactory and biologically valid results. Future R&D directions are also posted.

George Potamias
Correlation of Expression Between Different IMAGE Clones from the Same UniGene Cluster

Through the use of DNA microarray it is now possible to obtain quantitative measurements of the expression of thousands of genes present in a biological sample. DNA arrays yield a global view of gene expression and they can be used in a number of interesting ways. In this paper we are investigating an approach for studying the correlations between different clones from the same UniGene cluster. We will explore all possible couples of clones valuing the linear relations between the expression of these sequences. In this way, we can obtain several results: for example we can estimate measurement errors, or we can highlight genetic mutations. The experiments were done using a real dataset, build from 161 microarray experiments about Hepatocellular Carcinoma.

Giacomo Gamberoni, Evelina Lamma, Sergio Storari, Diego Arcelli, Francesca Francioso, Stefano Volinia
Backmatter
Metadaten
Titel
Biological and Medical Data Analysis
herausgegeben von
José María Barreiro
Fernando Martín-Sánchez
Víctor Maojo
Ferran Sanz
Copyright-Jahr
2004
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-30547-7
Print ISBN
978-3-540-23964-2
DOI
https://doi.org/10.1007/b104033