Programming Issues

The Verifying Compiler: A Grand Challenge for Computing Research

I propose a set of criteria which distinguish a grand challenge in science or engineering from the many other kinds of short-term or long-term research problems that engage the interest of scientists and engineers. As an example drawn from Computer Science, I revive an old challenge: the construction and application of a verifying compiler that guarantees correctness of a program before running it.

Tony Hoare

Linear Types for Cashflow Reengineering

A while back a major Danish bank approached the programming language group at DIKU for help on designing a language for modelling cash flow reengineering: The process of issuing customised bonds based on income from existing bonds. The idea was to have a simple language that allows non-programmers to describe such reengineering and run statistical simulations of the structures.We describe the problem and present the design of a cashflow-reengineering language based on the dataflow paradigm and linear types. This language has formed the basis of further development by the bank in question and a variant of it is now in use there.

Torben Æ Mogensen

Storing Properties in Grouped Tagged Tuples

A technique is presented that allows one to store groups of properties in C++, and single properties out of these groups can later be accessed by their name. Our approach refines previous work in this area and is an example for the application of template metaprogramming [1]. Typical usage examples of the introduced class templates are internal representations of serialized data, well suited for semi-automatic as well as manual generation of the corresponding class types.

Roland Weiss, Volker Simonis

A Polymorphic Radix-n Framework for Fast Fourier Transforms

We provide a polymorphic framework for radix-n Fast Fourier Transforms (FFTs) where all known kinds of monomoporhic radix-n algorithms can be obtained by specialization. The framework is mathematically based on the Cooley-Tukey mapping, and implemented as a C++ template meta-program. Avoiding run-time overhead, all specializations are performed statically.

Marcin Zalewski, Sibylle Schupp

Intersecting Classes and Prototypes

The object-oriented programming language design space consists of class-based and prototype-based languages. Both families have been shown to posses their advantages and disadvantages. Hybrid languages featuring both prototype-based and class-based mechanisms have been proposed as a solution. Unfortunately these languages not only unify the advantages but also the disadvantages of both families. We propose a more intersectional point of view and propose a language that inherits the advantages but shuns the disadvantages of both families.

Wolfgang De Meuter, Theo D’Hondt, Jessie Dedecker

Software Engineering

Bending without Breaking: Making Software More Flexible

Extended Abstract

In this talk we discuss the problem of simultaneously refining mutually interdependent classes and object types. We discuss possible solutions using existing static type systems that include parametric polymorphism. A statically type-safe solution is presented that involves the introduction of type groups, a construct that can be understood as a generalization of the MyType construct introduced in a statically type-safe way in languages like PolyTOIL [3] and LOOM [2].

Kim B. Bruce

Program Construction in the Context of Evolutionary Computation

Many optimization algorithms that imitate certain principles of nature have been proven useful in various application domains. The following paper shows how Evolutionary Algorithm (EA) can be applied to model (program) construction for solving the discrete time system identification problem. Non-linear system identification is used as an example problem domain for studying possibilities of EA to discover the relationship between parameters in response to a given set of inputs.

Jelena Sanko, Jaan Penjam

A Layered Architecture Sustaining Model-Driven and Event-Driven Software Development

This paper presents a layered software architecture reconciling model-driven, event-driven, and object-oriented software development. In its simplest form, the architecture consists of two layers: an enterprise layer consisting of a relatively stable business model and an information system layer, containing the more volatile user functionality. The paper explains how the concept of events is used in the enterprise layer as a means to make business objects more independent of each other. This results in an event handling sublayer, allowing to define groups of events and handling consistency and transaction management aspects. This type of architecture results in information systems with a high-level modular structure, where changes are easier to perform as higher layers will not influence the inherently more stable lower layers.

Cindy Michiels, Monique Snoeck, Wilfried Lemahieu, Frank Goethals, Guido Dedene

Software Education

The Outside-In Method of Teaching Introductory Programming

The new design for the introductory programming course at ETH relies on object technology, Eiffel, extensive reuse, a graphics-rich library (TRAFFIC) built specifically for the course, a textbook (“Touch of Class”) and an Outside-In approach based on “inverted curriculum” ideas. This article presents the key aspects of the approach.Note: readers interested in following the development of our course, the “Touch of Class” textbook and the supporting TRAFFIC software project may look up the page se.inf.ethz.ch/touch, where they can also subscribe to mailing lists connected with the approach.

Bertrand Meyer

Program Synthesis, Transformation, and Semantics

Numeric Types in Formal Synthesis

The Formal Synthesis methodology can be considered as the application of the transformational approach to circuit synthesis by logical transformations performed in a theorem prover. Additionally to the implementation of the circuit, the proof that the result is a correct implementation of a given specification is obtained automatically. In this paper, a higher-order formalisation for the arithmetic of bound numeric data types is given. We provide correct transformations implemented in the theorem prover HOL [4], to register-transfer level descriptions of arithmetic operations. For a restricted class of specifications, a correct transformation is described which eliminates the type num and replaces arithmetic operations by its bound variants.

Viktor Sabelfeld, Kai Kapp

On the Possibility of Provably Secure Obfuscating Programs

By obfuscation we mean any efficient semantic-preserving transformation of computer programs aimed at bringing a program into such a form, which impedes the understanding of its algorithm and data structures or prevents the extracting of some valuable information from the plaintext of a program. The main difficulty in designing an effective program obfuscator is to guarantee security, i.e. to prove that no algorithm can break software protection in reasonable time. All obfuscation techniques and tools developed so far rely on the informal concept of security and therefore can’t be regarded as provably secure. In this paper we (1) introduce for the first time a formal information-theoretic definition of obfuscation security, (2) present a new obfuscation technique which takes advantage of cryptographic primitives (one-way functions, hard-core predicates), and (3) demonstrate, taking a conventional password identification scheme as a case study, how to prove security of the obfuscating transformations.

Nikolay P. Varnovsky, Vladimir A. Zakharov

Verification-Oriented Language C-Light and Its Structural Operational Semantics

The paper presents the language C-light which is a representative verification-oriented subset of the standard C. This language allows deterministic expressions and a limited use of the statements switchand goto. C-light includes the C++ operators newand deleteto manage the dynamic memory instead of standard C library functions. The structural operational semantics of C-light in the Plotkin style is outlined.

Valery A. Nepomniaschy, Igor S. Anureev, Alexey V. Promsky

Proofs-as-Imperative-Programs: Application to Synthesis of Contracts

Proofs-as-programs is an approach to program synthesis involving the transformation of intuitionistic proofs of specification requirements to functional programs (see, e.g., [1, 2, 12]). Various authors have adapted the proofs-as-programs to other logics and programming paradigms. This paper presents a novel approach to adapting proofs-as-programs for the synthesis of imperativeSML programs with side-effect-free return values, from proofs in a constructive version of the Hoare logic. We will demonstrate the utility of this approach by sketching how our work can be used to synthesize assertion contracts, aiding software development according to the principles of design-by-contract [8].

Iman Poernomo

Graphical Interfaces

On the Visualization and Aesthetics of Large Graphs

Short Abstract

The talk will survey 14 years of our group’s work on graph drawing. It will start with a simulated annealing algorithm we developed in 1989, that works well for graphs with 20–40 nodes, but has severe problems for larger graphs. It will culminate with extremely powerful multi-scale and algebraic approaches developed in the last few years that produce beautiful renditions of million-node graphs in very reasonable time. The work was done with Ron Davidson, Gregory Yashchin, Ronny Hadany, Liran Carmel and Yehuda Koren.

David Harel

Data Mappings in the Model-View-Controller Pattern

The model-view-controller pattern is used to keep a data model and its views consistent. Usually there is a one-to-one correspondence between the data in the model and its representation in the views, which is sometimes too inflexible. We propose to add so-called data mappings between the model and its views. Data mappings are components that can be plugged together hierarchically. They can perform any transformations on the data as well on notifications, thus allowing a more flexible collaboration between a model and its views. GUI builders can be extended so that a user can add data mappings to a graphical user interface interactively, i.e. without programming.

Martin Rammerstorfer, Hanspeter Mössenböck

Partial Evaluation and Supercompilation

The Translation Power of the Futamura Projections

Despite practical successes with the Futamura projections, it has been an open question whether target programs produced by specializing interpreters can always be as efficient as those produced by a translator. We show that, given a Jones-optimal program specializer with static expression reduction, there exists for every translator an interpreter which, when specialized, can produce target programs that are at least as fast as those produced by the translator. This is not the case if the specializer is not Jones-optimal. We also examine Ershov’s generating extensions, give a parameterized notion of Jones optimality, and show that there is a class of specializers that can always produce residual programs that match the size and time complexity of programs generated by an arbitrary generating extension. This is the class of generation universal specializers. We study these questions on an abstract level, independently of any particular specialization method.

Robert Glück

A Compiler Generator for Constraint Logic Programs

The cogen approach to program specialisation, writing a compiler generator instead of a specialiser, has been used with considerable success. This paper demonstrates that the cogen approach is also applicable to the specialisation of constraint logic programs and leads to effective specialisers. We present the basic specialisation technique for CLP(Q) programs and show how we can handle non-declarative features as well. We present an implemented system along with experimental results.

Stephen-John Craig, Michael Leuschel

The Supercompiler SCP4: General Structure

Supercompilation is a program transformation technique introduced in the 1970s by V. Turchin [13,14,16]. His ideas were studied by a number of authors for a long time. We constructed an experimental supercompiler for a functional language Refal-5 [15]. The Scp4 project was discussed with V. Turchin. He initiated and supported our work. Scp4 has been implemented once again using Refal-5. Sources of Scp4 and the whole version of this paper are available for immediate download [8,9,17]. A user manual on Scp4 and reports on several interesting experiments can be found in [4,5,6]. Scp4 is a first experimental supercompiler for the real functional language Refal-5. The principal new tool in Scp4 is an online analysis of global properties of folded-components of the meta-tree MTr of all potential computations. Let a program P and a parameterized input of the P be given. Then such a pair defines a partial mapping. A supercompiler is a transformer of such pairs. The transformer must preserve the map values on the map domain. Scp4 unfolds a potentially infinite tree of all possible computations. It reduces in the process the redundancy that could be present in the original program. It folds the tree into a finite graph of states and transitions between possible configurations of the computing system. It analyses global properties of the graph and specializes this graph w.r.t. these properties. The resulting definition is constructed solely based on the meta-interpretation of the source program rather than by a step-by-step transformation of the program. The size of the Scp4 system is about 19500 lines of commented source code (800 KB).

Andrei P. Nemytykh

Partial Evaluation for Common Intermediate Language

Partial Evaluation is a well-established method for specialization of programs in functional languages. However, real-world applications demand specialization of object-oriented languages. With the advent of the Microsoft .NET platform with Common Intermediate Language (CIL), to which multiple languages are compiled, various program analysis and transformation techniques, including partial evaluation, can be developed once for a low-level language rather than over and over again to various high-level languages. The CILPE Project aims at developing a practical specializer for CIL programs based on Partial Evaluation with preliminary analysis referred to as Binding Time Analysis (BTA). The CILPE Specializer partially evaluates operations on values and objects determined by BTA as static. Static objects are created, processed and destroyed at specialization time; they may be mutable and can have some fields unknown (dynamic). Arbitrary verifiable CIL code is allowed as input.

Andrei M. Chepovsky, Andrei V. Klimov, Arkady V. Klimov, Yuri A. Klimov, Andrei S. Mishchenko, Sergei A. Romanenko, Sergei Yu. Skorobogatov

Verification

Timed Verification with μCRL

μCRL is a process algebraic language for specification and verification of distributed systems. μCRL allows to describe temporal properties of distributed systems but it has no explicit reference to time. In this work we propose a manner of introducing discrete time without extending the language. The semantics of discrete time we use makes it possible to reduce the time progress problem to the diagnostics of “no action is enabled” situations. The synchronous nature of the language facilitates the task. We show some experimental verification results obtained on a timed communication protocol.

Stefan Blom, Natalia Ioustinova, Natalia Sidorova

Verification of Distributed Dataspace Architectures

The space calculus is introduced as a language to model distributed dataspace systems, i.e. distributed applications that use a shared (but possibly distributed) dataspace to coordinate. The publish-subscribe and the global dataspace are particular instances of our model. We give the syntax and operational semantics of this language and provide tool support for functional and performance analysis of its expressions. Functional behaviour can be checked by an automatic translation to μCRL and the use of a model checker. Performance analysis can be done using an automatically generated distributed C prototype.

Simona Orzan, Jaco van de Pol

Using SPIN and STeP to Verify Business Processes Specifications

Business transactions are prone to failure and having to deal with unexpected situations. Some business process specification languages, e.g. StAC, introduce notions like compensation handling. Given the need of verification of correctness in business related software, it is important to fill in the gap between business process specification languages like StAC and the verification software already available.We report on two of our previous attempts to develop a tool to allow verification of StAC specifications by using already existing systems, SPIN and STeP. We highlight some of the problems we faced during these attempts as they can prevent successful and widespread use of verification tools. Our experience can be used to make the available tools more versatile and hence, useful to a wider range of applications.

Juan C. Augusto, Michael Butler, Carla Ferreira, Stephen-John Craig

Integrating Tools for Automatic Program Verification

In this paper we describe our findings after integrating several tools based upon the Java Modeling Language (JML) [1], a specification language used to annotate Java programs. The tools we consider are Daikon [2], ESC/Java [3], JML runtime assertion checker [1], and Loop/PVS tool [4]. The first one generates specifications; the others are used to verify them. We find that for the first three it is worthwhile to combine them because this is relatively easy and it improves the specifications. Combining Daikon and the Loop/PVS tool directly works in theory, but in practice it only works if the test suite is very good and hence it is not advisable.

Engelbert Hubbers

Logic and Types

A Logical Reconstruction of Reachability

In this paper we discuss reachability analysis for infinite-state systems. Infinite-state systems are formalized using transition systems over a first-order structure. We establish a common ground relating a large class of algorithms by analyzing the connections between the symbolic representation of transition systems and formulas used in various reachability algorithms. Our main results are related to the so-called guarded assignment systems.

Tatiana Rybina, Andrei Voronkov

Recent Advances in Σ-Definability over Continuous Data Types

The purpose of this paper is to survey our recent research in computability and definability over continuous data types such as the real numbers, real-valued functions and functionals. We investigate the expressive power and algorithmic properties of the language of Σ-formulas intended to represent computability on continuous data types. In the case of the real numbers we illustrate how computability can be expressed in the language of Σ-formulas.

Margarita Korovina

Concurrent and Distributed Systems

Open Maps and Trace Semantics for Timed Partial Order Models

The intention of the paper is to show the applicability of the general categorical framework of open maps to the setting of partial order models with a dense time domain. In particular, we define a category of timed event structures, where the morphisms are to be thought of as simulations, and an accompanying (sub)category of timed pomsets which provides us with a notion of open maps. Then, we show that the abstract equivalence obtained from the framework of open maps coincides with a timed extension of the well-known trace equivalence based on Pratt’s partial orders.

Irina B. Virbitskaite, Nataly S. Gribovskaja

Confidentiality for Multithreaded Programs via Bisimulation

Bisimulation has been a popular foundation for characterizing the confidentiality properties of concurrent programs. However, because a variety of bisimulation definitions are available in the literature, it is often difficult to pin down the “right” definition for modeling a particular attacker. Focusing on timing- and probability-sensitive confidentiality for shared-memory multithreaded programs, we clarify the relation between different kinds of bisimulation by proving inclusion results. As a consequence, we derive the relationship between scheduler-specific, scheduler-independent, and strong confidentiality definitions. A key result justifying strong confidentiality is that it is the most accurate (largest) compositional indistinguishability-based confidentiality property that implies scheduler-independent confidentiality.

Andrei Sabelfeld

Dynamic Modification of System Structures Using LLPNs

In this paper we aim to set up a framework for object Petri net semantics, allowing the modification of object net structures at run-time. The approach uses linear logic Petri nets (LLPNs) and performs the structure modification on a linear logic encoding of the object net. In addition, Valk’s self-modifying Petri nets are shown to be subsumed by LLPNs.We expand on the existing theory of Farwer’s LLPNs, which are Petri nets with linear logic formulae as tokens. This work in progress uses intuitionistic linear logic as the basis of a method for ensuring desirable properties — such as termination or non-termination — of P/T nets, coloured Petri nets and LLPNs.

Berndt Farwer, Kundan Misra

Concurrent and Reactive Systems

Principles for Entity Authentication

We study the roles of message components in authentication protocols. In particular, we investigate how a certain component contributes to the task of achieving entity authentication. To this aim, we isolate a core set of roles that enables us to extract general principles that should be followed to avoid attacks. We then formalize these principles in terms of rules for protocol parties and we prove that protocols designed according to these rules will achieve entity authentication.

Michele Bugliesi, Riccardo Focardi, Matteo Maffei

Causality and Replication in Concurrent Processes

The replication operator was introduced by Milner for obtaining a simplified description of recursive processes. The standard interleaving semantics denotes the replication of a process P, written ! P, a shorthand for its unbound parallel composition, operationally equivalent to the process P| P| ... , with P repeated as many times as needed.Albeit the replication mechanism has become increasingly popular, investigations on its causal semantics has been scarce. In our work we consider the interleaving semantics for the operator proposed by Sangiorgi and Walker, and we show how to refine it in order to capture causality.Furthermore, we prove that a basic property usually associated to these semantics, the so-called concurrency diamond, does hold in our framework, and we sketch a correspondence between our proposal and the standard causal semantics for recursive process studied in the literature, for processes defined through constant invocations.

Pierpaolo Degano, Fabio Gadducci, Corrado Priami

Event-Driven Traversal of Logic Circuits for Re-evaluation of Boolean Functions in Reactive Systems

This paper presents an efficient algorithm for re-evaluation of a Boolean function represented as a logic circuit. The algorithm consists of pre-computation and re-evaluation parts. For a given logic circuit and initial input bits, the pre-computation constructs the data structure for the re-computation. The re-evaluation accepts a list of changed input bits and updates the output of the circuit. The pre-computation runs in time linear to size of the circuit and the re-computation performs in time linear to the number of triggered input bits.

Valeriy Vyatkin

Teams of Pushdown Automata

We introduce team pushdown automata as a theoretical framework capable of modelling various communication and cooperation strategies in complex, distributed systems. Team pushdown automata are obtained by augmenting distributed pushdown automata with the notion of team cooperation or — alternatively — by augmenting team automata with pushdown memory. Here we study their accepting capacity.

Maurice H. ter Beek, Erzsébet Csuhaj-Varjú, Victor Mitrana

Program Specification

Algebraic State Machines: Concepts and Applications to Security

The concept of algebraic state machine has been introduced in [3] as a state transition system the states of which are each defined as an algebra, and that communicate through channels.To make efficient use of this concept, one needs a formal semantics, as well as notions of composition and refinement, which are provided in the present work. To demonstrate their usefulness for an application area of major interest, we show how to extend algebraic state machines with data types modelling cryptographic operations and with an adversary model to reason about security-critical systems. As an example we consider a cryptographic protocol proposed in the literature.

Jan Jürjens

Combining Aspects of Reactive Systems

For reactive systems, a large collection of formal models has been developed. While the formal relationship between those models is often carefully analyzed, the methodical implications for selecting or constructing appropriate models for specific application domains are rarely addressed. We classify and compare different specification methods for distributed systems concerning communication, behavior, and causality. We discuss the implications of these dimensions, especially concerning the combination of their properties.

Leonid Kof, Bernhard Schätz

OCL Extended with Temporal Logic

UML class diagrams have become a standard for modeling the static structure of object-oriented software systems. OCL can be used for formulating additional constraints that can not be expressed with the diagrams. In this paper, we extend OCL with temporal operators to formulate temporal constraints.

Paul Ziemann, Martin Gogolla

The Definition of Transitive Closure with OCL – Limitations and Applications –

The Object Constraint Language (OCL) is based on first-order logic and set theory. As the most well-known application, OCL is used to formulate well-formedness rules in the UML metamodel. Here, the transitive closure of a relationship is defined in terms of an OCL invariant, which seems to contradict classical results on the expressive power of first-order logic.In this paper, we give sufficient justification for the correctness of the definition of transitive closure. Our investigation reinforces some decisions made in the semantics of UML and OCL. Currently, there is a lively debate on the same issues in the semantics of the upcoming UML 2.0.

Thomas Baar

Verification and Model Checking

Improving the Consistency Checking Process by Reusing Formal Verification Knowledge

At an iterative and incremental requirements capture stage, each iteration implies identifying new requirements, checking their with the current functional specification of the system, and, in many cases, modifying this global specification to satisfy them. In a totally formalized environment, these verifications are usually done applying a model checking algorithm which entails the well-known efficiency problems to check medium-large and large systems. In order to reduce the amount of verifications and, consequently these efficiency problems, we propose reusing previously obtained formal verification results. The ARIFS methodology (Approximate Retrieval of Incomplete and Formal Specifications) supports the classification and the efficient retrieval (without formal proofs) of the verification information stored in the repository and, in this paper, we show how to carry out this proposal by using an application example.

Rebeca P. Díaz Redondo, José J. Pazos Arias, Ana Fernández Vilas, Jorge García Duque, Alberto Gil Solla

Complexity of Model Checking by Iterative Improvement: The Pseudo-Boolean Framework

We present several new algorithms as well as new lower and upper bounds for optimizing functions underlying infinite games pertinent to computer-aided verification.

Henrik Björklund, Sven Sandberg, Sergei Vorobyov

Polynomial Approximations for Model Checking

The μ-Calculus of D. Kozen (1983) is a very powerful propositional program logic with fixpoints. It is widely used for specification and verification. Model checking is a very popular automatic approach for verification of specifications of finite state systems. The most efficient algorithms that have been developed so far for model checking of the μ-Calculus in finite state systems have exponential upper bounds. A. Emerson, C. Jutla, and P. Sistla studied (1993) the first fragment of the μ-Calculus that permits arbitrary nesting and alternations of fixpoints, and polynomial model checking in finite state systems. In contrast we study lower and upper approximations for model checking that are computable in polynomial time, and that can give correct semantics in finite models for formulae with arbitrary nesting and alternations. A.Emerson, C.Jutla, and P.Sistla proved also that the model checking problem for the μ-Calculus in finite state systems is in $\mathcal{NP}\cap$co-$\mathcal{NP}$. We develop another proof (that we believe is a new one) as a by-product of our study.

Nikolai V. Shilov, Natalya O. Garanina

Constraint Programming

Separating Search and Strategy in Solver Cooperations

In the constraint programming community, solver cooperation is now an established paradigm that aims at using several solvers to improve some kind of limitations or inefficiency imposed by the use of a unique solver; solver cooperation applications range over several fields such as heterogeneous domain solving, heterogeneous constraint forms over the same domain, or distributed problem solving. Meanwhile, search-languages have emphasised the need to clearly separate the different steps during a constraint problem resolution. In a similar direction, this paper presents a paradigm that enables the user to properly separate computation strategies from the search phases in solver cooperations.

Brice Pajot, Eric Monfroy

Industrial Application of External Black-Box Functions in Constraint Programming Solver

We examine the problem of interaction between the interval constraint programming solver and external computation algorithms that are considered by the solver as Black Boxes. Some techniques are used in order to resolve a contradiction between the intervals values of the constraint solver and exact values of Black Boxes. The proposed approach was developed and implemented in CATIA CAD/CAM system.

Vladimir Sidorov, Vitaly Telerman

LGS: Geometric Constraint Solver

In the paper we present LGS — a geometric constraint solver developed at LEDAS Ltd. We review different approaches in geometric constraint solving, present our one, describe in details LGS architecture and the ideas behind it. The main idea of LGS is to decompose the initial problem into a set of simpler ones, to map each instance to a class of problems, and to apply a specialized algorithm to each class. We emphasize key differences of our approach: extendible hierarchy of problem classes, new decomposition techniques, a broad range of numerical algorithms.

Alexey Ershov, Ilia Ivanov, Serge Preis, Eugene Rukoleev, Dmitry Ushakov

On Strategies of the Narrowing Operator Selection in the Constraint Propagation Method

This paper presents an approach that allows us to increase efficiency of the constraint propagation method. This approach consists in using a strategy of calls of narrowing operators in the process of computation on the basis of a dynamic system of priorities. Several strategies are described and the results of numerical experiments are discussed.

Yuri G. Dolgov

Documentation and Testing

PROG — A New Program Documentation System

Though programming languages and programming styles evolve with remarkable speed today, there is no such evolution in the field of program documentation. And although there exist some popular approaches like Knuth’s literate programming system WEB [26], and nowadays JavaDoc [15] or Doxygen [16], tools for managing software development and documentation are not as widespread as desirable.This paper analyses a wide range of literate programming tools available during the past two decades and introduces PROG$\mathcal{DOC}$, a new software documentation system. It is simple, language independent, and it keeps documentation and the documented software consistent. It uses ${\rm L\kern-.36em\raise.3ex\hbox{\sc a}\kern-.15em T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}$ for typesetting purposes, supports syntax highlighting for various languages, and produces output in Postscript, PDF or HTML format.

Volker Simonis, Roland Weiss

Integration of Functional and Timed Testing of Real-Time and Concurrent Systems

The article presents an approach to model based testing of complex systems based on a generalization of finite state machines (FSM) and input output state machines (IOSM). The approach presented is used in the context of UniTesK specification based test development method. The results of its practical applications are also discussed. Practical experience demonstrates the applicability of the approach for model based testing of protocol implementations, distributed and concurrent systems, and real-time systems. This work stems from ISPRAS results of academic research and industrial application of formal techniques in verification and testing [1].

Victor V. Kuliamin, Alexander K. Petrenko, Nick V. Pakoulin, Alexander S. Kossatchev, Igor B. Bourdonov

Test Case Generation for UML Statecharts

We describe an approach to automatically generate test cases from object-oriented statecharts as they are used in the UML and supported by development tools such as I-Logics Rhapsody.This work contributes in three respects to using statecharts for specifying and verifying systems. First, it concretizes previously proposed semantics of statecharts by instantiating the abstract data type for the event management and analyzes the resulting specific properties. Second, building on a previously defined conformance relation it discusses two interpretations of stuttering. Third, it introduces a compact data structure for representing the statechart semantics that allows for efficient generation of test cases and easily supports both interpretations of stuttering.

Dirk Seifert, Steffen Helke, Thomas Santen

Databases

Conceptual Content Modeling and Management

Focused views on entities of interest — concrete or abstract ones — are often represented by texts, images, speech or other media. Such media views are communicated through visual or audio channels and stored persistently in appropriate containers.In this paper we extend a computational content-container model into a closely coupled content-concept model intended to capture more of the meaning — and improve the value — of content. Integrated content-concept views on entities are modeled by the notion of assets, and our asset language aims at two goals derived from extensive experiences with entity modeling:1Expressiveness: according to Peirce [29] and others, entity modeling — and, therefore, also asset modeling — has to cover three different perspectives: an entity’s inherent characteristics (firstness categories);its relationships to other entities (secondness categories);the systematics behind the first two perspectives (thirdness categories).2Responsiveness: according to Cassirer [8, 47] and others, entity modeling processes, in order to be successful have to be open, i.e., users of an asset language must be able to adapt their asset models according to the requirements of the entity at hand;dynamic in the sense that all aspects of an asset model must be subject to inspection and adaptation at any time.Our current experiments with asset languages are motivated by the need for a better understanding and integration of content and concepts about application entities. We conclude by outlining a component-based implementation technology for open and dynamic asset systems.

Joachim W. Schmidt, Hans-Werner Sehring

A Relational Algebra for Functional Logic Deductive Databases

In this paper, we study the integration of functional logic programming and databases by presenting a data model, and a query and data definition language. The data model is adopted from functional logic programming by allowing complex values. The query and data definition language is based on the use of algebra expressions built from a set of algebra operators over an extended relational algebra. In addition, algebra expressions can be used for defining functions, typical in a functional logic program.

Jesús Manuel Almendros-Jiménez, Antonio Becerra-Terón

Implication of Functional Dependencies for Recursive Queries

After two decades of research in Deductive Databases, SQL99 [12] brings them again to the foreground given that SQL99 includes queries with linear recursion. Therefore some of the problems solved for the relational model demand our attention again.In this paper, we tackle the implication of functional dependencies (also known as the FD-FD implication problem) in the deductive model framework. The problem is as follows. GivenP, F, andf, wherePis a Datalog program, Fis a set of functional dependencies defined on the predicates of P, andfis a fd defined over the predicates of P, is it true that for all databasesddefined exclusively on the extensional predicates ofP, dsatisfies Fimplies that P(d) –the output database– satisfies f. Unlike the implication problem of functional dependencies in the relational data model, this problem is undecidable for general Datalog programs.In this paper, we provide two methods to check if a given set of fds will be satisfied by the output database (without computing such database) for a class of Datalog programs.

José R. Paramá, Nieves R. Brisaboa, Miguel R. Penabad, Ángeles S. Places

TeXOR: Temporal XML Database on an Object-Relational Database System

Storage costs are rapidly decreasing, making it feasible to store larger amounts of data in databases. This also makes it possible to store previous versions of data in the databases, instead of only keeping the last version. Recently, the amount of data available in XML has been rapidly increasing. In this paper, we describe TeXOR, a temporal XML database system built on top of an object-relational database system. We describe the TXSQL query language used in TeXOR for querying temporal XML documents stored in the system, discuss storage alternatives for XML documents in such a system, and some details about the implementation of the current TeXOR prototype.

Kjetil Nørvåg, Marit Limstrand, Lene Myklebust

Functional Dependencies, from Relational to XML

The flexibility of XML allows the same data to be represented in many different ways. Some representations may be better than others in that they require less storage or have less redundancy. In this paper we define functional dependencies in XML (XFDs) and investigate their effect on the design of XML documents. We then define two subtypes of XFDs, namely partial and transitive XFDs, which cause the same problems in XML document design as the corresponding types of FDs in relations. We further show that the removal of such types of XFDs can lead to a better document design. On the basis of this, we define the concept of upward XFDs and analyze its use in maximizing the nesting levels in XML documents without introducing redundancy. We further propose guidelines to nesting elements in XML documents.

Jixue Liu, Millist Vincent, Chengfei Liu

Data-Object Replication, Distribution, and Mobility in Network Environments

In this paper we address the problem of replication, allocation and mobility of large data-objects in network environments that may be exposed to significant changes in users’ location, usage and access patterns. In these circumstances, if the design is not adapted to the new changes, the system can undergo a severe degradation in data access costs and response time. In order to solve this problem, we propose a formal model to generate a new data-object allocation and replication. The model uses current state information of the system and usage statistical data collected during a given period, and adapts the system to the new users’ location and usage patterns so as to minimize communication costs. Implicitly the mathematical model handles mobility and temporality. Model tests have been conducted with satisfactory and promising results. The principles used in the model can be applied to other models for the optimization of resources in network environments like the Web.

Joaquín Pérez O., Rodolfo A. Pazos, René Santaolaya, Juan Frausto S., Guillermo Rodríguez O., Laura Cruz R., Maricela Bravo C.

Natural Language Processing

Multi-classification of Patent Applications with Winnow

The Winnow family of learning algorithms can cope well with large numbers of features and is tolerant to variations in document length, which makes it suitable for classifying large collections of large documents, like patent applications.Both the large size of the documents and the large number of available training documents for each class make this classification task qualitatively different from the classification of short documents (newspaper articles or medical abstracts) with few training examples, as exemplified by the TREC evaluations.This note describes recent experiments with Winnow on two large corpora of patent applications, supplied by the European Patent Office (EPO). It is found that the multi-classification of patent applications is much less accurate than the mono-classification of similar documents. We describe a potential pitfall in multi-classification and show ways to improve the accuracy. We argue that the inherently larger noisiness of multi-class labeling is the reason that multi-classification is harder than mono-classification.

Cornelis H. A. Koster, Marc Seutter, Jean Beney

Automatic Evaluation of Quality of an Explanatory Dictionary by Comparison of Word Senses

Words in the explanatory dictionary have different meanings (senses) described using natural language definitions. If the definitions of two senses of the same word are too similar, it is difficult to grasp the difference and thus it is difficult to judge which of the two senses is intended in a particular contexts, especially when such a decision is to be made automatically as in the task of automatic word sense disambiguation. We suggest a method of formal evaluation of this aspect of quality of an explanatory dictionary by calculating the similarity of different senses of the same word. We calculate the similarity between two given senses as the relative number of equal or synonymous words in their definitions. In addition to the general assessment of the dictionary, the individual suspicious definitions are reported for possible improvement. In our experiments we used the Anaya explanatory dictionary of Spanish. Our experiments show that there are about 10% of substantially similar definitions in this dictionary, which indicates rather low quality.

Alexander Gelbukh, Grigori Sidorov, SangYong Han, Liliana Chanona-Hernandez

An Approach to Automatic Construction of a Hierarchical Subject Domain for Question Answering Systems

We propose a new statistical algorithm for automatic construction of subject domains that can be used in e-mail or Web question answering systems and ontology generating. The domain hierarchy is extracted from electronic texts written in a natural language, e.g., in English. During the text processing, the quality and quantity of information presented in the texts are being evaluated and then the hierarchical relationships between the pieces of texts are established depending on the derived data. Using this approach, we have created a question answering system which executes hierarchy navigation based on a query analysis including evaluation of the user’s conversance with the subject domain. In combination, these steps result in comprehensive and non-redundant answers.

Anna V. Zhdanova, Pavel V. Mankevich

Springer Professional

Inhaltsverzeichnis

Frontmatter