nach oben

2005 | Buch

Kapitel lesen Erstes Kapitel lesen

Research and Advanced Technology for Digital Libraries

9th European Conference, ECDL 2005, Vienna, Austria, September 18-23, 2005. Proceedings

herausgegeben von: Andreas Rauber, Stavros Christodoulakis, A Min Tjoa

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Since its inception in 1997,the EuropeanConferenceon Researchand Advanced Technology for Digital Libraries (ECDL) has come a long way, creating a strong interdisciplinarycommunityofresearchersandpractitionersinthe?eldofdigital libraries. We are proud to present the proceedings of ECDL 2005, the ninth conference in this series, which, following Pisa (1997), Heraklion (1998), Paris (1999), Lisbon (2000), Darmstadt (2001), Rome (2002), Trondheim (2003), and Bath (2004), took place on September 18–23, 2005 in Vienna, Austria. ECDL 2005 featured separate calls for paper and poster submissions, resu- ing in 130 full papers and 32 posters being submitted to the conference. All - pers were subject to a thorough peer-review process, with an 87-person-strong Program Committee and a further 68 additional reviewers from 35 countries from basically all continents sharing the tremendous review load, producing - tween three and four detailed reviews per paper. Based on these, as well as on the discussion that took place during a one-week on-line PC discussion phase, 41 papers were ?nally selected for inclusion in the conference program during a 1. 5 day PC meeting, resulting in an acceptance rate of only 32%. Furthermore, 17 paper submissions were accepted for poster presentations with an additional 13 posters being accepted based on a simpli?ed review process of 2–3 reviews per poster from the poster submission track. Both the full papers as well as extended abstracts of the posters presented at ECDL 2005 are provided in these proceedings.

Inhaltsverzeichnis

Frontmatter

Digital Library Models and Architectures

Requirements Gathering and Modeling of Domain-Specific Digital Libraries with the 5S Framework: An Archaeological Case Study with ETANA

Requirements gathering and conceptual modeling are essential for the customization of digital libraries (DLs), to help attend the needs of target communities. In this paper, we show how to apply the 5S (Streams, Structures, Spaces, Scenarios, and Societies) formal framework to support both tasks. The intuitive nature of the framework allows for easy and systematic requirements analysis, while its formal nature ensures the precision and correctness required for semi-automatic DL generation. Further, we show how 5S can help us define a domain-specific DL metamodel in the field of archaeology. Finally, an archaeological DL case study (from the ETANA project) yields informal and formal descriptions of two DL models (instances of the metamodel).

Rao Shen, Marcos André Gonçalves, Weiguo Fan, Edward Fox

On the Effective Manipulation of Digital Objects: A Prototype-Based Instantiation Approach

This paper elaborates on the design and development of an effective digital object manipulation mechanism that facilitates the generation of configurable Digital Library application logic, as expressed by collection manager, cataloguing and browsing modules. Our work aims to resolve the issue that digital objects typing information can be currently utilized only by humans as a guide and not by programs as a digital object type conformance mechanism. Drawing on the notions of the Object Oriented Model, we propose a “type checking” mechanism that automates the conformance of digital objects to their type definitions, named

digital object prototypes

. We pinpoint the practical benefits gained by our approach in the development of the University of Athens Digital Library, in terms of code reuse and configuration capabilities.

Kostas Saidis, George Pyrounakis, Mara Nikolaidou

LibraRing: An Architecture for Distributed Digital Libraries Based on DHTs

We present a digital library architecture based on distributed hash tables. We discuss the main components of this architecture and the protocols for offering information retrieval and information filtering functionality. We present an experimental evaluation of our proposals.

Christos Tryfonopoulos, Stratos Idreos, Manolis Koubarakis

Multimedia and Hypermedia Digital Libraries

Hierarchical Organization and Description of Music Collections at the Artist Level

As digital music collections grow, so does the need to organizing them automatically. In this paper we present an approach to hierarchically organize music collections at the artist level. Artists are grouped according to similarity which is computed using a web search engine and standard text retrieval techniques. The groups are described by words found on the webpages using term selection techniques and domain knowledge. We compare different term selection techniques, present a simple demonstration, and discuss our findings.

Elias Pampalk, Arthur Flexer, Gerhard Widmer

A Comparison of Melodic Segmentation Techniques for Music Information Retrieval

The scientific research on accessing and retrieval of music documents is becoming increasingly active, including the analysis of suitable features for content description or the development of algorithms to match relevant documents with queries. One of the challenges in this area is the possibility to extend textual retrieval techniques to music language. Music lacks of explicit separators between its lexical units, thus they have to be automatically extracted. This paper presents an overview of different approaches to melody segmentation aimed at extracting music lexical units. A comparison of different approaches is presented, showing their impact on indexes size and on retrieval effectiveness.

Giovanna Neve, Nicola Orio

The Effect of Collection Fusion Strategies on Information Seeking Performance in Distributed Hypermedia Digital Libraries

This paper reports the results of a user-centered experiment which examined the effect of parallel multi-database searching using automated collection fusion strategies on information seeking performance. Three conditions were tested in the experiment. Subjects in the first condition performed search tasks in a WWW-based distributed hypermedia digital library which did not support parallel, concurrent searching of multiple collections, and did not offer any automated mechanism for source selection. Subjects in the second and the third conditions performed parallel multi-database search tasks in the same library with the support of two automated collection fusion strategies (uniform and link-based), each solving the collection fusion problem using a different approach. The results show that information-seeking performance tends to be positively affected when the eclectic link-based method was used. On the other hand, the uniform collection fusion method which treats all the sub-collections in the same manner, does not present any benefit in comparison to information seeking environments in which users must manually select sources and parallel multi-database searching is not provided.

Michail Salampasis, John Tait

XML

A Native XML Database Supporting Approximate Match Search

XML is becoming the standard representation format for metadata. Metadata for multimedia documents, as for instance MPEG-7, require approximate match search functionalities to be supported in addition to exact match search. As an example, consider image search performed by using MPEG-7 visual descriptors. It does not make sense to search for images that are exactly equal to a query image. Rather, images similar to a query image are more likely to be searched. We present the architecture of an XML search engine where special techniques are used to integrate approximate and exact match search functionalities.

Giuseppe Amato, Franca Debole

XMLibrary Search: An XML Search Engine Oriented to Digital Libraries

The increase in the amount of data available in digital libraries calls for the development of search engines that allow the users to find quickly and effectively what they are looking for. The XML tagging makes possible the addition of structural information in digitized content. These metadata offer new opportunities to a wide variety of new services. This paper describes the requirements that a search engine inside a digital library should fulfill and it also presents a specific XML search engine architecture. This architecture is designed to index a large amount of text with structural tagging and to be web-available. The architecture has been developed and successfully tested at the Miguel de Cervantes Digital Library.

Enrique Sánchez-Villamil, Carlos González Muñoz, Rafael C. Carrasco

From Legacy Documents to XML: A Conversion Framework

We present an integrated framework for the document conversion from legacy formats to XML format. We describe the

LegDoC

project, aimed at automating the conversion of layout annotations layout-oriented formats like PDF, PS and HTML to semantic-oriented annotations. A toolkit of different components covers complementary techniques the logical document analysis and semantic annotations with the methods of machine learning. We use a real case conversion project as a driving example to exemplify different techniques implemented in the project.

Jean-Pierre Chanod, Boris Chidlovskii, Hervé Dejean, Olivier Fambon, Jérôme Fuselier, Thierry Jacquin, Jean-Luc Meunier

SCOPE – A Generic Framework for XML Based Publishing Processes

One of the objectives of the Open Access movement is to establish institutional repositories at universities and other research institutions in order to support self-archiving. Although a lot of software solutions have already been presented in recent years they lack a seamless integration of authoring tools, support for authors, and other technical publication tools. This paper presents a formal approach to describe software components applied in publishing processes. Additionally it is depicted how this formal description leads to the technological basis for SCOPE (Service Core for Open Publishing Environments) – a publishing platform for XML based publishing models. SCOPE is a framework intended for the integration of different publication components into a single platform.

Uwe Müller, Manuel Klatt

Building Digital Libraries

DAR: A Digital Assets Repository for Library Collections

The Digital Assets Repository (DAR) is a system developed at the Bibliotheca Alexandrina, the Library of Alexandria, to create and maintain the digital library collections. The system introduces a data model capable of associating the metadata of different types of resources with the content such that searching and retrieval can be done efficiently. The system automates the digitization process of library collections as well as the preservation and archiving of the digitized output and provides public access to the collection through browsing and searching capabilities. The goal of this project is building a digital resources repository by supporting the creation, use, and preservation of varieties of digital resources as well as the development of management tools. These tools help the library to preserve, manage and share digital assets. The system is based on evolving standards for easy integration with web-based interoperable digital libraries.

Iman Saleh, Noha Adly, Magdy Nagi

Webservices Infrastructure for the Registration of Scientific Primary Data

Registration of scientific primary data, to make these data citable as a unique piece of work and not only a part of a publication, has always been an important issue. In the context of the project ”Publication and Citation of Scientific Primary Data” funded by the German Research Foundation (DFG) the German National Library of Science and Technology (TIB) has become the first registration agency worldwide for scientific primary data. Registration has started for the field of earth science, but will be widened for other subjects in the future. This paper shall give an overview about the technical realization of this important usage field for a digital library.

Uwe Schindler, Jan Brase, Michael Diepenbroek

Incremental, Semi-automatic, Mapping-Based Integration of Heterogeneous Collections into Archaeological Digital Libraries: Megiddo Case Study

Automation is an important issue when integrating heterogeneous collections into archaeological digital libraries. We propose an incremental approach through intermediary- and mapping-based techniques. A visual schema mapping tool within the 5S [1] framework allows semi-automatic mapping and incremental global schema enrichment. 5S also helped speed up development of a new multi-dimension browsing service. Our approach helps integrate the Megiddo [2] excavation data into a growing union archaeological DL, ETANA-DL [3].

Ananth Raghavan, Naga Srinivas Vemuri, Rao Shen, Marcos A. Goncalves, Weiguo Fan, Edward A. Fox

Integrating Diverse Research in a Digital Library Focused on a Single Author

The works of a significant author are accompanied by a variety of artifacts ranging from the scholarly to the popular. In order to better support the needs of the scholarly community, digital libraries focused on the life and works of a particular author must be designed to assemble, integrate, and present the full scope of these artifacts. Drawing from our experiences with the Cervantes Project, we describe five intersecting domains that are common to similarly focused humanities research projects. Integrating the tools needed and the artifacts produced by each of these domains enables digital libraries to provide unique connections between diverse research communities.

Neal Audenaert, Richard Furuta, Eduardo Urbina, Jie Deng, Carlos Monroy, Rosy Sáenz, Doris Careaga

User Studies

A Fluid Interface for Personal Digital Libraries

An advanced interface is presented for fluid interaction in a personal digital library system. The system employs a zoomable planar representation of a collection using hybrid continuous/quantum treemap visualizations to facilitate navigation while minimizing cognitive load. The system is particularly well suited to user tasks which, in the physical world, are normally carried out by laying out a set of related documents on a physical desk — namely, those tasks that require frequent and rapid transfer of attention from one document in the collection to another. Discussed are the design and implementation of the system as well as its relationship to previous work.

Lance E. Good, Ashok C. Popat, William C. Janssen, Eric A. Bier

MedioVis – A User-Centred Library Metadata Browser

MedioVis is a visual information seeking system which was designed especially for library data. The objective target was to create a system which simplifies and optimizes the user’s information seeking process and thus further motivates the user to browse in the library stock. To enhance the motivation special attention was given to consider joy of use aspects during the design of the user interface. The primary user interface design is based on multiple coordinated views to offer a great variety of exploration possibilities in a direct-manipulative manner. To accomplish a self-explanatory usability of the system for non-expert users, the development was accompanied by continuous user tests with casual and regular library users. At the end of the development process a comprehensive summative evaluation was conducted, comparing efficiency and joy of use of the existing web-based catalogue system KOALA of the library of the University of Konstanz with the MedioVis system. The results of this comparative evaluation show a significant improvement of the efficiency of the information seeking process with the help of MedioVis. The users also rated MedioVis significantly better in all dimensions of its hedonic quality and appeal compared with KOALA.

Christian Grün, Jens Gerken, Hans-Christian Jetter, Werner König, Harald Reiterer

Effectiveness of Implicit Rating Data on Characterizing Users in Complex Information Systems

Most user focused data mining techniques involve purchase pattern analysis, targeted at strictly-formatted database-like transaction records. Most personalization systems employ explicitly provided user preferences rather than implicit rating data obtained automatically by collecting users’ interactions. In this paper, we show that in complex information systems such as digital libraries, implicit rating data can help to characterize users’ research and learning interests, and can be used to cluster users into meaningful groups. Thus, in our personalized recommender system based on collaborative filtering, we employ a user tracking system and a user modeling technique to capture and store users’ implicit ratings. Also, we describe the effects (on community finding) of using four different types of implicit rating data.

Seonho Kim, Uma Murthy, Kapil Ahuja, Sandi Vasile, Edward A. Fox

Managing Personal Documents with a Digital Library

This paper presents a desktop system for managing personal documents. The documents can be of many types—text, spreadsheets, images, multimedia—and are organized in a personal “digital library”. The interface supports browsing over a wide variety of document metadata, as well as full-text searching. This extensive browsing facility addresses a significant flaw in digital library and file management software, both of which typically provide less support for browsing than for searching, and support relatively inflexible browsing methods. Three separate usability studies of a prototype—an expert evaluation, a learnability evaluation, and a diary study—were conducted to suggest design refinements, which were then incorporated into the final system.

Imene Jaballah, Sally Jo Cunningham, Ian H. Witten

The Influence of the Scatter of Literature on the Use of Electronic Resources Across Disciplines: A Case Study of FinELib

This paper reports on how disciplinary variation in the scatter of literature affects the searching and use of electronic information services (EIS) by university faculty. The data consist of a nationwide web-survey of the end-users of FinELib, The Finnish National Electronic Library. The results show that discipline and scatter of literature are significantly associated with the number and types of electronic databases used. The scatter of literature across several fields activates researchers to more frequently search for and use various types of EIS. Especially the results concerning search methods challenge previous hypotheses and suggest important changes brought by the digital environment.

Pertti Vakkari, Sanna Talja

Information Seeking by Humanities Scholars

This paper investigates the information seeking of humanities academics and scholars using digital libraries. It furthers existing work by updating our knowledge of the information seeking techniques used by humanities scholars, where the current work predates the wide availability of the Internet. We also report some of the patterns observed in query and term usage by humanities scholars, and relate this to the patterns they report in their own information seeking and the problems that they encounter. This insight is used to reveal the current gap between the skills of information seekers and the technologies that they use. Searches for ‘discipline terms’ prove to be particularly problematic.

George Buchanan, Sally Jo Cunningham, Ann Blandford, Jon Rimmer, Claire Warwick

ReadUp: A Widget for Reading

User interfaces for digital library systems must support a wide range of user activities. They include search, browsing, and curation, but perhaps the most important is actual reading of the items in the library. Support for reading, however, is usually relegated to applications which are only loosely integrated with the digital library system. One reason for this is the absence of toolkit widget support for the activity of reading. Most user interface toolkits instead provide support for either text editing or text presentation, making it difficult to write applications which support reading well. In this paper we describe the origins, design, and implementation of a new Java Swing toolkit widget called

ReadUp

, which provides support for reading page images in a digital library application, and discuss briefly how it is being used.

William C. Janssen

Digital Preservation

The DSpace Open Source Digital Asset Management System: Challenges and Opportunities

Last year at the ECDL 2004 conference, we reported some initial progress and experiences developing DSpace as an open source community-driven project [8], particularly as seen from an institutional manager’s viewpoint. We also described some challenges and issues. This paper describes the progress in addressing some of those issues, and developments in the DSpace open source community. We go into detail about the processes and infrastructure we have developed around the DSpace code base, in the hope that this will be useful to other projects and organisations exploring the possibilities of becoming involved in or transitioning to open source development of digital library software. Some new challenges the DSpace community faces, particularly in the area of addressing required system architecture changes, are introduced. We also describe some exciting new possibilities that open source development brings to our community.

Robert Tansley, MacKenzie Smith, Julie Harford Walker

File-Based Storage of Digital Objects and Constituent Datastreams: XMLtapes and Internet Archive ARC Files

This paper introduces the write-once/read-many XMLtape/ARC storage approach for Digital Objects and their constituent datastreams. The approach combines two interconnected file-based storage mechanisms that are made accessible in a protocol-based manner. First, XML-based representations of multiple Digital Objects are concatenated into a single file named an XMLtape. An XMLtape is a valid XML file; its format definition is independent of the choice of the XML-based complex object format by which Digital Objects are represented. The creation of indexes for both the identifier and the creation datetime of the XML-based representation of the Digital Objects facilitates OAI-PMH-based access to Digital Objects stored in an XMLtape. Second, ARC files, as introduced by the Internet Archive, are used to contain the constituent datastreams of the Digital Objects in a concatenated manner. An index for the identifier of the datastream facilitates OpenURL-based access to an ARC file. The interconnection between XMLtapes and ARC files is provided by conveying the identifiers of ARC files associated with an XMLtape as administrative information in the XMLtape, and by including OpenURL references to constituent datastreams of a Digital Object in the XML-based representation of that Digital Object.

Xiaoming Liu, Lyudmila Balakireva, Patrick Hochstenbach, Herbert Van de Sompel

A No-Compromises Architecture for Digital Document Preservation

The Multivalent Document Model offers a practical, proven, no-compromises architecture for preserving digital documents of potentially any data format. We have implemented from scratch such complex and currently important formats as PDF and HTML, as well as older formats including scanned paper, UNIX manual pages, TeX DVI, and Apple II AppleWorks word processing. The architecture, stable since its definition in 1997, extends easily to additional document formats, defines a cross-format document tree data structure that fully captures semantics and layout, supports full expression of a format’s often idiosyncratic concepts and behavior, enables sharing of functionality across formats thus reducing implementation effort, can introduce new functionality such as hyperlinks and annotation to older formats that cannot express them, and provides a single interface (API) across all formats. Multivalent contrasts sharply with emulation and conversion, and advances Lorie’s Universal Virtual Computer with high-level architecture and extensive implementation.

Thomas A. Phelps, P. B. Watry

A Study into the Effect of Digitisation Projects on the Management and Stability of Historic Photograph Collections

The results of an ongoing interview study with custodians of historic photograph collections are reported. In particular the success or otherwise of recent digitisation projects is addressed, as well as the extent to which these projects have affected the long term management of the collections. We examine the effects of digitisation on the primary sources, their digitised surrogates and the relationship between the two in terms of selection, authenticity and representation. In most cases we have observed that the emphasis placed by the funding bodies on ‘accessibility’ of tangible numbers of resources is detrimental to these three other issues. However, we report in detail on one case study of a local history library where its digitisation work is embedded in core library activity and seen as successful and positive. We conclude by suggesting that their deliberate eschewing of short term project funding is a determining factor in their success.

Veronica Davis-Perkins, Richard Butterworth, Paul Curzon, Bob Fields

Metadata

Strategies for Reprocessing Aggregated Metadata

The OAI protocol facilitates the aggregation of large numbers of heterogeneous metadata records. In order to make harvested records useable in the context of an OAI service provider, the records typically must be filtered, analyzed and transformed. The CIC metadata portal harvests 450,000 records from 18 repositories at 9 U.S. Midwestern universities. The process implemented for transforming metadata records for this project supports multiple workflows and end-user interfaces. The design of the metadata transformation process required trade-offs between aggregation homogeneity and utility for purpose and pragmatic constraints such as feasibility, human resources, and processing time.

Muriel Foulonneau, Timothy W. Cole

A Hybrid Declarative/Procedural Metadata Mapping Language Based on Python

The Alexandria Digital Library (ADL) project has been working on automating the processes of building ADL collections and gathering the collection statistics on which ADL’s discovery system is based. As part of this effort, we have created a language and supporting programmatic framework for expressing mappings from XML metadata schemas to the required ADL metadata views. This language, based on the Python scripting language, is largely declarative in nature, corresponding to the fact that mappings can be largely—though not entirely—specified by crosswalk-type specifications. At the same time, the language allows mappings to be specified procedurally, which we argue is necessary to deal effectively with the realities of poor quality, highly variable, and incomplete metadata. An additional key feature of the language is the ability to derive new mappings from existing mappings, thereby making it easy to adapt generic mappings to the idiosyncrasies of particular metadata providers. We evaluate this language on three metadata standards (ADN, FGDC, and MARC) and three corresponding collections of metadata. We also note limitations, future research directions, and generalizations of this work.

Greg Janée, James Frew

Using a Metadata Schema Registry in the National Digital Data Archive of Hungary

The National Digital Data Archive (NDDA) is an ongoing initiative of the Hungarian government that makes Hungary’s national cultural assets available in digital form. The NDDA features a decentralized OAI-based network of archives and service providers facilitating discovery and access to digitized objects. Authors’ participation in the project is described including the implementation of an NDDA service provider. This service provider is connected with an RDF-based metadata schema registry enabling the service to automatically adapt to the metadata schemas defined within the NDDA.

Csaba Fülöp, Gergő Kiss, László Kovács, András Micsik

Digital Libraries and e-Learning

Finding Appropriate Learning Objects: An Empirical Evaluation

The challenge of finding appropriate learning objects is one of the bottlenecks for end users in Learning Object Repositories (LORs). This paper investigates usability problems of search tools for learning objects. We present findings and recommendations of an iterative usability study conducted to examine the usability of a search tool used to find learning objects in ARIADNE Knowledge Pool System [1]. Findings and recommendations of this study are generalized to other similar search tools.

Jehad Najjar, Joris Klerkx, Riina Vuorikari, Erik Duval

Managing Geography Learning Objects Using Personalized Project Spaces in G-Portal

The personalized project space is an important feature in G-Portal that supports individual and group learning activities. Within such a space, its owner can create, delete, and organize metadata referencing learning objects on the Web. Browsing and querying are among the functions provided to access the metadata. In addition, new schemas can be added to accommodate metadata of diverse attribute sets. Users can also easily share metadata across different projects using a “copy-and-paste” approach. Finally, a viewer to support offline viewing of personalized project content is also provided.

Dion Hoe-Lian Goh, Aixin Sun, Wenbo Zong, Dan Wu, Ee-Peng Lim, Yin-Leng Theng, John Hedberg, Chew Hung Chang

Evaluation of the NSDL and Google for Obtaining Pedagogical Resources

We describe an experiment that measures the pedagogical usefulness of the results returned by the National Science Digital Library (NSDL) and Google. Eleven public school teachers from the state of Virginia (USA) were used to evaluate a set of 38 search terms and search results based on the Standards of Learning (SOL) for Virginia Public Schools. Evaluations of search results were obtained from the NSDL (572 evaluations) and Google (650 evaluations). In our experiments, teachers ranked the links returned by Google as more relevant to the SOL than the links returned by the NSDL. Furthermore, Google’s ranking of educational material also showed some correlation with expert judgments.

Frank McCown, Johan Bollen, Michael L. Nelson

Policy Model for University Digital Collections

The access and reproduction policies of the digital collections of ten leading university digital libraries worldwide are classified according to factors such as the creation type of the material, acquisition method, copyright ownership etc. The relationship of these factors is analyzed, showing how acquisition methods and copyright ownership affect the access and reproduction policies of digital collections. We conclude with rules about which factors lead to specific policies. For example, when the library has the copyright of the material, the reproduction for private use is usually provided free with a credit to the source or otherwise mostly under fair use provisions, but the commercial reproduction needs written permission and fees are charged. The extracted rules, which show the common practice on access and reproduction policies, constitute the policy model. Finally, conventional policies are mapped onto digital policies.

Alexandros Koulouris, Sarantos Kapidakis

Text Classification in Digital Libraries

Importance of HTML Structural Elements and Metadata in Automated Subject Classification

The aim of the study was to determine how significance indicators assigned to different Web page elements (internal metadata, title, headings, and main text) influence automated classification. The data collection that was used comprised 1000 Web pages in engineering, to which Engineering Information classes had been manually assigned. The significance indicators were derived using several different methods: (total and partial) precision and recall, semantic distance and multiple regression. It was shown that for best results

all

the elements have to be included in the classification process. The exact way of combining the significance indicators turned out not to be overly important: using the F1 measure, the best combination of significance indicators yielded no more than 3% higher performance results than the baseline.

Koraljka Golub, Anders Ardö

DL Meets P2P – Distributed Document Retrieval Based on Classification and Content

Peer-to-peer architectures are a potentially powerful paradigm for retrieving documents over networks of digital libraries avoiding single points of failure by massive federation of (independent) information sources. Today sharing files over P2P infrastructures is already immensely successful, but restricted to simple metadata matching. But when it comes to the retrieval of complex documents, capabilities as provided by digital libraries are needed. Digital libraries have to cope with compound documents. Though some document parts (like embedded images) can efficiently be retrieved using metadata matching, the text-based information needs different methods like full text search. However, for effective querying of texts, also information like inverted document frequencies are essential. But due to the distributed characteristics of P2P networks such ’collection-wide’ information poses severe problems, e.g. that central updates whenever changes in any document collection occur use up valuable bandwidth. We will present a novel indexing technique that allows to query using collection-wide information with respect to different classifications and show the effectiveness of our scheme for practical applications. We will in detail discuss our findings and present simulations for the scheme’s efficiency and scalability.

Wolf-Tilo Balke, Wolfgang Nejdl, Wolf Siberski, Uwe Thaden

Automatic Detection of Survey Articles

We propose a method for detecting survey articles in a multilingual database. Generally, a survey article cites many important papers in a research domain. Using this feature, it is possible to detect survey articles. We applied HITS, which was devised to retrieve Web pages using the notions of authority and hub. We can consider that important papers and survey articles correspond to authorities and hubs, respectively. It is therefore possible to detect survey articles, by applying HITS to databases and by selecting papers with outstanding hub scores. However, HITS does not take into account the contents of each paper, so the algorithm may detect a paper citing many principal papers in mistake for survey articles. We therefore improve HITS by analysing the contents of each paper. We conducted an experiment and found that HITS was useful for the detection of survey articles, and that our method could improve HITS.

Hidetsugu Nanba, Manabu Okumura

Searching

Focused Crawling Using Latent Semantic Indexing – An Application for Vertical Search Engines

Vertical search engines and web portals are gaining ground over the general-purpose engines due to their limited size and their high precision for the domain they cover. The number of vertical portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler evident. In this paper, we develop a latent semantic indexing classifier that combines link analysis with text content in order to retrieve and index domain specific web documents. We compare its efficiency with other well-known web information retrieval techniques. Our implementation presents a different approach to focused crawling and aims to overcome the size limitations of the initial training data while maintaining a high recall/precision ratio.

George Almpanidis, Constantine Kotropoulos, Ioannis Pitas

Active Support for Query Formulation in Virtual Digital Libraries: A Case Study with DAFFODIL

Daffodil

is a front-end to federated, heterogeneous digital libraries targeting at strategic support of users during the information seeking process. This is done by offering a variety of functions for searching, exploring and managing digital library objects. However, the distributed search increases response time and the conceptual model of the underlying search processes is inherently weaker. This makes query formulation harder and the resulting waiting times can be frustrating. In this paper, we investigate the concept of proactive support during the user’s query formulation. For improving user efficiency and satisfaction, we implemented annotations, proactive support and error markers on the query form itself. These functions decrease the probability for syntactical or semantical errors in queries. Furthermore, the user is able to make better tactical decisions and feels more confident that the system handles the query properly. Evaluations with 30 subjects showed that user satisfaction is improved, whereas no conclusive results were received for efficiency.

André Schaefer, Matthias Jordan, Claus-Peter Klas, Norbert Fuhr

Expression of Z39.50 Supported Search Capabilities by Applying Formal Descriptions

The wide adoption of the Z39.50 protocol from the Libraries exposes their abilities to participate in a distributed environment. In spite of the protocol specification of a unified global access mechanism, query failures and/or inconsistent answers are the pending issues when searching many sources due to the variant or poor implementations. The elimination of these issues heavily depends on the ability of the client to make decisions prior to initiating search requests, utilizing the knowledge of the supported search capabilities of each source. To effectively reformulate such requests, we propose a Datalog based description for capturing the knowledge about the supported search capabilities of a

39.50 source. We assume that the accessible sources can answer some but possibly not all queries over their data, and we describe a model for their supported search capabilities using a set of parameterized queries, according to the Relational Query Description Language (RQDL) specification.

Michalis Sfakakis, Sarantos Kapidakis

Text Digital Libraries

A Comparison of On-Line Computer Science Citation Databases

This paper examines the difference and similarities between the two on-line computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while the CiteSeer entries are obtained autonomously via a crawl of the Web and automatic processing of user submissions. CiteSeer’s autonomous citation database can be considered a form of self-selected on-line survey. It is important to understand the limitations of such databases, particularly when citation information is used to assess the performance of authors, institutions and funding bodies.

We show that the CiteSeer database contains considerably fewer single author papers. This bias can be modeled by an exponential process with intuitive explanation. The model permits us to predict that the DBLP database covers approximately 24% of the entire literature of Computer Science. CiteSeer is also biased against low-cited papers.

Despite their difference, both databases exhibit similar and significantly different citation distributions compared with previous analysis of the Physics community. In both databases, we also observe that the number of authors per paper has been increasing over time.

Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G. Councill, C. Lee Giles

A Multi-document Summarization System for Sociology Dissertation Abstracts: Design, Implementation and Evaluation

The design, implementation and evaluation of a multi-document summarization system for sociology dissertation abstracts are described. The system focuses on extracting variables and their relationships from different documents, integrating the extracted information, and presenting the integrated information using a variable-based framework. Two important summarization steps – information extraction and information integration were evaluated by comparing system-generated output against human-generated output. Results indicate that the system-generated output achieves good precision and recall while extracting important concepts from each document, as well as good clusters of similar concepts from the set of documents.

Shiyan Ou, Christopher S. G. Khoo, Dion H. Goh

Compressing Dynamic Text Collections via Phrase-Based Coding

We present a new statistical compression method, which we call

Phrase Based Dense Code (PBDC)

, aimed at compressing large digital libraries. PBDC compresses the text collection to 30–32% of its original size, permits maintaining the text compressed all the time, and offers efficient on-line information retrieval services. The novelty of PBDC is that it supports continuous growing of the compressed text collection, by automatically adapting the vocabulary both to new words and to changes in the word frequency distribution, without degrading the compression ratio. Text compressed with PBDC can be searched directly without decompression, using fast Boyer-Moore algorithms. It is also possible to decompress arbitrary portions of the collection. Alternative compression methods oriented to information retrieval focus on static collections and thus are less well suited to digital libraries.

Nieves R. Brisaboa, Antonio Fariña, Gonzalo Navarro, José R. Paramá

Panels

Does eScience Need Digital Libraries?

eScience has emerged as an important framework for dramatically rethinking the conduct of scientific research using information technology. There is an unparalleled opportunity for the international eScience and digital library communities to create shared infrastructure to support the conduct of science from end-to-end; i.e., from hypothesis generation, to collecting and analyzing scientific data, to the reporting of research outcomes, and the inclusion of scientific data and models in teaching and learning processes. For this vision to be realized, the two communities must establish a shared vision and research agenda encompassing several critical dimensions, including differences in theoretical and methodological approaches, and collaboration goals. Additionally, for the benefits of eScience and digital libraries to be fully realized, it is vital to establish a shared vision of the broader impact of this work for educators, learners, and the general public.

Tamara Sumner, Rachel Heery, Jane Hunter, Norbert Lossau, Michael Wright

Digital Libraries over the Grid: Heaven or Hell?

(Panel Description)

The last decade has seen unprecedented advances in network and distributedsystem technologies, which have opened up the way for the construction of globalscale systems based on completely new conceptions of computation and sharing of resources. The dream of integrating unlimited levels of processing power, unlimited amounts of information, and an unlimited variety of services, and o.ering the entire package in a reliable and seamless fashion to widely distributed users is quickly becoming reality. As Digital Libraries move towards more usercentric, pro-active, collaborative functionality and application diversity, they should be among the first to take advantage of such environments. The long-term vision of the field for creating Dynamic Universal Knowledge Environments calls for intensive computation and processing of very large amounts of information, hence, the needs for the appropriate distributed architecture are pressing.

Donatella Castelli, Yannis Ioannidis

Posters

Management and Sharing of Bibliographies

Managing bibliographic data is a requirement for many researchers. The ShaRef system has been designed to fill the gap between public libraries and personal bibliographies, and provides an open platform for sharing bibliographic data among user groups.

Erik Wilde, Sai Anand, Petra Zimmermann

Legislative Digital Library: Online and Off-line Database of Laws

The paper presents the main issues that usually appear in the development of a legislative digital library. The great number of legislative documents which accumulates over the time raises the need for electronic management of this content and the meta-information associated with it. The preparation, the management and the distribution to end users are explained in detail in this paper, offering in the same time an architectural solution for the development of a similar library. A big emphasis was putted on the legislative documents automatic reference linking mechanisms.

Viorel Dumitru, Adrian Colomitchi, Eduard Budulea, Stefan Diaconescu

DIRECT: A System for Evaluating Information Access Components of Digital Libraries

Digital Library Management Systems(DLMSs)

generally manage collections of multi-media digitalized data and include components that perform the storage, access, retrieval, and analysis of the collections of data. Recently, the new trend of DLMS applications is pushing towards a components/services technology which is becoming more and more standardized [1,2]. The results of this new orientation are ad-hoc solutions for different components and services of DLMS: the data repository, the data manager, the search and retrieval components, etc. We are particularly interested in the evaluation aspects that range from measuring and quantifying the performances of the information access and extraction components of a DLMS to designing and developing an architecture for a system capable of supporting this kind of evaluation in the context of DLMS[3,4].

Giorgio Maria Di Nunzio, Nicola Ferro

Modular Emulation as a Viable Preservation Strategy

Emulation is the only strategy to ensure long-term access to digital objects in their original environment. The National Library of the Netherlands (KB) and the Nationaal Archief of the Netherlands believe that emulation-based preservation is worth developing and has to be tested. This short paper proposes a new model for emulation called modular emulation that will allow us to develop a working prototype for the rendering of digital objects in the future.

Jeffrey van der Hoeven, Hilde van Wijngaarden

Retrieving Amateur Video from a Small Collection

Research on digital video libraries has been done in extensive and expensive projects (e.g. Open Video Project [1], Físchlár [2], Informedia [3]). Small video collections have small budgets and cannot afford sophisticated techniques to put their material on-line. Though very basic digital video library features can be good enough for enlarging the access to rarely seen material, e.g. folklore films from the 1920’s to the 1990’s owned by the National Centre for English Cultural Tradition (NATCECT). This material is unique but rarely used as the archive opens few hours a week: digital access would make it widely available to scholars, students, and enthusiasts.

Daniela Petrelli, Dan Auld, Cathal Gurrin, Alan Smeaton

A Flexible Framework for Content-Based Access Management for Federated Digital Libraries

Recent advances in digital library technologies are making it possible to build federated discovery services which aggregate metadata from different digital libraries (data providers) and provide a unified search interface to users. In this work we develop a framework that enables data providers to control access to their content in the federation. We have built and tested such a framework based on XACML and Shibboleth.

K. Bhoopalam, K. Maly, F. McCown, R. Mukkamala, M. Zubair

The OAI Data-Provider Registration and Validation Service

I present a summary of recent use of the Open Archives Initiative (OAI) registration and validation services for data-providers. The registration service has seen a steady stream of registrations since its launch in 2002, and there are now over 220 registered repositories. I examine the validation logs to produce a breakdown of reasons why repositories fail validation. This breakdown highlights some common problems and will be used to guide work to improve the validation service.

Simeon Warner

An Effective Access Mechanism to Digital Interview Archives

Skill and knowledge of master workmen and artists are important information for digital libraries. Usually disciples acquire the skill and knowledge by conversation with masters and watching the master’s works. Therefor they can be conveyed to limited number of disciples and they are sometimes lost when masters and artists pass away. A digital library for skill and knowledge plays an important role to preserve and convey them to large number of people. Since skill and knowledge inherent in masters and artists, first we need to externalize and represent them in an appropriate form.

Interview to masters and artists are effective way to record their skill and knowledge. It can record various kinds of information such as emotional behavior, procedure of creative activity as well as verbal information in conversation. Furthermore interview enables us to obtain the information from masters and artists without heavy mental load. This characteristics is effective not only for gathering information of the skill and knowledge of masters and artists but also for externalizing knowledge of human beings in many fields.

Atsuhiro Takasu, Kenro Aihara

A Semantic Structure for Digital Theses Collection Based on Domain Annotations

Search performance can be greatly improved by describing data using Natural Language Processing (NLP) tools to create new metadata for digital libraries. In this paper, a methodology is presented to use a specific domain knowledge to improve user request. This domain knowledge is based on concepts, extracted from the document itself, used as

“semantic metadata tags”

in order to annotate XML documents. We present the process followed to define and to add new XML semantic metadata into the digital library of scientific theses. Using these new metadata, an ontology is also built to complete the annotation process. Effective retrieval information is obtained by using an intelligent system based on our XML semantic metadata and a domain ontology.

Rocío Abascal, Béatrice Rumpler, Suela Berisha-Bohé, Jean Marie Pinon

Towards Evaluating the Impact of Ontologies on the Quality of a Digital Library Alerting System

Advanced personalization techniques are required to cope with novel challenges posed by attribute-rich digital libraries. At the heart of our deeply personalized alerting system is one extensible preference model that serves all purposes in conjunction with our search technology Preference XPath and XML-based semantic annotations of digital library objects. In this paper we focus on the impact of automatic query expansion by ontologies. First results indicate that use of ontologies improves the quality of the result set and generates further results of higher quality.

Alfons Huhn, Peter Höfner, Werner Kießling

Building Semantic Digital Libraries: Automated Ontology Linking by Associative Naïve Bayes Classifier

In this paper, we present a new classification method, called

Associative Naïve Bayes (ANB)

, to associate MEDLINE citations with Gene Ontology (GO) terms. We define the concept of class-support to find frequent itemsets and the concept of class-all-confidence to find interesting itemsets. Empirical test results on three MEDLINE datasets show that ANB is superior to naïve Bayesian classifier. The results also show that ANB outperforms the state of the art Large Bayes classifier.

Hyunki Kim, Myung-Gil Jang, Su-Shing Chen

Evaluation of a Collaborative Querying System

We report evaluation results for a collaborative querying environment. Our results show that compared with traditional information retrieval systems, collaborative querying can lead to faster information seeking when users perform unspecified tasks.

Lin Fu, Dion Hoe-Lian Goh, Schubert Shou-Boon Foo

Aiding Comprehension in Electronic Books Using Contextual Information

A person reading a book needs to gain insights based on the text. In most books, stories, themes, and references are organized structurally and purposefully. In previous work, we presented the design of an e-Book user interface that reveals the multi-structural information to support reading for comprehension[1]. In this paper, we describe techniques for discovering and representing the narrative structure of e-Books, and describe the user interface components for revealing this narrative structure to readers. We chose e-Bible as our corpus and named our user interface "iSee", meaning that "I see what I read".

Yixing Sun, David J. Harper, Stuart N. K. Watt

An Information Foraging Tool

Electronic document repositories continue to expand rapidly; public collections, for instance the Google index, contain up to 8 billion individual items. Private electronic archives, maintained by companies, governments and other bodies grow at similar rates. While search techniques have scaled to manage these vast collections, most interfaces between search engines and searchers, usually based on a ranked list, are increasingly insufficient. This paper explains how Information Foraging Theory was applied to create visualisations of query resultsets which, when embedded in an application that contained tools to manipulate the visualisation, helped alleviate the deficiencies of the ranked list.

Cathal Hoare, Humphrey Sorensen

mod_oai: An Apache Module for Metadata Harvesting

We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAI-PMH is the de facto standard for metadata exchange in digital libraries and allows repositories to expose their contents in a structured, application-neutral format with semantics optimized for accurate incremental harvesting. mod_oai differs from other OAI-PMH implementations in that it optimizes harvesting web content by building OAI-PMH capability into the Apache server.

Michael L. Nelson, Herbert Van de Sompel, Xiaoming Liu, Terry L. Harrison, Nathan McFarland

Using a Path-Based Thesaurus Model to Enhance a Domain-Specific Digital Library

Our research focuses on providing easy access to interdisciplinary information in the natural resource management domain [3] so users can more readily benefit from previous scientific findings, assessments, and decisions. Because of the widespread use of specialized terminology, our work focuses on extending a traditional thesaurus model [1] to properly represent and exploit the broad range of terms in a digital library designed for natural resource management.

Mathew J. Weaver, Lois Delcambre, Timothy Tolle, Marianne Lykke Nielsen

Generating and Evaluating Automatic Metadata for Educational Resources

Metadata provides a higher-level description of digital library resources and serves as a searchable record for browsing and accessing digital library content. However, manually assigning metadata is a resource-consuming task for which Natural Language Processing (NLP) can provide a solution. This poster coalesces the findings from research and development accomplished across two multi-year digital library metadata generation and evaluation projects and suggests how the lessons learned might benefit digital libraries with the need for high-quality, but efficient metadata assignment for their resources.

Elizabeth D. Liddy, Jiangping Chen, Christina M. Finneran, Anne R. Diekema, Sarah C. Harwell, Ozgur Yilmazel

Web Service Providers: A New Role in the Open Archives Initiative?

Extended Abstract

Service Oriented Computing [1] is consolidating as the dominant paradigm for software development of this decade. The support it has received from researchers, practitioners and –most important –the software industry demonstrates the suitability of this approach. This means that current software systems must evolve in this direction in order to keep aligned with technology, providing a set of services that can be invoked by programs instead of end users.

Manuel Llavador, José H. Canós, Marcos R. S. Borges

DiCoMo: An Algorithm Based Method to Estimate Digitization Costs in Digital Libraries

The estimate of web-content production costs is a very difficult task. It is difficult to make exact predictions due to the great quantity of unknown factors. However, digitization projects need to have a precise idea of the economic costs and times involved in the development of their contents. As it happens with software development projects, incorrect estimates give way to delays and costs overdrafts. Based on methods used in Software Engineering for software development cost prediction like COCOMO [1]) and Function Points [2], and using historical data gathered during five years of work at the Miguel de Cervantes Digital Library, where more than 12.000 books were digitized, we have refined an equation for digitization cost estimates named DiCoMo (Digitization Cost Model). This method can be adapted to different production processes, like the production of digital XML or HTML texts using scanning plus OCR and human proofreading, or the production of digital facsimiles (scanning images without OCR). The estimates done a priori are improved as the project evolves by means of adjustments based on real data obtained from previous stages of the production process. Each estimate is a refinement obtained as a result of the work done so far.

Alejandro Bia, Jaime Gómez

Adapting Kepler Framework for Enriching Institutional Repositories: An Experimental Study

There is growing trend towards academic and research organizations to establish OAI-compliant institutional repositories. ePrints@IISc (http://eprints.iisc.ernet.in/) is the institutional repository of Indian Institute of Science (IISc), Bangalore. Though the repository is growing steadily, mediated submission by the ePrints@IISc staff is the predominant mode of enriching the repository. We have been exploring viable means of getting our researchers to contribute more actively to the repository. Observations have recently been made as to why researchers might be reluctant to contribute to central repositories [1]. It has been suggested that it might be useful to provide researchers with tools to easily create and share Personal Digital Repositories (PDR) designed to organize and facilitate their research and learning agendas. The collection in the PDR is built and managed by the scholar based on individual needs. A network of such PDRs could form the basis for a bottom up, organic approach to enrich centralized institutional repositories.

A. Ramnishath, Francis Jayakanth, Filbert Minj, T. B. Rajashekar

The Construction of a Chinese Rubbings Digital Library: An Attempt in Preserving and Utilizing Chinese Cultural Heritage Materials

China is a country with an ancient civilization going back 5,000 years. Keeping records on inscriptions is an important method of preserving the memory of Chinese history and culture. Rubbings are important components of ancient Chinese books, and are the main source for people to learn, study, and research history. The construction of a Chinese rubbings digital library is an attempt to solve the problems of preserving and utilizing cultural heritage materials. This poster will discuss the following topics: (1) technical process of constructing a Chinese rubbings digital library; (2) formulating principles and designing metadata standards for the Chinese rubbings digital library; (3) introduction of four prototype databases; and (4) analysis of existing problems in building a rubbings digital library such as data capacity, system functions, metadata standards, and international cooperation.

Guohui Li, Michael Bailou Huang

Policy Model for National and Academic Digital Collections

The access and reproduction policies of the digital collections of fifteen leading academic and national digital libraries worldwide are classified according to factors such as the creation type of the material, acquisition method and copyright ownership. The relationship of these factors and policies is analyzed and quantitative remarks are extracted. We propose a policy model for the digital content of the national and academic libraries. The model consists of rules, supplemented by their exceptions, about which factors lead to specific policies. We derive new policy rules on access and reproduction when different copyright terms are applied. We conclude with findings on policies. Finally, we compare national and academic library policies, showing interesting results that arise on their similarities and differences.

Alexandros Koulouris, Sarantos Kapidakis

A Framework for Supporting Common Search Strategies in DAFFODIL

Daffodil

is a front-end to federated, heterogeneous digital libraries targeting at strategic support of users during the information seeking process by offering a variety of functions for searching, exploring and managing digital library objects. In the process of searching for information, common strategies and tactics emerge that can be reused in different searches and different contexts. This poster presents the framework that will be used to build a search support system that provides the possibilities to define and recognize such common strategies and tactics, to save and reuse them, to build larger search plans from these parts, and to support automatic execution of partial or complete search strategies.

Sascha Kriewel, Claus-Peter Klas, Sven Frankmölle, Norbert Fuhr

Searching Cross-Language Metadata with Automatically Structured Queries

When searching metadata, it can be useful to detect expressions in the query that should be searched for in specific fields (for instance, person names might correspond to an “author” field). In [1], it was shown that automatically structured queries (matching title, abstract, author and publication fields) improved effectiveness when searching the ACM, CITIDEL and NDLTD Computing Digital Libraries.

Víctor Peinado, Fernando López-Ostenero, Julio Gonzalo, Felisa Verdejo

Similarity and Duplicate Detection System for an OAI Compliant Federated Digital Library

The Open Archives Initiative (OAI) is making feasible to build high level services such as a federated search service that harvests metadata from different data providers using the OAI protocol for metadata harvesting (OAI-PMH) and provides a unified search interface. There are numerous challenges to build and maintain a federation service, and one of them is managing duplicates. Detecting exact duplicates where two records have identical set of metadata fields is straight-forward. The problem arises when two or more records differ slightly due to data entry errors, for example. Many duplicate detection algorithms exist, but are computationally intensive for large federated digital library. In this paper, we propose an efficient duplication detection algorithm for a large federated digital library like Arc.

Haseebulla M. Khan, Kurt Maly, Mohammad Zubair

Sharing Academic Integrity Guidance: Working Towards a Digital Library Infrastructure

This poster draws on the ‘Digital Libraries in Support of Innovative Approaches to Learning and Teaching in Geography’ (http://www.dialogplus.org) project under which geography teachers in two UK and two US universities are collaborating in the creation and sharing of reusable online learning activities. A specific aim of the project has been to explore the use of digital library (DL) infrastructures to enable the sharing of learning objects between the participating institutions.

This poster presents a brief overview of the broader project, but focuses on one case study drawn from our programme of work, whereby a learning activity or ‘nugget’ concerned with academic integrity, originally developed at Pennsylvania State University (PSU) in the USA for use by distance learning masters students, has subsequently been repurposed for campus based students at the Universities of Southampton and Leeds in the UK.

Samuel Leung, Karen Fill, David DiBiase, Andy Nelson

Supporting ECDL’05 Using TCeReview

Conference Management constitutes a field in Digital Libraries including tasks such as paper to reviewer assignment and session compilation. These tasks depend on the paper to topic assignment. TCeReview addresses the automatic organization of text documents and enhances conventional conference management applications by incorporating a text classification module. This paper presents the results obtained during the empirical evaluation of the TCeReview applied at ECDL’05.

Andreas Pesenhofer, Helmut Berger, Andreas Rauber

ContentE: Flexible Publication of Digitised Works with METS

This poster addresses the problem of the publication of digitized works. It presents the solution developed at the National Library of Portugal, where nearly one million of images were created in the last year, from a wide range of original genres (manuscripts, maps, posters, books, newspapers, etc.). The solution is based on a tool named ContentE, which supports the creation, import and editing of structural descriptions of works, making it possible to record them in XML using the METS schema. The tool manages also collections of style sheets, making it possible to create multiple publication copies, as XHTML sites. This solution can be used as a standalone tool, with a graphic user interface, or embedded in a web-service, for automatic publishing.

José Borbinha, Gilberto Pedrosa, João Penas

The UNIMARC Metadata Registry

This poster describes the first steps in creating a metadata registry for the UNIMARC formats. This registry aims to hold formal descriptions of the structure of the formats, keeping track of their versions, as also the register of the textual descriptions in multiple languages. These structural representations are recorded in XML. This poster gives a special focus to the results already available for the bibliographic format: its on-line textual publication and the automatic validation of bibliographic records.

José Borbinha, Hugo Manguinhas

Developing a Computational Model of “Quality” for Educational Digital Libraries

This poster will present the results of a pilot study that examined the efficacy of a computational model to support users in determining quality in educational digital libraries. The subsequent research design of a larger follow-on project will also be presented. It is anticipated that the conceptual and computational models that will be created for scaffolding quality judgments about library resources can be empirically validated, and ultimately integrated into digital library tools and services.

Tamara Sumner, Mary Marlino, Myra Custard

Backmatter

Titel: Research and Advanced Technology for Digital Libraries
herausgegeben von: Andreas Rauber
Stavros Christodoulakis
A Min Tjoa
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-540-31931-3
Print ISBN: 978-3-540-28767-4
DOI: https://doi.org/10.1007/11551362