Skip to main content

2011 | Buch

Web Information Systems and Technologies

6th International Conference, WEBIST 2010, Valencia, Spain, April 7-10, 2010, Revised Selected Papers

insite
SUCHEN

Über dieses Buch

This book contains the thoroughly refereed and revised best papers from the 6th International Conference on Web Information Systems and Technologies, WEBIST 2010, held in Valencia, Spain, in April 2010, organized by the Institute for Systems and Technologies of Information, Control and Communication (INSTICC), in collaboration with ACM SIGMIS and co-sponsored by the Workflow Management Coalition (WFMC).

The 21 papers presented in this book were carefully reviewed and selected from 205 submissions. The papers are grouped into four parts on Internet Technology; Web Interfaces and Applications; Society, e-Business, and e-Government; and Web Intelligence.

Inhaltsverzeichnis

Frontmatter

Internet Technology

How People and Stakeholders Shape Standards: The Case of IEEE 802.11
Abstract
Following the ‘social shaping of technology’ approach, this paper provides a brief discussion of the relations that exist between different stakeholders in ICT standardisation. It then discusses the impact exerted by the individuals who populate a standards body’s working group, and how this body’s voting rules impact its final standards. The paper primarily draws upon a qualitative empirical study. In particular, it will use the IEEE 802.11 committee as a real-world sample group to further highlight the issues discussed more theoretically above.
Kai Jakobs
Semantic Building Information Model and Multimedia for Facility Management
Abstract
In the field of civil engineering, the proliferation of stakeholders and the heterogeneity of modeling tools detract from the quality of the design process, construction and building maintenance. In this paper, we present a Web-based platform lets geographically dispersed project participants—from facility managers and architects to electricians to plumbers—directly use and exchange project documents in a centralized virtual environment using a simple Web browser. A 3D visualization lets participants move around in the building being designed and obtain information about the objects that compose it. This approach is based both on a semantic architecture called CDMF and IFC 2x3. Our framework, based on Building Information Modeling features, facilitates data maintenance (data migration, model evolution) during the building lifecycle and reduces the volume of data.
Christophe Nicolle, Christophe Cruz
Service Reliability Assessment in a Composition Environment
Abstract
A service composition environment is one where a service interacts with other services to form a composite application. In such an environment, the reliability of an individual service is influenced by the reliability figures of the services it interacts with. Reliability measuring procedures in the past tend to ignore this influence and calculate the individual reliabilities of services in isolation. In this work, we present a technique which allows us to calculate the reliability of a service incorporating the influence of the interacting services. This is done by representing the entire service domain as a continuous time Markov chain and finding the ‘failure distance’ of each service. The failure distance of a service is an expression of its reliability. The proposed technique is validated by running experiments in a simulated environment wherein services are randomly made to fail and service compositions are formed based on reliability figures of individual services. It is shown that the compositions formed utilizing the proposed technique are much more reliable than those formed with service reliability being calculated in isolation.
Abhishek Srivastava, Paul G. Sorenson
Automatic Web Service Tagging Using Machine Learning and WordNet Synsets
Abstract
The importancy of Web services comes from the fact that they are an important means to realize SOA applications. Their increasing popularity caused the emergence of a fairly huge number of services. Therefore, finding a particular service among this large service space can be a hard task. User tags have proven to be a useful technique to smooth browsing experience in large document collections. Some service search engines proposes the facility of service tagging. It is usually done manually by the providers and the users of the services, which can be a fairly tedious and error prone task. In this paper we propose an approach for tagging Web services automatically. It adapts techniques from text mining and machine learning to extract tags from WSDL descriptions. Then it enriches these tags by extracting relevant synonyms using WordNet. We validated our approach on a corpus of 146 services extracted from Seekda.
Zeina Azmeh, Jean-Rémy Falleri, Marianne Huchard, Chouki Tibermacine
Minimal-Footprint Middleware to Leverage Qualified Electronic Signatures
Abstract
Qualified electronic signatures are recognized as being equivalent to handwritten signatures and are supported by EU legislation. They require a secure signature creation device (SSCD) such as a smart card. This paper presents a novel approach for the integration of smart cards in web applications without the requirement to install dedicated software on the user’s computer. The signature creation process is split into two parts: One part is performed on the server side and the other part (requiring access to functions of the secure signature creation device) is deployed and executed as a lightweight component in the user’s browser on demand. This significantly facilitates the usage of smart cards for the creation of qualified electronic signatures and therefore counteracts their low market penetration all over Europe. The approach has meanwhile attracted attention in various Member States and proved ideal for the quick integration and deployment of a large number of diverse and rapidly evolving SSCDs.
Clemens Orthacker, Martin Centner
Structured Collaborative Tagging: Is It Practical for Web Service Discovery?
Abstract
One of the key requirements for the success of Service Oriented Architecture is discoverability of Web services. However, public services suffer from the lack of metadata. Current methods to provide such metadata are impractical for the volume of services published on the Web: they are too expensive to be implemented by a service broker, and too difficult to be used for retrieval. We introduce structured collaborative tagging to address these issues. Here, user tags not only aspects relevant for her but also suggested ones (input, output and behavior). Cost, performance and usability of the proposed technique obtained during the Semantic Service Selection 2009 contest are reported. Obtained results suggests that there is no “free lunch.” While the method is user-friendly and supports effective retrieval, it still involves cost of attracting the community, and is practical only as complementary one. The analysis shows this is due to user’s autonomy as to what, when and how to tag.
Maciej Gawinecki, Giacomo Cabri, Marcin Paprzycki, Maria Ganzha

Web Interfaces and Applications

In-Context Annotations for Refinding and Sharing
Abstract
Annotations support understanding, interpretation, sensemaking and scannability. As valuable as in paper-based contexts, digital online annotations provide several benefits for annotators and collaborators. To further explore the real benefits of online annotations, we implemented a simple Web Annotation tool, SpreadCrumbs, to support our studies. The tool provides a simple annotation mechanism, simulating real-world paper-based annotations. In addition, the tool supports search, sharing capabilities and social navigation. We conducted a series of user studies that empirically demonstrates the benefits of “in-context” annotations for refinding and sharing.
Ricardo Kawase, Eelco Herder, George Papadakis, Wolfgang Nejdl
Concept Signatures and Semantic Drift
Abstract
Ontology evolution is the process of incrementally and consistently adapting an existing ontology to changes in the relevant domain. Semantic drift refers to how ontology concepts’ intentions gradually change as the domain evolves. Normally, a semantic drift captures small domain changes that are hard to detect with traditional ontology management tools or ontology learning methods, but may be important to the maintenance of the ontology. This paper discusses a new approach to detecting semantic drift that makes use of concept signatures reflecting the textual references to concepts over time. Comparing how signatures change over time, we see how concepts’ semantic content evolves and how their relationships to other concepts gradually reflect these changes. An experiment with the DNV’s business sector ontology from 2004 and 2008 demonstrates the value of this approach to ontology evolution.
Jon Atle Gulla, Geir Solskinnsbakk, Per Myrseth, Veronika Haderlein, Olga Cerrato
TagMe!: Enhancing Social Tagging with Spatial Context
Abstract
TagMe! is a tagging and exploration front-end for Flickr images, which enables users to annotate specific areas of an image, i.e. users can attach tag assignments to a specific area within an image and further categorize the tag assignments. Additionally, TagMe! automatically maps tags and categories to DBpedia URIs to clearly define the meaning. In this work we discuss the differences between tags and categories and show how both facets can be applied to learn semantic relations between concepts referenced by tags and categories. We also expose the benefits of the visual (spatial) context of the tag assignments, with respect to ranking algorithms for search and retrieval of relevant items. We do so by analyzing metrics of size and position of the annotated areas. Finally, in our experiments we compare different strategies to realize semantic mappings and show that already lightweight approaches map tags and categories with high precisions (86.85% and 93.77% respectively). The TagMe! system is currently available at http://tagme.groupme.org .
Fabian Abel, Nicola Henze, Ricardo Kawase, Daniel Krause, Patrick Siehndel
Personalized Information Search and Retrieval through a Desktop Application
Abstract
An important aspect of current Search Engines is that they answer queries crudely rather than learning the long-term requirements specific to a given user or, more precisely, to a specific information seeking task. If the same query is submitted by different users to a typical search engine, it will probably return the same result, regardless of who submitted the query. In our opinion, smart searching is definitely the next level of search technology. In this paper we present a Personalized Information Search Assistant, P\(\mathcal{I}\)SA, an environment where the user will not only be able to search/retrieve/be informed about documents relevant to her interests, but she will also be provided with highly personalized tools for organizing documents and information into a personal workspace. The major novelty of P\(\mathcal{I}\)SA is that it combines all the characteristics of an on-line metasearch system with working space organization features in a desktop application, providing the user with a single user point of view personalized search environment.
M. Elena Renda
Web Information Systems Portfolios: A Contribution to Pragmatics
Abstract
On a high level of abstraction the storyboard of a Web Information System (WIS) specifies who will be using the system, in which way and for which goals. Storyboard pragmatics deals with the question what the storyboard means for its users. One part of pragmatics is concerned with usage analysis by means of life cases, user models and contexts. In this paper we address another part of pragmatics that complements usage analysis by WIS portfolios. These comprise two parts: the information portfolio and the utilisation portfolio. The former one is concerned with information consumed and produced by the WIS users, which leads to content chunks. The latter one captures functionality requirements, which depend on the specific category the WIS belongs to. Here we concentrate on information services and community WISs.
Klaus-Dieter Schewe, Bernhard Thalheim
Efficient Literature Research Based on Semantic Tagnets: Implemented and Evaluated for a German Text-Corpus
Abstract
In this paper we present an approach that is capable to automatically generate semantic tagnets for given sets of german tags (keywords) and an arbitrary text corpus using three different analysis methods. The resulting tagnets are used to estimate similarities between texts that are manually tagged with the keywords from the given tagset. Basically, this approach can be used in digital libraries to provide an efficient and intuitive interface for literature research. Although it is mainly optimized for the german language the proposed methods can easily be enhanced to generate tagnets for a given set of english keywords.
Uta Christoph, Daniel Götten, Karl-Heinz Krempels, Christoph Terwelp

Society, e-Business and e-Government

Personalized e-Government Services: Tourism Recommender System Framework
Abstract
Most governments around the globe use the internet and information technologies to deliver information and services for citizens and businesses. One of the main directions in the current e-government (e-Gov) development strategy is to provide better online services to citizens such that the required information can be located by citizens with less time and effort. Tourism is one of the main focused areas of e-Gov development strategy because it is one of the major profitable industries. Significant efforts have been devoted by governments to improve tourism services. However, the current e-Gov tourism services are limited to simple online presentation; intelligent e-Gov tourism services are highly desirable. Personalization techniques, particularly recommendation systems, are the most promising techniques to deliver personalized e-Gov (Pe-Gov) tourism services. This study proposes ontology-based personalized e-Gov tourism recommender system framework, which would enable tourism information seekers to locate the most interesting destinations and find the most preferable attractions and activities with less time and effort. The main components of the proposed framework and some outstanding features are presented along with a detailed description of a scenario.
Malak Al-hassan, Haiyan Lu, Jie Lu
Building a Community Information System for Supporting Disabled Bus Riders and Local Businesses
Abstract
This paper discusses the implementation of one type of information system for the New York City bus transit service, as a case study to provide value-added transportation services for people with impaired mobility. Information technology is a key tool for finding flexible transportation services, especially for disabled people. Useful information supplies psychological reassurance to these vulnerable people to make them feel more safe and secure. Residents in metropolitan areas increasingly rely on the convenience of public transportation, and they are becoming used to exchanging information relevant to their regional community in on-line settings. The improvement to transit accessibility needs the exact same type of the cooperation between transportation companies, local business, and residents. The widespread use of mobile wheelchairs has a socioeconomic impact. The significance of this research for the longer-term goals lies in its implications for adaptation of this kind of intelligent model into future welfare or assistive activities.
Akira Kawaguchi, Andrew Nagel, Chiu Chan, Neville A. Parker

Web Intelligence

An Empirical Study on Machine Learning-Based Sentiment Classification Using Polarity Clues
Abstract
In recent years a variety of approaches in classifying the sentiment polarity of texts have been proposed. While in the majority of approaches the determination of subjectivity or polarity-related term features is at the center, the number of publicly available dictionaries is rather limited. In this paper, we investigate the performance of combining lexical resources with machine learning based classifier for the task of sentiment classification.We systematically analyze four different English and three different German polarity dictionaries as a resources for a sentiment-based feature selection. The evaluation results show that smaller but more controlled dictionaries used for feature selection perform within a SVM-based classification setup equally good compared to the biggest available resources.
Ulli Waltinger
The Generalist Recommender System GRSK and Its Extension to Groups
Abstract
This paper presents a Generalist Recommender System Kernel (GRSK) and describes the differences of the recommendation process when it is applied to groups. The GRSK is able to work with any domain as long as the domain description is represented within an ontology. Several basic techniques like demographics, content-based or collaborative are used to elicit the recommendations, as well as other hybrid techniques. The GRSK provides a configuration process through which to select the techniques and parameters that best suit the particular application domain. The experiments will show the success of the GRSK in different domains. We also outline the changes and new techniques required by the GRSK when it is used in a group recommendation.
Inma Garcia, Laura Sebastia, Sergio Pajares, Eva Onaindia
Extending User Profiles in Collaborative Filtering Algorithms to Alleviate the Sparsity Problem
Abstract
The overabundance of information and the related difficulty to discover interesting content has complicated the selection process for endusers. Recommender systems try to assist in this content-selection process by using intelligent personalisation techniques which filter the information. Most commonly-used recommendation algorithms are based on Collaborative Filtering (CF). However, present-day CF techniques are optimized for suggesting provider-generated content and partially lose their effectiveness when recommending user-generated content. Therefore, we propose an advanced CF algorithm which considers the specific characteristics of user-generated content (like the sparsity of the data matrix). To alleviate this sparsity problem, profiles are extended with probable future consumptions. These extended profiles increase the profile overlap probability, thereby increasing the number of neighbours used for calculating the recommendations. This way, the recommendations become more precise and diverse compared to traditional CF recommendations. This paper explains the proposed algorithm in detail and demonstrates the improvements on standard CF.
Toon De Pessemier, Kris Vanhecke, Simon Dooms, Tom Deryckere, Luc Martens
Automatic Short Text Annotation for Question Answering System
Abstract
Semantic annotation for text is a well-studied topic. However, little contribution has been engaged in the application of short text annotation. In this article, an automatic annotation approach is proposed for such purpose, which annotates short text with semantic labels for question answering systems. In the first step, keywords are extracted from a question and then a semantic label selection module is used to select semantic labels to tag keywords. If there is no appropriate label, WordNet is employed to obtain candidate labels to annotate those keywords by calculating the similarity between each keyword in the question and the concept list in our predefined Tagger Ontology. To improve the accuracy of annotation, we also design a naïve Bayesian based method to distinguish multi-senses and assign best semantic labels by referring to historically annotated questions. Preliminary experiments on 6 categories show our approach achieves the precision of 76% in average.
Gang Liu, Zhi Lu, Tianyong Hao, Wenyin Liu
Ad-Hoc Georeferencing of Web-Pages Using Street-Name Prefix Trees
Abstract
A bottleneck of constructing location-based web searches is that most web-pages do not contain any explicit geocoding such as geotags. Alternative solution can be based on ad-hoc georeferencing which relies on street addresses, but the problem is how to extract and validate the address strings from free-form text. We propose a rule-based pattern matching solution that detects address-based locations using a gazetteer and street-name prefix trees created from the gazetteer. We compare this approach against a method that doesn’t require a gazetteer (a heuristic method that assumes that street-name has a certain structure) and a method that also uses data structures created from the gazetteer in the form of street-name arrays. Experiments using our location based search engine prototype (MOPSI) for Finland and Singapore, show that the proposed prefix-tree solution is twice as fast and 10% more accurate than its rule-based alternative and 10 times faster if an array structure is used when accessing the gazetteer.
Andrei Tabarcea, Ville Hautamäki, Pasi Fränti
Web Page Classification Using Image Analysis Features
Abstract
Classification of web pages is usually done by extracting the textual content of the page and/or by extracting structural features from the HTML. In this work, we present a different approach, where we use the visual appearance of web pages for their classification.We extract generic, low-level visual features directly from the page as it is rendered by a web browser. The visual features used in this document are simple color and edge histograms, Gabor and texture features. These were extracted using an off-the-shelf visual feature extraction method. In three experiments, we classify web pages based on their aesthetic value, their recency and the type of website. Results show that these simple, global visual features already produce good classification results. We also introduce an online tool that uses the trained classifiers to assess new web pages.
Viktor de Boer, MaartenW. van Someren, Tiberiu Lupascu
Towards a Better Semantic Matching for Indexation Improvement of Error-Prone (Semi-)Structured XML Documents
Abstract
Documents containing errors in their textual content (which we will call noisy documents) are difficultly handled by Information Retrieval systems. The same observation is verified when it comes to (semi-)structured IR systems this paper deals with. However, the problem is even bigger when those systems rely on Semantics. In order to achieve that, they need an additional external semantic resource related to the documents collection. Then, ranking is made possible thanks to concepts comparisons allowed by similarity measures. Similarity measures assume that concepts related to the words have been identified without ambiguity. Nevertheless, this assumption can’t be made in presence of noisy documents where words are potentially misspelled, resulting in a word having a different meaning or at least in a non-word. Semantic aware (semi-)structured IR systems lay on basic concept identification but they don’t care about spelling uncertainties. As this can degrade systems results, we suggest a way to detect and correct misspelled terms which can be used in documents pre-processing of IR systems. First results on small datasets seem promising.
Arnaud Renard, Sylvie Calabretto, Béatrice Rumpler
Backmatter
Metadaten
Titel
Web Information Systems and Technologies
herausgegeben von
Joaquim Filipe
José Cordeiro
Copyright-Jahr
2011
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-22810-0
Print ISBN
978-3-642-22809-4
DOI
https://doi.org/10.1007/978-3-642-22810-0

Premium Partner