Skip to main content

Über dieses Buch

This book constitutes the proceedings of the 35th European Conference on IR Research, ECIR 2013, held in Moscow, Russia, in March 2013. The 55 full papers, 38 poster papers and 10 demonstrations presented in this volume were carefully reviewed and selected from 287 submissions. The papers are organized in the following topical sections: user aspects; multimedia and cross-media IR; data mining; IR theory and formal models; IR system architectures; classification; Web; event detection; temporal IR, and microblog search. Also included are 4 tutorial and 2 workshop presentations.



User Aspects

Using Intent Information to Model User Behavior in Diversified Search

A result page of a modern commercial search engine often contains documents of different types targeted to satisfy different user intents (news, blogs, multimedia). When evaluating system performance and making design decisions we need to better understand user behavior on such result pages. To address this problem various click models have previously been proposed. In this paper we focus on result pages containing fresh results and propose a way to model user intent distribution and bias due to different document presentation types. To the best of our knowledge this is the first work that successfully uses intent and layout information to improve existing click models.

Aleksandr Chuklin, Pavel Serdyukov, Maarten de Rijke

Understanding Relevance: An fMRI Study

Relevance is one of the key concepts in Information Retrieval (IR). A huge body of research exists that attempts to understand this concept so as to operationalize it for IR systems. Despite advances in the past few decades, answering the question “How does relevance happen?” is still a big challenge. In this paper, we investigate the connection between relevance and brain activity. Using functional Magnetic Resonance Imaging (fMRI), we measured the brain activity of eighteen participants while they performed four topical relevance assessment tasks on relevant and non-relevant images. The results of this experiment revealed three brain regions in the frontal, parietal and temporal cortex where brain activity differed between processing relevant and non-relevant documents. This is an important step in unravelling the nature of relevance and therefore better utilising it for effective retrieval.

Yashar Moshfeghi, Luisa R. Pinto, Frank E. Pollick, Joemon M. Jose

An Exploratory Study of Sensemaking in Collaborative Information Seeking

With the ubiquity of current information retrieval systems, users move beyond individual searching to performing complex information seeking tasks together with collaborators for social, leisure or professional purposes. In this paper, we investigate the sensemaking behaviour of online searchers in terms of sensemaking strategies, sharing of information, construction of a shared representation and sharing of task progress and status. We also looked into the support provided to them by search systems in the collaborative information seeking process. We report the results of an observational user study where 24 participants, in groups of 3, completed a travel planning task. Our results show that current tools do not sufficiently support searchers in most aspects of the collaborative sensemaking process. Our findings have implications for the design of collaborative information seeking systems.

Yihan Tao, Anastasios Tombros

Multimedia and Cross-Media IR

Exploiting User Comments for Audio-Visual Content Indexing and Retrieval

State-of-the-art content sharing platforms often require users to assign tags to pieces of media in order to make them easily retrievable. Since this task is sometimes perceived as tedious or boring, annotations can be sparse. Commenting on the other hand is a frequently used means of expressing user opinion towards shared media items. This work makes use of time series analyses in order to infer potential tags and indexing terms for audio-visual content from user comments. In this way, we mitigate the vocabulary gap between queries and document descriptors. Additionally, we show how large-scale encyclopaedias such as Wikipedia can aid the task of tag prediction by serving as surrogates for high-coverage natural language vocabulary lists. Our evaluation is conducted on a corpus of several million real-world user comments from the popular video sharing platform YouTube, and demonstrates significant improvements in retrieval performance.

Carsten Eickhoff, Wen Li, Arjen P. de Vries

An Evaluation of Labelling-Game Data for Video Retrieval

Games with a purpose (GWAPs) are increasingly used in audio-visual collections as a mechanism for annotating videos through tagging. This trend is driven by the assumption that user tags will improve video search. In this paper we study whether this is indeed the case. To this end, we create an evaluation dataset that consists of: (i) a set of videos tagged by users via video labelling game, (ii) a set of queries derived from real-life query logs, and (iii) relevance judgements. Besides user tags from the labelling game, we exploit the existing metadata associated with the videos (textual descriptions and curated in-house tags) and closed captions. Our findings show that search based on user tags alone outperforms search based on all other metadata types. Combining user tags with the other types of metadata yields an increase in search performance of 33%. We also find that the search performance of user tags steadily increases as more tags are collected.

Riste Gligorov, Michiel Hildebrand, Jacco van Ossenbruggen, Lora Aroyo, Guus Schreiber

Multimodal Re-ranking of Product Image Search Results

In this article we address the problem of searching for products using an image as query, instead of the more popular approach of searching by textual keywords. With the fast development of the Internet, the popularization of mobile devices and e-commerce systems, searching specific products by image has become an interesting research topic. In this context, Content-Based Image Retrieval (CBIR) techniques have been used to support and enhance the customer shopping experience. We propose an image re-ranking strategy based on multimedia information available on product databases. Our re-ranking strategy relies on category and textual information associated to the top-k images of an initial ranking computed purely with CBIR techniques. Experiments were carried out with users’ relevance judgment on two image datasets collected from e-commerce Web sites. Our results show that our re-ranking strategy outperforms the baselines when using only CBIR techniques.

Joyce M. dos Santos, João M. B. Cavalcanti, Patricia C. Saraiva, Edleno S. de Moura

Data Mining

Predicting Information Diffusion in Social Networks Using Content and User’s Profiles

Predicting the diffusion of information on social networks is a key problem for applications like Opinion Leader Detection, Buzz Detection or Viral Marketing. Many recent diffusion models are direct extensions of the




models, initially proposed for epidemiology and social studies. In such models, the diffusion process is based on the dynamics of interactions between neighbor nodes in the network (the social pressure), and largely ignores important dimensions as the content of the piece of information diffused. We propose here a new family of probabilistic models that aims at predicting how a content diffuses in a network by making use of additional dimensions: the content of the piece of information diffused, user’s profile and willingness to diffuse. These models are illustrated and compared with other approaches on two blog datasets. The experimental results obtained on these datasets show that taking into account the content of the piece of information diffused is important to accurately model the diffusion process.

Cédric Lagnier, Ludovic Denoyer, Eric Gaussier, Patrick Gallinari

How Tagging Pragmatics Influence Tag Sense Discovery in Social Annotation Systems

The presence of emergent semantics in social annotation systems has been reported in numerous studies. Two important problems in this context are the induction of semantic relations among tags and the discovery of different senses of a given tag. While a number of approaches for discovering tag senses exist, little is known about which


influence the

discovery process

. In this paper, we analyze the influence of

user pragmatic

factors. We divide taggers into different pragmatic distinctions. Based on these distinctions, we identify subsets of users whose annotations allow for a more precise and complete discovery of tag senses. Our results provide evidence for a link between tagging pragmatics and semantics and provide another argument for including pragmatic factors in semantic extraction methods. Our work is relevant for improving search, retrieval and browsing in social annotation systems, as well as for optimizing ontology learning algorithms based on tagging data.

Thomas Niebler, Philipp Singer, Dominik Benz, Christian Körner, Markus Strohmaier, Andreas Hotho

A Unified Framework for Monolingual and Cross-Lingual Relevance Modeling Based on Probabilistic Topic Models

We explore the potential of probabilistic topic modeling within the relevance modeling framework for both monolingual and cross-lingual ad-hoc retrieval. Multilingual topic models provide a way to represent documents in a structured and coherent way, regardless of their actual language, by means of language-independent concepts, that is, cross-lingual topics. We show how to integrate the topical knowledge into a unified relevance modeling framework in order to build quality retrieval models in monolingual and cross-lingual contexts. The proposed modeling framework processes all documents uniformly and does not make any conceptual distinction between monolingual and cross-lingual modeling. Our results obtained from the experiments conducted on the standard CLEF test collections reveal that fusing the topical knowledge and relevance modeling leads to building monolingual and cross-lingual retrieval models that outperform several strong baselines. We show that that the topical knowledge coming from a general Web-generated corpus boosts retrieval scores. Additionally, we show that within this framework the estimation of cross-lingual relevance models may be performed by exploiting only a general non-parallel corpus.

Ivan Vulić, Marie-Francine Moens

IR Theory and Formal Models

Semantic Search Log k-Anonymization with Generalized k-Cores of Query Concept Graph

Search log k-anonymization is based on the elimination of infrequent queries under exact (or nearly exact) matching conditions, which usually results in a big data loss and impaired utility. We present a more flexible, semantic approach to k-anonymity that consists of three steps: query concept mining, automatic query expansion, and affinity assessment of expanded queries. Based on the observation that many infrequent queries can be seen as refinements of a more general frequent query, we first model query concepts as probabilistically weighted n-grams and extract them from the search log data. Then, after expanding the original log queries with their weighted concepts, we find all the k-affine expanded queries under a given affinity threshold Θ, modeled as a generalized


of the graph of Θ-affine queries. Experimenting with the AOL data set, we show that this approach achieves levels of privacy comparable to those of plain k-anonymity while at the same time reducing the data losses to a great extent.

Claudio Carpineto, Giovanni Romano

A Joint Classification Method to Integrate Scientific and Social Networks

In this paper, we address the problem of scientific-social network integration to find a matching relationship between members of these networks. Utilizing several name similarity patterns and contextual properties of these networks, we design a focused crawler to find high probable matching pairs, then the problem of name disambiguation is reduced to predict the label of each candidate pair as either true or false matching. By defining matching dependency graph, we propose a joint label prediction model to determine the label of all candidate pairs simultaneously. An extensive set of experiments have been conducted on six test collections obtained from the DBLP and the Twitter networks to show the effectiveness of the proposed joint label prediction model.

Mahmood Neshati, Ehsaneddin Asgari, Djoerd Hiemstra, Hamid Beigy

Using Document-Quality Measures to Predict Web-Search Effectiveness

The query-performance prediction task is estimating retrieval effectiveness in the absence of relevance judgments. The task becomes highly challenging over theWeb due to, among other reasons, the effect of low quality (e.g., spam) documents on retrieval performance. To address this challenge, we present a novel prediction approach that utilizes queryindependent document-quality measures. While using these measures was shown to improve Web-retrieval effectiveness, this is the first study demonstrating the clear merits of using them for query-performance prediction. Evaluation performed with large scale Web collections shows that our methods post prediction quality that often surpasses that of state-of-the-art predictors, including those devised specifically for Web retrieval.

Fiana Raiber, Oren Kurland

IR System Architectures

Training Efficient Tree-Based Models for Document Ranking

Gradient-boosted regression trees (GBRTs) have proven to be an effective solution to the learning-to-rank problem. This work proposes and evaluates techniques for training GBRTs that have efficient


characteristics. Our approach is based on the simple idea that compact, shallow, and balanced trees yield faster predictions: thus, it makes sense to incorporate some notion of execution cost during training to “encourage” trees with these topological characteristics. We propose two strategies for accomplishing this: the first, by directly modifying the node splitting criterion during tree induction, and the second, by stagewise tree pruning. Experiments on a standard learning-to-rank dataset show that the pruning approach is superior; one balanced setting yields an approximately 40% decrease in prediction latency with minimal reduction in output quality as measured by NDCG.

Nima Asadi, Jimmy Lin

DTD Based Costs for Tree-Edit Distance in Structured Information Retrieval

In this paper we present a Structured Information Retrieval (SIR) model based on graph matching. Our approach combines content propagation, which handles sibling relationships, with a document-query structure matching process. The latter is based on Tree-Edit Distance (TED) which is the minimum set of insert, delete, and replace operations to turn one tree to another. To our knowledge this algorithm has never been used in ad-hoc SIR. As the effectiveness of TED relies both on the input tree and the edit costs, we first present a focused subtree extraction technique which selects the most representative elements of the document w.r.t the query. We then describe our TED costs setting based on the Document Type Definition (DTD). Finally we discuss our results according to the type of the collection (data-oriented or text-oriented). Experiments are conducted on two INEX test sets: the 2010 Datacentric collection and the 2005 Ad-hoc one.

Cyril Laitang, Karen Pinel-Sauvagnat, Mohand Boughanem

Ranked Accuracy and Unstructured Distributed Search

Non-uniformly distributing documents in an unstructured peer-to-peer (P2P) network has been shown to improve both the expected search length and search accuracy, where accuracy is defined as the size of the intersection of the documents retrieved by a constrained, probabilistic search and the documents that would have been retrieved by an exhaustive search, normalized by the size of the latter. However neither metric considers the relative ranking of the documents in the retrieved sets. We therefore introduce a new performance metric, rank-accuracy, that is a rank weighted score of the top-


documents retrieved. By replicating documents across nodes based on their retrieval rate (a function of query frequency), and rank, we show that average rank-accuracy can be improved. The practical performance of rank-aware search is demonstrated using a simulated network of 10,000 nodes and queries drawn from a Yahoo! web search log.

Sami Richardson, Ingemar J. Cox


Learning to Rank from Structures in Hierarchical Text Classification

In this paper, we model learning to rank algorithms based on structural dependencies in hierarchical multi-label text categorization (TC). Our method uses the classification probability of the binary classifiers of a standard top-down approach to generate


-best hypotheses. The latter are generated according to their global probability while at the same time satisfy the structural constraints between father and children nodes. The rank is then refined using Support Vector Machines and tree kernels applied to a structural representation of hypotheses, i.e., a hierarchy tree in which the outcome of binary one-vs-all classifiers is directly marked in its nodes. Our extensive experiments on the whole Reuters Corpus Volume 1 show that our models significantly improve over the state of the art in TC, thanks to the use of structural dependecies.

Qi Ju, Alessandro Moschitti, Richard Johansson

Folktale Classification Using Learning to Rank

We present a learning to rank approach to classify folktales, such as fairy tales and urban legends, according to their

story type

, a concept that is widely used by folktale researchers to organize and classify folktales. A

story type

represents a collection of similar stories often with recurring plot and themes. Our work is guided by two frequently used story type classification schemes. Contrary to most information retrieval problems, the text similarity in this problem goes beyond topical similarity. We experiment with approaches inspired by distributed information retrieval and features that compare subject-verb-object triplets. Our system was found to be highly effective compared with a baseline system.

Dong Nguyen, Dolf Trieschnigg, Mariët Theune

Open-Set Classification for Automated Genre Identification

Automated Genre Identification

(AGI) of web pages is a problem of increasing importance since web genre (e.g. blog, news, e-shops, etc.) information can enhance modern Information Retrieval (IR) systems. The state-of-the-art in this field considers AGI as a closed-set classification problem where a variety of web page representation and machine learning models have intensively studied. In this paper, we study AGI as an open-set classification problem which better formulates the real world conditions of exploiting AGI in practice. Focusing on the use of content information, different text representation methods (words and character n-grams) are tested. Moreover, two classification methods are examined, one-class SVM learners, used as a baseline, and an ensemble of classifiers based on random feature subspacing, originally proposed for

author identification

. It is demonstrated that very high


can be achieved in open-set AGI while


remains relatively high.

Dimitrios A. Pritsos, Efstathios Stamatatos


Semantic Tagging of Places Based on User Interest Profiles from Online Social Networks

In recent years, location based services (LBS) have become very popular. The performance of LBS depends on number of factors including how well the places are described. Though LBS enable users to tag places, users rarely do so. On the other hand, users express their interests via online social networks. The common interests of a group of people that has visited a particular place can potentially provide further description for that place. In this work we present an approach that automatically assigns tags to places, based on interest profiles and visits or check-ins of users at places. We have evaluated our approach with real world datasets from popular social network services against a set of manually assigned tags. Experimental results show that we are able to derive meaningful tags for different places and that sets of tags assigned to places are expected to stabilise as more unique users visit places.

Vinod Hegde, Josiane Xavier Parreira, Manfred Hauswirth

Sponsored Search Ad Selection by Keyword Structure Analysis

In sponsored search, the ad selection algorithm is used to pick out the best candidate ads for ranking, the bid keywords of which are best matched to the user queries. Existing ad selection methods mainly focus on the relevance between user query and selected ads, and consequently the monetization ability of the results is not necessarily maximized. To this end, instead of making selection based on keywords as a whole, our work takes advantages of the different impacts, as revealed in our data study, of different components inside the keywords on both relevance and monetization ability. In particular, we select keyword components and then maximize the relevance and revenue on the component level. Finally, we combine the selected components to generate the bid keywords. The experiments reveal that our method can significantly outperform two baseline algorithms on the metrics including recall, precision and the monetization ability.

Kai Hui, Bin Gao, Ben He, Tie-jian Luo

Intent-Based Browse Activity Segmentation

Users search and browse activity mined with special toolbars is known to provide diverse valuable information for the search engine. In particular, it helps to understand information need of a searcher, her personal preferences, context of the topic she is currently interested in. Most of the previous studies on the topic either considered the whole user activity for a fixed period of time or divided it relying on some predefined inactivity time-out. It helps to identify groups of web sites visited with the same information need. This paper addresses the problem of automatic segmentation of users browsing logs into logical segments. We propose a method for automatic division of their daily activity into intent-related parts. This segmentation advances the commonly used approaches. We propose several methods for browsing log partitioning and provide detailed study of their performance. We evaluate all algorithms and analyse contributions of various types of features.

Yury Ustinovskiy, Anna Mazur, Pavel Serdyukov

Event Detection

Extracting Event-Related Information from Article Updates in Wikipedia

Wikipedia is widely considered the largest and most up-to-date online encyclopedia, with its content being continuously maintained by a supporting community. In many cases, real-life events like new scientific findings, resignations, deaths, or catastrophes serve as triggers for collaborative editing of articles about affected entities such as persons or countries. In this paper, we conduct an in-depth analysis of event-related updates in Wikipedia by examining different indicators for events including language, meta annotations, and update bursts. We then study how these indicators can be employed for automatically detecting event-related updates. Our experiments on event extraction, clustering, and summarization show promising results towards generating entity-specific news tickers and timelines.

Mihai Georgescu, Nattiya Kanhabua, Daniel Krause, Wolfgang Nejdl, Stefan Siersdorfer

Using WordNet Hypernyms and Dependency Features for Phrasal-Level Event Recognition and Type Classification

The goal of this research is to devise a method for recognizing and classifying TimeML events in a more effective way. TimeML is the most recent annotation scheme for processing the event and temporal expressions in natural language processing fields. In this paper, we argue and demonstrate that unit feature dependency information and deep-level WordNet hypernyms are useful for event recognition and type classification. The proposed method utilizes various features including lexical semantic and dependency-based combined features. The experimental results show that our proposed method outperforms a state-of-the-art approach, mainly due to the new strategies. Especially, the performance of noun and adjective events, which have been largely ignored and yet significant, is significantly improved.

Yoonjae Jeong, Sung-Hyon Myaeng

Aggregating Evidence from Hospital Departments to Improve Medical Records Search

Searching medical records is challenging due to their inherent implicit knowledge – such knowledge may be known by medical practitioners, but it is hidden from an information retrieval (IR) system. For example, it is intuitive for a medical practitioner to assert that patients with heart disease are likely to have records from the hospital’s cardiology department. Hence, we hypothesise that this implicit knowledge can be used to enhance a medical records search system that ranks patients based on the relevance of their medical records to a query. In this paper, we propose to group aggregates of medical records from individual hospital departments, which we refer to as department-level evidence, to capture some of the implicit knowledge. In particular, each department-level aggregate consists of all of the medical records created by a particular hospital department, which is then exploited to enhance retrieval effectiveness. Specifically, we propose two approaches to build the department-level evidence based on a federated search and a voting paradigm, respectively. In addition, we introduce an extended voting technique that could leverage this department-level evidence while ranking. We evaluate the retrieval effectiveness of our approaches in the context of the TREC 2011 Medical Records track. Our results show that modelling department-level evidence of records in medical records search improves retrieval effectiveness. In particular, our proposed approach to leverage department-level evidence built using a voting technique obtains results comparable to the best submitted TREC 2011 Medical Records track systems without requiring any external resources that are exploited in those systems.

Nut Limsopatham, Craig Macdonald, Iadh Ounis

Temporal IR

An N-Gram Topic Model for Time-Stamped Documents

This paper presents a topic model that captures the temporal dynamics in the text data along with topical phrases. Previous approaches have relied upon bag-of-words assumption to model such property in a corpus. This has resulted in an inferior performance with less interpretable topics. Our topic model can not only capture changes in the way a topic structure changes over time but also maintains important contextual information in the text data. Finding topical


-grams, when possible based on context, instead of always presenting unigrams in topics does away with many ambiguities that individual words may carry. We derive a collapsed Gibbs sampler for posterior inference. Our experimental results show an improvement over the current state-of-the-art topics over time model.

Shoaib Jameel, Wai Lam

Influence of Timeline and Named-Entity Components on User Engagement

Nowadays, successful applications are those which contain features that captivate and engage users. Using an interactive news retrieval system as a use case, in this paper we study the effect of timeline and named-entity components on user engagement. This is in contrast with previous studies where the importance of these components were studied from a retrieval effectiveness point of view. Our experimental results show significant improvements in user engagement when named-entity and timeline components were installed. Further, we investigate if we can predict user-centred metrics through user’s interaction with the system. Results show that we can successfully learn a model that predicts all dimensions of user engagement and whether users will like the system or not. These findings might steer systems that apply a more personalised user experience, tailored to the user’s preferences.

Yashar Moshfeghi, Michael Matthews, Roi Blanco, Joemon M. Jose

Cognitive Temporal Document Priors

Temporal information retrieval exploits temporal features of document collections and queries. Temporal document priors are used to adjust the score of a document based on its publication time. We consider a class of temporal document priors that is inspired by retention functions considered in cognitive psychology that are used to model the decay of memory. Many such functions used as a temporal document prior have a positive effect on overall retrieval performance. We examine the stability of this effect across news and microblog collections and discover interesting differences between retention functions. We also study the problem of optimizing parameters of the retention functions as temporal document priors; some retention functions display consistent good performance across large regions of the parameter space. A retention function based on a Weibull distribution is the preferred choice for a temporal document prior.

Maria-Hendrike Peetz, Maarten de Rijke

Microblog Search

Combining Recency and Topic-Dependent Temporal Variation for Microblog Search

The appearance of microblogging services has led to many short documents being issued by crowds of people. To retrieve useful information from among such a huge quantity of messages, query expansion (QE) is usually used to enrich a user query. Some QE methods for microblog search utilize temporal properties (e.g., recency and temporal variation) derived from the real-time characteristic that many messages are posted by users when an interesting event has recently occurred. Our approach leverages temporal properties for QE and combines them according to the temporal variation of a given topic. Experimental results show that this QE method using automatically combined temporal properties is effective at improving retrieval performance.

Taiki Miyanishi, Kazuhiro Seki, Kuniaki Uehara

Subjectivity Annotation of the Microblog 2011 Realtime Adhoc Relevance Judgments

In this work, we extend the Microblog dataset with subjectivity annotations. Our aim is twofold; first, we want to provide a high-quality, multiply-annotated gold standard of subjectivity annotations for the relevance assessments of the real-time adhoc task. Second, we randomly sample the rest of the dataset and annotate it for subjectivity once, in order to create a complementary annotated dataset that is at least an order of magnitude larger than the gold standard. As a result we have 2,389 tweets that have been annotated by multiple humans and 75,761 tweets that have been annotated by one annotator. We discuss issues like inter-annotator agreement, the time that it took annotators to classify tweets in correlation to their subjective content and lastly, the distribution of subjective tweets in relation to topic categorization. The annotated datasets and all relevant anonymised information are freely available for research purposes.

Georgios Paltoglou, Kevan Buckley

Geo-spatial Event Detection in the Twitter Stream

The rise of Social Media services in the last years has created huge streams of information that can be very valuable in a variety of scenarios. What precisely these scenarios are and how the data streams can efficiently be analyzed for each scenario is still largely unclear at this point in time and has therefore created significant interest in industry and academia. In this paper, we describe a novel algorithm for geo-spatial event detection on Social Media streams. We monitor all posts on Twitter issued in a given geographic region and identify places that show a high amount of activity. In a second processing step, we analyze the resulting spatio-temporal clusters of posts with a Machine Learning component in order to detect whether they constitute real-world events or not. We show that this can be done with high precision and recall. The detected events are finally displayed to a user on a map, at the location where they happen and while they happen.

Maximilian Walther, Michael Kaisser

Full Papers with Short Presentation

A Versatile Tool for Privacy-Enhanced Web Search

We consider the problem of privacy leaks suffered by Internet users when they perform web searches, and propose a framework to mitigate them. Our approach, which builds upon and improves recent work on search privacy, approximates the target search results by replacing the private user query with a set of blurred or scrambled queries. The results of the scrambled queries are then used to cover the original user interest. We model the problem theoretically, define a set of privacy objectives with respect to web search and investigate the effectiveness of the proposed solution with a set of real queries on a large web collection. Experiments show great improvements in retrieval effectiveness over a previously reported baseline in the literature. Furthermore, the methods are more versatile, predictably-behaved, applicable to a wider range of information needs, and the privacy they provide is more comprehensible to the end-user.

Avi Arampatzis, George Drosatos, Pavlos S. Efraimidis

Exploiting Novelty and Diversity in Tag Recommendation

The design and evaluation of tag recommendation methods have focused only on relevance. However, other aspects such as novelty and diversity may be as important to evaluate the usefulness of the recommendations. In this work, we define these two aspects in the context of tag recommendation and propose a novel recommendation strategy that considers them jointly with relevance. This strategy extends a state-of-the-art method based on Genetic Programming to include novelty and diversity metrics both as attributes and as part of the objective function. We evaluate the proposed strategy using data collected from 3 popular Web 2.0 applications: LastFM, YouTube and YahooVideo. Our experiments show that our strategy outperforms the state-of-the-art alternative in terms of novelty and diversity, without harming relevance.

Fabiano Belém, Eder Martins, Jussara Almeida, Marcos Gonçalves

Example Based Entity Search in the Web of Data

The scale of today’s Web of Data motivates the use of keyword search-based approaches to entity-oriented search tasks in addition to traditional structure-based approaches, which require users to have knowledge of the underlying schema. We propose an alternative structure-based approach that makes use of example entities and compare its effectiveness with a text-based approach in the context of an entity list completion task. We find that both the text and structure-based approaches are effective in retrieving relevant entities, but that they find different sets of entities. Additionally, we find that the performance of the structure-based approach is dependent on the quality and number of example entities given. We experiment with a number of hybrid techniques that balance between the two approaches and find that a method that uses the example entities to determine the weights of approaches in the combination on a per query basis is most effective.

Marc Bron, Krisztian Balog, Maarten de Rijke

A Fast Generative Spell Corrector Based on Edit Distance

One of the main challenges in the implementation of web-scale online search systems is the disambiguation of the user input when portions of the input queries are possibly misspelt. Spell correctors that must be integrated with such systems have very stringent restrictions imposed on them; primarily they must possess the ability to handle large volume of concurrent queries and generate relevant spelling suggestions at a very high speed. Often, these systems consist of highend server machines with lots of memory and processing power and the requirement from such spell correctors is to minimize the latency of generating suggestions to a bare minimum.

In this paper, we present a spell corrector that we developed to cater to high volume incoming queries for an online search service. It consists of a fast, per-token candidate generator which generates spell suggestions within a distance of two edit operations of an input token. We compare its performance against an n-gram based spell corrector and show that the presented spell candidate generation approach has lower response times.

Ishan Chattopadhyaya, Kannappan Sirchabesan, Krishanu Seal

Being Confident about the Quality of the Predictions in Recommender Systems

Recommender systems suggest new items to users to try or buy based on their previous preferences or behavior. Many times the information used to recommend these items is limited. An explanation such as

“I believe you will like this item, but I do not have enough information to be fully confident about it.”

may mitigate the issue, but can also damage user trust because it alerts users to the fact that the system might be wrong. The findings in this paper suggest that there is a way of modelling recommendation confidence that is related to accuracy (MAE, RMSE and NDCG) and user rating behaviour (rated vs unrated items). In particular, it was found that unrated items have lower confidence compared to the entire item set - highlighting the importance of explanations for novel but risky recommendations.

Sergio Cleger-Tamayo, Juan M. Fernández-Luna, Juan F. Huete, Nava Tintarev

Two-Stage Learning to Rank for Information Retrieval

Current learning to rank approaches commonly focus on learning the best possible ranking function given a small fixed set of documents. This document set is often retrieved from the collection using a simple unsupervised bag-of-words method, e.g. BM25. This can potentially lead to learning a sub-optimal ranking, since many relevant documents may be excluded from the initially retrieved set. In this paper we propose a novel two-stage learning framework to address this problem. We first learn a ranking function over the entire retrieval collection using a limited set of textual features including weighted phrases, proximities and expansion terms. This function is then used to retrieve the best possible subset of documents over which the final model is trained using a larger set of query- and document-dependent features. Empirical evaluation using two web collections unequivocally demonstrates that our proposed two-stage framework, being able to learn its model from more relevant documents, outperforms current learning to rank approaches.

Van Dang, Michael Bendersky, W. Bruce Croft

Hybrid Query Scheduling for a Replicated Search Engine

Search engines use replication and distribution of large indices across many query servers to achieve efficient retrieval. Under high query load, queries can be scheduled to replicas that are expected to be idle soonest, facilitated by the use of predicted query response times. However, the overhead of making response time predictions can hinder the usefulness of query scheduling under low query load. In this paper, we propose a hybrid scheduling approach that combines the scheduling methods appropriate for both low and high load conditions, and can adapt in response to changing conditions. We deploy a simulation framework, which is prepared with actual and predicted response times for real Web search queries for one full day. Our experiments using different numbers of shards and replicas of the 50 million document ClueWeb09 corpus show that hybrid scheduling can reduce the average waiting times of one day of queries by 68% under high load conditions and by 7% under low load conditions w.r.t. traditional scheduling methods.

Ana Freire, Craig Macdonald, Nicola Tonellotto, Iadh Ounis, Fidel Cacheda

Latent Factor BlockModel for Modelling Relational Data

In this paper we address the problem of modelling relational data, which has appeared in many applications such as social network analysis, recommender systems and bioinformatics. Previous studies either consider latent feature based models to do link prediction in the relational data but disregarding local structure in the network, or focus exclusively on capturing network structure of objects based on latent blockmodels without coupling with latent characteristics of objects to avoid redundant information. To combine the benefits of the previous work, we model the relational data as a function of both latent feature factors and latent cluster memberships of objects via our proposed Latent Factor BlockModel (LFBM) to collectively discover globally predictive intrinsic properties of objects and capture the latent block structure. We also develop an optimization transfer algorithm to learn the latent factors. Extensive experiments on the synthetic data and several real world datasets suggest that our proposed LFBM model outperforms the state-of-the-art approaches for modelling the relational data.

Sheng Gao, Ludovic Denoyer, Patrick Gallinari, Jun Guo

Estimation of the Collection Parameter of Information Models for IR

In this paper we explore various methods to estimate the collection parameter of the information based models for

ad hoc

information retrieval. In previous studies, this parameter was set to the average number of documents where the word under consideration appears. We introduce here a fully formalized estimation method for both the log-logistic and the smoothed power law models that leads to improved versions of these models in IR. Furthermore, we show that the previous setting of the collection parameter of the log-logistic model is a special case of the estimated value proposed here.

Parantapa Goswami, Eric Gaussier

Increasing Stability of Result Organization for Session Search

Search result clustering (SRC) organizes search results into labeled hierarchical structures as an “information lay-of-land”, providing users an overview and helping them quickly locate relevant information from piles of search results. Hierarchies built by this process are usually sensitive to query changes. For search sessions with multiple queries, this could be undesirable since it may leave users a seemly random overview and partly diminish the benefits that SRC intents to offer. We propose to integrate external knowledge from Wikipedia when building concept hierarchies to boost their stability for session queries. Our evaluations on both TREC 2010 and 2011 Session tracks demonstrate that the proposed approaches outperform the state-of-the-art hierarchy construction algorithms in stability of search results organization.

Dongyi Guan, Hui Yang

Updating Users about Time Critical Events

During unexpected events such as natural disasters, individuals rely on the information generated by news outlets to form their understanding of these events. This information, while often voluminous, is frequently degraded by the inclusion of unimportant, duplicate, or wrong information. It is important to be able to present users with only the novel, important information about these events as they develop. We present the problem of updating users about time critical news events, and focus on the task of deciding which information to select for updating users as an event develops. We propose a solution to this problem which incorporates techniques from information retrieval and multi-document summarization and evaluate this approach on a set of historic events using a large stream of news documents. We also introduce an evaluation method which is significantly less expensive than traditional approaches to temporal summarization.

Qi Guo, Fernando Diaz, Elad Yom-Tov

Comparing Crowd-Based, Game-Based, and Machine-Based Approaches in Initial Query and Query Refinement Tasks

Human computation techniques have demonstrated their ability to accomplish portions of tasks that machine-based techniques find difficult. Query refinement is a task that may benefit from human involvement. We conduct an experiment that evaluates the contributions of two user types: student participants and crowdworkers hired from an online labor market. Human participants are assigned to use one of two query interfaces: a traditional web-based interface or a game-based interface. We ask each group to manually construct queries to respond to TREC information needs and calculate their resulting recall and precision. Traditional web interface users are provided feedback on their initial queries and asked to use this information to reformulate their original queries. Game interface users are provided with instant scoring and ask to refine their queries based on their scores. We measure the resulting feedback-based improvement on each group and compare the results from human computation techniques to machine-based algorithms.

Christopher G. Harris, Padmini Srinivasan

Reducing the Uncertainty in Resource Selection

The distributed retrieval process is plagued by uncertainty. Sampling, selection, merging and ranking are all based on very limited information compared to centralized retrieval. In this paper, we focus our attention on reducing the uncertainty within the resource selection phase by obtaining a number of estimates, rather than relying upon only one point estimate. We propose three methods for reducing uncertainty which are compared against state-of-the-art baselines across three distributed retrieval testbeds. Our results show that the proposed methods significantly improve baselines, reduce the uncertainty and improve robustness of resource selection.

Ilya Markov, Leif Azzopardi, Fabio Crestani

Exploiting Time in Automatic Image Tagging

Existing automatic image annotation (AIA) models that depend solely on low-level image features often produce poor results, particularly when annotating real-life collections. Tag co-occurrence has been shown to improve image annotation by identifying additional keywords associated with user-provided keywords. However, existing approaches have treated tag co-occurrence as a static measure over time, thereby ignoring the temporal trends of many tags. The temporal distribution of tags, however, caused by events, seasons, memes, etc. provide a strong source of evidence beyond keywords for AIA. In this paper we propose a temporal tag co-occurrence approach to improve upon the current state-of-the-art automatic image annotation model. By replacing the annotated tags with more temporally significant tags, we achieve statistically significant increases to annotation accuracy on a real-life timestamped image collection from Flickr.

Philip J. McParlane, Joemon M. Jose

Using Text-Based Web Image Search Results Clustering to Minimize Mobile Devices Wasted Space-Interface

The recent shift in human-computer interaction from desktop to mobile computing fosters the needs of new interfaces for web image search results exploration. In order to leverage users’ efforts, we present a set of state-of-the-art ephemeral clustering algorithms, which allow to summarize web image search results into meaningful clusters. This way of presenting visual information on mobile devices is exhaustively evaluated based on two main criteria: clustering accuracy, which must be maximized, and wasted space-interface, which must be minimized. For the first case, we use a broad set of metrics to evaluate ephemeral clustering over a public golden standard data set of web images. For the second case, we propose a new metric to evaluate the mismatch of the used space-interface between the ground truth and the cluster distribution obtained by ephemeral clustering. The results evidence that there exist high divergences between clustering accuracy and used space maximization. As a consequence, the trade-off of cluster-based exploration of web image search results on mobile devices is difficult to define, although our study evidences some clear positive results.

Jose G. Moreno, Gaël Dias

Discovery and Analysis of Evolving Topical Social Discussions on Unstructured Microblogs

Social networks have emerged as hubs of user generated content. Online social conversations can be used to retrieve users interests towards given topics and trends. Microblogging platforms like Twitter are primary examples of social networks with significant volumes of topical message exchanges between users. However, unlike traditional online discussion forums, blogs and social networking sites, explicit discussion threads are absent from microblogging networks like Twitter. This inherent absence of any conversation framework makes it challenging to distinguish conversations from mere topical interests. In this work, we explore semantic, social and temporal relationships of topical clusters formed in Twitter to identify conversations. We devise an algorithm comprising of a sequence of steps such as text clustering, topical similarity detection using TF-IDF and Wordnet, and intersecting social, semantic and temporal graphs to discover social conversations around topics. We further qualitatively show the presence of social localization of discussion threads. Our results suggest that discussion threads evolve significantly over social networks on Twitter. Our algorithm to find social discussion threads can be used for settings such as social information spreading applications and information diffusion analyses on microblog networks.

Kanika Narang, Seema Nagar, Sameep Mehta, L. V. Subramaniam, Kuntal Dey

Web Credibility: Features Exploration and Credibility Prediction

The open nature of the World Wide Web makes evaluating webpage credibility challenging for users. In this paper, we aim to automatically assess web credibility by investigating various characteristics of webpages. Specifically, we first identify features from textual content, link structure, webpages design, as well as their social popularity learned from popular social media sites (e.g., Facebook, Twitter). A set of statistical analyses methods are applied to select the most informative features, which are then used to infer webpages credibility by employing supervised learning algorithms. Real dataset-based experiments under two application settings show that we attain an accuracy of 75% for classification, and an improvement of 53% for the mean absolute error (MAE), with respect to the random baseline approach, for regression.

Alexandra Olteanu, Stanislav Peshterliev, Xin Liu, Karl Aberer

Query Suggestions for Textual Problem Solution Repositories

Textual problem-solution repositories are available today in various forms, most commonly as problem-solution pairs from community question answering systems. Modern search engines that operate on the web can suggest possible completions in real-time for users as they type in queries. We study the problem of generating intelligent query suggestions for users of customized search systems that enable querying over problem-solution repositories. Due to the small scale and specialized nature of such systems, we often do not have the luxury of depending on query logs for finding query suggestions. We propose a retrieval model for generating query suggestions for search on a set of problem solution pairs. We harness the problem solution partition inherent in such repositories to improve upon traditional query suggestion mechanisms designed for systems that search over general textual corpora. We evaluate our technique over real problem-solution datasets and illustrate that our technique provides large and statistically significant improvements over the state-of-the-art technique in query suggestion.

Deepak P., Sutanu Chakraborti, Deepak Khemani

Improving ESA with Document Similarity

Explicit semantic analysis (ESA) is a technique for computing semantic relatedness between natural language texts. It is a document-based distributional model similar to latent semantic analysis (LSA), which is often built on the Wikipedia database when it is required for general English usage. Unlike LSA, however, ESA does not use dimensionality reduction, and therefore it is sometimes unable to account for similarity between words that do not co-occur with same concepts, even if their concepts themselves cover similar subjects. In the Wikipedia implementation ESA concepts are Wikipedia articles, and the Wikilinks between the articles are used to overcome the concept-similarity problem. In this paper, we provide two general solutions for integration of concept-concept similarities into the ESA model, ones that do not rely on a particular corpus structure and do not alter the explicit concept-mapping properties that distinguish ESA from models like LSA and latent Dirichlet allocation (LDA).

Tamara Polajnar, Nitish Aggarwal, Kartik Asooja, Paul Buitelaar

Ontology-Based Word Sense Disambiguation for Scientific Literature

Scientific documents often adopt a well-defined vocabulary and avoid the use of ambiguous terms. However, as soon as documents from different research sub-communities are considered in combination, many scientific terms become ambiguous as the same term can refer to different concepts from different sub-communities. The ability to correctly identify the right sense of a given term can considerably improve the effectiveness of retrieval models, and can also support additional features such as search diversification. This is even more critical when applied to explorative search systems within the scientific domain.

In this paper, we propose novel semi-supervised methods to term disambiguation leveraging the structure of a community-based ontology of scientific concepts. Our approach exploits the graph structure that connects different terms and their definitions to automatically identify the correct sense that was originally picked by the authors of a scientific publication. Experimental evidence over two different test collections from the physics and biomedical domains shows that the proposed method is effective and outperforms state-of-the-art approaches based on feature vectors constructed out of term co-occurrences as well as standard supervised approaches.

Roman Prokofyev, Gianluca Demartini, Alexey Boyarsky, Oleg Ruchayskiy, Philippe Cudré-Mauroux

A Language Modeling Approach for Extracting Translation Knowledge from Comparable Corpora

A main challenge in


information retrieval is to estimate a translation language model, as its quality directly affects the retrieval performance. The translation language model is built using translation resources such as bilingual dictionaries, parallel corpora, or comparable corpora. In general, high quality resources may not be available for scarce-resource languages. For these languages, efficient exploitation of commonly available resources such as

comparable corpora

is considered more crucial. In this paper, we focus on using only comparable corpora to extract translation information more efficiently. We propose a

language modeling

approach for estimating the translation language model. The proposed method is based on probability distribution estimation, and can be tuned easier in comparison with heuristically adjusted previous work. Experiment results show a significant improvement in the translation quality and CLIR performance compared to the previous approaches.

Razieh Rahimi, Azadeh Shakery

Content-Based Re-ranking of Text-Based Image Search Results

This article presents a method for re-ranking images retrieved by classical search engine using key words for entering queries. This method uses the visual content of the images and it is based on the idea that the relevant images should be similar to each other while the non-relevant images should be different from each other and from relevant images. This idea has been implemented by ranking the images according to their average distances to their nearest neighbors. This query-dependent re-ranking is completed by a query-independent re-ranking taking into account the fact that some types of images are non-relevant for almost all queries. This idea is implemented by training a classifier on results from all queries in the training set. The re-ranking is successfully evaluated on classical datasets built with




Google Images


search engines.

Franck Thollard, Georges Quénot

Encoding Local Binary Descriptors by Bag-of-Features with Hamming Distance for Visual Object Categorization

This paper presents a novel method for encoding local binary descriptors for Visual Object Categorization (VOC). Nowadays, local binary descriptors, e.g. LBP and BRIEF, have become very popular in image matching tasks because of their fast computation and matching using binary bitstrings. However, the bottleneck of applying them in the domain of VOC lies in the high dimensional histograms produced by encoding these binary bitstrings into decimal codes. To solve this problem, we propose to encode local binary bitstrings directly by the Bag-of-Features (BoF) model with Hamming distance. The advantages of this approach are two-fold: (1) It solves the high dimensionality issue of the traditional binary bitstring encoding methods, making local binary descriptors more feasible for the task of VOC, especially when more bits are considered; (2) It is computationally efficient because the Hamming distance, which is very suitable for comparing bitstrings, is based on bitwise XOR operations that can be fast computed on modern CPUs. The proposed method is validated by applying on LBP feature for the purpose of VOC. The experimental results on the PASCAL VOC 2007 benchmark show that our approach effectively improves the recognition accuracy compared to the traditional LBP feature.

Yu Zhang, Chao Zhu, Stephane Bres, Liming Chen

Recommending High Utility Query via Session-Flow Graph

Query recommendation is an integral part of modern search engines that helps users find their information needs. Traditional query recommendation methods usually focus on recommending users relevant queries, which attempt to find alternative queries with close search intent to the original query. Whereas the ultimate goal of query recommendation is to assist users to accomplish their search task successfully, while not just find relevant queries in spite of they can sometimes return useful search results. To better achieve the ultimate goal of query recommendation, a more reasonable way is to recommend users high utility queries, i.e., queries that can return more useful information. In this paper, we propose a novel utility query recommendation approach based on absorbing random walk on the session-flow graph, which can learn queries’ utility by simultaneously modeling both users’ reformulation behaviors and click behaviors. Extensively experiments were conducted on real query logs, and the results show that our method significantly outperforms the state-of-the-art methods under the evaluation metric QRR and MRD.

Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan, Wolfgang Nejdl

URL Redirection Accounting for Improving Link-Based Ranking Methods

Traditional link-based web ranking algorithms are applied to web snapshots in the form of webgraphs consisting of pages as vertices and links as edges. Constructing webgraph, researchers do not pay attention to a particular method of how links are taken into account, while certain details may significantly affects the contribution of link-based factors to ranking. Furthermore, researchers use small subgraphs of the webgraph for more efficient evaluation of new algorithms. They usually consider a graph induced by pages, for example, of a certain first level domain. In this paper we reveal a significant dependence of PageRank on the method of accounting redirects while constructing the webgraph. We evaluate several natural ways of redirect accounting on a large-scale domain and find an optimal case, which turns out non-trivial. Moreover, we experimentally compare different ways of extracting a small subgraph for multiple evaluations and reveal some essential shortcomings of traditional approaches.

Maksim Zhukovskii, Gleb Gusev, Pavel Serdyukov

Short Papers with Poster Presentation

Lo mejor de dos idiomas – Cross-Lingual Linkage of Geotagged Wikipedia Articles

Different language versions of Wikipedia contain articles referencing the same place. However, an article in one language does not necessarily mean it is available in another language as well and linked to. This paper examines geotagged articles describing places in Honduras in both the Spanish and the English language versions. It demonstrates that a method based on simple features can reliably identify article pairs describing the same semantic place concept and evaluates it against the existing interlinks as well as a manual assessment.

Dirk Ahlers

A Pilot Study on Using Profile-Based Summarisation for Interactive Search Assistance

Text summarisation is the process of distilling the most important information from a source to produce an abridged version for a particular user or task. This poster investigates the use of profile-based summarisation to provide contextualisation and interactive support for enterprise searches. We employ log analysis to acquire continuously updated profiles to provide profile-based summarisations of search results. These profiles could be capturing an individual’s interests or (as discussed here) those of a group of users. Here we report on a first pilot study.

Azhar Alhindi, Udo Kruschwitz, Chris Fox

Exploring Patent Passage Retrieval Using Nouns Phrases

This paper presents experiments which initially were carried out for the Patent Passage Retrieval track of CLEF-IP 2012. The Passage Retrieval module was implemented independently of the Document Retrieval system. In the Passage Retrieval module we make use of Natural Language Processing applications (WordNet and Stanford Part-of-Speech tagger) for lemmatization and phrase (multi word units) retrieval. We show by applying simple rule-based modifications and only targeting specific language instances (noun phrases) the usage of general NLP tools for phrase retrieval will increase performance of a Patent Passage Information Extraction system.

Linda Andersson, Parvaz Mahdabi, Allan Hanbury, Andreas Rauber

Characterizing Health-Related Community Question Answering

Our ongoing project is aimed at improving information access to narrow-domain collections of questions and answers. This poster demonstrates how out-of-the-box tools and domain dictionaries can be applied to community question answering (CQA) content in health domain. This approach can be used to improve user interfaces and search over CQA data, as well as to evaluate content quality. The study is a first-time use of a sizable dataset from the Russian CQA site Otvety@Mail.Ru.

Alexander Beloborodov, Artem Kuznetsov, Pavel Braslavski

Topic Models Can Improve Domain Term Extraction

The paper describes the results of an experimental study of topic models applied to the task of single-word term extraction. The experiments encompass several probabilistic and non-probabilistic topic models and demonstrate that topic information improves the quality of term extraction, as well as NMF with KL-divergence minimization is the best among the models under study.

Elena Bolshakova, Natalia Loukachevitch, Michael Nokel

A Topic Person Multi-polarization Method Using Friendship Network Analysis

In this paper, we leverage competing viewpoints of the persons mentioned in a set of topic documents. We propose a method to construct a friendship network of the persons and present a graph-partition based multi-polarization algorithm to group the persons into clusters with competing viewpoints.

Zhong-Yong Chen, Chien Chin Chen

Improving Cyberbullying Detection with User Context

The negative consequences of cyberbullying are becoming more alarming every day and technical solutions that allow for taking appropriate action by means of automated detection are still very limited. Up until now, studies on cyberbullying detection have focused on individual comments only, disregarding context such as users’ characteristics and profile information. In this paper we show that taking user context into account improves the detection of cyberbullying.

Maral Dadvar, Dolf Trieschnigg, Roeland Ordelman, Franciska de Jong

Snippet-Based Relevance Predictions for Federated Web Search

How well can the relevance of a page be predicted, purely based on snippets? This would be highly useful in a Federated Web Search setting where caching large amounts of result snippets is more feasible than caching entire pages. The experiments reported in this paper make use of result snippets and pages from a diverse set of actual Web search engines. A linear classifier is trained to predict the snippet-based user estimate of page relevance, but also, to predict the actual page relevance, again based on snippets alone. The presented results confirm the validity of the proposed approach and provide promising insights into future result merging strategies for a Federated Web Search setting.

Thomas Demeester, Dong Nguyen, Dolf Trieschnigg, Chris Develder, Djoerd Hiemstra

Designing Human-Readable User Profiles for Search Evaluation

Forming an accurate mental model of a user is crucial for the qualitative design and evaluation steps of many information-centric applications such as web search, content recommendation, or advertising. This process can often be time-consuming as search and interaction histories become verbose. In this work, we present and analyze the usefulness of concise human-readable user profiles in order to enhance system tuning and evaluation by means of user studies.

Carsten Eickhoff, Kevyn Collins-Thompson, Paul Bennett, Susan Dumais

Sentiment Classification Based on Phonetic Characteristics

The majority of sentiment classifiers is based on dictionaries or requires large amount of training data. Unfortunately, dictionaries contain only limited data and machine-learning classifiers using word-based features do not consider part of words, which makes them domain-specific, less effective and not robust to orthographic mistakes. We attempt to overcome these drawbacks by developing a context-independent approach. Our main idea is to determine some phonetic features of words that could affect their sentiment polarity. These features are applicable to all words; it eliminates the need to continuous manual dictionary renewal. Our experiments are based on a sentiment dictionary for the Russian language. We apply phonetic features to predict word sentiment based on machine learning.

Sergei Ermakov, Liana Ermakova

Cross-Language Plagiarism Detection Using a Multilingual Semantic Network

Cross-language plagiarism refers to the type of plagiarism where the source and suspicious documents are in different languages. Plagiarism detection across languages is still in its infancy state. In this article, we propose a new graph-based approach that uses a multilingual semantic network to compare document paragraphs in different languages. In order to investigate the proposed approach, we used the German-English and Spanish-English cross-language plagiarism cases of the PAN-PC’11 corpus. We compare the obtained results with two state-of-the-art models. Experimental results indicate that our graph-based approach is a good alternative for cross-language plagiarism detection.

Marc Franco-Salvador, Parth Gupta, Paolo Rosso

Classification of Opinion Questions

With the increasing growth of opinions on news, services and so on, automatic opinion question answering aims at answering questions involving views of persons, and plays an important role in fields of sentiment analysis and information recommendation. One challenge is that opinion questions may contain different types of question focuses that affect answer extraction, such as holders, comparison and location. In this paper, we build a taxonomy of opinion questions, and propose a hierarchical classification technique to classify opinion questions according to our constructed taxonomy. This technique first uses Bayesian classifier and then employs an approach leveraging semantic similarities between questions. Experimental results show that our approach significantly improves performances over baseline and other related works.

Hongping Fu, Zhendong Niu, Chunxia Zhang, Lu Wang, Peng Jiang, Ji Zhang

Tempo of Search Actions to Modeling Successful Sessions

Considering search process in the evaluation of interactive information retrieval (IIR) is a challenging issue. This paper explores tempo of search actions (query, click, and judgement) to measure people’s search process and performance. When we analyse how people consume their search resource (i.e., a total number of search actions taken to complete a task) over the time, it was observed that there was a different pattern in successful sessions and unsuccessful sessions. Successful sessions tend to have a regular tempo in search actions while poor sessions tend to have uneven distribution of resource usage. The resource consumption graph also allows us to observe where in the search process was affected by experimental conditions. Therefore, this paper suggests that tempo of search actions can be exploited to model successful search sessions.

Kazuya Fujikawa, Hideo Joho, Shin-ichi Nakayama

Near-Duplicate Detection for Online-Shops Owners: An FCA-Based Approach

We proposed a prototype of near-duplicate detection system for web-shop owners. It’s a typical situation for this online businesses to buy description of their goods from so-called copyrighters. Copyrighter can cheat from time to time and provide the owner with some almost identical descriptions for different items. In this paper we demonstrated how we can use FCA for fast clustering and revealing such duplicates in real online perfume shop’s datasets.

Dmitry I. Ignatov, Andrey V. Konstantiov, Yana Chubis

Incremental Reranking for Hierarchical Text Classification

The top-down method is efficient and commonly used in hierarchical text classification. Its main drawback is the error propagation from the higher to the lower nodes. To address this issue we propose an efficient incremental reranking model of the top-down classifier decisions. We build a multiclassifier for each hierarchy node, constituted by the latter and its children. Then we generate several classification hypotheses with such classifiers and rerank them to select the best one. Our rerankers exploit category dependencies, which allow them to recover from the multiclassifier errors whereas their application in top-down fashion results in high efficiency. The experimentation on Reuters Corpus Volume 1 (RCV1) shows that our incremental reranking is as accurate as global rerankers but at least one magnitude order faster.

Qi Ju, Alessandro Moschitti

Topic Model for User Reviews with Adaptive Windows

We discuss the problem in applying topic models to user reviews. Different from ordinary documents, reviews in a same category are similar to each other. This makes it difficult to estimate meaningful topics from these reviews. In this paper, we develop a new model for this problem using the distance dependent Chinese restaurant process. It need not decide the size of windows and can consider neighboring sentences adaptively. We compare this model to the Multi-grain latent Dirichlet allocation which has been proposed previously, and show that our model achieves better results in terms of perplexity.

Takuya Konishi, Fuminori Kimura, Akira Maeda

Time Based Feedback and Query Expansion for Twitter Search

Twitter is an accepted platform among users for expressing views in a short text called a “Tweet” Application of search models to platforms like Twitter is still an open-ended question, though the creation of the TREC Microblog track in 2011 aims to help resolve it. In this paper, we propose a modified language search model by extending a traditional query-likelihood language model with time based feedback and query expansion. The proposed method makes use of two types of feedback, time feedback by evaluating the time distribution of top retrieved tweets, and query expansion by using highly frequent terms in top tweets as expanded terms. Our results suggest that using both types of feedback, we get better results than using a standard language model, and the time-based feedback uniformly improves results whether query expansion is used or not.

Naveen Kumar, Benjamin Carterette

Is Intent-Aware Expected Reciprocal Rank Sufficient to Evaluate Diversity?

In this paper we define two models of users that require diversity in search results; these models are theoretically grounded in the notion of intrinsic and extrinsic diversity. We then examine Intent-Aware Expected Reciprocal Rank (ERR-IA), one of the official measures used to assess diversity in TREC 2011-12, with respect to the proposed user models. By analyzing ranking preferences as expressed by the user models and those estimated by ERR-IA, we investigate whether ERR-IA assesses document rankings according to the requirements of the diversity retrieval task expressed by the two models. Empirical results demonstrate that ERR-IA neglects query-intents coverage by attributing


importance to redundant relevant documents. ERR-IA behavior is contrary to the user models that require measures to first assess diversity through the coverage of intents, and then assess the redundancy of relevant intents. Furthermore, diversity should be considered separately from document relevance and the documents positions in the ranking.

Teerapong Leelanupab, Guido Zuccon, Joemon M. Jose

Late Data Fusion for Microblog Search

The character of microblog environments raises challenges for microblog search because relevancy becomes one of the many aspects for ranking documents. We concentrate on merging multiple ranking strategies at post-retrieval time for the TREC Microblog task. We compare several state-of-the-art late data fusion methods, and present a new semi-supervised variant that accounts for microblog characteristics. Our experiments show the utility of late data fusion in microblog search, and that our method helps boost retrieval effectiveness.

Shangsong Liang, Maarten de Rijke, Manos Tsagkias

A Task-Specific Query and Document Representation for Medical Records Search

One of the challenges of searching in the medical domain is to deal with the complexity and ambiguity of medical terminology. Concept-based representation approaches using terminology from domain-specific resources have been developed to handle such a challenge. However, it has been shown that these techniques are effective only when combined with a traditional term-based representation approach. In this paper, we propose a novel technique to represent medical records and queries by focusing only on medical concepts essential for the information need of a medical search task. Such a representation could enhance retrieval effectiveness since only the medical concepts crucial to the information need are taken into account. We evaluate the retrieval effectiveness of our proposed approach in the context of the TREC 2011 Medical Records track. The results demonstrate the effectiveness of our approach, as it significantly outperforms a baseline where all concepts are represented, and markedly outperforms a traditional term-based representation baseline. Moreover, when combining the relevance scores obtained from our technique and a term-based representation approach, the achieved performance is comparable to the best TREC 2011 systems.

Nut Limsopatham, Craig Macdonald, Iadh Ounis

On CORI Results Merging

Score normalization and results merging are important components of many IR applications. Recently MinMax—an unsupervised linear score normalization method—was shown to perform quite well across various distributed retrieval testbeds, although based on strong assumptions. The CORI results merging method relaxes these assumptions to some extent and significantly improves the performance of MinMax. We parameterize CORI and evaluate its performance across a range of parameter settings. Experimental results on three distributed retrieval testbeds show that CORI significantly outperforms state-of-the-art results merging and score normalization methods when its parameter goes to infinity.

Ilya Markov, Avi Arampatzis, Fabio Crestani

Detecting Friday Night Party Photos: Semantics for Tag Recommendation

Multimedia annotation is central to its organization and retrieval – a task which tag recommendation systems attempt to simplify. We propose a photo tag recommendation system which automatically extracts semantics from visual and meta-data features to complement existing tags. Compared to standard content/tag-based models, these automatic tags provide a richer description of the image and especially improve performance in the case of the “cold start problem”.

Philip J. McParlane, Yelena Mejova, Ingmar Weber

Optimizing nDCG Gains by Minimizing Effect of Label Inconsistency

We focus on nDCG choice of gains, and in particular on the fracture between large differences in exponential gains of high relevance labels and the not-so-small confusion, or


, between these labels in data. We show that better gains can be derived from data by measuring the label inconsistency, to the point that virtually indistinguishable labels correspond to equal gains. Our derived optimal gains make a better nDCG objective for training Learning to Rank algorithms.

Pavel Metrikov, Virgil Pavlu, Javed A. Aslam

Least Square Consensus Clustering: Criteria, Methods, Experiments

We develop a consensus clustering framework developed three decades ago in Russia and experimentally demonstrate that our least squares consensus clustering algorithm consistently outperforms several recent consensus clustering methods.

Boris G. Mirkin, Andrey Shestakov

Domain Adaptation of Statistical Machine Translation Models with Monolingual Data for Cross Lingual Information Retrieval

Statistical Machine Translation (SMT) is often used as a black-box in CLIR tasks. We propose an adaptation method for an SMT model relying on the monolingual statistics that can be extracted from the document collection (both source and target if available). We evaluate our approach on CLEF Domain Specific task (German-English and English-German) and show that very simple document collection statistics integrated in SMT translation model allow to obtain good gains both in terms of IR metrics (MAP, P10) and MT evaluation metrics (BLEU, TER).

Vassilina Nikoulina, Stéphane Clinchant

Text Summarization while Maximizing Multiple Objectives with Lagrangian Relaxation

We show an extractive text summarization method that solves an optimization problem involving the maximization of multiple objectives. Though we can obtain high quality summaries if we solve the problem exactly with our formulation, it is NP-hard and cannot scale to support large problem size. Our solution is an efficient and high quality approximation method based on Lagrangian relaxation (LR) techniques. In experiments on the DUC’04 dataset, our LR based method matches the performance of state-of-the-art methods.

Masaaki Nishino, Norihito Yasuda, Tsutomu Hirao, Jun Suzuki, Masaaki Nagata

Towards Detection of Child Sexual Abuse Media: Categorization of the Associated Filenames

This paper approaches the problem of automatic pedophile content identification. We present a system for filename categorization, which is trained to identify suspicious files on P2P networks. In our initial experiments, we used regular pornography data as a substitution of child pornography. Our system separates filenames of pornographic media from the others with an accuracy that reaches 91–97%.

Alexander Panchenko, Richard Beaufort, Hubert Naets, Cédrick Fairon

Leveraging Latent Concepts for Retrieving Relevant Ads for Short Text

The microblogging platforms are increasingly becoming a lucrative prospect for advertisers to attract the customers. The challenge with advertising on such platforms is that there is very little content to retrieve relevant ads. As the microblogging content is short and noisy and the ads are short too, there is a high amount of lexical/vocabulary mismatch between the micropost and the ads. To bridge this vocabulary mismatch, we propose a conceptual approach that transforms the content into a conceptual space that represent the latent concepts of the content. We empirically show that the conceptual model performs better than various state-of-the-art techniques the performance gain obtained are substantial and significant.

Ankit Patil, Kushal Dave, Vasudeva Varma

Robust PLSA Performs Better Than LDA

In this paper we introduce a generalized learning algorithm for probabilistic topic models (PTM). Many known and new algorithms for PLSA, LDA, and SWB models can be obtained as its special cases by choosing a subset of the following “options”: regularization, sampling, update frequency, sparsing and robustness. We show that a robust topic model, which distinguishes specific, background and topic terms, doesn’t need Dirichlet regularization and provides controllably sparse solution.

Anna Potapenko, Konstantin Vorontsov

WANTED: Focused Queries for Focused Retrieval

Focused retrieval tasks such as XML or passage retrieval strive to provide direct access to the relevant content of a document. In these scenarios users can pose


queries, i.e., queries that restrict the type of output the user wants to see. We first analyze several characteristics of this type of requests and show that they differ substantially from the unfocused ones. We also show that typical XML retrieval systems tend to perform poorly on


queries and that systems ranking differs considerably when processing each of the types. Finally, we argue that the unbalanced number of


queries in the INEX benchmark topic set might lead to misleading interpretations of the evaluation results. To get a better insight of the systems ability to perform focused search, more focused queries are needed.

Georgina Ramírez

Exploiting Click Logs for Adaptive Intranet Navigation

Web sites and intranets can be difficult to navigate as they tend to be rather static and a new user might have no idea what documents are most relevant to his or her need. Our aim is to capture the navigational behaviour of existing users (as recorded in the click logs) so that we can assist future users by proposing the most relevant pages as they navigate the site without changing the actual Web site and do this adaptively so that a continuous learning cycle is being employed. In this paper we explore three different algorithms that can be employed to learn such suggestions from navigation logs. We find that users managed to conduct the tasks significantly quicker than the (purely frequency-based) baseline by employing ant colony optimisation or random walk approaches to the log data for building a suggestion model.

Sharhida Zawani Saad, Udo Kruschwitz

Leveraging Microblogs for Spatiotemporal Music Information Retrieval

We present results of text data mining experiments for music retrieval, analyzing microblogs gathered from November 2011 to September 2012 to infer

music listening patterns

all around the world. We assess

relationships between particular music preferences and spatial properties

, such as month, weekday, and country, and the

temporal stability of listening activities

. The findings of our study will help improve music retrieval and recommendation systems in that it will allow to incorporate geospatial and cultural information into models for music retrieval, which has not been looked into before.

Markus Schedl

Topic-Focused Summarization of Chat Conversations

In this paper, we propose a novel approach to address the problem of chat summarization. We summarize real-time chat conversations which contain multiple users with frequent shifts in topic. Our approach consists of two phases. In the first phase, we leverage topic modeling using web documents to find the primary topic of discussion in the chat. Then, in the summary generation phase, we build a semantic word space to score sentences based on their association with the primary topic. Experimental results show that our method significantly outperforms the baseline systems on ROUGE F-scores.

Arpit Sood, Thanvir P Mohamed, Vasudeva Varma

Risk Ranking from Financial Reports

This paper attempts to use soft information in finance to rank the risk levels of a set of companies. Specifically, we deal with a ranking problem with a collection of financial reports, in which each report is associated with a company. By using text information in the reports, which is so-called the soft information, we apply learning-to-rank techniques to rank a set of companies to keep them in line with their relative risk levels. In our experiments, a collection of financial reports, which are annually published by publicly-traded companies, is employed to evaluate our ranking approach; moreover, a regression-based approach is also carried out for comparison. The experimental results show that our ranking approach not only significantly outperforms the regression-based one, but identifies some interesting relations between financial terms.

Ming-Feng Tsai, Chuan-Ju Wang

An Initial Investigation on the Relationship between Usage and Findability

Ensuring that information within a website is findable is particularly important. This is because visitors that cannot find what they are looking for are likely to leave the site or become very frustrated and switch to a competing site. While findability has been touted as important in web design, we wonder to what degree measures of findability are correlated to usage. To this end, we have conducted a preliminary study on three sub-domains across a number of measures of findability.

Colin Wilkie, Leif Azzopardi

Sub-sentence Extraction Based on Combinatorial Optimization

This paper describes the prospect of word extraction for text summarization based on combinatorial optimization. Instead of the commonly used sentence-based approach, word-based approaches are preferable if highly-compressed summarizations are required. However, naively applying conventional methods for word extraction yields excessively fragmented summaries. We avoid this by restricting the number of selected fragments from each sentence to at most one when formulating the maximum coverage problem. Consequently, the method only choose


as fragments. Experiments show that our method matches the ROUGE scores of state-of-the-art systems without requiring any training or special parameters.

Norihito Yasuda, Masaaki Nishino, Tsutomu Hirao, Masaaki Nagata

ADRTrace: Detecting Expected and Unexpected Adverse Drug Reactions from User Reviews on Social Media Sites

We automatically extract adverse drug reactions (ADRs) from consumer reviews provided on various drug social media sites to identify adverse reactions not reported by the United States Food and Drug Administration (FDA) but touted by consumers. We utilize various lexicons, identify patterns, and generate a synonym set that includes variations of medical terms. We identify “expected” and “unexpected” ADRs. Background (drug) language is utilized to evaluate the strength of the detected unexpected ADRs. Evaluation results for our synonym set and ADR extraction are promising.

Andrew Yates, Nazli Goharian

The Impact of Temporal Intent Variability on Diversity Evaluation

To cope with the uncertainty involved with ambiguous or underspecified queries, search engines often diversify results to return documents that cover multiple interpretations, e.g. the car brand, animal or operating system for the query ‘jaguar’. Current diversity evaluation measures take the popularity of the subtopics into account and aim to favour systems that promote most popular subtopics earliest in the result ranking. However, this subtopic popularity is assumed to be static over time. In this paper, we hypothesise that temporal subtopic popularity change is common for many topics and argue this characteristic should be considered when evaluating diversity. Firstly, to support our hypothesis we analyse temporal subtopic popularity changes for ambiguous queries through historic Wikipedia article viewing statistics. Further, by simulation, we demonstrate the impact of this temporal intent variability on diversity evaluation.

Ke Zhou, Stewart Whiting, Joemon M. Jose, Mounia Lalmas

Short Papers with Demo Presentation

Re-leashed! The PuppyIR Framework for Developing Information Services for Children, Adults and Dogs

Children are active information seekers, but research has suggested that services, designed with adults in mind, are a poor fit to their needs [1-3]. The goal of the PuppyIR project is to design, develop and deliver an open source framework for building information services specifically for children, which incorporates the current understanding of children’s information seeking needs. This paper describes the framework’s architecture, highlights two of its novel information processing components, and marks the release of the framework to the wider Interactive Information Retrieval community. PuppyIR provides an open and common framework for the rapid prototyping, development and evaluation of information services specifically for children.

Doug Dowie, Leif Azzopardi

A Web Mining Tool for Assistance with Creative Writing

We develop a web mining tool for assistance with creative writing. The relevance of web mining is achieved via computing similarities of parse trees for queries and found snippets. To assure the plausible flow of mental states of involved agents, a multi-agent behavior simulator is included in content generation algorithm.

Boris A. Galitsky, Sergei O. Kuznetsov

$\mathcal{DS}^4$ : A Distributed Social and Semantic Search System

We present


, a Distributed Social and Semantic Search System that allows users to share content among friends and clusters of users. In


nodes that are semantically, thematically, or socially similar are automatically discovered and logically organised. Content retrieval is then performed by routing the query towards social friends and clusters of nodes that are likely to answer it. In this way, search receives two facets: the social facet, addressing friends, and the semantic facet, addressing nodes that are semantically close to the query.


is scalable (requires no centralised component), privacy-aware (users maintain ownership and control over their content), automatic (requires no intervention by the user), general (works for any type of content), and adaptive (adjusts to changes of user content or interests). In this work, we aim to design the next generation of social networks that will offer open and adaptive design, and privacy-aware content management.

Dionisis Kontominas, Paraskevi Raftopoulou, Christos Tryfonopoulos, Euripides G. M. Petrakis

Serelex: Search and Visualization of Semantically Related Words

We present Serelex, a system that provides, given a query in English, a list of semantically related words. The terms are ranked according to an original semantic similarity measure learnt from a huge corpus. The system performs comparably to dictionary-based baselines, but does not require any semantic resource such as WordNet. Our study shows that users are completely satisfied with 70% of the query results.

Alexander Panchenko, Pavel Romanov, Olga Morozova, Hubert Naets, Andrey Philippovich, Alexey Romanov, Cédrick Fairon

SIAM: Social Interaction Analysis for Multimedia

This paper describes the SIAM demonstrator, a system that illustrates the usefulness of indexing multimedia segments thanks to associated microblog posts. From a socialized multimedia content (i.e. video and associated microblog posts on Twitter), the system applies text mining techniques and derives a topic model to index socialized multimedia segments. That result may then be used inside many multimedia applications, such as in-media social navigation, multimedia summarization or composition, or exploration of multimedia collections according to various socially-based viewpoints.

Jérôme Picault, Myriam Ribière

Exploratory Search on Social Media

The rise of Social Media creates a wealth of information that can be very valuable for private and professional users alike. But many challenges surrounding this relatively new kind of information are yet unsolved. This is true for algorithms that efficiently and intelligently process such data, but also for methods of how users can conveniently access it and how results are displayed. In this paper we present a tool that lets users perform exploratory search on several Social Media sites in parallel. It gives users the opportunity to explore a topic space, and to better understand facets of current discussions.

Aaron Russ, Michael Kaisser

VisNavi: Citation Context Visualization and Navigation

The process of retrieving information for literature review purposes differs from traditional web information retrieval. Literature reviews differentiate between the weightiness of the retrieved data segments. For example, citations and their accompanying information, such as cited author, citation context etc., are a very important consideration when searching for relevant information in literature. However, this information is integrated into a scientific paper, in rich interrelationships, making it very complicated for standard search systems to present and track them efficiently. In this paper, we demonstrate a system,


, in the form of a visualized star-centered approach that introduces the rich citation interrelationships to the searchers in an effective and navigational appearance.

Farag Saad, Brigitte Mathiak

Face-Based People Searching in Videos

We propose a system for retrieving people according to their faces in unannotated video streams. The system processes input videos to extract key-frames on which faces are detected. The detected faces are automatically grouped together to create clusters containing snapshots of the same person. The system also facilitates annotation and manual manipulations with created clusters. On the processed videos the system offers to search for persons in three distinct operations applicable to various scenarios. The system is presented online by indexing five high-quality video streams with the total length of nearly five hours.

Jan Sedmidubsky, Michal Batko, Pavel Zezula

Political Hashtag Trends

Political Hashtag Trends (PHT) is an analysis tool for political left-vs.-right polarization of Twitter hashtags. PHT computes a


for trending, political hashtags in a given week, giving insights into the polarizing U.S. American issues on Twitter. The leaning of a hashtag is derived in two steps. First, users retweeting a set of “seed users” with a known political leaning, such as Barack Obama or Mitt Romney, are identified and the corresponding leaning is assigned to retweeters. Second, a hashtag is assigned a fractional leaning corresponding to which retweeting users used it. Non-political hashtags are removed by requiring certain hashtag co-occurrence patterns. PHT also offers functionality to put the results into context. For example, it shows example tweets from different leanings, it shows historic information and it links to the New York Times archives to explore a topic in depth. In this paper, we describe the underlying methodology and the functionality of the demo.

Ingmar Weber, Venkata Rama Kiran Garimella, Asmelash Teka

OPARS: Objective Photo Aesthetics Ranking System

As the perception of beauty is subjective across individuals, evaluating the objective aesthetic value of an image is a challenging task in image retrieval system. Unlike current online photo sharing services that take the average rating as the aesthetic score, our system integrates various ratings from different users by jointly modeling images and users’ expertise in a regression framework. In the front-end, users are asked to rate images selected by an active learning process. A multi-observer regression model is employed in the back-end to integrate these ratings for predicting the aesthetic value of images. Moreover, the system can be incorporated into current photo sharing services as complement by providing more accurate ratings.

Huang Xiao, Han Xiao, Claudia Eckert


Distributed Information Retrieval and Applications

Distributed Information Retrieval (DIR) is a generic area of research that brings together techniques, such as resource selection and results aggregation, dealing with data that, for organizational or technical reasons, cannot be managed centrally. Existing and potential applications of DIR methods vary from blog retrieval to aggregated search and from multimedia and multilingual retrieval to distributed Web search. In this tutorial we briefly discuss main DIR phases, that are resource description, resource selection, results merging and results presentation. The main focus is made on applications of DIR techniques: blog, expert and desktop search, aggregated search and personal meta-search, multimedia and multilingual retrieval. We also discuss a number of potential applications of DIR techniques, such as distributed Web search, enterprise search and aggregated mobile search.

Fabio Crestani, Ilya Markov

Searching the Web of Data

Search is currently undergoing a major paradigm shift away from the traditional document-centric “10 blue links” towards more explicit and actionable information. Recent advances in this area are Google’s Knowledge Graph, Virtual Personal Assistants such as Siri and Google Now, as well as the now ubiquitous entity-oriented vertical search results for places, products, etc. Apart from novel query understanding methods, these developments are largely driven by structured data that is blended into the Web Search experience. We discuss efficient indexing and query processing techniques to work with large amounts of structured data. Finally, we present query interpretation and understanding methods to map user queries to these structured data sources.

Gerard de Melo, Katja Hose

Monolingual and Cross-Lingual Probabilistic Topic Models and Their Applications in Information Retrieval

Probabilistic topic models are a group of unsupervised generative machine learning models that can be effectively trained on large text collections. They model document content as a two-step generation process, i.e., documents are observed as mixtures of latent topics, while topics are probability distributions over vocabulary words. Recently, a significant research effort has been invested into transferring the probabilistic topic modeling concept from monolingual to multilingual settings. Novel topic models have been designed to work with parallel and comparable multilingual data (e.g., Wikipedia or news data discussing the same events). Probabilistic topics models offer an elegant way to represent content across different languages. Their probabilistic framework allows for their easy integration into a language modeling framework for monolingual and cross-lingual information retrieval. Moreover, we present how to use the knowledge from the topic models in the tasks of cross-lingual event clustering, cross-lingual document classification and the detection of cross-lingual semantic similarity of words. The tutorial also demonstrates how semantically similar words across languages are integrated as useful additional evidences in cross-lingual information retrieval models.

Marie-Francine Moens, Ivan Vulić

Practical Online Retrieval Evaluation


evaluation allows the assessment of information retrieval (IR) techniques based on how real users respond to them. Because this technique is directly based on observed user behavior, it is a promising alternative to traditional


evaluation, which is based on manual relevance assessments. In particular, online evaluation can enable comparisons in settings where reliable assessments are difficult to obtain (e.g., personalized search) or expensive (e.g., for search by trained experts in specialized collections).

Despite its advantages, and its successful use in commercial settings, online evaluation is rarely employed outside of large commercial search engines due to a perception that it is impractical at small scales. The goal of this tutorial is to show how online evaluations can be conducted in such settings, demonstrate software to facilitate its use, and promote further research in the area. We will also contrast online evaluation with standard offline evaluation, and provide an overview of online approaches.

Filip Radlinski, Katja Hofmann


Integrating IR Technologies for Professional Search

(Full-Day Workshop)

Professional search in specific domains (e.g. patent, medical, scientific literature, media) usually needs an exploratory type of search which is characterized more often, in comparison to fact finding and question answering web search, by recall-oriented information needs and by uncertainty and evolution or change of the information need. Additionally, the complexity of the tasks that need to be performed by professional searchers, which usually include not only retrieval but also information analysis and monitoring tasks, require association, pipelining and possibly integration of information as well as synchronization and coordination of multiple and potentially concurrent search views produced from different datasets, search tools and UIs. Many facets of IR technology (e.g. exploratory search, aggregated search, federated search, task-based search, IR over query sessions, cognitive IR approaches, Human Computer and Information Retrieval) aim to at least partially address these demands. This workshop aims to stimulate exploratory research, bring together various facets of IR research and promote discussion between researchers towards the development of a generalised framework facilitating the integration of IR technologies and search tools into next generation professional search systems. This envisioned framework should be supported from new or the extension of existing protocols and may influence the design of next generation professional search systems.

Michail Salampasis, Norbert Fuhr, Allan Hanbury, Mihai Lupu, Birger Larsen, Henrik Strindberg

From Republicans to Teenagers – Group Membership and Search (GRUMPS)

In the early years of information retrieval, the focus of research was on systems aspects such as crawling, indexing, and relevancy ranking. Over the years, more and more user-related information such as click information or search history has entered the equation creating more and more personalized search experiences, though still within the scope of the same overall system. Though fully personalized search is probably desirable, this individualistic perspective does not exploit the fact that a lot of a users behavior can be explained through their group membership. Children, despite individual differences, share many challenges and needs; as do men, Republicans, Chinese or any user group. This workshop takes a group-centric approach to IR and invites contributions that either (i) propose and evaluate IR systems for a particular user group or that (ii) describe how the search behavior of specific groups differ, potentially requiring a different way of addressing their needs.

Ingmar Weber, Djoerd Hiemstra, Pavel Serdyukov

Doctoral Consortium

Doctoral Consortium at ECIR 2013

This is a short description of Doctoral Consortium at ECIR 2013.

Hideo Joho, Dmitry I. Ignatov


Weitere Informationen