Skip to main content

Über dieses Buch

This book constitutes the refereed proceedings of the Third International Conference on the Theory of Information Retrieval, ICTIR 2011, held in Bertinoro, Italy, in September 2011. The 25 revised full papers and 13 short papers presented together with the abstracts of two invited talks were carefully reviewed and selected from 65 submissions. The papers cover topics ranging from query expansion, co-occurence analysis, user and interactive modelling, system performance prediction and comparison, and probabilistic approaches for ranking and modelling IR to topics related to interdisciplinary approaches or applications. They are organized into the following topical sections: predicting query performance; latent semantic analysis and word co-occurrence analysis; query expansion and re-ranking; comparison of information retrieval systems and approximate search; probability ranking principle and alternatives; interdisciplinary approaches; user and relevance; result diversification and query disambiguation; and logical operators and descriptive approaches.



Invited Talks

Axiomatic Analysis and Optimization of Information Retrieval Models

Development of optimal retrieval models is an important, yet challenging research problem in information retrieval. Although many effective retrieval models have been proposed, there is still no clear single winner, making it interesting to ask the question whether there exists a single optimal retrieval model that is better than all others. However, this question is theoretically ill defined unless we can formally characterize what properties must be satisfied by an optimal retrieval model. In this talk, I will present a number of formal constraints that an optimal retrieval model are expected to satisfy, and show that these constraints not only provide a formal framework for analytically assessing the optimality of a retrieval model, but also are necessary for diagnosing deficiencies of existing models and improving them. I will use several examples to show that such an axiomatic analysis is required in order to better understand and bridge the gap between theoretically motivated models and empirically effective retrieval functions. Finally, I will discuss some interesting challenges in developing a complete axiomatic analysis framework for seeking an ultimately optimal retrieval model.

ChengXiang Zhai

What Is Quantum Information Retrieval?

I will introduce the theoretical foundations for quantum information retrieval derived from Quantum Theory.

There will be an explanation of how such a theoretical framework could be useful in IR, for example, by showing how logic, probability, and geometry, as exploited in IR, can be represented in a consistent way in the underlying Hilbert Space.

The talk will conclude with some examples of recent concrete applications of the framework in IR.

C. J. Keith van Rijsbergen

Predicting Query Performance

User Perspectives on Query Difficulty

The difficulty of a user query can affect the performance of Information Retrieval (IR) systems. What makes a query difficult and how one may predict this is an active research area, focusing mainly on factors relating to the retrieval algorithm, to the properties of the retrieval data, or to statistical and linguistic features of the queries that may render them difficult. This work addresses query difficulty from a different angle, namely the users’ own perspectives on query difficulty. Two research questions are asked: (1) Are users aware that the query they submit to an IR system may be difficult for the system to address? (2) Are users aware of specific features in their query (e.g., domain-specificity, vagueness) that may render their query difficult for an IR system to address? A study of 420 queries from a Web search engine query log that are pre-categorised as easy, medium, hard by TREC based on system performance, reveals an interesting finding: users do not seem to reliably assess which query might be difficult; however, their assessments of which query features might render queries difficult are notably more accurate. Following this, a formal approach is presented for synthesising the user-assessed causes of query difficulty through opinion fusion into an overall assessment of query difficulty. The resulting assessments of query difficulty are found to agree notably more to the TREC categories than the direct user assessments.

Christina Lioma, Birger Larsen, Hinrich Schutze

A Unified Framework for Post-Retrieval Query-Performance Prediction

The query-performance prediction task is estimating the effectiveness of a search performed in response to a query in lack of relevance judgments. Post-retrieval predictors analyze the

result list

of top-retrieved documents. While many of these previously proposed predictors are supposedly based on different principles, we show that they can actually be derived from a novel unified prediction framework that we propose. The framework is based on using a pseudo effective and/or ineffective ranking as reference comparisons to the ranking at hand, the quality of which we want to predict. Empirical exploration provides support to the underlying principles, and potential merits, of our framework.

Oren Kurland, Anna Shtok, David Carmel, Shay Hummel

Predicting the Performance of Recommender Systems: An Information Theoretic Approach

Performance prediction is an appealing problem in Recommender Systems, as it enables an array of strategies for deciding when to deliver or hold back recommendations based on their foreseen accuracy. The problem, however, has been barely addressed explicitly in the area. In this paper, we propose adaptations of query clarity techniques from ad-hoc Information Retrieval to define performance predictors in the context of Recommender Systems, which we refer to as user clarity. Our experiments show positive results with different user clarity models in terms of the correlation with single recommender’s performance. Empiric results show significant dependency between this correlation and the recommendation method at hand, as well as competitive results in terms of average correlation.

Alejandro Bellogín, Pablo Castells, Iván Cantador

Latent Semantic Analysis and Word Co-occurrence Analysis

Trading Spaces: On the Lore and Limitations of Latent Semantic Analysis

Two decades after its inception,

Latent Semantic Analysis

(LSA) has become part and parcel of every modern introduction to IR. For any tool that matures so quickly, it is important to check its lore and limitations, or else stagnation will set in. We focus here on the three main aspects of LSA that are well accepted, and the gist of which can be summarized as follows: (1) that LSA recovers latent

semantic factors

underlying the document space, (2) that such can be accomplished through lossy compression of the document space by eliminating

lexical noise

, and (3) that the latter can best be achieved by

Singular Value Decomposition


For each aspect we performed experiments analogous to those reported in the LSA literature and compared the evidence brought to bear in each case. On the negative side, we show that the above claims about LSA are much more limited than commonly believed. Even a simple example may show that LSA does not recover the optimal semantic factors as intended in the pedagogical example used in many LSA publications. Additionally, and remarkably deviating from LSA lore, LSA does not scale up well: the larger the document space, the more unlikely that LSA recovers an optimal set of semantic factors. On the positive side, we describe new algorithms to replace LSA (and more recent alternatives as pLSA, LDA, and kernel methods) by trading its



space for an



space, thereby guaranteeing an optimal set of semantic factors. These algorithms seem to salvage the spirit of LSA as we think it was initially conceived.

Eduard Hoenkamp

Quantum Latent Semantic Analysis

The main goal of this paper is to explore latent topic analysis (LTA), in the context of quantum information retrieval. LTA is a valuable technique for document analysis and representation, which has been extensively used in information retrieval and machine learning. Different LTA techniques have been proposed, some based on geometrical modeling (such as latent semantic analysis, LSA) and others based on a strong statistical foundation. However, these two different approaches are not usually mixed. Quantum information retrieval has the remarkable virtue of combining both geometry and probability in a common principled framework. We built on this quantum framework to propose a new LTA method, which has a clear geometrical motivation but also supports a well-founded probabilistic interpretation. An initial exploratory experimentation was performed on three standard data sets. The results show that the proposed method outperforms LSA on two of the three datasets. These results suggests that the quantum-motivated representation is an alternative for geometrical latent topic modeling worthy of further exploration.

Fabio A. González, Juan C. Caicedo

Pure High-Order Word Dependence Mining via Information Geometry

The classical bag-of-word models fail to capture contextual associations between words. We propose to investigate the “high-order pure dependence” among a number of words forming a semantic entity, i.e., the high-order dependence that cannot be reduced to the random coincidence of lower-order dependence. We believe that identifying these high-order pure dependence patterns will lead to a better representation of documents. We first present two formal definitions of pure dependence: Unconditional Pure Dependence (UPD) and Conditional Pure Dependence (CPD). The decision on UPD or CPD, however, is a NP-hard problem. We hence prove a series of sufficient criteria that entail UPD and CPD, within the well-principled Information Geometry (IG) framework, leading to a more feasible UPD/CPD identification procedure. We further develop novel methods to extract word patterns with high-order pure dependence, which can then be used to extend the original unigram document models. Our methods are evaluated in the context of query expansion. Compared with the original unigram model and its extensions with term associations derived from constant n-grams and Apriori association rule mining, our IG-based methods have proved mathematically more rigorous and empirically more effective.

Yuexian Hou, Liang He, Xiaozhao Zhao, Dawei Song

Query Expansion and Re-ranking

Promoting Divergent Terms in the Estimation of Relevance Models

Traditionally the use of pseudo relevance feedback (PRF) techniques for query expansion has been demonstrated very effective. Particularly the use of Relevance Models (RM) in the context of the Language Modelling framework has been established as a high-performance approach to beat. In this paper we present an alternative estimation for the RM promoting terms that being present in the relevance set are also distant from the language model of the collection. We compared this approach with RM3 and with an adaptation to the Language Modelling framework of the Rocchio’s KLD-based term ranking function. The evaluation showed that this alternative estimation of RM reports consistently better results than RM3, showing in average to be the most stable across collections in terms of robustness.

Javier Parapar, Álvaro Barreiro

Is Document Frequency Important for PRF?

We introduce in this paper a new heuristic constraint for PRF models, referred to as the

Document Frequency (DF) constraint

, which is validated through a series of experiments with an oracle. We then analyze, from a theoretical point of view, state-of-the-art PRF models according to their relation with this constraint. This analysis reveals that the standard mixture model for PRF in the language modeling family does not satisfy the DF constraint on the contrary to several recently proposed models. Lastly, we perform tests, which further validate the constraint, with a simple family of


functions based on a parameter controlling the satisfaction of the DF constraint.

Stéphane Clinchant, Eric Gaussier

Comparison of Information Retrieval Systems and Approximate Search

Model-Based Inference about IR Systems

Researchers and developers of IR systems generally want to make inferences about the effectiveness of their systems over a population of user needs, topics, or queries. The most common framework for this is statistical hypothesis testing, which involves computing the probability of measuring the observed effectiveness of two systems over a sample of topics under a null hypothesis that the difference in effectiveness is unremarkable. It is not commonly known that these tests involve


of effectiveness. In this work we first explicitly describe the modeling assumptions of the t-test, then develop a Bayesian modeling approach that makes modeling assumptions explicit and easy to change for specific challenges in IR evaluation.

Ben Carterette

Selecting a Subset of Queries for Acquisition of Further Relevance Judgements

Assessing the relative performance of search systems requires the use of a test collection with a pre-defined set of queries and corresponding relevance assessments. The state-of-the-art process of constructing test collections involves using a large number of queries and selecting a set of documents, submitted by a group of participating systems, to be judged per query. However, the initial set of judgments may be insufficient to reliably evaluate the performance of future as yet unseen systems. In this paper, we propose a method that expands the set of relevance judgments as new systems are being evaluated. We assume that there is a limited budget to build additional relevance judgements. From the documents retrieved by the new systems we create a pool of unjudged documents. Rather than uniformly distributing the budget across all queries, we first select a subset of queries that are effective in evaluating systems and then uniformly allocate the budget only across these queries. Experimental results on TREC 2004 Robust track test collection demonstrate the superiority of this budget allocation strategy.

Mehdi Hosseini, Ingemar J. Cox, Natasa Milic-Frayling, Vishwa Vinay, Trevor Sweeting

On the Feasibility of Unstructured Peer-to-Peer Information Retrieval

We consider the feasibility of web-scale search in an unstructured peer-to-peer network. Since the network is unstructured, any such search is probabilistic in nature. We therefore adopt a probably approximately correct (PAC) search framework. The accuracy of such a search is defined by the overlap between the set of documents retrieved by a PAC search and the set of documents retrieved by an exhaustive (deterministic) search of the network. For an accuracy of 90%, we theoretically determine the number of nodes each query must be sent to for three distributions of documents in the network, namely uniform, proportional and square root. We assume that the query distribution follows a power law and investigate how performance is affected by the scale factor. For various configurations, we estimate the global and local network traffic induced by the search. For a network of 1 million nodes, a query rate of 1000 queries per second, and assuming each node is capable of indexing 0.1% of the collection, our analysis indicates that the network traffic is less that 0.07% of global internet traffic.

H. Asthana, Ruoxun Fu, Ingemar J. Cox

Probability Ranking Principle and Alternatives

Can Information Retrieval Systems Be Improved Using Quantum Probability?

In this paper we reformulate the retrieval decision problem within a quantum probability framework in terms of vector subspaces rather than in terms of subsets as it is customary to state in classical probabilistic Information Retrieval. Hence we show that ranking by quantum probability of relevance in principle yields higher expected recall than ranking by classical probability at every level of expected fallout and when the parameters are estimated as accurately as possible on the basis of the available data.

Massimo Melucci

An Analysis of Ranking Principles and Retrieval Strategies

The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP’s limitations. All alternatives deviate from the PRP by incorporating dependencies. This results in a re-ranking that promotes or demotes documents depending upon their relationship with the documents that have been already ranked. In this paper, we compare and contrast the behaviour of state-of-the-art ranking strategies and principles. To do so, we tease out analytical relationships between the ranking approaches and we investigate the document kinematics to visualise the effects of the different approaches on document ranking.

Guido Zuccon, Leif Azzopardi, C. J. Keith Van Rijsbergen

Towards a Better Understanding of the Relationship between Probabilistic Models in IR

Probability of relevance (PR) models are generally assumed to implement the Probability Ranking Principle (PRP) of IR, and recent publications claim that PR models and language models are similar. However, a careful analysis reveals two gaps in the chain of reasoning behind this statement. First, the PRP considers the relevance of particular documents, whereas PR models consider the relevance of any query-document pair. Second, unlike PR models, language models consider draws of terms and documents. We bridge the first gap by showing how the probability measure of PR models can be used to define the probabilistic model of the PRP. Furthermore, we argue that given the differences between PR models and language models, the second gap cannot be bridged at the probabilistic model level. We instead define a new PR model based on logistic regression, which has a similar score function to the one of the query likelihood model. The performance of both models is strongly correlated, hence providing a bridge for the second gap at the functional and ranking level. Understanding language models in relation with logistic regression models opens ample new research directions which we propose as future work.

Robin Aly, Thomas Demeester

Interdisciplinary Approaches

Cognitive Processes in Query Generation


search is the search for a specific document that is known to exist. This task is particularly important in Personal Information Management (PIM), where it is the most common search activity. A major obstacle to research in search technologies for PIM is the lack of publicly accessible test corpora. As a potential solution, pseudo-desktop corpora and automatic query generation have been proposed. These approaches though do not take the cognitive processes into account that take place when a user formulates a re-finding query. The human memory is not perfect, and many factors influence a user’s ability to recall information. In this work, we propose a model that accounts for these cognitive processes in the automatic query generation setting.

Claudia Hauff, Geert-Jan Houben

Protocol-Driven Searches for Medical and Health-Sciences Systematic Reviews

Systematic reviews are instances of a critically important search task in medicine and health services research. Along with large and well conducted randomised control trials, they provide the highest levels of clinical evidence. We provide a brief overview of the methodologies used to conduct systematic reviews, and report on our recent experience of conducting a


– i.e. a systematic review of reviews – of preoperative assessment. We discuss issues associated with the large manual effort currently necessary to conduct systematic reviews when using available search engines. We then suggest ways in which more dedicated and sophisticated information retrieval tools may enhance the efficiency of systematic searches and increase the recall of results. Finally, we discuss the development of tests collections for systematic reviews, to permit the development of enhanced search engines for this task.

Matt-Mouley Bouamrane, Craig Macdonald, Iadh Ounis, Frances Mair

Enhanced Information Retrieval Using Domain-Specific Recommender Models

The objective of an information retrieval (IR) system is to retrieve relevant items which meet a user information need. There is currently significant interest in personalized IR which seeks to improve IR effectiveness by incorporating a model of the user’s interests. However, in some situations there may be no opportunity to learn about the interests of a specific user on a certain topic. In our work, we propose an IR approach which combines a recommender algorithm with IR methods to improve retrieval for domains where the system has no opportunity to learn prior information about the user’s knowledge of a domain for which they have not previously entered a query. We use search data from other previous users interested in the same topic to build a recommender model for this topic. When a user enters a query on a new topic, an appropriate recommender model is selected and used to predict a ranking which the user may find interesting based on the behaviour of previous users with similar queries. The recommender output is integrated with a standard IR method in a weighted linear combination to provide a final result for the user. Experiments using the INEX 2009 data collection with a simulated recommender training set show that our approach can improve on a baseline IR system.

Wei Li, Debasis Ganguly, Gareth J. F. Jones

User and Relevance

Exploring Ant Colony Optimisation for Adaptive Interactive Search

Search engines have become much more interactive in recent years which has triggered a lot of work in automatically acquiring knowledge structures that can assist a user in navigating through a document collection. Query log analysis has emerged as one of the most promising research areas to automatically derive such structures. We explore a biologically inspired model based on ant colony optimisation applied to query logs as an adaptive learning process that addresses the problem of deriving query suggestions. A user interaction with the search engine is treated as an individual ant’s journey and over time the collective journeys of all ants result in strengthening more popular paths which leads to a corresponding term association graph that is used to provide query modification suggestions. This association graph is being updated in a continuous learning cycle. In this paper we use a novel automatic evaluation framework based on actual query logs to explore the effect of different parameters in the ant colony optimisation algorithm on the performance of the resulting adaptive query suggestion model. We also use the framework to compare the ant colony approach against a state-of-the-art baseline. The experiments were conducted with query logs collected on a university search engine over a period of several years.

M-Dyaa Albakour, Udo Kruschwitz, Nikolaos Nanas, Dawei Song, Maria Fasli, Anne De Roeck

Applying the User-over-Ranking Hypothesis to Query Formulation

The User-over-Ranking hypothesis states that the best retrieval performance can be achieved with queries returning about as many results as can be considered at user site[21]. We apply this hypothesis to Lee et al.’s problem of automatically formulating a single promising query from a given set of keywords[16]. Lee et al.’s original approach requires unrestricted access to the retrieval system’s index and manual parameter tuning for each keyword set. Their approach is not applicable on larger scale, not to mention web search scenarios. By applying the User-over-Ranking hypothesis we overcome this restriction and present a fully automatic user-site heuristic for web query formulation from given keywords. Substantial performance gains of up to 60% runtime improvement over previous approaches for similar problems underpin the value of our approach.

Matthias Hagen, Benno Stein

How to Count Thumb-Ups and Thumb-Downs: User-Rating Based Ranking of Items from an Axiomatic Perspective

It is a common practice among Web 2.0 services to allow users to rate items on their sites. In this paper, we first point out the flaws of the popular methods for user-rating based ranking of items, and then argue that two well-known Information Retrieval (IR) techniques, namely the Probability Ranking Principle and Statistical Language Modelling, provide simple but effective solutions to this problem. Furthermore, we examine the existing and proposed methods in an axiomatic framework, and prove that only the score functions given by the Dirichlet Prior smoothing method as well as its special cases can satisfy both of the two axioms borrowed from economics.

Dell Zhang, Robert Mao, Haitao Li, Joanne Mao

Result Diversification and Query Disambiguation

Aggregated Search Result Diversification

Search result diversification has been effectively employed to tackle query ambiguity, particularly in the context of web search. However, ambiguity can manifest differently in different search verticals, with ambiguous queries spanning, e.g., multiple place names, content genres, or time periods. In this paper, we empirically investigate the need for diversity across four different verticals of a commercial search engine, including web, image, news, and product search. As a result, we introduce the problem of aggregated search result diversification as the task of satisfying multiple information needs across multiple search verticals. Moreover, we propose a probabilistic approach to tackle this problem, as a natural extension of state-of-the-art diversification approaches. Finally, we generalise standard diversity metrics, such as ERR-IA and


-nDCG, into a framework for evaluating diversity across multiple search verticals.

Rodrygo L. T. Santos, Craig Macdonald, Iadh Ounis

Topical Categorization of Search Results Based on a Domain Ontology

This paper presents an approach to the categorization of Web search results based on a domain ontology that represents specific long term user’s interests. The idea is to leverage the information in the considered ontology to classify results related to queries formulated in the topical context represented by the ontology. To efficiently manage the knowledge represented in a domain ontology for a categorizaton task, we propose a methodology to convert any domain ontology into its granular (taxonomy like) representation. In this paper, both the categorization process based on granular ontologies, and some evaluations that show the effectiveness the approach are presented.

Silvia Calegari, Fabio Farina, Gabriella Pasi

Towards Semantic Category Verification with Arbitrary Precision

Many tasks related to or supporting information retrieval, such as query expansion, automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. “coffee” is a drink, “red” is a color, while “steak” is not a drink and “big” is not a color). We present a novel framework to automatically validate a membership in an arbitrary, not a trained a priori semantic category up to a desired level of accuracy. Our approach does not rely on any manually codified knowledge but instead capitalizes on the diversity of topics and word usage in a large corpus (e.g. World Wide Web). Using TREC factoid questions that expect the answer to belong to a specific semantic category, we show that a very high level of accuracy can be reached by automatically identifying more training seeds and more training patterns when needed. We develop a specific quantitative validation model that takes uncertainty and redundancy in the training data into consideration. We empirically confirm the important aspects of our model through ablation studies.

Dmitri Roussinov

Logical Operators and Descriptive Approaches

Negation for Document Re-ranking in Ad-hoc Retrieval

Information about top-ranked documents plays a key role to improve retrieval performance. One of the most common strategies which exploits this kind of information is relevance feedback. Few works have investigated the role of


feedback on retrieval performance. This is probably due to the difficulty of dealing with the concept of non-relevant document. This paper proposes a novel approach to document re-ranking, which relies on the concept of negative feedback represented by non-relevant documents. In our model the concept of non-relevance is defined as a quantum operator in both the classical

Vector Space Model

and a

Semantic Document Space

. The latter is induced from the original document space using a distributional approach based on Random Indexing. The evaluation carried out on a standard document collection shows the effectiveness of the proposed approach and opens new perspectives to address the problem of quantifying the concept of non-relevance.

Pierpaolo Basile, Annalina Caputo, Giovanni Semeraro

A Descriptive Approach to Classification

Nowadays information systems are required to be more adaptable and flexible than before to deal with the rapidly increasing quantity of available data and changing information needs. Text Classification (TC) is a useful task that can help to solve different problems in different fields. This paper investigates the application of descriptive approaches for modelling classification. The main objectives are increasing abstraction and flexibility so that expert users are able to customise specific strategies for their needs.

The contribution of this paper is two-fold. Firstly, it illustrates that the modelling of classifiers in a descriptive approach is possible and it leads to a close definition w.r.t. mathematical formulations. Moreover, the automatic translation from PDatalog to mathematical formulation is discussed. Secondly, quality and efficiency results prove the approach feasibility for real-scale collections.

Miguel Martinez-Alvarez, Thomas Roelleke


Do Subtopic Judgments Reflect Diversity?

Current measures of novelty and diversity in information retrieval evaluation require explicit subtopic judgments, adding complexity to the manual assessment process. In some sense, these subtopic judgments may be viewed as providing a crude indication of document similarity, since we might expect documents relevant to common subtopics to be more similar on average than documents sharing no common subtopic, even when these documents are relevant to the same overall topic. In this paper, we test this hypothesis using documents and judgments drawn from the TREC 2009 Web Track. Our experiments demonstrate that higher subtopic overlap correlates with higher cosine similarity, providing validation for the use of subtopic judgments and pointing to new possibilities for measuring of novelty and diversity.

John A. Akinyemi, Charles L. A. Clarke

On Upper Bounds for Dynamic Pruning

Dynamic pruning strategies enhance the efficiency of search engines, by making use of term upper bounds to decide when a document will not make the final set of


retrieved documents. After discussing different approaches for obtaining term upper bounds, we propose the use of multiple least upper bounds. Experiments are conducted on the TREC ClueWeb09 corpus, to measure the accuracy of different upper bounds.

Craig Macdonald, Nicola Tonellotto, Iadh Ounis

A Comparative Study of Pseudo Relevance Feedback for Ad-hoc Retrieval

This paper presents an initial investigation in the relative effectiveness of different popular pseudo relevance feedback (PRF) methods. The retrieval performance of relevance model, and two KL-divergence-based divergence from randomness (DFR) feedback methods generalized from Rocchio’s algorithm, are compared by extensive experiments on standard TREC test collections. Results show that a KL-divergence based DFR method (denoted as


), combined with the classical Rocchio’s algorithm, has the best retrieval effectiveness out of the three methods studied in this paper.

Kai Hui, Ben He, Tiejian Luo, Bin Wang

A Generic Data Model for Schema-Driven Design in Information Retrieval Applications

Database technology offers design methodologies to rapidly develop and deploy applications that are easy to understand, document and teach. It can be argued that information retrieval (IR) lacks equivalent methodologies. This poster discusses a generic data model, the Probabilistic Object-Oriented Content Model, that facilitates solving complex IR tasks. The model guides how data and queries are represented and how retrieval strategies are built and customised. Application/task-specific schemas can also be derived from the generic model. This eases the process of tailoring search to a specific task by offering a layered architecture and well-defined schema mappings. Different types of knowledge (facts and content) from varying data sources can also be consolidated into the proposed modelling framework. Ultimately, the data model paves the way for discussing IR-tailored design methodologies.

Hany Azzam, Thomas Roelleke

A Query-Basis Approach to Parametrizing Novelty-Biased Cumulative Gain

Novelty-biased cumulative gain (


-NDCG) has become the

de facto

measure within the information retrieval (IR) community for evaluating retrieval systems in the context of sub-topic retrieval. Setting the incorrect value of parameter




-NDCG prevents the measure from behaving as desired in particular circumstances. In fact, when


is set according to common practice (i.e.


 = 0.5), the measure favours systems that promote redundant relevant sub-topics rather than provide novel relevant ones. Recognising this characteristic of the measure is important because it affects the comparison and the ranking of retrieval systems. We propose an approach to overcome this problem by defining a safe threshold for the value of


on a query basis. Moreover, we study its impact on system rankings through a comprehensive simulation.

Teerapong Leelanupab, Guido Zuccon, Joemon M. Jose

Investigating Query-Drift Problem from a Novel Perspective of Photon Polarization

Query expansion, while generally effective in improving retrieval performance, may lead to the query-drift problem. Following the recent development of applying Quantum Mechanics (QM) to IR, we investigate the problem from a novel theoretical perspective inspired by photon polarization (a key QM experiment).

Peng Zhang, Dawei Song, Xiaozhao Zhao, Yuexian Hou

Using Emotion to Diversify Document Rankings

The aim of this paper is to investigate the role of emotion features in diversifying document rankings to improve the effectiveness of Information Retrieval (IR) systems. For this purpose, two approaches are proposed to consider emotion features for diversification, and they are empirically tested on the TREC 678 Interactive Track collection. The results show that emotion features are capable of enhancing retrieval effectiveness.

Yashar Moshfeghi, Guido Zuccon, Joemon M. Jose

Improved Stable Retrieval in Noisy Collections

We consider the problem of retrieval on noisy text collections. This is a paramount problem for retrieval with new social media collections, like Twitter, where typical messages are short, whilst dictionary is very large in size, plenty of variations of emoticons, term shortcuts and any other type of users jargon. In particular, we propose a new methodology which combines different effective techniques, some of them proposed in the OCR information retrieval literature, such as


-grams tokenization, approximate string matching algorithms, that need to be plugged in suitable IR models of retrieval and query reformulation. To evaluate the methodology we use the OCR degraded collections of the Confusion TREC. Unlike the solutions proposed by the TREC participants, tuned for specific collections and thus exhibiting a high variable performance among the different degradation levels, our model is highly stable. In fact, with the same tuned parameters, it reaches the best or nearly best performance simultaneously on all the three Confusion collections (Original, Degrade 5% and Degrade 20%), with a 33% improvement on the average MAP measure. Thus, it is a good candidate as a universal high precision strategy to be used when there isn’t any a priori knowledge of the specific domain. Moreover, our parameters can be specifically tuned in order to obtain the best up to date retrieval performance at all levels of collection degradation, and even on the clean collection, that is the original collection without the OCR errors.

Gianni Amati, Alessandro Celi, Cesidio Di Nicola, Michele Flammini, Daniela Pavone

On the use of Complex Numbers in Quantum Models for Information Retrieval

Quantum-inspired models have recently attracted increasing attention in Information Retrieval. An intriguing characteristic of the mathematical framework of quantum theory is the presence of

complex numbers

. However, it is unclear what such numbers could or would actually represent or mean in Information Retrieval. The goal of this paper is to discuss the role of complex numbers within the context of Information Retrieval. First, we introduce how complex numbers are used in quantum probability theory. Then, we examine van Rijsbergen’s proposal of evoking complex valued representations of informations objects. We empirically show that such a representation is unlikely to be effective in practice (confuting its usefulness in Information Retrieval). We then explore alternative proposals which may be more successful at realising the power of complex numbers.

Guido Zuccon, Benjamin Piwowarski, Leif Azzopardi

A Query Performance Analysis for Result Diversification

Which queries stand to gain or loose from diversifying their results? Some queries are more difficult than others for diversification. Across a number of conceptually different diversification methods, performance on such queries tends to deteriorate after applying these diversification methods, even though their initial performance in terms of relevance or diversity tends to be good.

Jiyin He, Marc Bron, Maarten de Rijke

Rare Disease Diagnosis as an Information Retrieval Task

Increasingly more clinicians use web Information Retrieval (IR) systems to assist them in diagnosing difficult medical cases, for instance rare diseases that they may not be familiar with. However, web IR systems are not necessarily optimised for this task. For instance, clinicians’ queries tend to be long lists of symptoms, often containing phrases, whereas web IR systems typically expect very short keyword-based queries. Motivated by such differences, this work uses a preliminary study of 30 clinical cases to reflect on rare disease retrieval as an IR task. Initial experiments using both Google web search and offline retrieval from a rare disease collection indicate that the retrieval of rare diseases is an open problem with room for improvement.

Radu Dragusin, Paula Petcu, Christina Lioma, Birger Larsen, Henrik Jørgensen, Ole Winther

Distilling Relevant Documents by Means of Dynamic Quantum Clustering

Dynamic Quantum Clustering (DQC) is a recent clustering technique based on physical intuition from quantum mechanics. Clusters are identified as the minima of the potential function of the Schrödinger equation. In this poster, we apply this technique to explore the possibility to select highly relevant documents relative to a query of a user. In particular, we analyze the clusters produced by DQC with a standard test collection.

Emanuele Di Buccio, Giorgio Maria Di Nunzio

Adding Emotions to Pictures

A large number of out-of-copyright children books are available online, but are not very attractive to children due to a lack of illustrations. Automatic text illustration may enhance the reading experience of these books, but inappropriate picture coloring may convey inappropriate emotions. Since already at a very early age, children can map colors to certain emotions, we propose an approach to automatically alter picture colors according to the emotion conveyed in the text.

Claudia Hauff, Dolf Trieschnigg


Weitere Informationen