Skip to main content

2016 | OriginalPaper | Buchkapitel

5. Uncertainty Representations for Information Retrieval with Missing Data

verfasst von : Anne-Laure Jousselme, Patrick Maupin

Erschienen in: Fusion Methodologies in Crisis Management

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Retrieving items such as similar past events, or vessels with a specific characteristic of interest, is a critical task for crisis management support. The problem of information retrieval from incomplete databases is addressed in this paper. In particular, we assess the impact of the uncertainty representation about missing data for retrieving the corresponding items. After a brief survey on the problem of missing data with an emphasis on the information retrieval application, we propose a novel approach for retrieving records with missing data. The general idea of the proposed data-driven approach is to model the uncertainty pertaining to this missing data. We chose the general model of belief functions as it encompasses as special cases both classical set and probability models. Several uncertainty models are then compared based on (1) an expressiveness criterion (non-specificity or randomness) and (2) objective measures of performance typical to the Information Retrieval domain. The results are illustrated on a real dataset and a simulation controlled missing data mechanism.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
5
Maritime Mobile Service Identity.
 
6
International Maritime Organization.
 
7
Descriptors are also called terms, features, attributes, etc.
 
8
The typology of uncertainty types referred here is the one of Klir and Yuan (1995), in which fuzziness is omitted. Note that randomness is called discord in Klir and Yuan (1995).
 
9
We used here equal weights.
 
12
Among the series of results obtained for different values of τ we selected these ones as they were amongst those with (1) a clear difference between the models and (2) good performances. Further results will be provided in an extended version of our work.
 
Literatur
Zurück zum Zitat Aamodt A, Plaza E (1994) Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun 7(1):39–59 Aamodt A, Plaza E (1994) Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun 7(1):39–59
Zurück zum Zitat Ahlgren P, Grönqvist L (2006) Retrieval evaluation with incomplete relevance data: a comparative study of three measures. In: 15th ACM international conference on information and knowledge management, Arlington Ahlgren P, Grönqvist L (2006) Retrieval evaluation with incomplete relevance data: a comparative study of three measures. In: 15th ACM international conference on information and knowledge management, Arlington
Zurück zum Zitat Bach Tobji MA, Ben Yaghlane B, Mellouli K (2008) A new algorithm for mining frequent itemsets from evidential databases. In: Magdalena JVL, Ojeda-Aciego M (ed) Proceedings of IPMU, pp 1535–1542 Bach Tobji MA, Ben Yaghlane B, Mellouli K (2008) A new algorithm for mining frequent itemsets from evidential databases. In: Magdalena JVL, Ojeda-Aciego M (ed) Proceedings of IPMU, pp 1535–1542
Zurück zum Zitat Brini A, Boughanem M, Dubois D (2005) A model for information retrieval based on possibilistic networks. In: String processing and information retrieval (SPIRE 2005), Buenos Aires. Lecture notes in computer sciences. Springer, New York, pp 271–282 Brini A, Boughanem M, Dubois D (2005) A model for information retrieval based on possibilistic networks. In: String processing and information retrieval (SPIRE 2005), Buenos Aires. Lecture notes in computer sciences. Springer, New York, pp 271–282
Zurück zum Zitat Buckley C, Voorhees EM (2004) Retrieval evaluation with incomplete information. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’04), Sheffield, pp 25–32 Buckley C, Voorhees EM (2004) Retrieval evaluation with incomplete information. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’04), Sheffield, pp 25–32
Zurück zum Zitat Burkhard H-D (2004) Case completion and similarity in case-based reasoning. Comput Sci Inf Syst 1(2):27–55CrossRef Burkhard H-D (2004) Case completion and similarity in case-based reasoning. Comput Sci Inf Syst 1(2):27–55CrossRef
Zurück zum Zitat Chen LA (1988) On information retrieval and evidential reasoning. Tech. Rep. UCB/CSD-88-429, EECS Department, University of California, Berkeley Chen LA (1988) On information retrieval and evidential reasoning. Tech. Rep. UCB/CSD-88-429, EECS Department, University of California, Berkeley
Zurück zum Zitat Chen N, Dahanayake A (2007) Role-based situation-aware information seeking and retrieval for crisis response. Int J Intell Control Syst 12:186–197 Chen N, Dahanayake A (2007) Role-based situation-aware information seeking and retrieval for crisis response. Int J Intell Control Syst 12:186–197
Zurück zum Zitat Chowdhary KR, Bansal VS (2011) Information retrieval using probability and belief theory. In: International conference on emerging trends in networks and computer communications (ETNCC), pp 188–191 Chowdhary KR, Bansal VS (2011) Information retrieval using probability and belief theory. In: International conference on emerging trends in networks and computer communications (ETNCC), pp 188–191
Zurück zum Zitat Costa PCG, Laskey K, Blasch E, Jousselme A-L (2012) Towards unbiased evaluation of uncertainty reasoning: The URREF Ontology. In: Proceedings of the 15th International Conference on Information Fusion, Singapore Costa PCG, Laskey K, Blasch E, Jousselme A-L (2012) Towards unbiased evaluation of uncertainty reasoning: The URREF Ontology. In: Proceedings of the 15th International Conference on Information Fusion, Singapore
Zurück zum Zitat Crestani F, Lalmas M, Van Rijsbergen CJ, Campbell I (1998) Is this document relevant? … probably: a survey of probabilistic models in information retrieval. ACM Comput Surv 30(4):528–552CrossRef Crestani F, Lalmas M, Van Rijsbergen CJ, Campbell I (1998) Is this document relevant? probably: a survey of probabilistic models in information retrieval. ACM Comput Surv 30(4):528–552CrossRef
Zurück zum Zitat Dalvi N, Re C, Suciu D (2009) Probabilistic databases: diamonds in the dirt (extended version). Commun ACM 52:86–94CrossRef Dalvi N, Re C, Suciu D (2009) Probabilistic databases: diamonds in the dirt (extended version). Commun ACM 52:86–94CrossRef
Zurück zum Zitat da Silva WT, Milidiú RL (1993) Belief function model for information retrieval. J Am Soc Inf Sci 44(2):10–18CrossRef da Silva WT, Milidiú RL (1993) Belief function model for information retrieval. J Am Soc Inf Sci 44(2):10–18CrossRef
Zurück zum Zitat Farhangfar A, Kurgan L, Pedrycz W (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern - A: Syst and Humans 37(5):692–708CrossRef Farhangfar A, Kurgan L, Pedrycz W (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern - A: Syst and Humans 37(5):692–708CrossRef
Zurück zum Zitat Hewawasam GK, Premaratne K, Subasingha M-L, Shyu SP (2005) Rule mining and classification in imperfect databases. In: Proceedings of the 7th international conference on information fusion Hewawasam GK, Premaratne K, Subasingha M-L, Shyu SP (2005) Rule mining and classification in imperfect databases. In: Proceedings of the 7th international conference on information fusion
Zurück zum Zitat Joussselme A-L, Maupin P (2012) A brief survey of comparative elements for uncertainty calculi and decision procedures assessment. In: Proceedings of the 15th international conference on information fusion, 2012. Panel Uncertainty Evaluation: Current Status and Major Challenges Joussselme A-L, Maupin P (2012) A brief survey of comparative elements for uncertainty calculi and decision procedures assessment. In: Proceedings of the 15th international conference on information fusion, 2012. Panel Uncertainty Evaluation: Current Status and Major Challenges
Zurück zum Zitat Jousselme A-L, Maupin P (2013) Comparison of uncertainty representations for missing data in information retrieval. In: Proceedings of the international conference of information fusion, Istanbul Jousselme A-L, Maupin P (2013) Comparison of uncertainty representations for missing data in information retrieval. In: Proceedings of the international conference of information fusion, Istanbul
Zurück zum Zitat Jousselme A-L, Grenier D, Bossé E (2001) A new distance between two bodies of evidence. Inf Fusion 2:91–101CrossRef Jousselme A-L, Grenier D, Bossé E (2001) A new distance between two bodies of evidence. Inf Fusion 2:91–101CrossRef
Zurück zum Zitat Kim W, Choi B-J, Hong E-K, Kim S-K, Lee D (2003) A taxonomy of dirty data. Data Min Knowl Discov 7(1):81–99CrossRefMathSciNet Kim W, Choi B-J, Hong E-K, Kim S-K, Lee D (2003) A taxonomy of dirty data. Data Min Knowl Discov 7(1):81–99CrossRefMathSciNet
Zurück zum Zitat Klir GJ, Yuan B (1995) Fuzzy sets and fuzzy logic: theory and applications. Prentice Hall International, Upper Saddle RiverMATH Klir GJ, Yuan B (1995) Fuzzy sets and fuzzy logic: theory and applications. Prentice Hall International, Upper Saddle RiverMATH
Zurück zum Zitat Lalmas M (1998) Information retrieval and Dempster-Shafer’s theory of evidence. In: Applications of uncertainty formalisms. Lecture notes in computer science, Chap. B. Springer Berlin/Heidelberg, pp 157–176 Lalmas M (1998) Information retrieval and Dempster-Shafer’s theory of evidence. In: Applications of uncertainty formalisms. Lecture notes in computer science, Chap. B. Springer Berlin/Heidelberg, pp 157–176
Zurück zum Zitat Lee SK (1992) Imprecise and uncertain information in databases: an evidential approach. In: Proceedings of the 8th international conference data engineering, pp 614–621 Lee SK (1992) Imprecise and uncertain information in databases: an evidential approach. In: Proceedings of the 8th international conference data engineering, pp 614–621
Zurück zum Zitat McClean S, Scotney B, Shapcott M (2001) Aggregation of imprecise and uncertain information in databases. IEEE Trans Knowl Data Eng 13:902CrossRef McClean S, Scotney B, Shapcott M (2001) Aggregation of imprecise and uncertain information in databases. IEEE Trans Knowl Data Eng 13:902CrossRef
Zurück zum Zitat Schafer JL, John WG (2004) Missing data: our view of the state of the art. Psychol Methods 7(2):147–177CrossRef Schafer JL, John WG (2004) Missing data: our view of the state of the art. Psychol Methods 7(2):147–177CrossRef
Zurück zum Zitat Schmidt R, Vorobieva O (2007) Applying case-based reasoning for missing medical data in ISOR. In: LWA 07, pp 275–280 Schmidt R, Vorobieva O (2007) Applying case-based reasoning for missing medical data in ISOR. In: LWA 07, pp 275–280
Zurück zum Zitat Telmoudi A, Chakhar S (2004) Data fusion application from evidential databases as a support for decision making. Inf Softw Technol 46:547–555CrossRef Telmoudi A, Chakhar S (2004) Data fusion application from evidential databases as a support for decision making. Inf Softw Technol 46:547–555CrossRef
Zurück zum Zitat Wu S, McClean S (2006) Evaluation of system measures for incomplete relevance judgment in IR. In: Flexible query answering systems. Lecture notes in computer sciences, vol 4027. Springer, New York, pp 245–256 Wu S, McClean S (2006) Evaluation of system measures for incomplete relevance judgment in IR. In: Flexible query answering systems. Lecture notes in computer sciences, vol 4027. Springer, New York, pp 245–256
Zurück zum Zitat Yassir A, Nayak S (2012) Issues in data mining and information retrieval. Int J Comput Sci Commun Netw 2:93–98 Yassir A, Nayak S (2012) Issues in data mining and information retrieval. Int J Comput Sci Commun Netw 2:93–98
Zurück zum Zitat Yi X (2011) Discovering and using implicit data for information retrieval. Ph.D. thesis, University of Massachusetts Amherst Yi X (2011) Discovering and using implicit data for information retrieval. Ph.D. thesis, University of Massachusetts Amherst
Metadaten
Titel
Uncertainty Representations for Information Retrieval with Missing Data
verfasst von
Anne-Laure Jousselme
Patrick Maupin
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-22527-2_5

Neuer Inhalt