LONLIES: Estimating Property Values for Long Tail Entities

Authors:
Mina Farid

University of Waterloo, Waterloo, ON, Canada

University of Waterloo, Waterloo, ON, Canada
View Profile

,
Ihab F. Ilyas

University of Waterloo, Waterloo, ON, Canada

University of Waterloo, Waterloo, ON, Canada
View Profile

,
Steven Euijong Whang

Google Research, Mountain View, CA, USA

Google Research, Mountain View, CA, USA
View Profile

,
Cong Yu

Google Research, New York, NY, USA

Google Research, New York, NY, USA
View Profile

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalJuly 2016Pages 1125–1128https://doi.org/10.1145/2911451.2911466

Published:07 July 2016Publication History

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Pages 1125–1128

ABSTRACT

Web search engines often retrieve answers for queries about popular entities from a growing knowledge base that is populated by a continuous information extraction process. However, less popular entities are not frequently mentioned on the web and are generally interesting to fewer users; these entities reside on the long tail of information. Traditional knowledge base construction techniques that rely on the high frequency of entity mentions to extract accurate facts about these mentions have little success with entities that have low textual support. We present Lonlies, a system for estimating property values of long tail entities by leveraging their relationships to head topics and entities. We demonstrate (1) how Lonlies builds communities of entities that are relevant to a long tail entity utilizing a text corpus and a knowledge base; (2) how Lonlies determines which communities to use in the estimation process; (3) how we aggregate estimates from community entities to produce final estimates, and (4) how users interact with Lonlies to provide feedback to improve the final estimation results.

References

M. S. Bernstein, J. Teevan, S. Dumais, D. Liebling, and E. Horvitz. Direct Answers for Search Queries in the Long Tail. In SIGCHI, 2012. Google ScholarDigital Library
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, 2008. Google ScholarDigital Library
J. Callan, M. Hoy, C. Yoo, and L. Zhao. Clueweb09 data set, 2009.Google Scholar
X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. Knowledge Vault: A Web-scale Approach to Probabilistic Knowledge Fusion. In SIGKDD, 2014. Google ScholarDigital Library
C. Manning and D. Klein. Optimization, maxent models, and conditional estimation without magic. In NAACL - Tutorials '03, Stroudsburg, PA, USA, 2003. Association for Computational Linguistics. Google ScholarDigital Library
M. E. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical review E, 69(2), 2004.Google Scholar
F. Niu, C. Zhang, C. Ré, and J. W. Shavlik. Deepdive: Web-scale knowledge-base construction using statistical learning and inference. VLDS, 12:25--28, 2012.Google Scholar
E. H. Simpson. Measurement of diversity. Nature, 1949.Google Scholar

Index Terms

LONLIES: Estimating Property Values for Long Tail Entities
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Information extraction
      2. Question answering

Recommendations

A search based approach to entity recognition: magnetic and IISAS team at ERD challenge
ERD '14: Proceedings of the first international workshop on Entity recognition & disambiguation

ERD 2014 was a research challenge focused on the task of recognition and disambiguation of knowledge base entities in short and long texts. This write-up describes Magnetic-IISAS team's approach to the entity recognition in search queries with which we ...
Read More
Two-stage approach to named entity recognition using Wikipedia and DBpedia
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

In natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE ...
Read More
Concordance-based entity-oriented search

We consider the problem of finding relevant named entities in response to a search query over a given text corpus. Entity search can readily be used to augment conventional web search engines for a variety of applications. We use entity concordance ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
General Chairs:
Raffaele Perego
ISTI-CNR, Italy
,
Fabrizio Sebastiani
Qatar Computing Research Institute, HBKU, Qatar
,
Program Chairs:
Javed Aslam
Northeastern University, US
,
Ian Ruthven
University of Strathclyde, UK
,
Justin Zobel
University of Melbourne, Australia
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 July 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
entity search
knowledge base construction
question answering
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 385
  Total Downloads
- Downloads (Last 12 months)36
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

LONLIES: Estimating Property Values for Long Tail Entities

SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

A search based approach to entity recognition: magnetic and IISAS team at ERD challenge

Two-stage approach to named entity recognition using Wikipedia and DBpedia

Concordance-based entity-oriented search