ABSTRACT
Temporality is an important characteristic of text documents. While some documents are clearly atemporal, many have temporal character and can be mapped to certain time periods. In this paper, we introduce the problem of estimating focus time of documents. Document focus time is defined as the time to which the content of a document refers to and is considered as a complementary dimension to its creation time or timestamp. We propose several estimators of focus time by utilizing external knowledge bases such as news article collections which contain explicit temporal references. We then evaluate the effectiveness of our methods on diverse datasets of documents about historical events in five countries.
- Alonso, O. et al. Temporal Information Retrieval: Challenges and Opportunities. In TWAW 2011, pp. 1--8Google Scholar
- Arikan, I. Bedathur, S.J. and Berberich, K. Time Will Tell: Leveraging Temporal Expressions in IR. In WSDM 2009Google Scholar
- Au Yeung, C.-M. and Jatowt, A., Studying how the Past is Remembered: Towards Computational History through Large Scale Text Mining. In CIKM 2011, pp. 1231--1240 Google ScholarDigital Library
- Berberich, K., Bedathur, S.J., Alonso, O. and Weikum, G. A Language Modeling Approach for Temporal Information Needs. In ECIR 2010, pp. 13--25, 2010 Google ScholarDigital Library
- Campos, R., Dias, G., Jorge, A. M., and Nunes, C. GTE: A Distributional Second-Order Co-Occurrence Approach to Improve the Identification of Top Relevant Dates. In CIKM 2012, 2035--2039 Google ScholarDigital Library
- Cavnar, W. B. and Trenkle, J. M. N-Gram-Based Text Categorization. In SDAIR 1994, pp. 161--175Google Scholar
- Jones, R., and Diaz, F. Temporal Profiles of Queries. In TOIS: ACM Transactions on Information Systems, 25(3), 2007 Google ScholarDigital Library
- Jong de, F.M.G. and Rode, H. and Hiemstra, D. Temporal Language Models for the Disclosure of Historical Text. In AHC'05, pp. 161--168Google Scholar
- Kanhabua, N., and Nørvåg, K. Determining Time of Queries for Re-ranking Search Results. In ECDL 2010, pp. 261--272, 2010 Google ScholarDigital Library
- Kanhabua, N., and Nørvåg, K. Using Temporal Language Models for Document Dating, In MLKDD 2009, pp. 738--741, 2009 Google ScholarDigital Library
- Kerr, G. Timeline of World History, Canary Press, 2011Google Scholar
- Mani, I., and Wilson, G. Robust Temporal Processing of News. In ACL 2000, pp. 69--76, 2000 Google ScholarDigital Library
- Manning, C. and Schütze, H. Foundations of Statistical Natural Language Processing, MIT Press, 1999 Google ScholarDigital Library
- Metzler, D., Jones, R., Peng, F., and Zhang, R. Improving Search Relevance for Implicitly Temporal Queries. In SIGIR 2009, 700--701 Google ScholarDigital Library
- Michel, J.-B. et al. Quantitative Analysis of Culture Using Millions of Digitized Books. Science, 331(6014), pp. 176--182, 2011Google ScholarCross Ref
- Mihalcea, R., and Tarau, P. Textrank: Bringing Order into Text. In EMNLP 2004, pp. 404--411. 2004Google Scholar
- Ratnikas, A. Timelines of History, 2012 (Kindle edition)Google Scholar
- Strötgen, J. and Gertz, M. TimeTrails: a system for exploring spatio-temporal information in documents. In VLDB 2010, pp. 1569--1572Google Scholar
- Strötgen, J. Alonso, O. and Gertz, M. Identification of top relevant temporal expressions in documents. In TempWeb 2012, pp. 33--40 Google ScholarDigital Library
- Sheather, S.J. Density Estimation. Statistical Science. Vol. 19, Number 4, pp. 588--597, 2004Google Scholar
Index Terms
- Estimating document focus time
Recommendations
Generic method for detecting focus time of documents
Statistical approach for estimating the focus time of text documents.Classification framework for categorizing documents into temporal and atemporal.Bi-Temporal Document Representation using document focus time and creation time. Time is an important ...
A Concept Driven Graph Based Approach for Estimating the Focus Time of a Document
Mining Intelligence and Knowledge ExplorationAbstractMany text documents are temporal in nature, i.e., the contents of the document can be mapped to a specific time period. For example, a news article about the Kargil War can be mapped to the year 1999. Identifying this time period associated with ...
Improving retrieval of short texts through document expansion
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrievalCollections containing a large number of short documents are becoming increasingly common. As these collections grow in number and size, providing effective retrieval of brief texts presents a significant research problem. We propose a novel approach to ...
Comments