Skip to main content
Top

2003 | OriginalPaper | Chapter

Topical Crawling for Business Intelligence

Authors : Gautam Pant, Filippo Menczer

Published in: Research and Advanced Technology for Digital Libraries

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

The Web provides us with a vast resource for business intelligence. However, the large size of the Web and its dynamic nature make the task of foraging appropriate information challenging. General-purpose search engines and business portals may be used to gather some basic intelligence. Topical crawlers, driven by richer contexts, can then leverage on the basic intelligence to facilitate in-depth and up-to-date research. In this paper we investigate the use of topical crawlers in creating a small document collection that helps locate relevant business entities. The problem of locating business entities is encountered when an organization looks for competitors, partners or acquisitions. We formalize the problem, create a test bed, introduce metrics to measure the performance of crawlers, and compare the results of four different crawlers. Our results underscore the importance of identifying good hubs and exploiting link contexts based on tag trees for accelerating the crawl and improving the overall results.

Metadata
Title
Topical Crawling for Business Intelligence
Authors
Gautam Pant
Filippo Menczer
Copyright Year
2003
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-540-45175-4_22

Premium Partner