ABSTRACT
The Web is increasingly becoming an important channel for conducting businesses, disseminating information, and communicating with people on a global scale. More and more companies, organizations, and individuals are publishing their information on the Web. With all this information publicly available, naturally companies and individuals want to find useful information from these Web pages. As an example, companies always want to know what their competitors are doing and what products and services they are offering. Knowing such information, the companies can learn from their competitors and/or design countermeasures to improve their own competitiveness. The ability to effectively find such business intelligence information is increasingly becoming crucial to the survival and growth of any company. Despite its importance, little work has been done in this area. In this paper, we propose a novel visualization technique to help the user find useful information from his/her competitors' Web site easily and quickly. It involves visualizing (with the help of a clustering system) the comparison of the user's Web site and the competitor's Web site to find similarities and differences between the sites. The visualization is such that with a single glance, the user is able to see the key similarities and differences of the two sites. He/she can then quickly focus on those interesting clusters and pages to browse the details. Experiment results and practical applications show that the technique is effective.
- Allan, J., Leouski, A. V. and Swan, R. C. "Interactive Cluster Visualization for Information Retrieval". Tech. Rep. IR-116, Uni. of Mass., Amherst, 1997.Google Scholar
- Ashish, N. and Knoblock, C. "Wrapper Generation for Semi-structured Internet Sources". Workshop on Management of Semistructured Data, Ventana Canyon Resort, Tucson, Arizona. 1997.Google ScholarDigital Library
- Baeza-Yayes, R. and Ribeiro-Neto, B. Modern Information Retrieval. Addison Wesley. 1999. Google ScholarDigital Library
- Brin, S. and Page, L. "The Anatomy of a Large-Scale Hypertextual Web Search Engine". WWW-7, 1998. Google ScholarDigital Library
- Brown, M. H., Marais, H., Najork, M. A. and Weihl, W. E. "Focus+Context Displays of Web Pages: Implementation Alternatives". WWW-6. 1997.Google Scholar
- Cadez, I., Heckerman, D., Meek, C., Smyth, P. and White, S. "Visualization of Navigation Patterns on a Web Site Using Model-Based Clustering". KDD-2000, 2000. Google ScholarDigital Library
- Carey, M., Kriwaczek, F. and Ruger, S. M. "A Visualization Interface for Document Searching and Browsing". Proc of NPIVM 2000, 2000.Google Scholar
- Chakrabarti, S., Berg, M. van den and Dom, B. "Focused crawling: a new approach to topic-specific Web resource discovery". WWW-8, 1999. Google ScholarDigital Library
- Chen, Y. F. and Koutsofios, E. "WebCiao: A Website Visualization and Tracking System." WebNet97, 1997.Google Scholar
- Crouch, D. B., Crouch, C. J. and Andreas, G. "The Use of Cluster Hierarchies in Hypertext Information Retrieval". Hypertext'89, 1989. Google ScholarDigital Library
- Davulcu, H., Freire, J., Kifer, M. and Ramakrishnan, I.V. "A Layered Architecture for Querying Dynamic Web Content". SIGMOD'99, 1999. Google ScholarDigital Library
- Dean, J., and Henzinger, M.R. "Finding Related Pages in the World Wide Web". In Proceedings of WWW-8. 1999. Google ScholarDigital Library
- Douglis, F., Ball, T., Chen, Y. F. and Koutsofios, E. "The AT&T Internet Difference Engine: Tracking and Viewing Changes on the Web". World Wide Web Journal, Vol. 1. No.1. Baltzer Science Publishers, Jan. 1998. Google ScholarDigital Library
- Fu, Y., Sandhu, K. and Shih, M Y. "Clustering of Web Users Based on Access Patterns." In Proceedings of the 1999 KDD Workshop on Web Mining. 1999.Google Scholar
- Hasan, M., Mendelzon, A. and Vista, D. "Visual Web Surfing with Hy+." CASCON'95, 1995. Google ScholarDigital Library
- Hersovici, M., Jacovi, M., Marrek, Y. S., Pelleg, D., Shtalhaim, M. and Ur, S. "The shark-search algorithm - An application: tailored Web site mapping." WWW-7, 1998. Google ScholarDigital Library
- Hong, J. and Landay, J. "WebQuilt: A Framework for Capturing and Visualizing the Web Experiences." WWW-10, 2001. Google ScholarDigital Library
- Jain, A. K., Murty, M. N. and Flynn, P. J. "Data Clustering: A Review". ACM Computing Surveys, 1999. Google ScholarDigital Library
- Li, W. S. and Shim, J. "Facilitating complex Web queries through visual user interfaces and query relaxation". WWW-7, 1998. Google ScholarDigital Library
- Liu, B., Hsu, W. and Ma, Y. "Pruning and Summarizing the Discovered Associations." KDD-99, 1999. Google ScholarDigital Library
- Liu, B., Ma, Y. and Yu, P. S. "Discovering Unexpected Information from Your Competitor's Web Sites". KDD-01, 2001. Google ScholarDigital Library
- Mendelzon, A., Mihaila, G. and Milo, T. "Querying the World Wide Web." International Journal on Digital Libraries, 1(1):54--67, 1997.Google ScholarCross Ref
- Munzner, T. and Burchard, P. "Visualizing the Structure of the World Wide Web in 3D Hyperbolic Space". Proceedings of VRML'95, 1995. Google ScholarDigital Library
- Najork, M. and Wiener, J. L. "Breadth-First Search Crawling Yields High-Quality Pages". WWW-10, 2001. Google ScholarDigital Library
- Padmanabhan, B. and Tuzhilin, A. "Small is Beautiful: Discovering the Mining Set of Unexpected Patterns". KDD-2000. 2000. Google ScholarDigital Library
- Papakonstantinou, Y., Gupta, A., Garcia-Molina, H. and Ullman, J. "A Query Transition Scheme for Rapid Implementation of Wrappers". Proc. 4th International Conference on Deductive and Object-Oriented Databases, 1995. Google ScholarDigital Library
- Piatesky-Shapiro, G. and Matheus, C. "The Interestingness of Deviations". KDD-94. 1994.Google Scholar
- Ruocco, A. and Frieder, O. "Clustering and Classification of Large Document Bases in a Parallel Environment". Journal of the American Society for Information Science, 48(10): 932--943, 1997. Google ScholarDigital Library
- Salton, G. and McGill, M. J. Introduction to Modern Information Retrieval. McGraw-Hill, 1983. Google ScholarDigital Library
- Sebrechts, M. M., et al. "Visualization of Search Results: A Comparative Evaluation of Text, 2D, and 3D Interfaces". SIGIR'99, 1999 Google ScholarDigital Library
- Silberschatz, A. and Tuzhilin, A. "What Makes Patterns Interesting in Knowledge Discovery Systems". IEEE Trans. on Know. And Data Eng. 8(6), 1996. Google ScholarDigital Library
- Steinbach, M., Karypis, G. and Kumar, V. "A Comparison of Document Clustering Techniques". In KDD Workshop on Text Mining, 2000.Google Scholar
- Underwood, G., Maglio, P. and Barrett, R. "User-Centered Push for Timely Information Delivery". WWW7, 1998. Google ScholarDigital Library
- Zamir, O. and Etzioni, O. "Grouper: a Dynamic Clustering Interface to Web Search Results". WWW-8, 1999. Google ScholarDigital Library
Index Terms
- Visualizing web site comparisons
Recommendations
Visualization and Analysis of Clickstream Data of Online Stores for Understanding Web Merchandising
Clickstreams are visitors' paths through a Web site. Analysis of clickstreams shows how a Web site is navigated and used by its visitors. Clickstream data of online stores contains information useful for understanding the effectiveness of marketing and ...
Web site metadata
The currently established formats for how a Web site can publish metadata about a site's pages, the robots.txt file and sitemaps, focus on how to provide information to crawlers about where to not go and where to go on a site. This is sufficient as ...
Organizing domain-specific information on the Web: An experiment on the Spanish business Web directory
Web directories organize voluminous information into hierarchical structures, helping users to quickly locate relevant information and to support decision-making. The development of existing ontologies and Web directories either relies on expert ...
Comments