Article

Creating a Web community chart for navigating related communities

Authors:
Masashi Toyoda

Institute of Industrial Science, University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo, JAPAN

Institute of Industrial Science, University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo, JAPAN
View Profile

,
Masaru Kitsuregawa

Institute of Industrial Science, University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo, JAPAN

Institute of Industrial Science, University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo, JAPAN
View Profile

HYPERTEXT '01: Proceedings of the 12th ACM conference on Hypertext and HypermediaSeptember 2001Pages 103–112https://doi.org/10.1145/504216.504244

Published:10 September 2001Publication History

HYPERTEXT '01: Proceedings of the 12th ACM conference on Hypertext and Hypermedia

Pages 103–112

ABSTRACT

Recent research on link analysis has shown the existence of numerous web communities on the Web. A web community is a collection of web pages created by individuals or any kind of associations that have a common interest on a specific topic. In this paper, we propose a technique to create a web community chart, that connects related web communities, from thousands of seed pages. This allows the user to navigate through related web communities, and can be used for a `What's Related Community' service that provides not only the web community including a given page but also related web communities. Our technique is based on a related page algorithm that gives related pages to a given page using only link analysis. We show that the algorithm can be used for creating the chart by applying the algorithm to each seed, then using similarities of the results to classify seeds into clusters and to deduce their relationships. We perform experiments to create a web community chart of companies and organizations from thousands of seed pages. First, we improve the precision of an existing related page algorithm, Companion, and evaluated the improved version, Companion-, by an user study. Then the chart is created using Companion-. The result chart consists of web communities including related pages, and paths between related web communities. From the chart, we can find many web communities of companies classified by their category of business, and relationships between the communities.

References

1.Krishna Bharat, Andrei Broder, Monika Henzinger, Puneet Kumar, and Suresh Venkatasubramanian. The Connectivity Server: fast access to linkage information on the Web. In Proceedings of the 7th International World Wide Web Conference, 1998. Google ScholarDigital Library
2.Krishna Bharat and Monika Henzinger. Improved Algorithms for Topic Distillation in a Hyperlinked Environment. In Proceedings of ACM SIGIR '98, 1998. Google ScholarDigital Library
3.D. Boley, M. Gini, R. Gross, S. Han, K. Hastings, G. Karypis, V. Kumar, B. Mobasher, and J. Moore. Partitioning-Based Clustering for Web Document Categorization. Desision Support Systems, 27(3):329-341, 1999. Google ScholarDigital Library
4.A. Broder, S. Glassman, M. Manasse, and G. Zweig. Syntactic clustering of the web. In Proceedings of the 6th International World Wide Web Conference, 1997. Google ScholarDigital Library
5.S. Chakrabarti, B. Dom, P. Raghavan, S. Rajagopalan, D. Gibson, and Jon Kleinberg. Automatic resource compilation by analyzing hyperlink structure and associated text. In Proceedings of the 7th International World Wide Web Conference, 1998. Google ScholarDigital Library
6.Jeffrey Dean and Monika R. Henzinger. Finding related pages in the World Wide Web. In Proceedings of the 8th World-Wide Web Conference, 1999. Google ScholarDigital Library
7.Gary W. Flake, Steve Lawrence, and C. Lee Giles. Efficient Identification of Web Communities. In Proceedings of KDD 2000, 2000. Google ScholarDigital Library
8.David Gibson, Jon Kleinberg, and Prabhakar Raghavan. Inferring Web Communities from Link Topology. In Proceedings of HyperText98, 1998. Google ScholarDigital Library
9.Jon M. Kleinberg. Authoritative Sources in a Hyperlinked Environment. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, 1998. Google ScholarDigital Library
10.Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and AndrewTomkins. Extracting large-scale knowledge bases from the web. In Proceedings of the 25th VLDB Conference, 1999. Google ScholarDigital Library
11.J. Pitkow and P. Pirolli. Life, death, and lawfulness on the electronic frontier. In Proceedings of International Conference on Computer and Human Interaction, 1997. Google ScholarDigital Library
12.Sridhar Rajagopalan Ravi Kumar, Prabhakar Raghavan and Andrew Tomkins. Trawling the Web for emerging cyber-communities. In Proceedings of the 8th World- Wide Web Conference, 1999. Google ScholarDigital Library

Index Terms

Creating a Web community chart for navigating related communities

Recommendations

Extracting evolution of web communities from a series of web archives
HYPERTEXT '03: Proceedings of the fourteenth ACM conference on Hypertext and hypermedia

Recent advances in storage technology make it possible to store a series of large Web archives. It is now an exciting challenge for us to observe evolution of the Web. In this paper, we propose a method for observing evolution of web communities. A web ...
Read More
Detection of web communities from community cores
WISS'10: Proceedings of the 2010 international conference on Web information systems engineering

A Web community, as a significant pattern of the Web, formed by a group of pages focusing on a common topic. Web communities are able to be oriented by complete bipartite graphs (CBG for short, and also known as community cores). Investigations have ...
Read More
Constructing good quality web page communities

World Wide Web is a rich source of information and continues to expand in size and complexity. To capture the features of the Web at a higher level to realise the information classification and efficient retrieval on the Web is becoming a challenge task. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HYPERTEXT '01: Proceedings of the 12th ACM conference on Hypertext and Hypermedia
September 2001
270 pages
ISBN:1581134207
DOI:10.1145/504216
General Chair:
Kaj Grønbæk
University of Aarhus, Denmark
,
Program Chairs:
Hugh Davis
Univ. of Southampton, UK
,
Yellowlees Douglas
Univ. of Florida
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 September 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Web community
World Wide Web
link analysis
related web communities
Qualifiers
- Article
Conference

Acceptance Rates
HYPERTEXT '01 Paper Acceptance Rate45of136submissions,33%Overall Acceptance Rate378of1,158submissions,33%
More
Upcoming Conference
HT '24

Sponsor:

sigweb

35th ACM Conference on Hypertext and Social Media

September 10 - 13, 2024

Poznan , Poland
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 44
  Total Citations
  View Citations
- 845
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Creating a Web community chart for navigating related communities

HYPERTEXT '01: Proceedings of the 12th ACM conference on Hypertext and Hypermedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Extracting evolution of web communities from a series of web archives

Detection of web communities from community cores

Constructing good quality web page communities