skip to main content
10.1145/1835804.1835924acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Growing a tree in the forest: constructing folksonomies by integrating structured metadata

Authors Info & Claims
Published:25 July 2010Publication History

ABSTRACT

Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured metadata provide valuable evidence for learning how a community organizes knowledge. For instance, we can aggregate many personal hierarchies into a common taxonomy, also known as a folksonomy, that will aid users in visualizing and browsing social content, and also to help them in organizing their own content. However, learning from social metadata presents several challenges, since it is sparse, shallow, ambiguous, noisy, and inconsistent. We describe an approach to folksonomy learning based on relational clustering, which exploits structured metadata contained in personal hierarchies. Our approach clusters similar hierarchies using their structure and tag statistics, then incrementally weaves them into a deeper, bushier tree. We study folksonomy learning using social metadata extracted from the photo-sharing site Flickr, and demonstrate that the proposed approach addresses the challenges. Moreover, comparing to previous work, the approach produces larger, more accurate folksonomies, and in addition, scales better.

Skip Supplemental Material Section

Supplemental Material

kdd2010_plangprasopchok_gtf_01.mov

mov

130 MB

References

  1. I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Trans. Knowl. Discov. Data, 1(1):5, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Broecheler and L. Getoor. Probabilistic similarity logic. In Proceedings of International Workshop on Statistical Relational Learning, 2009.Google ScholarGoogle Scholar
  3. C. H. Brooks and N. Montanez. Improved annotation of the blogosphere via autotagging and hierarchical clustering. In Proceedings of the World Wide Web conference, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Euzenat and P. Shvaiko. Ontology Matching. Springer-Verlag, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. A. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. J. Inf. Sci., 32(2):198--208, April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Heymann and H. Garcia-Molina. Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical Report 2006--10, Stanford University, Stanford, CA, USA, April 2006.Google ScholarGoogle Scholar
  8. A. Maedche and S. Staab. Measuring similarity between ontologies. In Proceedings of the Knowledge Engineering and Knowledge Management, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Mathes. Folksonomies: cooperative classification and communication through shared metadata. 2004.Google ScholarGoogle Scholar
  10. P. Mika. Ontologies are us: A unified model of social networks and semantics. J. Web Sem., 5(1):5--15, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. E. Monge and C. Elkan. An efficient domain-independent algorithm for detecting approximately duplicate database records. In Proceedings of the SIGMOD workshop on Data Mining and Knowledge Discovery, 1997.Google ScholarGoogle Scholar
  12. A. Plangprasopchok and K. Lerman. Constructing folksonomies from user-specified relations on flickr. In Proceedings of the World Wide Web conference, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Rattenbury, N. Good, and M. Naaman. Towards automatic extraction of event and place semantics from flickr tags. In Proceedings of the conference on Research and development in information retrieval, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Sanderson and W. B. Croft. Deriving concept hierarchies from text. In Proceedings of the conference on Research and development in information retrieval, pages 206--213, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Schmitz. Inducing ontology from flickr tags. In Proceedings of the WWW workshop on Collaborative Web Tagging Workshop, 2006.Google ScholarGoogle Scholar
  16. R. Snow, D. Jurafsky, and A. Y. Ng. Semantic taxonomy induction from heterogenous evidence. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. O. Udrea, L. Getoor, and R. J. Miller. Leveraging data and structure in ontology integration. In SIGMOD Conference, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Wang and P. Domingos. Hybrid markov logic networks. In Proceedings of Association for the Advancement of Artificial Intelligence, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Yang and J. Callan. A metric-based framework for automatic taxonomy induction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Growing a tree in the forest: constructing folksonomies by integrating structured metadata

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
        July 2010
        1240 pages
        ISBN:9781450300551
        DOI:10.1145/1835804

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 July 2010

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader