ABSTRACT
Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured metadata provide valuable evidence for learning how a community organizes knowledge. For instance, we can aggregate many personal hierarchies into a common taxonomy, also known as a folksonomy, that will aid users in visualizing and browsing social content, and also to help them in organizing their own content. However, learning from social metadata presents several challenges, since it is sparse, shallow, ambiguous, noisy, and inconsistent. We describe an approach to folksonomy learning based on relational clustering, which exploits structured metadata contained in personal hierarchies. Our approach clusters similar hierarchies using their structure and tag statistics, then incrementally weaves them into a deeper, bushier tree. We study folksonomy learning using social metadata extracted from the photo-sharing site Flickr, and demonstrate that the proposed approach addresses the challenges. Moreover, comparing to previous work, the approach produces larger, more accurate folksonomies, and in addition, scales better.
Supplemental Material
- I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Trans. Knowl. Discov. Data, 1(1):5, 2007. Google ScholarDigital Library
- M. Broecheler and L. Getoor. Probabilistic similarity logic. In Proceedings of International Workshop on Statistical Relational Learning, 2009.Google Scholar
- C. H. Brooks and N. Montanez. Improved annotation of the blogosphere via autotagging and hierarchical clustering. In Proceedings of the World Wide Web conference, 2006. Google ScholarDigital Library
- J. Euzenat and P. Shvaiko. Ontology Matching. Springer-Verlag, 2007. Google ScholarDigital Library
- S. A. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. J. Inf. Sci., 32(2):198--208, April 2006. Google ScholarDigital Library
- M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 1992. Google ScholarDigital Library
- P. Heymann and H. Garcia-Molina. Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical Report 2006--10, Stanford University, Stanford, CA, USA, April 2006.Google Scholar
- A. Maedche and S. Staab. Measuring similarity between ontologies. In Proceedings of the Knowledge Engineering and Knowledge Management, 2002. Google ScholarDigital Library
- A. Mathes. Folksonomies: cooperative classification and communication through shared metadata. 2004.Google Scholar
- P. Mika. Ontologies are us: A unified model of social networks and semantics. J. Web Sem., 5(1):5--15, 2007. Google ScholarDigital Library
- A. E. Monge and C. Elkan. An efficient domain-independent algorithm for detecting approximately duplicate database records. In Proceedings of the SIGMOD workshop on Data Mining and Knowledge Discovery, 1997.Google Scholar
- A. Plangprasopchok and K. Lerman. Constructing folksonomies from user-specified relations on flickr. In Proceedings of the World Wide Web conference, 2009. Google ScholarDigital Library
- T. Rattenbury, N. Good, and M. Naaman. Towards automatic extraction of event and place semantics from flickr tags. In Proceedings of the conference on Research and development in information retrieval, 2007. Google ScholarDigital Library
- M. Sanderson and W. B. Croft. Deriving concept hierarchies from text. In Proceedings of the conference on Research and development in information retrieval, pages 206--213, 1999. Google ScholarDigital Library
- P. Schmitz. Inducing ontology from flickr tags. In Proceedings of the WWW workshop on Collaborative Web Tagging Workshop, 2006.Google Scholar
- R. Snow, D. Jurafsky, and A. Y. Ng. Semantic taxonomy induction from heterogenous evidence. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2006. Google ScholarDigital Library
- O. Udrea, L. Getoor, and R. J. Miller. Leveraging data and structure in ontology integration. In SIGMOD Conference, 2007. Google ScholarDigital Library
- J. Wang and P. Domingos. Hybrid markov logic networks. In Proceedings of Association for the Advancement of Artificial Intelligence, 2008. Google ScholarDigital Library
- H. Yang and J. Callan. A metric-based framework for automatic taxonomy induction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2009. Google ScholarDigital Library
Index Terms
- Growing a tree in the forest: constructing folksonomies by integrating structured metadata
Recommendations
Constructing folksonomies from user-specified relations on flickr
WWW '09: Proceedings of the 18th international conference on World wide webAutomatic folksonomy construction from tags has attracted much attention recently. However, inferring hierarchical relations between concepts from tags has a drawback in that it is difficult to distinguish between more popular and more general concepts. ...
An Approach to Building High-Quality Tag Hierarchies from Crowdsourced Taxonomic Tag Pairs
SocInfo 2013: Proceedings of the 5th International Conference on Social Informatics - Volume 8238Building taxonomies for web content is costly. An alternative is to allow users to create folksonomies, collective social classifications. However, folksonomies lack structure and their use for searching and browsing is limited. Current approaches for ...
A probabilistic approach for learning folksonomies from structured data
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data miningLearning structured representations has emerged as an important problem in many domains, including document and Web data mining, bioinformatics, and image analysis. One approach to learning complex structures is to integrate many smaller, incomplete and ...
Comments