ABSTRACT
We present a system that tries to automatically collect and monitor Japanese blog collections that include not only ones made with blog softwares but also ones written as normal web pages. Our approach is based on extraction of date expressions and analysis of HTML documents. Our system also extracts and mines useful information from the collected blog pages.
- M. Ceglowski. Www::blog::identify - identify blogging tools based on url and content. http://search.cpan.org/~mceglows/ WWW-Blog-Identify-0.06/Identify.pm, 2003.Google Scholar
- IPA(Information-technology Promotion Agency, Japan). Generic engine for transposable association: Geta. http://geta.ex.nii.ac.jp/, 2002.Google Scholar
- J. Kleinberg. Bursty and hierarchical structure in streams. In Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1--25, 2002. Google ScholarDigital Library
- R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In Proc. of the 12th International World Wide Web Conference, pages 568--576, 2003. Google ScholarDigital Library
- J. Wiebe, E. Breck, C. Buckley, and C. Cardie. Recognizing and organizing opinions expressed in the world press. In Proc. of the 2003 AAAI Spring Symposium New Directions in Question Answering, pages 12--19, 2003. Technical Report SS-03-07.Google Scholar
- D. Winer. Weblogs.com xml-rpc interface. http://www.xmlrpc.com/weblogsCom, 2001.Google Scholar
Index Terms
- Automatically collecting, monitoring, and mining japanese weblogs
Recommendations
Scholarly hyperwriting: The function of links in academic weblogs
Weblogs are gaining momentum as one of most versatile tools for online scholarly communication. Since academic weblogs tend to be used by scholars to position themselves in a disciplinary blogging community, links are essential to their construction. ...
Mining advices from weblogs
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementWeblog, one of the fastest growing user generated contents, often contains key learnings gleaned from people's past experiences which are really worthy to be well presented to other people. One of the key learnings contained in weblogs is often vented ...
Information flows and social capital in weblogs: a case study in the Brazilian blogosphere
HT '08: Proceedings of the nineteenth ACM conference on Hypertext and hypermediaBlogs are tools for publishing information that have become very popular due to the way they facilitate the process of publishing on the Internet. Due to their popularity, blogs influence how information flows in cyberspace. This paper deals with the ...
Comments