ABSTRACT
HypTrails is a bayesian approach for comparing different hypotheses about human trails on the web. While a standard implementation exists, it exposes performance issues when working with large-scale data. In this paper, we propose a distributed implementation of HypTrails based on Apache Spark taking advantage of several structural properties inherent to HypTrails. The performance improves substantially. Our implementation is publicly available.
- M. Becker, P. Singer, F. Lemmerich, A. Hotho, D. Helic, and M. Strohmaier. Photowalking the city: Comparing hypotheses about urban photo trails on flickr. In Social Informatics, volume 9471 of Lecture Notes in CS. 2015.Google ScholarCross Ref
- P. Singer, D. Helic, A. Hotho, and M. Strohmaier. Hyptrails: A bayesian approach for comparing hypotheses about human trails. In 24th Intl. World Wide Web Conf. (WWW2015), 2015, best paper. Google ScholarDigital Library
- E. Wulczyn and D. Taraborelli. Wikipedia Clickstream. figshare, 2015.Google Scholar
Index Terms
- SparkTrails: A MapReduce Implementation of HypTrails for Comparing Hypotheses About Human Trails
Recommendations
HypTrails: A Bayesian Approach for Comparing Hypotheses About Human Trails on the Web
WWW '15: Proceedings of the 24th International Conference on World Wide WebWhen users interact with the Web today, they leave sequential digital trails on a massive scale. Examples of such human trails include Web navigation, sequences of online restaurant reviews, or online music play lists. Understanding the factors that ...
A Bayesian Method for Comparing Hypotheses About Human Trails
When users interact with the Web today, they leave sequential digital trails on a massive scale. Examples of such human trails include Web navigation, sequences of online restaurant reviews, or online music play lists. Understanding the factors that ...
A Scalable Heterogeneous Dataflow Architecture For Big Data Analytics Using FPGAs (Abstract Only)
FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysDue to rapidly expanding data size, there is increasing need for scalable, high-performance, and low-energy frameworks for large- scale data computation. We build a dataflow architecture that harnesses FPGA resources within a distributed analytics ...
Comments