skip to main content
10.1145/3219819.3220100acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

Sequences of Sets

Published:19 July 2018Publication History

ABSTRACT

Sequential behavior such as sending emails, gathering in groups, tagging posts, or authoring academic papers may be characterized by a set of recipients, attendees, tags, or coauthors respectively. Such "sequences of sets" show complex repetition behavior, sometimes repeating prior sets wholesale, and sometimes creating new sets from partial copies or partial merges of earlier sets.

In this paper, we provide a stochastic model to capture these patterns. The model has two classes of parameters. First, a correlation parameter determines how much of an earlier set will contribute to a future set. Second, a vector of recency parameters captures the fact that a set in a sequence is more similar to recent sets than more distant ones. Comparing against a strong baseline, we find that modeling both correlation and recency structures are required for high accuracy. We also find that both parameter classes vary widely across domains, so must be optimized on a per-dataset basis. We present the model in detail, provide a theoretical examination of its asymptotic behavior, and perform a set of detailed experiments on its predictive performance.

Skip Supplemental Material Section

Supplemental Material

benson_sequences_of_sets.mp4

mp4

421.6 MB

References

  1. Eytan Adar, Jaime Teevan, and Susan T. Dumais . 2008. Large scale analysis of Web revisitation patterns Proceeding of the twenty-sixth Annual CHI Conference on Human Factors in Computing Systems. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Eytan Adar, Jaime Teevan, and Susan T. Dumais . 2009. Resonance on the Web: Web dynamics and revisitation patterns Proceedings of the 27th international conference on Human factors in computing systems. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Gediminas Adomavicius and Alexander Tuzhilin . 2005. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering Vol. 17, 6 (2005), 734--749. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ashton Anderson, Ravi Kumar, Andrew Tomkins, and Sergei Vassilvitskii . 2014. The dynamics of repeat consumption. In Proceedings of the 23rd International Conference on World Wide Web. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Austin R Benson, Rediet Abebe, Michael T Schaub, Ali Jadbabaie, and Jon Kleinberg . 2018 a. Simplicial closure and higher-order link prediction. arXiv:1802.06916 (2018). deftempurl%https://arxiv.org/abs/1802.06916 tempurlGoogle ScholarGoogle Scholar
  6. Austin R. Benson, Ravi Kumar, and Andrew Tomkins . 2016. Modeling User Consumption Sequences. In Proceedings of the 25th International Conference on World Wide Web. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Austin R. Benson, Ravi Kumar, and Andrew Tomkins . 2018 b. A Discrete Choice Model for Subset Selection. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jun Chen, Chaokun Wang, and Jianmin Wang . 2015. Will You “Reconsume" the Near Past? Fast Prediction on Short-Term Reconsumption Behaviors. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. deftempurl%https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9491 tempurl Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chao-Min Chiu, Meng-Hsiang Hsu, Hsiangchu Lai, and Chun-Ming Chang . 2012. Re-examining the influence of trust on online repeat purchase intention: The moderating role of habit and its antecedents. Decision Support Systems Vol. 53, 4 (2012), 835--845. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Alan Collins, Chris Hand, and Maggie Linnell . 2008. Analyzing repeat consumption of identical cultural goods: some exploratory evidence from moviegoing. Journal of Cultural Economics Vol. 32, 3 (2008), 187--199.Google ScholarGoogle ScholarCross RefCross Ref
  11. John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra . 2008. Efficient projections onto the $ell_1$-ball for learning in high dimensions Proceedings of the 25th International Conference on Machine Learning. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Marion M. Hetherington, Ali Bell, and Barbara J. Rolls . 2000. Effects of repeat consumption on pleasantness, preference and intake. British Food Journal Vol. 102, 7 (2000), 507--521.Google ScholarGoogle ScholarCross RefCross Ref
  13. Pamela J Hinds, Kathleen M Carley, David Krackhardt, and Doug Wholey . 2000. Choosing Work Group Members: Balancing Similarity, Competence, and Familiarity. Organizational Behavior and Human Decision Processes Vol. 81, 2 (2000), 226--251.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jacob Jacoby and David B. Kyner . 1973. Brand Loyalty vs. Repeat Purchasing Behavior. Journal of Marketing Research Vol. 10, 1 (1973).Google ScholarGoogle ScholarCross RefCross Ref
  15. Barbara E. Kahn, Manohar U. Kalwani, and Donald G. Morrison . 1986. Measuring Variety-Seeking and Reinforcement Behaviors Using Panel Data. Journal of Marketing Research Vol. 23, 2 (1986), 89.Google ScholarGoogle ScholarCross RefCross Ref
  16. Komal Kapoor, Mingxuan Sun, Jaideep Srivastava, and Tao Ye . 2014. A hazard based approach to user return time prediction Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Bryan Klimt and Yiming Yang . 2004. The Enron Corpus: A New Dataset for Email Classification Research Machine Learning: ECML 2004. Springer Berlin Heidelberg, 217--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jure Leskovec, Jon Kleinberg, and Christos Faloutsos . 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data Vol. 1, 1 (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rossana Mastrandrea, Julie Fournet, and Alain Barrat . 2015. Contact Patterns in a High School: A Comparison between Data Collected Using Wearable Sensors, Contact Diaries and Friendship Surveys. PLOS ONE Vol. 10, 9 (2015), e0136497.Google ScholarGoogle ScholarCross RefCross Ref
  20. Leigh McAlister . 1982. A Dynamic Attribute Satiation Model of Variety-Seeking Behavior. Journal of Consumer Research Vol. 9, 2 (1982), 141.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ben Morris and Yuval Peres . 2003. Evolving sets and mixing. In Proceedings of the thirty-fifth ACM Symposium on Theory of Computing. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ben Morris and Yuval Peres . 2005. Evolving sets, mixing and heat kernel bounds. Probability Theory and Related Fields Vol. 133, 2 (2005), 245--266.Google ScholarGoogle ScholarCross RefCross Ref
  23. Rebecca K. Ratner, Barbara E. Kahn, and Daniel Kahneman . 1999. Choosing Less-Preferred Experiences For the Sake of Variety. Journal of Consumer Research Vol. 26, 1 (1999), 1--15.Google ScholarGoogle ScholarCross RefCross Ref
  24. Maja Rudolph, Francisco Ruiz, Stephan Mandt, and David Blei . 2016. Exponential family embeddings. In Advances in Neural Information Processing Systems. 478--486. deftempurl%https://papers.nips.cc/paper/6571-exponential-family-embeddings tempurl Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Laurent Saloff-Coste . 2004. Random Walks on Finite Groups. In Probability on Discrete Structures. Springer Berlin Heidelberg, 263--346.Google ScholarGoogle Scholar
  26. Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang . 2015. An Overview of Microsoft Academic Service (MAS) and Applications Proceedings of the 24th International Conference on World Wide Web. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Juliette Stehlé, Nicolas Voirin, Alain Barrat, Ciro Cattuto, Lorenzo Isella, Jean-Franccois Pinton, Marco Quaggiotto, Wouter Van den Broeck, Corinne Régis, Bruno Lina, and Philippe Vanhems . 2011. High-Resolution Measurements of Face-to-Face Contact Patterns in a Primary School. PLOS ONE Vol. 6, 8 (2011), e23176.Google ScholarGoogle ScholarCross RefCross Ref
  28. Jaime Teevan, Eytan Adar, Rosie Jones, and Michael Potts . 2006. History repeats itself: Repeat queries in Yahoo's logs Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jaime Teevan, Eytan Adar, Rosie Jones, and Michael A. S. Potts . 2007. Information re-retrieval: Repeat queries in Yahoo's logs Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Hao Yin, Austin R. Benson, Jure Leskovec, and David F. Gleich . 2017. Local Higher-Order Graph Clustering. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan R Salakhutdinov, and Alexander J Smola . 2017. Deep sets Advances in Neural Information Processing Systems. 3394--3404. deftempurl%https://papers.nips.cc/paper/6931-deep-sets tempurlGoogle ScholarGoogle Scholar

Index Terms

  1. Sequences of Sets

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
            July 2018
            2925 pages
            ISBN:9781450355520
            DOI:10.1145/3219819

            Copyright © 2018 Owner/Author

            This work is licensed under a Creative Commons Attribution International 4.0 License.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 July 2018

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            KDD '18 Paper Acceptance Rate107of983submissions,11%Overall Acceptance Rate1,133of8,635submissions,13%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader