Abstract
We develop a methodology to automate creating imaginary people, referred to as personas, by processing complex behavioral and demographic data of social media audiences. From a popular social media account containing more than 30 million interactions by viewers from 198 countries engaging with more than 4,200 online videos produced by a global media corporation, we demonstrate that our methodology has several novel accomplishments, including: (a) identifying distinct user behavioral segments based on the user content consumption patterns; (b) identifying impactful demographics groupings; and (c) creating rich persona descriptions by automatically adding pertinent attributes, such as names, photos, and personal characteristics. We validate our approach by implementing the methodology into an actual working system; we then evaluate it via quantitative methods by examining the accuracy of predicting content preference of personas, the stability of the personas over time, and the generalizability of the method via applying to two other datasets. Research findings show the approach can develop rich personas representing the behavior and demographics of real audiences using privacy-preserving aggregated online social media data from major online platforms. Results have implications for media companies and other organizations distributing content via online platforms.
- Sofiane Abbar, J. An, H. Kwak, Yacine Messaoui, and Javier Borge-Holthoefer. 2015. Consumers and suppliers: Attention asymmetries. A Case Study of Aljazeera's News Coverage and Comments. Computation+Journalism Symposium 2015, New York, NY, 2--3 October.Google Scholar
- Tamara Adlin and John Pruitt. 2010. The Essential Persona Lifecycle: Your Guide to Building and Using Personas: Morgan Kaufmann Publishers, Inc. Google ScholarDigital Library
- Alchemy Taxonomy API. 2017. IBM Accessed 1 July https://www.ibm.com/watson/developercloud/alchemy-language.html.Google Scholar
- J. An, H. Kwak, and B. J. Jansen. 2016a. Validating social media data for automatic persona generation. The 2nd International Workshop on Online Social Networks Technologies (OSNT-2016), 13th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA2016). Agidar, Morocco, 29 November - 2 December.Google Scholar
- J. An, H. Kwak, and B. J. Jansen. 2017a. Automatic generation of personas using YouTube social media data. Proceedings of the 50th International Conference on System Sciences (HICSS-50). Waikoloa, Hawaii, 4--7 January.Google Scholar
- J. An, H. Kwak, and B. J. Jansen. 2017b. Personas for content creators via decomposed aggregate audience statistics. The 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2017). Sydney, Australia 31 Jul-3 Aug. Google ScholarDigital Library
- Jisun An, Ho Youn Cho, Haewoon Kwak, Mohammed Ziyaad Hassen, and Bernard J. Jansen. 2016b. Towards automatic persona generation using socialmedia. The 3rd International Symposium on Social Networks Analysis, Management and Security (SNAMS2016), The 4th International Conference on Future Internet of Things and Cloud. Vienna, Austria, 29 November - 2 December.Google Scholar
- Hugh Beyer and Karen Holtzblatt. 1998. Contextual Design: Defining Customer-centered Systems. Morgan-Kaufmann Publishers Inc. Google ScholarDigital Library
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, 993--1022. Google ScholarDigital Library
- Åsa Blomquist and Mattias Arvola. 2002. Personas in action: Ethnography in an interaction design team. Proceedings of the 2nd Nordic Conference on Human-Computer Interaction. Aarhus, Denmark. Google ScholarDigital Library
- Lesly Camacho, Alejandra Gonzalez, and Solange Nice Alves-Souza. 2018. Social network data to alleviate cold-start in recommender system: A systematic review. Information Processing 8 Management 54, 4, 529--544. Google ScholarDigital Library
- George Casella and Roger L. Berger. 2001. Statistical Inference. Pacific Grove, CA, Duxbury Press.Google Scholar
- M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon. 2007. I tube, you tube, everybody tubes: Analyzing the world's largest user generated content video system. Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement. Google ScholarDigital Library
- Christopher N. Chapman, E. Love, R. P. Milham, P. ElRif, and J. L. Alford. 2008. Quantitative evaluation of personas as information. Human Factors and Ergonomics Society 52nd Annual Meeting. New York, NY, 22--26 September.Google Scholar
- Christopher N. Chapman and Russell P. Milham. 2006. The Personas' New Clothes: Methodological and practical arguments against a popular method. Human Factors and Ergonomics Society Annual Meeting. San Francisco, CA, 16--20 October.Google Scholar
- Xihui Chen, Jun Pang, and Ran Xue. 2014. Constructing and comparing user mobility profiles. ACM Transactions on the Web (TWEB) 8, 4, Article 21. Google ScholarDigital Library
- Michael F. Clarke. 2015. The work of mad men that makes the methods of math men work: Practically occasioned segment design. The 33rd Annual ACM Conference on Human Factors in Computing Systems. Seoul, Republic of Korea. Google ScholarDigital Library
- Alan Cooper. 2004. The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity (2nd Edition). Pearson Higher Education. Google ScholarDigital Library
- Pallavi Dharwada, Joel S. Greenstein, Anand K. Gramopadhye, and Steve J. Davis. 2007. A case study on use of personas in design and development of an audit management system. Human Factors and Ergonomics Society Annual Meeting Proceedings. Baltimore, MD, 1--5 October.Google Scholar
- Vidya L. Drego and Moira Dorsey. 2010. The ROI of Personas. Forrester Research.Google Scholar
- Alexey Drutsa, Gleb Gusev, and Pavel Serdyukov. 2017. Periodicity in user engagement with a search engine and its application to online controlled experiments. ACM Transactions on the Web (TWEB) 11, 2, 1--35. Google ScholarDigital Library
- Elina Eriksson, Henrik Artman, and Anna Swartling. 2013. The secret life of a persona: When the personal becomes private. The SIGCHI Conference on Human Factors in Computing Systems. Paris, France. Google ScholarDigital Library
- Shamal Faily and Ivan Flechais. 2011. Persona cases: A technique for grounding personas. The SIGCHI Conference on Human Factors in Computing Systems. Vancouver, BC, Canada. Google ScholarDigital Library
- Erin Friess. 2012. Personas and decision making in the design process: An ethnographic case study. The SIGCHI Conference on Human Factors in Computing Systems. Austin, Texas. Google ScholarDigital Library
- Kim Goodwin and Alan Cooper. 2009. Designing for the Digital Age: How to Create Human-Centered Products and Services. Indianapolis, IN, Wiley. Google ScholarDigital Library
- R. M. Gray. 1984. Vector quantization. IEEE ASSP Magazine 1, 2, 4--29.Google ScholarCross Ref
- J. Grudin and J. Pruitt. 2002. Personas, participatory design and product development: An infrastructure for engagement. Participatory Design Conference.Google Scholar
- Rosa Guljonsdottir and Sinna Lindquist. 2008. Personas and scenarios: Design tool or a communication device. 8th International Conference on Cooperative Systems (COOP'08). Carry-le-Rouet, France, 20--23 May.Google Scholar
- Frank Y. Guo, Sanjay Shamdasani, and Bruce Randall. 2011. Creating effective personas for product design: Insights from a case study. In Internationalization, Design and Global Development: 4th International Conference, IDGD 2011, Held as Part of HCI International 2011, Orlando, FL, July 9-14, 2011, P. L. Patrick Rau (Ed.). Springer Berlin, 37--46. Google ScholarDigital Library
- Hoang Thi Bich Ngoc and Josiane Mothe. 2018. Location extraction from tweets. Information Processing 8 Management 54, 2, 129--144.Google Scholar
- B. J. Jansen, J. An, H. Kwak, Mohammed Ziyaad Hassen, and Ho Youn Cho. 2016. Efforts towards automatically generating personas in real-time using actual user data. Qatar Foundation Annual Research Conference 2016. Doha, Qatar, 22--23 March.Google ScholarCross Ref
- B. J. Jansen, Kate Sobel, and Geoff Cook. 2011. Classifying ecommerce information sharing behaviour by youths on social networking sites. Journal of Information Science 37, 2, 120--136. Google ScholarDigital Library
- Ian Jolliffe. 2002. Principal Component Analysis (2nd ed). New York, John Wiley 8 Sons, Ltd.Google Scholar
- Tejinder Judge, Tara Matthews, and Steve Whittaker. 2012. Comparing collaboration and individual personas for the design and evaluation of collaboration software. SIGCHI Conference on Human Factors in Computing Systems. Austin, Texas. Google ScholarDigital Library
- S. Jung, J. An, H. Kwak, M. Ahmad, L. Nielsen, and B. J. Jansen. 2017. Persona Generation from aggregated social media data. ACM Conference on Human Factors in Computing Systems 2017 (CHI2017). Denver, CO, 6--11 May. Google ScholarDigital Library
- D. Kahneman and A. Tversky. 1972. Subjective probability: A judgment of representativeness. Cognitive Psychology 3, 3, 430--454.Google ScholarCross Ref
- Jeon-Hyung Kang, and Kristina Lerman. 2017. Effort mediates access to information in online social networks. ACM Transactions on the Web (TWEB) 11, 1, 1--19. Google ScholarDigital Library
- S. D. Krashen. 1984. Immersion: Why it works and what it has taught us. Language and Society 12, 1, 61--64.Google Scholar
- H. Kwak and J. An. 2014. Understanding news geography and major determinants of global news coverage of disasters. Computation+Journalism Symposium 2014. New York, NY, 24--25 October.Google Scholar
- H. Kwak, J. An, and B. J. Jansen. 2017. Automatic generation of personas using youtube social media data. Hawaii International Conference on System Sciences (HICSS-50). Waikoloa, Hawaii, 4--7 January.Google Scholar
- Haewoon Kwak, Jisun An, Joni Salminen, Soon-Gyo Jung, and Bernard J. Jansen. 2018. What we read, what we search: Media attention and public attention among 193 countries. The 2018 World Wide Web Conference. Lyon, France. Google ScholarDigital Library
- Daniel D. Lee and Sebastian H. Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755, 788--791.Google Scholar
- E. Mao and J. Zhang. 2015. What drives consumers to click on social media ads? The Roles of Content, Media, and Individual Factors. 2015 48th Hawaii International Conference on System Sciences, 5--8 Jan. 2015. Google ScholarDigital Library
- Nicola Marsden and Maren Haag. 2016. Stereotypes and politics: Reflections on Personas. The 2016 CHI Conference on Human Factors in Computing Systems. Santa Clara, CA. Google ScholarDigital Library
- Adrienne L. Massanari. 2010. Designing for imaginary friends: Information architecture, personas, and the politics of user-centered design. New Media 8 Society 12, 4, 401--416.Google Scholar
- Tara Matthews, Tejinder Judge, and Steve Whittaker. 2012. How do designers and user experience professionals actually perceive and use personas? SIGCHI Conference on Human Factors in Computing Systems. Austin, TX. Google ScholarDigital Library
- Jennifer McGinn and Nalini Kotamraju. 2008. Data-driven persona development. SIGCHI Conference on Human Factors in Computing Systems. Florence, Italy. Google ScholarDigital Library
- M. L. McHugh. 2012. Interrater reliability: The kappa statistic. Biochemia Medica 22, 3, 276--282.Google ScholarCross Ref
- Tomasz Miaskiewicz, Susan Jung Grant, and Kenneth A. Kozar. 2009. A preliminary examination of using personas to enhance user-centered design. AMCIS 2009 Proceedings.Google Scholar
- Steve Mulder and Ziv Yaar. 2006. The User Is Always Right: A Practical Guide to Creating and Using Personas for the Web. New Rider, Berkely, CA. Google ScholarDigital Library
- Duc T. Nguyen and Jai E. Jung. 2017. Real-time event detection for online behavioral analysis of big social data. Future Generation Computer Systems 66, 137--145.Google ScholarCross Ref
- Lene Nielsen. 2004. Engaging Personas and Narrative Scenarios. Department of Informatics, Copenhagen Business School.Google Scholar
- Lene Nielsen and Kira Storgaard Hansen. 2014. Personas is applicable: A study on the use of personas in Denmark. 32nd Annual ACM Conference on Human Factors in Computing Systems. Toronto, Ontario, Canada. Google ScholarDigital Library
- Rafael B. Pereira, Alexandre Plastino, Bianca Zadrozny, and Luiz H. C. Merschmann. 2018. Correlation analysis of performance measures for multi-label classification. Information Processing 8 Management 54, 3, 359--369. Google ScholarDigital Library
- Steve Portigal. 2008. Persona non grata. Last Modified January 2008 Accessed 29 December. http://www.portigal.com/wp-content/uploads/2008/01/Portigal-Consulting-White-Paper-Persona-Non-Grata.pdf.Google Scholar
- John Pruitt and Tamara Adlin. 2005. The Persona Lifecycle: Keeping People in Mind Throughout Product Design. Morgan-Kaufmann Publishers Inc. Google ScholarDigital Library
- John Pruitt and Jonathan Grudin. 2003. Personas: Practice and theory. 2003 Conference on Designing for User Experiences. San Francisco, CA. Google ScholarDigital Library
- Adele Revella. 2015. Buyer Personas: How to Gain Insight into Your Customer's Expectations, Align Your Marketing Strategies, and Win More Business. Wiley.Google Scholar
- Kerry Rodden, Hilary Hutchinson, and Xin Fu. 2010. Measuring the user experience on a large scale: User-centered metrics for web applications. SIGCHI Conference on Human Factors in Computing Systems. Atlanta, GA. Google ScholarDigital Library
- Kari Rönkkö. 2005. An empirical study demonstrating how different design constraints, project organization and contexts limited the utility of personas. 38th Annual Hawaii International Conference on System Sciences, 03--06 Jan. 2005.Google ScholarDigital Library
- Kari Rönkkö, Mats Hellman, Britta Kilander, and Yvonne Dittrich. 2004. Personas is not applicable: Local remedies interpreted in a wider context. 8th Conference on Participatory Design: Artful Integration: Interweaving Media, Materials and Practices - Volume 1. Toronto, Ontario, Canada. Google ScholarDigital Library
- Joni Salminen, Lene Nielsen, Soon-Gyo Jung, Jisun An, Haewoon Kwak, and Bernard J. Jansen. 2018. Is more better?: Impact of multiple photos on perception of persona profiles. 2018 CHI Conference on Human Factors in Computing Systems. Montreal QC, Canada. Google ScholarDigital Library
- G. Shuradze and H. T. Wagner. 2016. Towards a conceptualization of data analytics capabilities. 2016 49th Hawaii International Conference on System Sciences (HICSS), 5--8 Jan. 2016. Google ScholarDigital Library
- Wendell R. Smith. 1956. A product differentiation and market segmentation as alternative marketing strategies. Journal of Advertising 21, 1, 3--8.Google Scholar
- Barbara B. Stern. 1994. A revised communication model for advertising: Multiple dimensions of the source, the message, and the recipient. Journal of Advertising 23, 2, 5--15.Google ScholarCross Ref
- Renata Tesch. 1990. Qualitative Research: Analysis Types and Software Tools. Psychology Press.Google Scholar
- Xiang Zhang, Hans-Frederick Brown, and Anil Shankar. 2016. Data-driven personas: Constructing archetypal users with clickstreams and user telemetry. 2016 CHI Conference on Human Factors in Computing Systems. Santa Clara, CA. Google ScholarDigital Library
Index Terms
- Imaginary People Representing Real Numbers: Generating Personas from Online Social Media Data
Recommendations
Persona Generation from Aggregated Social Media Data
CHI EA '17: Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing SystemsWe develop a methodology for persona generation using real time social media data for the distribution of products via online platforms. From a large social media account containing more than 30 million interactions from users from 181 countries ...
Persona: an online social network with user-defined privacy
SIGCOMM '09: Proceedings of the ACM SIGCOMM 2009 conference on Data communicationOnline social networks (OSNs) are immensely popular, with some claiming over 200 million users. Users share private content, such as personal information or photographs, using OSN applications. Users must trust the OSN service to protect personal ...
Comments