Using latent topics to enhance search and recommendation in Enterprise Social Software
Highlights
► Application of latent topic models for search and recommendation in Enterprise Social Software. ► Generation and refinement of enterprise knowledge structures with latent topic models. ► Item to item collaborative and content based recommendations with Latent Dirichlet Allocation.
Introduction
Knowledge management (KM) has been recognized as a critical factor for obtaining organizational competitive advantage and has been shown to be a powerful ingredient in the success of organizations (see for example Davenport and Prusak, 1998, Desouza and Evaristo, 2003, Holsapple and Jiming, 2008, Kulkarni et al., 2007, Nonaka and Takeuchi, 1995). There have been many studies on the relationship between Information and Communication Technologies (ICT) and KM (Edwards et al., 2005, Holsapple, 2005, Maier, 2007, Tseng, 2008, Tsui, 2005). Most agree that ICT improve business values as an infrastructure and enabler of KM (see e.g., Alavi and Leidner, 2001, Tanriverdi, 2005). Many organizations have introduced knowledge management systems as a systematic way of applying ICT to KM. ICT have been identified as the third most important critical success factor for the adoption of KM in the SME sector (Wong & Aspinwall, 2005).
New advances in ICT support innovations in KM. Enterprise Social Software, which involves using the World Wide Web as a platform for social interactions (McAfee, 2006a, McAfee, 2006b), is one of the technologies driving a paradigm shift in KM (see e.g., Jung, 2009). Enterprise Social Software provides “an open and flexible environment to stimulate participation through informal interactions and aggregate these interactions into a structure that reflects the collective attitudes, dispositions and knowledge of the participants” (Eid, 2008). Such software provides innovative approaches for discovering, collecting, organizing, managing and distributing information; these approaches employ a number of tools: blogs, wikis, feeds, syndication systems, social bookmarking applications, and discussion forums.
The adoption of Enterprise Social Software leads to the emergence of additional content and information about the organization knowledge resources. This is a consequence of its key differences to contemporary technologies. Contemporary KM technologies typically predefine their employment in specific business situations (i.e., during after action reviews) and objectify knowledge (i.e., knowledge is seen as an object residing in documents and other artefacts) (Apostolou, Abecker, & Mentzas, 2007). On the contrary, Enterprise Social Software provides open, social platforms that indicate the inherently social character of knowledge, as it is constructed through sense-making episodes, involving various actors in an organization. Enterprise Social Software tools are abstracted from their practical use and are not defining their utilization in a strict and deterministic manner, while their deployment can be eventually emergent according to adapting needs, ideas, organizational practices, etc. (Patrick & Dotsika, 2007). While contemporary KM systems formulate routine information in a structured manner with specified up front roles, Enterprise Social Software lets structure emerge, rather than imposing it.
A prominent challenge for knowledge management systems is to discover and utilize useful content within the increased ‘knowledge base’ of the organization. To facilitate this task, categorization of resources with knowledge structures is commonly used. Knowledge structures range from the most simple and least expressive, such as folksonomies,3 to the most complex and precise ones, such as ontologies (McGuiness, 2003, Smith and Welty, 2001, Uschold and Gruninger, 2004). Folksonomies are bottom-up catalogues of tags. Ontologies are machine-readable specifications of domain knowledge using URLs for all data elements, properties and relationship types. Taxonomies, simple hierarchies of terms, are also used extensively by KM systems. Recently a number of techniques have emerged that can provide insight to human knowledge using unsupervised statistical analysis of data (Blei, Ng, & Jordan, 2003). These techniques can support fundamental information management processes such as searching for content and getting recommendations on relevant content.
In this paper we focus on knowledge structures in Enterprise Social Software that support search and recommendation. We aim to substantiate that the use of new advances in information retrieval technologies, combined with knowledge structures, can enhance the effectiveness of search and recommendation, and in turn can facilitate knowledge work and help improve the effectiveness of Enterprise Social Software for KM. In particular, we investigate approaches that use probabilistic topic models in order to uncover hidden topics in the organizational ‘knowledge base’. We also propose a methodology for applying such approaches to Enterprise Social Software to improve search and recommendation.
The remainder of this paper is organized as follows. The following section contains a short introduction to Enterprise Social Software, how it challenges search and recommender systems as well as an overview of knowledge structures used in search and recommender systems. In Section 3 we describe probabilistic topic models, the family of information processing techniques that we employed for uncovering latent topics in corporate knowledge bases. In Section 4 we provide a detailed description of our proposal that uses latent topics in search and recommendation functionalities. Section 5 contains a description of the case study, a system walkthrough and the results of the evaluation performed. In Section 6 we provide an overview of the related work while Section 7 is dedicated to a discussion on the implications of this work. Conclusions and further work are provided in Section 8.
Section snippets
Enterprise Social Software
The characteristics of the current Web 2.0 are different than that of the original Web. While the original Web was little more than a collection of semi-static pages, accessed by various users independent of each other, the current Web 2.0 is a highly social and interactive platform. This trend has found its counterpart inside organizations where it is called Enterprise 2.0 (McAfee, 2006a, McAfee, 2006b). Enterprise 2.0 is based largely on fostering internal and external communities using
Probabilistic topic models
In order to improve the search and recommendation functionalities of Enterprise Social Software, we focus on extending prominent knowledge structuring approaches, such as folksonomies and taxonomies with the addition of latent topics. We use probabilistic topic models as a technical background for uncovering these latent topics.
Probabilistic topic models are based upon the idea that documents are mixtures of topics, where a topic is defined as a probability distribution over words. Statistical
Using LDA to enhance search and recommendation
Our approach focuses on combining structured taxonomies and folksonomies with latent topics (Fig. 2). By enabling periodic processing of resources and unsupervised discovering of latent topics, discovered topics are used as a basis for enhancing search and recommendation functionalities (Christidis & Mentzas, 2010).
System and deployment
We have tested our approach by integrating it within an open source Enterprise Social Software, the OrganiK system. OrganiK provides a collection of social applications for the corporate environment and supports both taxonomies and folksonomies for annotating information resources (Christidis, Mentzas, & Apostolou, 2011), while it builds on and extends the prominent open source content management system Drupal.
OrganiK has been deployed in and used by five small and medium enterprises including
Related work
Search and recommender systems in social environments, both inside and outside the enterprise, have recently been a subject of intensive research, see a recent survey by Park, Kim, Choi and Kim (2012). (Dmitriev, Eiron, Fontoura, & Shekita, 2006) have suggested the use of implicit and explicit annotations as user feedback for improving the enterprise search, while (Amitay et al., 2009) have proposed ways to combine heterogeneous information in order to augment search functionality. (Dugan et
Discussion
Our system utilizes LDA-based document analysis for query expansion and for recommending resources to users. When an item, e.g., a document, is read by the user, the system either retrieves the topic distribution or, in the case of a new item, infers the underlying topic distribution.
Latent topic detection for content recommendation and search is unsupervised and provides a number of benefits, compared to other supervised and model-based methods. It is not depending on an explicit knowledge
Conclusions and further research
In this work we have demonstrated how probabilistic topic models as an integral part of Enterprise Social Software can enhance recommendation applications and improve the efficiency of search functionality. Our approach addresses problems in query expansion and can recommend relevant resources and tags which in turn can leverage the creation and evolution of knowledge structures including taxonomies and folksonomies. It also provides a sound basis for item-to-item collaborative and
Acknowledgement
Acknowledgments Research reported in this paper has been partially financed by the European Commission in the OrganiK project (FP7: Research for the Benefit of SMEs, 222225).
References (63)
- et al.
Global knowledge management strategies
European Management Journal
(2003) Uniting formal and informal descriptive power: Reconciling ontologies with folksonomies
International Journal of Information Management
(2009)Knowledge distribution via shared context between blog-based knowledge management systems: A case study of collaborative tagging
Expert Systems with Applications
(2009)- et al.
Collaborative user modeling with user-generated tags for social recommender systems
Expert Systems with Applications
(2011) A tag-topic model for blog mining
Expert Systems with Applications
(2011)The effects of information technology on knowledge management systems
Expert Systems with Applications
(2008)- et al.
A recommender system based on tag and time information for social tagging systems
Expert Systems with Applications
(2011) - et al.
Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions
IEEE Transactions on Knowledge and Data Engineering
(2005) - et al.
Finding high-quality content in social media
- et al.
Knowledge management and knowledge management systems: Conceptual foundations and research issues
MIS Quarterly
(2001)
Social networking in knowledge management
IT Professional
Harmonising codification and socialisation in knowledge management
Knowledge Management Research & Practice
Latent Dirichlet allocation
The Journal of Machine Learning Research
Using probabilistic topic models in enterprise social software
Supercharging enterprise 2
IEEE IT Professional
Working knowledge: How organizations manage what they know
LDA for on-the-fly auto tagging
Using annotations in enterprise search
The dogear game: A social bookmark recommender system
Knowledge management systems: Finding a way with technology
Journal of Knowledge Management
Corporate taxonomies: report on a survey of current practice
Online Information Review
Do you want to know?: Recommending strangers in the enterprise
Article recommendation based on a topic model for wikipedia selection for schools
Digital Libraries: Universal and Ubiquitous Access to Information
Cited by (24)
Predicting the pattern of technology convergence using big-data technology on large-scale triadic patents
2015, Technological Forecasting and Social ChangeCitation Excerpt :Yan et al. (2012) also applied LDA for discovering the research communities in area of information retrieval. Christidis et al. (2012) utilized LDA for enhancing search and recommendation functionality in Enterprise social software. We discussed previous studies on technological convergence and patent analysis.
Semantic compared cross impact analysis
2014, Expert Systems with ApplicationsCitation Excerpt :These semantic textual patterns consist of a list of terms that are semantically related. The aspect of meanings stated in each semantic text pattern can be found in several of the documents (Christidis, Mentzas, & Apostolou, 2012). Besides LSI, new semantic approaches have been introduced in scientific community with an improved performance.
Quantitative cross impact analysis with latent semantic indexing
2014, Expert Systems with ApplicationsCitation Excerpt :It identifies the hidden meaning of textual information in documents considering occurrences and co-occurrences of terms (D’Haen, Van den Poel, & Thorleuchter, 2013; Luo, Chen, & Xiong, 2011). Both, terms and documents are mapped to a semantic structure that consists of several semantic textual patterns (Christidis, Mentzas, & Apostolou, 2012; Park, Kim, Choi, & Kim, 2012). The impact of terms and documents on the patterns is calculated (Kuhn, Ducasse, & Girba, 2007).
Protecting research and technology from espionage
2013, Expert Systems with ApplicationsTechnology classification with latent semantic indexing
2013, Expert Systems with ApplicationsCitation Excerpt :With these techniques, words that occur in project or technology descriptions are used in the hidden semantic patterns but also words, that might be in these descriptions (Thorleuchter & Van den Poel, 2012d). This enables the identification of a similarity between a project and a set of technologies even if the words in the project description are completely different than the words in the technology descriptions (Christidis, Mentzas, & Apostolou, 2012; Thorleuchter, Weck, & Van den Poel, 2012b; Tsai, 2012). This approach uses LSI as well-known representative of these techniques (Thorleuchter & Van den Poel, 2012d; Thorleuchter, Van den Poel, & Prinzie, 2012).