Using latent topics to enhance search and recommendation in Enterprise Social Software

https://doi.org/10.1016/j.eswa.2012.02.073Get rights and content

Abstract

Enterprise Social Software refers to open and flexible organizational systems and tools which utilize Web 2.0 technologies to stimulate participation through informal interactions. A challenge in Enterprise Social Software is to discover and maintain over time the knowledge structure of topics found relevant to the organization. Knowledge structures, ranging in formality from ontologies to folksonomies, support user activity by enabling users to categorize and retrieve information resources. In this paper we enhance the search and recommendation functionalities of Enterprise Social Software by extending their knowledge structures with the addition of underlying hidden topics which we discover using probabilistic topic models. We employ Latent Dirichlet Allocation in order to elicit hidden topics and use the latter to assess similarities in resource and tag recommendation as well as for the expansion of query results. As an application of our approach we have extended the search and recommendation facilities of an open source Enterprise Social Software system which we have deployed and evaluated in five knowledge-intensive small and medium enterprises.

Highlights

► Application of latent topic models for search and recommendation in Enterprise Social Software. ► Generation and refinement of enterprise knowledge structures with latent topic models. ► Item to item collaborative and content based recommendations with Latent Dirichlet Allocation.

Introduction

Knowledge management (KM) has been recognized as a critical factor for obtaining organizational competitive advantage and has been shown to be a powerful ingredient in the success of organizations (see for example Davenport and Prusak, 1998, Desouza and Evaristo, 2003, Holsapple and Jiming, 2008, Kulkarni et al., 2007, Nonaka and Takeuchi, 1995). There have been many studies on the relationship between Information and Communication Technologies (ICT) and KM (Edwards et al., 2005, Holsapple, 2005, Maier, 2007, Tseng, 2008, Tsui, 2005). Most agree that ICT improve business values as an infrastructure and enabler of KM (see e.g., Alavi and Leidner, 2001, Tanriverdi, 2005). Many organizations have introduced knowledge management systems as a systematic way of applying ICT to KM. ICT have been identified as the third most important critical success factor for the adoption of KM in the SME sector (Wong & Aspinwall, 2005).

New advances in ICT support innovations in KM. Enterprise Social Software, which involves using the World Wide Web as a platform for social interactions (McAfee, 2006a, McAfee, 2006b), is one of the technologies driving a paradigm shift in KM (see e.g., Jung, 2009). Enterprise Social Software provides “an open and flexible environment to stimulate participation through informal interactions and aggregate these interactions into a structure that reflects the collective attitudes, dispositions and knowledge of the participants” (Eid, 2008). Such software provides innovative approaches for discovering, collecting, organizing, managing and distributing information; these approaches employ a number of tools: blogs, wikis, feeds, syndication systems, social bookmarking applications, and discussion forums.

The adoption of Enterprise Social Software leads to the emergence of additional content and information about the organization knowledge resources. This is a consequence of its key differences to contemporary technologies. Contemporary KM technologies typically predefine their employment in specific business situations (i.e., during after action reviews) and objectify knowledge (i.e., knowledge is seen as an object residing in documents and other artefacts) (Apostolou, Abecker, & Mentzas, 2007). On the contrary, Enterprise Social Software provides open, social platforms that indicate the inherently social character of knowledge, as it is constructed through sense-making episodes, involving various actors in an organization. Enterprise Social Software tools are abstracted from their practical use and are not defining their utilization in a strict and deterministic manner, while their deployment can be eventually emergent according to adapting needs, ideas, organizational practices, etc. (Patrick & Dotsika, 2007). While contemporary KM systems formulate routine information in a structured manner with specified up front roles, Enterprise Social Software lets structure emerge, rather than imposing it.

A prominent challenge for knowledge management systems is to discover and utilize useful content within the increased ‘knowledge base’ of the organization. To facilitate this task, categorization of resources with knowledge structures is commonly used. Knowledge structures range from the most simple and least expressive, such as folksonomies,3 to the most complex and precise ones, such as ontologies (McGuiness, 2003, Smith and Welty, 2001, Uschold and Gruninger, 2004). Folksonomies are bottom-up catalogues of tags. Ontologies are machine-readable specifications of domain knowledge using URLs for all data elements, properties and relationship types. Taxonomies, simple hierarchies of terms, are also used extensively by KM systems. Recently a number of techniques have emerged that can provide insight to human knowledge using unsupervised statistical analysis of data (Blei, Ng, & Jordan, 2003). These techniques can support fundamental information management processes such as searching for content and getting recommendations on relevant content.

In this paper we focus on knowledge structures in Enterprise Social Software that support search and recommendation. We aim to substantiate that the use of new advances in information retrieval technologies, combined with knowledge structures, can enhance the effectiveness of search and recommendation, and in turn can facilitate knowledge work and help improve the effectiveness of Enterprise Social Software for KM. In particular, we investigate approaches that use probabilistic topic models in order to uncover hidden topics in the organizational ‘knowledge base’. We also propose a methodology for applying such approaches to Enterprise Social Software to improve search and recommendation.

The remainder of this paper is organized as follows. The following section contains a short introduction to Enterprise Social Software, how it challenges search and recommender systems as well as an overview of knowledge structures used in search and recommender systems. In Section 3 we describe probabilistic topic models, the family of information processing techniques that we employed for uncovering latent topics in corporate knowledge bases. In Section 4 we provide a detailed description of our proposal that uses latent topics in search and recommendation functionalities. Section 5 contains a description of the case study, a system walkthrough and the results of the evaluation performed. In Section 6 we provide an overview of the related work while Section 7 is dedicated to a discussion on the implications of this work. Conclusions and further work are provided in Section 8.

Section snippets

Enterprise Social Software

The characteristics of the current Web 2.0 are different than that of the original Web. While the original Web was little more than a collection of semi-static pages, accessed by various users independent of each other, the current Web 2.0 is a highly social and interactive platform. This trend has found its counterpart inside organizations where it is called Enterprise 2.0 (McAfee, 2006a, McAfee, 2006b). Enterprise 2.0 is based largely on fostering internal and external communities using

Probabilistic topic models

In order to improve the search and recommendation functionalities of Enterprise Social Software, we focus on extending prominent knowledge structuring approaches, such as folksonomies and taxonomies with the addition of latent topics. We use probabilistic topic models as a technical background for uncovering these latent topics.

Probabilistic topic models are based upon the idea that documents are mixtures of topics, where a topic is defined as a probability distribution over words. Statistical

Using LDA to enhance search and recommendation

Our approach focuses on combining structured taxonomies and folksonomies with latent topics (Fig. 2). By enabling periodic processing of resources and unsupervised discovering of latent topics, discovered topics are used as a basis for enhancing search and recommendation functionalities (Christidis & Mentzas, 2010).

System and deployment

We have tested our approach by integrating it within an open source Enterprise Social Software, the OrganiK system. OrganiK provides a collection of social applications for the corporate environment and supports both taxonomies and folksonomies for annotating information resources (Christidis, Mentzas, & Apostolou, 2011), while it builds on and extends the prominent open source content management system Drupal.

OrganiK has been deployed in and used by five small and medium enterprises including

Related work

Search and recommender systems in social environments, both inside and outside the enterprise, have recently been a subject of intensive research, see a recent survey by Park, Kim, Choi and Kim (2012). (Dmitriev, Eiron, Fontoura, & Shekita, 2006) have suggested the use of implicit and explicit annotations as user feedback for improving the enterprise search, while (Amitay et al., 2009) have proposed ways to combine heterogeneous information in order to augment search functionality. (Dugan et

Discussion

Our system utilizes LDA-based document analysis for query expansion and for recommending resources to users. When an item, e.g., a document, is read by the user, the system either retrieves the topic distribution or, in the case of a new item, infers the underlying topic distribution.

Latent topic detection for content recommendation and search is unsupervised and provides a number of benefits, compared to other supervised and model-based methods. It is not depending on an explicit knowledge

Conclusions and further research

In this work we have demonstrated how probabilistic topic models as an integral part of Enterprise Social Software can enhance recommendation applications and improve the efficiency of search functionality. Our approach addresses problems in query expansion and can recommend relevant resources and tags which in turn can leverage the creation and evolution of knowledge structures including taxonomies and folksonomies. It also provides a sound basis for item-to-item collaborative and

Acknowledgement

Acknowledgments Research reported in this paper has been partially financed by the European Commission in the OrganiK project (FP7: Research for the Benefit of SMEs, 222225).

References (63)

  • Amitay, E., Carmel, D., Har’El, N., Ofek-Koifman, S., Soffer, A., Yogev, S., et al. (2009). Social search and discovery...
  • S. Anderson et al.

    Social networking in knowledge management

    IT Professional

    (2011)
  • D. Apostolou et al.

    Harmonising codification and socialisation in knowledge management

    Knowledge Management Research & Practice

    (2007)
  • Blei, D. M., & Lafferty, J. D. (2005). Correlated topic models. In NIPS,...
  • D.M. Blei et al.

    Latent Dirichlet allocation

    The Journal of Machine Learning Research

    (2003)
  • Blei, D. M., Griffiths, T. L., Jordan, M. I., & Tenenbaum, J. B. (2004). Hierarchical topic models and the nested...
  • Chen, W. Y., et al. (2009). Collaborative filtering for orkut communities: Discovery of user latent behavior. In...
  • K. Christidis et al.

    Using probabilistic topic models in enterprise social software

  • K. Christidis et al.

    Supercharging enterprise 2

    IEEE IT Professional

    (2011)
  • T.H. Davenport et al.

    Working knowledge: How organizations manage what they know

    (1998)
  • Deerwester, S., et al. (1988). Improving information retrieval with latent semantic indexing. In Proceedings of the...
  • E. Diaz-Aviles et al.

    LDA for on-the-fly auto tagging

  • P.A. Dmitriev et al.

    Using annotations in enterprise search

  • C. Dugan et al.

    The dogear game: A social bookmark recommender system

  • J.S. Edwards et al.

    Knowledge management systems: Finding a way with technology

    Journal of Knowledge Management

    (2005)
  • Eid, T. (2008). The Enterprise social software market continues to expand (Gartner:...
  • A. Gilchrist

    Corporate taxonomies: report on a survey of current practice

    Online Information Review

    (2001)
  • I. Guy et al.

    Do you want to know?: Recommending strangers in the enterprise

  • Guy, I., Zwerdling, N., Ronen, I., Carmel, D., & Uziel, E. (2010). Social media recommendation based on people and...
  • C. Haruechaiyasak et al.

    Article recommendation based on a topic model for wikipedia selection for schools

    Digital Libraries: Universal and Ubiquitous Access to Information

    (2008)
  • Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the 22nd ACM SIGIR conference on research...
  • Cited by (24)

    • Predicting the pattern of technology convergence using big-data technology on large-scale triadic patents

      2015, Technological Forecasting and Social Change
      Citation Excerpt :

      Yan et al. (2012) also applied LDA for discovering the research communities in area of information retrieval. Christidis et al. (2012) utilized LDA for enhancing search and recommendation functionality in Enterprise social software. We discussed previous studies on technological convergence and patent analysis.

    • Semantic compared cross impact analysis

      2014, Expert Systems with Applications
      Citation Excerpt :

      These semantic textual patterns consist of a list of terms that are semantically related. The aspect of meanings stated in each semantic text pattern can be found in several of the documents (Christidis, Mentzas, & Apostolou, 2012). Besides LSI, new semantic approaches have been introduced in scientific community with an improved performance.

    • Quantitative cross impact analysis with latent semantic indexing

      2014, Expert Systems with Applications
      Citation Excerpt :

      It identifies the hidden meaning of textual information in documents considering occurrences and co-occurrences of terms (D’Haen, Van den Poel, & Thorleuchter, 2013; Luo, Chen, & Xiong, 2011). Both, terms and documents are mapped to a semantic structure that consists of several semantic textual patterns (Christidis, Mentzas, & Apostolou, 2012; Park, Kim, Choi, & Kim, 2012). The impact of terms and documents on the patterns is calculated (Kuhn, Ducasse, & Girba, 2007).

    • Protecting research and technology from espionage

      2013, Expert Systems with Applications
    • Technology classification with latent semantic indexing

      2013, Expert Systems with Applications
      Citation Excerpt :

      With these techniques, words that occur in project or technology descriptions are used in the hidden semantic patterns but also words, that might be in these descriptions (Thorleuchter & Van den Poel, 2012d). This enables the identification of a similarity between a project and a set of technologies even if the words in the project description are completely different than the words in the technology descriptions (Christidis, Mentzas, & Apostolou, 2012; Thorleuchter, Weck, & Van den Poel, 2012b; Tsai, 2012). This approach uses LSI as well-known representative of these techniques (Thorleuchter & Van den Poel, 2012d; Thorleuchter, Van den Poel, & Prinzie, 2012).

    View all citing articles on Scopus
    1

    Tel.: +30 2107723895; fax: +30 2107724042.

    2

    Tel.: +30 2104142476; fax: +30 2104142264.

    View full text