Micro-blog topic recommendation based on knowledge flow and user selection

https://doi.org/10.1016/j.jocs.2017.10.021Get rights and content

Highlights

  • The proposed method is to solve micro-blog topic recommendation’s low efficiency caused by massive micro-blog data.

  • The proposed micro-blog topic recommendation is based on knowledge flow and user selection.

  • Micro-blog users are clustered according to users’ previous preference.

  • Micro-blog topic of knowledge flow in different class are generated according to the user selection of recommended topics.

Abstract

Micro-blog topic recommendation aims to solve the problem of low efficiency for micro-blog topic recommendation caused by excessive micro-blog data. This paper proposed a micro-blog topic recommendation based on knowledge flow and user selection to improve the accessing speed of micro-blog and efficiency of topic recommendation. The micro-blog topic recommendation’s core tasks have two sides. One is analyzing the user’s preference for the micro-blog topic based on the user’s historical behavior. The other is recommending the topic to other users who have the similar historical behavior. First, users are clustered according to users’ previous preference to micro-blog topic. After that, the micro-blog topics of knowledge flow in different class (i.e., belongs to different users) are recommended. Finally, the knowledge flow according to the user selection of recommended topics is updated to improve the accuracy of micro-blog topic recommendation. The experimental results show that the proposed algorithm can improve the accuracy and efficiency of micro-blog topic recommendation effectively.

Introduction

In recent years, the status of emerging data platform is growing, more and more people are willing to focus on the current social hot topics through the network. Micro-blog, as one of the emerging data platforms, has been used by more and more people since it was launched. The number of micro-blog topics is growing exponentially at the same time. These topics are related to the time, hotspot, classification and many other dimensions. In the era of information explosion, the different dimensions of user interest and micro-blog topic also have become an important reference index for micro-blog topic recommendation. Therefore, the dimensions of user interest and micro-blog topic have been analyzed by many researchers for micro-blog topic recommendation.

However, how to effectively recommend micro-blog topic to the clustering users has become one of the focuses of current research. The main reason that micro-blog topic recommendation draws so many scholars' attention is that micro-blog topic recommendation can achieve reorganization of massive information. And the fragment of micro-blog topic can be integrated into a new type of knowledge flow. Because the structure of micro-blog topic's knowledge flow can be continuously updated with the passage of time and the selection by people, the users’ selection is difference due to their different interests for micro-blog topic. For example, a hot topic in the past may become a non-hot topic now, so recommended value of this topic is reduced. Similarly, a non-hot topic in the past may also become a hot topic now due to it may be gradually known with passage of time, thus becomes a hot topic worthy of recommendation to users. And some users like hot topics and other users may prefer non-hot topics. The features of micro-blog topic and user selection also brought many challenges to the algorithm of micro-blog topic recommendation.

In order to solve the above problems, the method of micro-blog topic recommendation based on knowledge flow and user selection is proposed in this paper. The core task of this method is clustering user according to the operation for micro-blog. The current micro-blog topics in micro-blog platform are regarded as micro-blog's initial knowledge. The micro-blog which meets the requirements after the screening can be organized as a unique knowledge flow for the user group. After that the user's knowledge flow of the micro-blog topic can be recommended to the user group. Finally, the knowledge flow will be updated according to the user's selection for the recommended micro-blog topic. For the convenience of description, the part of knowledge flow organization is always adopted a user group for example. There are three steps in the micro-blog topic recommendation based on knowledge flow and user selection, and it is shown in Fig. 1:

  • (1)

    The acquisition for user interest vector based on user selected micro-blog topic. First, the crawler programs can be used to obtain users’ previous browsing data for micro-blog topics. Second, these browsing data can be analyzed to get each user's interest vector for micro-blog topics. Finally, the K-means algorithm is used to cluster the micro-blog users, and get k user group sand the core user of each user group. These user groups satisfy the high degree of similarity within the group and the low degree of similarity among groups.

  • (2)

    The organization for micro-blog topic based on knowledge flow. There are two times of selection for micro-blog in the organization of knowledge flow in this paper to ensure that the micro-blog topic in the knowledge flow is worth recommending. The current micro-blog topics in micro-blog platform are regarded as micro-blog's initial knowledge, and the interestingness of these micro-blog topics can be calculated. If the interestingness is less than the threshold set in advance, then the micro-blog topic will be removed from the initial knowledge flow. The interest vectors of all micro-blog topics in the knowledge flow can be calculated, and the Euclidean Distance(ED)between the micro-blog interest vector and the core users in the user group can also be calculated. If the ED is greater than the threshold set in advance, then the micro-blog topic can be deleted.

  • (3)

    The implementation of micro-blog topic recommendation and knowledge flow update algorithm. The micro-blog topic with the largest similarity will be to the front according to the similarity between the micro-blog interest vector and the core user interest vector. The micro-blog topics will be lined up in a row recommended to the user, and then user makes select whether to browse the micro-blog topic or not. At the same time, the interestingness and interest vector of micro-blog topic can be calculated again based on user selections on the recommended micro-blog topics (including micro-blog topic browse time, browse frequency and browsing times). And then, the micro-blog topic in the knowledge flow is selected to update the knowledge flow.

The rest of the paper is organized as follows: In Section 2, we give a brief review of related works about micro-blog topic recommendation. Section 3 introduces the basic concepts. Section 4 presents the acquisition for user interest vector based on user selected micro-blog topic. Section 5 introduces the organization for micro-blog topic based on knowledge flow. Section 6 presents the implementation of micro-blog topic recommendation and knowledge flow update algorithm. We give the experimental results and some analyses in Section 7. Finally, conclusions are made in the Section 8.

Section snippets

Related works

In this section, this paper will explain the relevant research on recommender systems and knowledge flow. The specific researches are as follows.

Basic concepts

According to investigation and study, the user's interest can be analyzed by the user's historical browsing behavior [23]. First of all, there are many aspects of impacting on user browsing micro-blog, such as his age, nature of work and living environment and gender. Secondly, if the user interested in a topic in micro-blog, it will be spent more time on, and be seen more frequent by the user browsing. In everyday life, users are interested in the micro-blog topic if users spend as much time

User clustering based on user selected micro-blog topic

The crawler program can be conducted for crawling the micro-blog topics that have been browsed by users. So the average fresh degree, the average daily number of point praise and the average daily number of forwarding for these micro-blog topics can be calculated. This paper uses K-means algorithm to cluster users with similar interest vectors to form k user groups and the core of each user group (shown in Fig. 2). The Euclidean Distance(ED) is used to calculate the similarity between user

The organization for micro-blog topic based on knowledge flow

Combined with the results of the user clustering above, this paper will be devoted to study knowledge flow formation based on topics of micro-blog. In order to improve the accuracy of micro-blog topics recommendation, there are two times of selection for micro-blog topics in the organization of knowledge flow. The organization processes of knowledge flow are as following.

  • (1)

    The paper uses existing micro-blog topics in the platform of micro-blog as initial knowledge flow. The formula (2) is used to

The implementation of micro-blog topic recommendation and knowledge flow update algorithm

Based on the above micro-blog user clustering, the knowledge flow of each user group can be obtained by the construction algorithm of knowledge flow. Micro-blog topic recommendation system can recommend micro-blog topics to the users quickly. The specific recommendation method is in turn recommended based on the location of micro-blog topic in the flow of knowledge. (shown in Fig. 4). For example, the first micro-blog topic in the knowledge flow will first be recommended to the user. Users can

The methods of experiment

We get the information about micro-blog topics browsed by 10 Sina users and several Sina micro-blog topics to verify the effect of micro-blog topic recommendation based on knowledge flow and user selection. At the beginning of the experiment, the user’s interest vector and the micro-blog topic’s interest vector can be calculated according to the micro-blog information browsed by the users. In order to verify the validity of the algorithm, the accuracy (i.e., the ratio of the number of interest

Conclusions

Aiming at the problem of micro-blog topic information overload, the algorithm of micro-blog topic recommendation based on knowledge flow and user selection is proposed in this paper to improve the accuracy and efficiency of micro-blog topic recommendation. Firstly, the micro-blog topics that user browsed previously will be analyzed to constructed the interest vector for user, and user clustering will be conducted by the K-means algorithm. Secondly, the micro-blog topic data can be obtained from

Acknowledgments

This Research work was supported in part by the Natural Science Foundation of Anhui Province University (No. KJ2015A111), by the Youth Scientific Research Foundation of Anhui University of Science & Technology (Grant No. QN201321), by the National Science Foundation of China (Grant No. 61300202), by CCF-Venustech Research Program (Grant No. CCF-VenustechRP2017006), by 2018 Top talent project of Anhui colleges and universities.

Shunxiang Zhang received his Ph.D. degree from the School of Computing Engineering and Science, Shanghai University, Shanghai, in 2012. He is a professor at Anhui University of Science and Technology, China. His current research interests include Web Mining, Semantic Search, and Complex network.

References (30)

  • M. Jamali et al.

    A matrix factorization technique with trust propagation for recommendation in social networks

  • W. Zhang et al.

    Combining latent factor model with location features for event-based group recommendation

  • Jianjun,

    Tongyu Combining long-term and short-term user interest for personalized hashtag recommendation

    Front. Comput. Sci.

    (2015)
  • P. Kazienko et al.

    Multidimensional social network in the social recommender system

    IEEE Trans. Syst. Man Cybernet. A: Syst. Hum.

    (2011)
  • Shaymaa Khater et al.

    Personalized microblogs corpus recommendation based on dynamic users interests

  • Cited by (0)

    Shunxiang Zhang received his Ph.D. degree from the School of Computing Engineering and Science, Shanghai University, Shanghai, in 2012. He is a professor at Anhui University of Science and Technology, China. His current research interests include Web Mining, Semantic Search, and Complex network.

    Wenjuan Liu (Corresponding author) received the Master degree of Engineering from the School of Computing Engineering and Science, Anhui University of Science and Technology, in 2006, China. She is a lecturer at Anhui University of Science and Technology, China. Her current research interests include Web Mining, Semantic Search, and Cognitive Science.

    Xiaolu Deng received the Bachelor degree of Engineering in the school of computer Science and Engineering, in 2016, China. Now, she is a M.S. degree candidate of Computer Science and Engineering, AUST. Her current research interests are in Natural Language Processing, Data Mining and Semantic Web.

    Zheng Xu was born in Shanghai, China. He received the Diploma and Ph.D. degrees from the School of Computing Engineering and Science, Shanghai University, Shanghai, in 2007 and 2012, respectively. He is currently working in the third research institute of ministry of public security and as a postdoc at Tsinghua University, China. His current research interests include topic detection and tracking, semantic Web and Web mining. He has authored or co-authored more than 70 publications including IEEE Trans. On Fuzzy Systems, IEEE Trans. On Automation Science and Engineering, IEEE Trans. On Cloud Computing, IEEE Trans. On Emerging Topics in Computing, IEEE Trans. On Systems, Man, and Cybernetics: Systems, IEEE Trans. On Big Data, etc.

    Kim-Kwang Raymond Choo received the Ph.D. in Information Security in 2006 from Queensland University of Technology, Australia. He currently holds the Cloud Technology Endowed Professorship at The University of Texas at San Antonio. He serves on the editorial board of Computers & Electrical Engineering, Cluster Computing, Digital Investigation, IEEE Access, IEEE Cloud Computing, IEEE Communications Magazine, Future Generation Computer Systems, Journal of Network and Computer Applications, PLoS ONE, Soft Computing, etc. He also serves as the Special Issue Guest Editor of ACM Transactions on Embedded Computing Systems (2017), ACM Transactions on Internet Technology (2016), Computers & Electrical Engineering (2017), Digital Investigation (2016), Future Generation Computer Systems (2016, 2018), IEEE Cloud Computing (2015), IEEE Network (2016), IEEE Transactions on Cloud Computing (2017), IEEE Transactions on Dependable and Secure Computing (2017), Journal of Computer and System Sciences (2017), Multimedia Tools and Applications (2017), Personal and Ubiquitous Computing (2017), Pervasive and Mobile Computing (2016), Wireless Personal Communications (2017), etc. In 2016, he was named the Cybersecurity Educator of the Year – APAC (Cybersecurity Excellence Awards are produced in cooperation with the Information Security Community on LinkedIn), and in 2015 he and his team won the Digital Forensics Research Challenge organized by Germany’s University of Erlangen-Nuremberg. He is the recipient of ESORICS 2015 Best Paper Award, 2014 Highly Commended Award by the Australia New Zealand Policing Advisory Agency, Fulbright Scholarship in 2009, 2008 Australia Day Achievement Medallion, and British Computer Society's Wilkes Award in 2008. He is also a Fellow of the Australian Computer Society, and an IEEE Senior Member.

    View full text