Evolutionary features of academic articles co-keyword network and keywords co-occurrence network: Based on two-mode affiliation network
Introduction
Within the recent development and popularization of information and data analysis technology, computing large-scale data-intensive analysis of scientific data is a new trend of data-mining [1], and as one of the main aspects of data-mining, text mining has become a new method of knowledge discovery. Text mining is a useful tool for understanding the basic information provided by one or more texts through structured algorithms. However, it can also be used to determine the relationships among the textual elements and the texts themselves. The results can be used for knowledge discovery and other applications. As an important tool and method for knowledge discovery, text mining has been used in many fields, such as medicine [2], biochemistry [3], business [4], and so on. The objects of analysis in text mining include literatures [5], news [6], network information [7], long texts [8], etc. Various technologies [9], [10], [11], [12], [13], [14], [15] and tools [16], [17], [18], [19], [20], [21] are used in this field. Such technologies and tools have been enhanced not only to conduct single-text analysis but also to analyze big data and complexity.
Existing literatures indicate that one of the most frequently use of text mining methods is to conduct a literature review, which allows researchers to determine the developing trends in the field. Literature reviews are also fundamental in academic research. There are currently two ways to conduct a literature review. The first is to identify important academic articles by their citation frequency and the impact factors of the journals in which they were published [22]. This method is used to identify recent developments in a field during a short period. However, due to the limited sample, it is difficult to achieve a holistic perspective using this method. The other frequently used method is content analysis [23], a research technique that involves the systematic, objective and quantitative description of a text’s content. This method has recently received increased attention. Researchers can use content analysis to find multi-text statistics and clusters by delimiting a research object and establishing a quantification standard. However, it is difficult to maintain the consistency of the quantitative criteria because both the classification and coding rules are based on the knowledge and experience of the researchers, which undermines the objectivity of the analysis. Content analysis is also an inadequate method for mining the complex relationships among the texts.
There is an urgent need to develop a tool for tracking the trends in academic articles and rapidly understanding the key points and inner relationship of a collection of texts from a holistic perspective. Keywords, an important textual element, can provide a concise overview of the important content and key points of a body of articles. Keyword analysis can also expedite text mining [24], [25]. Many scholars use tag clouds to analyze unstructured keywords because this method allows the user to highlight the most significant concepts, which facilitates navigation and visualization [26]. However, tag clouds only show the frequency of single words and do not show the relationships of the keywords and the relationships between the articles based on the keywords. Unlike tag clouds, complex network is a young but active method to discover the inner relationship between different entities from real or virtual system. It is well used in different areas, such as economic networks, biological networks, and so on. It can effectively model a network’s topological features [27], [28], [29], mine its relationships [30], and analyze its evolution [28]. As a new frontier of complex network, multi-mode network has been shown to better represent reality according to its heterogeneous attributes, it has been successfully used in some other area, such as multi-mode societal ecological affiliation network [31], [32], [33], [34], [35], fibers transmission [36] and shareholding network of the listed companies [37].
In this paper, we study the patterns of relationships among academic articles on a given theme, complex network, from a holistic perspective by constructing and analyzing annual articles co-keyword equivalent networks (AENs for short) and annual keywords co-occurrence equivalent networks (KENs for short). The process of constructing the two different networks is the same as the one employed to construct equivalent networks [38] using the two-mode affiliation network. The topological features of the two networks in annual level and the evolution as well as the stability of the two networks are analyzed. Then, the innovation coefficient of the networks about the given theme, complex networks, is defined and analyzed.
Section snippets
Constructing the AENs and KENs
In this paper, affiliation relationships can be found between keywords and the articles in which they appear. Networks constructed according to affiliation relationships are a typical type of two-mode network called a member-network [39] or hyper-network [40]. The two-mode affiliation network is composed of a set of actors (keywords) and a set of events (articles) [41]. According to Wasserman [42] and Li et al. [38], when there are two nodes, and , that have the same relationship with
The visualization of the two different networks
The equivalent networks of each keyword and each paper were constructed based on co-keywords and keywords co-occurrence at the same period (year). The equivalent networks were then superimposed to form the AENs and KENs. Fig. 2, Fig. 3, Fig. 4 present the visualization results and the quantity of nodes and edges of AENs and KENs.
As the Fig. 2, Fig. 4 show, the relationships between the words become increasingly complex over time (the nodes with the same color mean they are more strongly
Discussion and conclusion
In order to gain the evolutionary features of a body of articles and their relations, in this paper, we used 5944 “complex networks” articles that were published between 1990 and 2013 as the sample. Based on the two-mode affiliation network theory, we constructed the AENs by taking the articles as nodes, the co-keyword relationships as edges and the quantity of co-keywords as weights and the KENs by taking the articles’ keywords as nodes, the co-occurrence relationships as edges and the
Acknowledgments
This research is supported by grants from the National Natural Science Foundation of China (Grant No. 71173199), the China Scholarship Council (File No. 201406400004), the Humanities and Social Sciences Planning Funds Project under the Ministry of Education of the PRC (Grant No. 10YJA630001), and the Fundamental Research Funds for the Central Universities (Grant No. 2-9-2014-104). The authors would like to express their gratitude to the reviewers and Xuan Huang, Xiaoqing Hao, Xiaoliang Jia,
References (50)
- et al.
G-Hadoop: MapReduce across distributed data centers for data-intensive computing
Future Gener. Comput. Syst.
(2013) - et al.
Knowle: a semantic link network based system for organizing large scale online news events
Future Gener. Comput. Syst.
(2015) - et al.
Mining temporal explicit and implicit semantic relations between entities using web search engines
Future Gener. Comput. Syst.
(2014) - et al.
Diagnosis and multi-modality treatment of adult pulmonary plastoma: Analysis of 18 cases and review of literature
Asian Pac. J. Trop. Med.
(2014) - et al.
The role of fluctuating modes of autocorrelation in crude oil prices
Physica A
(2014) - et al.
The evolution of communities in the international oil trade network
Physica A
(2014) - et al.
On the topological properties of the cross-shareholding networks of listed companies in China: Taking shareholders’ cross-shareholding relationships into account
Physica A
(2014) - et al.
Energy regulation in China: Objective selection, potential assessment and responsibility sharing by partial frontier analysis
Energy Policy
(2014) - et al.
Demand-driven energy requirement of world economy 2007: A multi-region input–output network simulation
Commun. Nonlinear Sci. Numer. Simul.
(2013) - et al.
Three-scale input–output modeling for urban economy: Carbon emission by Beijing 2007
Commun. Nonlinear Sci. Numer. Simul.
(2013)
Energy security, efficiency and carbon emission of Chinese industry
Energy Policy
The shareholding similarity of the shareholders of the worldwide listed energy companies based on a two-mode primitive network and a one-mode derivative holding-based network
Physica A
Hypernetwork sampling: Duality and differentiation among voluntary organizations
Soc. Networks
The shareholding similarity of the shareholders of the worldwide listed energy companies based on a two-mode primitive network and a one-mode derivative holding-based network
Physica A
The dynamics of a mobile phone network
Physica A
Using textmining techniques in electronic patient records to identify ADRs from medicine use
Br. J. Clin. Pharmacol.
A method for integrating and ranking the evidence for biochemical pathways by mining reactions from text
Bioinformatics
The first step in the development of text mining technology for cancer risk assessment: Identifying and organizing scientific evidence in risk assessment literature
BMC Bioinformatics
Web text data mining for building large scale language modelling corpus
Context-based text mining for insights in long documents
Data mining for text categorization with semisupervised agglomerative hierarchical clustering
Int. J. Intell. Syst.
A self-adaptive clustering scheme with a time-decay function for microblogging text mining
Cited by (177)
Analyzing and mapping the current status, hotspots, and perspectives of lightweight cellular concrete: A bibliometric evaluation from 2000 to 2022
2024, Journal of Building EngineeringA data mining approach to analyze the role of biomacromolecules-based nanocomposites in sustainable packaging
2024, International Journal of Biological MacromoleculesCitation counts prediction of statistical publications based on multi-layer academic networks via neural network model[Formula presented]
2024, Expert Systems with ApplicationsSystematic analysis of the blockchain in the energy sector: Trends, issues, and future directions
2024, Telecommunications Policy