main-content

## Über dieses Buch

This handbook presents the state of the art of quantitative methods and models to understand and assess the science and technology system. Focusing on various aspects of the development and application of indicators derived from data on scholarly publications, patents and electronic communications, the individual chapters, written by leading experts, discuss theoretical and methodological issues, illustrate applications, highlight their policy context and relevance, and point to future research directions.

A substantial portion of the book is dedicated to detailed descriptions and analyses of data sources, presenting both traditional and advanced approaches. It addresses the main bibliographic metrics and indexes, such as the journal impact factor and the h-index, as well as altmetric and webometric indicators and science mapping techniques on different levels of aggregation and in the context of their value for the assessment of research performance as well as their impact on research policy and society. It also presents and critically discusses various national research evaluation systems.

Complementing the sections reflecting on the science system, the technology section includes multiple chapters that explain different aspects of patent statistics, patent classification and database search methods to retrieve patent-related information. In addition, it examines the relevance of trademarks and standards as additional technological indicators.

The Springer Handbook of Science and Technology Indicators is an invaluable resource for practitioners, scientists and policy makers wanting a systematic and thorough analysis of the potential and limitations of the various approaches to assess research and research performance.

## Inhaltsverzeichnis

### 1. The Journal Impact Factor: A Brief History, Critique, and Discussion of Adverse Effects

The journal impact factor ( Journal Impact Factor (JIF) ) is, by far, the most discussed bibliometric indicator. Since its introduction over 40 years ago, it has had enormous effects on the scientific ecosystem: transforming the publishing industry, shaping hiring practices and the allocation of resources, and, as a result, reorienting the research activities and dissemination practices of scholars. Given both the ubiquity and impact of the indicator, the JIF has been widely dissected and debated by scholars of every disciplinary orientation. Drawing on the existing literature as well as original research, this chapter provides a brief history of the indicator and highlights well-known limitations—such as the asymmetry between the numerator and the denominator, differences across disciplines, the insufficient citation window, and the skewness of the underlying citation distributions. The inflation of the JIF and the weakening predictive power is discussed, as well as the adverse effects on the behaviors of individual actors and the research enterprise. Alternative journal-based indicators are described and the chapter concludes with a call for responsible application and a commentary on future developments in journal indicators.

Vincent Larivière, Cassidy R. Sugimoto

### 2. Bibliometric Delineation of Scientific Fields

Delineation of scientific domains (fields, areas of science) is a preliminary task in bibliometric studies at the mesolevel, far from straightforward in domains with high multidisciplinarity, variety, and instability. The Sect. 2.2 shows the connection of the delineation problem to the question of disciplines versus invisible colleges, through three combinable models: ready-made classifications of science, classical information-retrieval searches, mapping and clustering. They differ in the role and modalities of supervision. The Sect. 2.3 sketches various bibliometric techniques against the background of information retrieval ( information retrieval (IR) ), data analysis, and network theory, showing both their power and their limitations in delineation processes. The role and modalities of supervision are emphasized. The Sect. 2.4 addresses the comparison and combination of bibliometric networks (actors, texts, citations) and the various ways to hybridize. In the Sect. 2.5, typical protocols and further questions are proposed.

Michel Zitt, Alain Lelu, Martine Cadot, Guillaume Cabanac

### 3. Knowledge Integration: Its Meaning and Measurement

Interdisciplinary research depends on research traditions and fields originating from different research teams, different countries and regions. Its essence is knowledge integration. As a dynamic and interactive process it continuously pushes the structure of science to become a complex diverse system.In this chapter, we provide a systematic review of interdisciplinary research. Starting from a definition of interdisciplinary research, its elements, and its role for scientific progress, we particularly focus on how to identify the activity of interdisciplinary research, how to measure it and point out the limitations of existing approaches. Stating that one can measure knowledge integration implies that this notion refers to a continuum, beginning from no integration (disciplinary research) to a large degree of integration (highly interdisciplinary).Following Stirling, Rafols and Meyer we show that knowledge integration can be measured by two main factors: a diversity factor and a network coherence factor. The diversity factor itself consists of three aspects: variety (number of categories taken into account), evenness and similarity between categories. In accordance with the Jost–Leinster–Cobbold approach we prefer a so-called true diversity measure.As an illustration, we provide a simple example of a study on interdisciplinarity in the field of synthetic biology, using the true diversity measure derived from the Rao–Stirling measure. Finally, we include some suggestions for future research.

Ronald Rousseau, Lin Zhang, Xiaojun Hu

### 4. Google Scholar as a Data Source for Research Assessment

Emilio Delgado López-Cózar, Enrique Orduña-Malea, Alberto Martín-Martín

### 5. Disentangling Gold Open Access

This chapter focuses on the analysis of current publication trends in gold Open Access ( open access (OA) ). The purpose of the chapter is to develop a full understanding of country patterns, OA journal characteristics and citation differences between gold OA and non-gold OA publications. For this, we will first review current literature regarding Open Access and its ostensible citation advantage. Starting with a chronological perspective we will describe its development, how different countries are promoting OA publishing, and its effects on the journal publishing industry. We will deepen the analysis by investigating the research output produced by different units of analysis. First, we will focus on the production of countries with a special emphasis on citation and disciplinary differences. A point of interest will be identification of national idiosyncrasies and the relation between OA publication and research of local interest. This will lead to our second unit of analysis, OA journals indexed in Web of Science. Here we will focus on journal characteristics and publisher types to clearly identify factors which may affect citation differences between OA and traditional journals which may not necessarily be derived from the OA factor. Gold OA publishing, as opposed to green OA, is being encouraged in many countries. This chapter aims at fully understanding how it affects researchers' publication patterns and whether it ensures an alleged citation advantage as opposed to non-gold OA publications. country publication profile open access (OA) gold scholarly communication citation advantage

Daniel Torres-Salinas, Nicolas Robinson-García, Henk F. Moed

### 6. Science Forecasts: Modeling and Communicating Developments in Science, Technology, and Innovation

In a knowledge-based economy, science and technology are omnipresent, and their importance is undisputed. Equally evident is the need to allocate resources, both monetary and human, in an effective way to foster innovation [6.1, 6.2]. In the preceding decades, science policy has embraced data mining and metrics to gain insights into the structure and evolution of science and to devise metrics and indicators [6.3], but it has not invested significant efforts into mathematical, statistical, and computational models that can predict future developments in science, technology, and innovation ( science, technology, and innovation (STI) ) in support of data-driven decision making.Recent advances in computational power combined with the unprecedented volume and variety of data concerning science and technology developments (e. g., publications, patents, funding, clinical trials, and stock market and social media data) yielded ideal conditions for the advancement of computational modeling approaches that can be not only empirically validated, but used to simulate and understand the structure and dynamics of STI in support of improved human decision making.In this chapter, we review and demonstrate the power of computational models for simulating and predicting possible STI developments and futures. In addition, we discuss novel means to visualize and broadcast STI forecasts to make them more accessible to general audiences.

Katy Börner, Staša Milojević

### 7. Science Mapping Analysis Software Tools: A Review

Scientific articles are one of the most important types of output of a researcher. In that sense, bibliometrics is an essential tool for assessing and analyzing academic research output contributing to the progress of science in many different ways. It provides objective criteria to assess research developed by researchers, being increasingly valued as a tool for measuring scholarly quality and productivity. Science mapping is a bibliometric tool to analyze and mine scientific output. The aim of this chapter is to present a thorough review of science mapping software tools, showing strengths and limitations. Six software tools that meet the criteria of being free, full, and allowing the whole analysis to be performed are analyzed: BibExcel CiteSpace II CitNetExplorer SciMAT Sci $${}^{2}$$ 2 Tool VOSviewer. This analysis describes aspects related to data processing, analysis options, and visualization. The particular properties of each tool that allows us to analyze the science are presented, the choice of a particular tool one depends on the type of actor to be analyzed and the output expected.

Jose A. Moral-Munoz, Antonio G. López-Herrera, Enrique Herrera-Viedma, Manuel J. Cobo

### 8. Creation and Analysis of Large-Scale Bibliometric Networks

In the more than a decade since the last Handbook of Quantitative Science and Technology Research [8.1] was published, a sea change has occurred in the creation and analysis of bibliometric networks that describe the Science & Technology (S&T) landscape. Previously, networks were typically restricted in size to hundreds or thousands of objects (papers, journals, authors, etc.) due to lack of data access and computing capacity. However, recent years have seen the increased availability of full databases, increased computing capacity, and development of partitioning and community detection algorithms that can work effectively at large scale. As a result, much larger networks–comprised of millions or tens of millions of objects–are being created and analyzed. These large-scale networks have enabled analyses that were simply not possible in the past, analyses that require the context of complete networks to give accurate results.In this chapter, we focus on large-scale, global bibliometric networks network bibliometric bibliometric network , and on the types of analysis that are enabled by these networks. We start by providing a historical perspective that sets the stage for recent advances that have culminated in the ability to create and analyze large-scale bibliographic networks bibliographic network . We then discuss data sources and the methods that are commonly used to create large-scale networks. We review many of these networks, along with the types of unique analyses that they enable, and ways that their results can be effectively communicated. After reviewing the state of the art, we discuss our most recent large-scale topic-level model of science in detail as an example of a global bibliometric model and show how it can be used for various applications.

Kevin W. Boyack, Richard Klavans

### 9. Science Mapping and the Identification of Topics: Theoretical and Methodological Considerations

This chapter focusses on the drivers for the advancement of mapping of science and the detection of topics as often applied in scientometrics. The chapter identifies three different drivers for this advancement: technological innovation resulting in increased computational power, the improved community detection approaches available today, and advancements in scientometrics itself with respect to the actual linking of documents through citations or lexical approaches. We will show that the main drivers are the first two, with the last one somewhat lagging behind. Next, severe methodological issues have been identified in network science related to the application of these techniques for community detection. The resolution limit and the degeneracy problem are described. The last section shows how different approaches are taken to enable scientometricians to create global maps of science and how they come to comparable results at higher levels of granularity but that the validity of more fine-grained clusters and topics suffers strongly in the discussed problems, which raises serious questions with respect to the applicability of these global techniques with a strong local focus.

Bart Thijs

### 10. Measuring Science: Basic Principles and Application of Advanced Bibliometrics

We begin with a short history of measuring science and discuss how the Science Citation Index has revolutionized the quantitative study of science and created a strong application potential. After reviewing the rationale of bibliometric analysis, we present the basic principle of the bibliometric methodology, with complex citation networks as a starting point. We show that the two main pillars of advanced bibliometric methods, citation-based analysis and science mapping, are both reducible to one and the same principle. From this basic principle we deduce a set of main indicators, particularly for the assessment of research output and international impact. Important elements include new approaches for identifying fields and research themes on the basis of a publication-level rather than a journal-level network; publication and citation counting; normalization of citation measures; the use of indicators based on averages versus those based on citation distributions; and weighting procedures and statistical reliability. In this account of the state of the art of advanced bibliometrics bibliometrics advanced , we highlight in particular the developments in our Leiden institute, given its long-standing, extensive, and broad experience.The next part of this chapter deals with practical applications of indicators, particularly real-life examples of evaluation studies. We further discuss several crucial issues such as the use of journal impact factors and h-index; the relation between peer review judgment and bibliometric findings; definition and delimitation of fields; assignment of publications; the influence of open access; webometrics and altmetrics; ranking of universities; and general objections to bibliometric analysis.The second main pillar of the advanced bibliometric methodology is the development of science maps. We discuss the basic elements and the construction of both citation-relation and word-relation science maps. Further, we present a method to combine the two main pillars: the integration of citation analysis in science maps. This combined citation analysis and science mapping science mapping can be used to explore research related to socioeconomic problems. Recently developed bibliometric instruments enable tunable mapping, which opens up new analytical opportunities in monitoring scientific research. Finally, we contend that bibliometric indicators indicator bibliometric and maps are not just evaluation tools for science policymakers, research managers, and individual researchers, but also powerful instruments in the study of science.

Anthony van Raan

### 11. Field Normalization of Scientometric Indicators

When scientometric indicators are used to compare research units active in different scientific fields, there is often a need to make corrections for differences between fields, for instance, differences in publication, collaboration, and citation practices. Field-normalized indicators aim to make such corrections. The design of these indicators is a significant challenge. We discuss the main issues in the design of field-normalized indicators and present an overview of the different approaches that have been developed for dealing with the problem of field normalization. We also discuss how field-normalized indicators can be evaluated and consider the sensitivity of scientometric analyses to the choice of a field-normalization approach.

Ludo Waltman, Nees Jan van Eck

### 12. All Along the h-Index-related Literature: A Guided Tour

In this chapter, a survey of the literature related to the $$h$$ h -index (referred to as $$h$$ h -related literature) between 2005 and 2016 is presented. In the first section, the basic definitions and a brief historical account are given. After providing an overview of the typology of the $$h$$ h -related publications and some earlier reviews, the more than 3000 $$h$$ h -related publications collected from four databases (Web of Science, Scopus, Google Scholar and Microsoft Academic) are analyzed by bibliometric methods. Document types, publication sources, subject categories, geographical distributions, authors and institutions, citations and references are listed and mapped. Several examples of applications of the $$h$$ h -index, within and outside the area of scientometrics, are presented, with particular attention to the possibilities for using the $$h$$ h -related indices as a network measure. Among the mathematical models used to explain and interpret the index and its relatives, Hirsch's model, the Lotkaian framework, models based on extreme value theory and on fuzzy integrals, and axiomatic approaches are demonstrated.

András Schubert, Gábor Schubert

### 13. Citation Classes: A Distribution-based Approach for Evaluative Purposes

In this chapter, we describe a scientometric assessment tool that was first introduced as early as the second half of the 1980s, but due to the high computational requirements at that time, the method fell undeservedly into oblivion. The method is called Characteristic Scores and Scales (CSS) and is aimed at providing a more detailed picture of citation impact, with particular regard to the high end of performance. More than two decades after its introduction, the method experienced a revival as a consequence of the burning need for improved and versatile assessment tools, facilitated by the rapid development of information technology and the broad access to electronic data sources.The first part of this chapter will describe the model, its background and the statistical properties underlying this approach, while the following sections will deal with its implementation within the framework of research evaluation framework of research evaluation at different levels of aggregation and in various disciplinary and multidisciplinary contexts. Special attention is paid to the applicability to various aggregation levels, such as national research performance, the comparative analysis of institutional research output, as a tool to assist the assessment of individual researchers and as journal impact measures. A graphical sketch of possible applications is used as a road map throughout the chapter to navigate the various methodological issues and fields of use. The chapter begins with a review of previous work, but also aims at presenting new insights and applications in a systematic manner. In addition to the presentation of new results, future perspectives and possible applications of this model within and outside traditional scientometrics will be sketched and highlighted.

Wolfgang Glänzel, Bart Thijs, Koenraad Debackere

### 14. An Overview of Author-Level Indicators of Research Performance

The purpose of this chapter is to present a critical overview of author-level indicator individual level indicators of research production ( author-level indicators of research production (ALIRP) ), discuss their appropriate application and provide a tool to support the informed use of ALIRP. A brief history of the development of ALIRP begins with a chronological discussion of the major trends in indicator development, which documents the quick adaptation of ALIRP in evaluation practice, and consequently sets the argument for the need to monitor and evaluate present-day indicator production, which is the major theme of this chapter. The characteristics and common mathematical properties of ALIRP are used to highlight the challenges we face in applying appropriate ALIRP in evaluation. The construction and validity validity of 69 ALIRP are analyzed, and the results presented in table form for easy reference. These tables are also available as interactive tables provided as e-material to this chapter. This analysis, combined with the deconstruction of indicators in the chapter sections, argues that ALIRP are mathematical models, and the numerical values they produce should never be confused with the reality they are trying to model in evaluation practice.

Lorna Wildgaard

### 15. Challenges, Approaches and Solutions in Data Integration for Research and Innovation

In order to be implemented by policy makers, science, technology, and innovation ( science, technology, and innovation (STI) ) policies and indicator building need data. Whenever we need data, we need a method for data management, and in the era of big data big data , a crucial role is played by data integration big data integration . Therefore, STI policies and indicator development need data integration. Two main approaches to data integration exist, namely procedural and declarative. In this chapter, we follow the latter approach and focus our attention on the ontology-based data integration ( ontology-based data integration (OBDI) ) paradigm. The main principles of OBDI are: (i) Leave the data where they are. (ii) Build a conceptual specification of the domain of interest (ontology), in terms of knowledge structures. (iii) Map such knowledge structures to concrete data sources. (iv) Express all services over the abstract representation. (v) Automatically translate knowledge services to data services. We introduce the main challenges of data integration for research and innovation ( research and innovation (R&I) ) and show that reasoning over an ontology connected to data may be very helpful for the study of R&I. We also provide examples by using Sapientia, an ontology specifically defined for multidimensional research assessment.

Maurizio Lenzerini, Cinzia Daraio

### 16. Synergy in Innovation Systems Measured as Redundancy in Triple Helix Relations

The Triple Helix ( triple helix (TH) ) of university–industry–government relations can first be considered as an institutional network. However, the correlations in the patterns of relations provide another topology: that of a vector space. Meanings are provided from positions in this latter topology and from the perspective of hindsight. Meanings can be shared, and sharing generates redundancy. Increasing redundancy provides new options and reduces uncertainty; reducing uncertainty improves the innovative climate, and the generation of options (redundancy) is crucial for innovation. The knowledge base provides an engine of the economy by evolving in terms of generating new options. The trade-off between the evolutionary generation of redundancy and the historical variation providing uncertainty can be measured as negative and positive information, respectively. In a number of studies of national systems of innovation (e. g., Sweden, Germany, Spain, China), this TH synergy indicator has been used to analyze regions and sectors in which uncertainty was significantly reduced. The quality of innovation systems innovation system can thus be quantified at different geographical scales and in terms of sectors such as high- and medium-tech manufacturing or knowledge-intensive services ( knowledge -intensive services (KIS) ).

Loet Leydesdorff, Inga Ivanova, Martin Meyer

### 17. Scientometrics Shaping Science Policy and vice versa, the ECOOM Case

It is difficult to imagine a world without science policy. Ever since Vannevar Bush published his seminal insights on the role of science in society, science policy has become deeply ingrained in public policy. Alongside this, the discipline of scientometrics developed. It started from library and information needs, helping the ever-growing scientific community to access, retrieve and disseminate its ever-increasing output. However, along the way, scientometrics developed into a powerful set of scientifically validated data, indicators and tools. It diffused across many disciplines in the social sciences. Over time, this evolution came to the attention of policymakers. The wealth of data and indicators developed in the field of scientometrics (later extended to informetrics and webometrics) elicited interest in their use for policy purposes. A symbiosis between scientometrics and science policy was born. Using the case of the Flemish Centre for Research & Development Monitoring (ECOOM), we describe and illustrate this coevolution between scientometrics and science policy, its opportunities and its challenges, and its do's and don'ts.

Koenraad Debackere, Wolfgang Glänzel, Bart Thijs

### 18. Different Processes, Similar Results? A Comparison of Performance Assessment in Three Countries

Monitoring the scientific performance of a country, region, or organization has become a high priority for research managers and government agencies. Research assessments research assessment have been implemented to provide evidence and facilitate their decisions. They differ in the methodologies applied, the disciplinary and regional breadth, and the consequences that follow. We sought to examine the extent to which quantitative, indicator-based analysis can contribute to identifying and better understanding the effects and effectiveness of the different assessment regimes. To this end, we analyzed the publications from three countries (Australia, the United Kingdom, and Germany) with contrasting systems in place, seeking to demonstrate the possibilities and limitations of using an indicator-based methodology for determining the outcomes from different approaches to assessment.We intentionally selected three countries with different assessment regimes, expecting to see the effects of this in the bibliometric analyses we undertook. However, we found that the data alone do not allow us to conclude that any one system has a beneficial or detrimental influence on performance. Rather, the data suggest that it is not the specific system that makes a difference but the fact that performance becomes a central topic of conversation.In order to better understand the mechanisms behind changing performance, restricting scrutiny to mere numbers is insufficient. Contextual information at various levels of aggregation—within and outside the institutions—is highly relevant.

Sybille Hinze, Linda Butler, Paul Donner, Ian McAllister

### 19. Scientific Collaboration Among BRICS: Trends and Priority Areas

The political and economic partnership known as BRIC (for Brazil, Russia, India and China) was formally established in 2008. Three years later, in a joint meeting in Cape Town, a new member, South Africa, was included in the group. In this meeting, Brazil, Russia, India, China and South Africa (BRICS) delegates elaborated a list of priority areas for enhancing bi- or multilateral cooperation in the fields of science, technology and innovation. Considering the growing importance of BRICS in the global economy and other sectors, the present study investigates the performance of the group in the scientific arena before and after its formalization in 2008, looking closely at BRICS collaborative publications, in order to identify whether the priority areas established in the Cape Town declaration are being actually pursued. Data were collected during February and March 2017 from the Web of Science database, covering the period 2000–2015. To match scientific collaborations, specific searches were carried out by combining the names of two BRICS members and time periods. Various bibliometric techniques were used, including diachronic analysis, Bradford's law and journal co-citation analysis. Among the key findings highlighted here are a marked increase in BRICS participation during the period, widely varying levels of collaboration among members, and the presence of physics as a central field for most members. The chapter concludes with an in-depth discussion focusing on correlations between the fields with greater collaboration and the priority areas.

Jacqueline Leta, Raymundo das Neves Machado, Roberto Mario Lovón Canchumani

### 20. The Relevance of National Journals from a Chinese Perspective

The process of of journal evaluation journal evaluation began in the 1930s when the famous British scholar S.C. Bradford published his study of geophysics and lubrication, which presented the empirical law now known as Bradford's law of scattering, as well as the concept of core area journals. The citation indicator indicator system and citation analysis theory system were founded in the middle of the twentieth century, and now have extensive influence. In the 1960s, Garfield carried out a large-scale statistical analysis of citations in journal literature. Generally speaking, the journal evaluation system has been gradually improved over time, producing an evaluation result that meets the development needs of science and technology. As one of the countries producing important science and technology outputs, China has ranked second according to the statistics of the number of scientific articles in recent years. At the same time, China has over 5000 scholarly journals, however, only 4% of them have been indexed in Web of Science and 10% of them in Scopus. A similar situation is found in Russia, Japan, Korea, and other non-English-speaking countries. Therefore, China has a lot of research and practice in the field of journal evaluation with which to explore more applicable and effective ways of assessing and improving national academic journal development. We will review the development situation of scientific, technical and medical ( scientific, technical and medical (STM) ) journals in China to understand the demand for a national journal evaluation system. According to the comparative study comparative study on international and national evaluation systems and indicators of academic journals in China, we can find the characteristics of national journal evaluation under a framework of their respective evaluation purposes, evaluation methods, key features, and evaluation criteria. We introduce two cases of China's STM journal research and evaluation work: the development of the boom index boom index index boom and its monitoring function, and the definition and application of comprehensive performance scores ( comprehensive performance score (CPS) s) for Chinese scientific and technical journals. English-language science and technology journals in China are more similar to international journals but are developing along a particular path. Therefore we also introduce three other cases: statistics and analysis of English-language science and technology journals in China, the communication value of Chinese-published English-language academic journals according to citation analysis, and the atomic structure model for evaluating English-language scientific journals published in non-English countries.

Zheng Ma

### 21. Bibliometric Studies on Gender Disparities in Science

Understanding gender related disparities in science is an essential step in tackling these issues. Through the years, bibliometric studies have designed several methodologies to analyze scholarly output and demonstrate that there are significant gaps between men and women in the scientific arena. However, gender identification in itself is an enormous challenge, since bibliographic data does not reveal it. These bibliometric studies not only focused on publication output and impact, but also on cross-referencing output, promotions and tenure data, and other related curriculum vitae (CV) information. This chapter discusses the challenges of tracking gender disparities in science through bibliometrics and reviews the various approaches taken by bibliometricians to identify gender and analyze the bibliographic data in order to point to gender disparities in science.

Gali Halevi

### 22. How Biomedical Research Can Inform Both Clinicians and the General Public

This study involved the collection of clinical practice guidelines ( clinical practice guideline (CPG) s) on five noncommunicable disease (NCD) areas from 21 European countries, and extraction of their evidence base in the form of papers in journals processed on the Web of Science ( Web of Science (WoS) ). We analyzed these cited papers to see how their geographical provenance compared with European research in the respective subjects and found that European research (and that from the USA, Australia, and New Zealand) was over-cited compared with that from East Asia. In cancer, surgery and radiotherapy research made important contributions to the CPGs.We also collected medical research stories from 30 newspapers from 22 European countries and the WoS papers that they cited. There was a heavy emphasis on cancer, particularly breast cancer, and its epidemiology, genetics, and prognosis, but new treatment methods were seldomly reported, particularly surgery and radiotherapy. Some of the stories quoted commentators, with those from the two UK newspapers often mentioning medical research charities, which thereby gained much free publicity.Both sets of cited research papers showed a marked tendency to be over-cited by documents from their countrymen; the ratio was higher the smaller the country's contribution to research in the subject area.

Elena Pallari, Grant Lewison

### 23. Societal Impact Measurement of Research Papers

What are the results of public investment in research from which society actually derives a benefit? The scope of research evaluations becomes broader when societal products (outputs), societal use (societal references), and societal benefits (changes in society) of research are considered. This chapter presents an overview of the literature in the area of societal impact measurement of scientific papers. It describes major research projects on societal impact measurements. Problems of societal impact assessments are discussed as well as proposals to measure societal impact. The chapter discusses the role of alternative metrics (altmetrics) in measuring societal impact. There is an ongoing debate in scientometrics as to whether altmetrics are able to measure this kind of impact.

Lutz Bornmann, Robin Haunschild

### 24. Econometric Approaches to the Measurement of Research Productivity

The measurement of research productivity is receiving more and more attention. Besides scholars that are interested in understanding how research works and evolves over time, there are supranational, national and local governments, and national evaluation agencies, as well as various stakeholders, including managers of academic and research institutions, scholars and more generally the wider public, who are interested in the accountability and transparency of the scholarly production process.The main objective of this chapter is to analyze econometric approaches to research productivity and efficiency, highlighting what econometric approaches to research assessment can offer and what their benefit is, compared to traditional bibliometric or informetric approaches. We describe the nature of, and the ambiguities connected to, the measurement of research productivity, as well as the potential of econometric approaches for research measurement and assessment. Finally, we propose a checklist when developing econometric models of research assessment as a starting point for further research.

Cinzia Daraio

### 25. Developing Current Research Information Systems (CRIS) as Data Sources for Studies of Research

Current research information systems ( current research information system (CRIS) ) are increasingly being used to standardize and ease documentation, communication, and administration of research. With broad coverage and sufficient completeness, data quality, and standardization, CRIS systems can also be used as data sources for studies of research. Making CRIS interoperable and comparable across institutions and countries is necessary for the further development of CRIS for research purposes. Integration of CRIS for administrative purposes is already on the European agenda. This chapter focuses on challenges and solutions to the development of internationally integrated CRIS. Most of the remaining challenges are not related to technical solutions, but to an efficient sharing and use of contents. The chapter starts with the situation at the international level before it moves on to an example of CRIS at the national level to describe challenges and possible solutions even more concretely. The last section of the chapter provides examples of the type of studies that can be performed if progress is made for internationally integrated CRIS.

Gunnar Sivertsen

### 26. Social Media Metrics for New Research Evaluation

This chapter approaches, from both a theoretical and practical perspective, the most important principles and conceptual frameworks that can be considered in the application of social media metrics for scientific evaluation. We propose conceptually valid uses for social media metrics in research evaluation. The chapter discusses frameworks and uses of these metrics as well as principles and recommendations for the consideration and application of current (and potentially new) metrics in research evaluation.

Paul Wouters, Zohreh Zahedi, Rodrigo Costas

### 27. Reviewing, Indicating, and Counting Books for Modern Research Evaluation Systems

In this chapter, we focus on the specialists who have helped to improve the conditions for book assessments in research evaluation exercises, with empirically based data and insights supporting their greater integration. Our review highlights the research carried out by four types of expert communities—the monitors, the subject classifiers, the indexers, and the indicator constructionists. Many challenges lie ahead for scholars affiliated with these communities, particularly the latter three. By acknowledging their unique yet interrelated roles, we show where the greatest potential is for both quantitative and qualitative indicator advancements in book-inclusive evaluation systems.

Alesia Zuccala, Nicolas Robinson-García

Twitter has unarguably been the most popular among the data sources that form the basis of so-called altmetrics. Tweets to scholarly documents have been heralded as both early indicators of citations and measures of societal impact. This chapter provides an overview of Twitter activity as the basis for scholarly metrics from a critical point of view and equally describes the potential and limitations of scholarly Twitter metrics. By reviewing the literature on Twitter in scholarly communication and analyzing 24 million tweets linking to scholarly documents, it aims to provide a basic understanding of what tweets can and cannot measure in the context of research evaluation. Going beyond the limited explanatory power of low correlations between tweets and citations, this chapter considers what types of scholarly documents are popular on Twitter, and how, when and by whom they are diffused in order to understand what tweets to scholarly documents measure. Although the chapter is not able to solve the problems associated with the creation of meaningful metrics from social media, it highlights particular issues and aims to provide the basis for advanced scholarly Twitter metrics.

Stefanie Haustein

### 30. Data Collection from the Web for Informetric Purposes

This chapter reviews the development of data collection procedures on the web with an emphasis on current practices, data cleansing and matching, data quality and transparency. There are several issues to be considered when collecting data from the web. Transparency is essential to know what is included in the data source, how recent and comprehensive the data are, what timeframe is covered etc. Data quality relates to reliability and accuracy. Mistakes are inevitable, data providers, aggregators, and researchers all make mistakes, but these mistakes should be reduced to a minimum so that meaningful conclusions may be reached from the data analysis. Extensive data cleansing before starting the analysis is needed to try to correct mistakes in the data. When several data sources are used, data from different sources should be matched, and duplicates should be removed.

Judit Bar-Ilan

Kayvan Kousha

### 32. Usage Bibliometrics as a Tool to Measure Research Activity

Edwin A. Henneken, Michael J. Kurtz

### 33. Online Indicators for Non-Standard Academic Outputs

This chapter reviews webometric, altmetric, and other online indicators for the impact of nonstandard academic outputs, such as software, data, presentations, images, videos, blogs, and grey literature. Although the main outputs of academics are journal articles in science and the social sciences, and monographs, chapters, or edited books to some extent in the arts and humanities, many scholars also produce other primary research outputs. For nonstandard outputs, it is important to provide evidence to justify a claim for a type of impact and online indicators indicator online may help with this. Using the web, academics may obtain data to present as evidence for a specific impact claim. The research reviewed in this chapter describes the types of evidence that can be gathered, the nature of the claims that can be made, and methods to collect and process the raw data. The chapter concludes by discussing the limitations of online data and summarizing recommendations for interpreting impact evidence.

Mike Thelwall

### 34. Information Technology-Based Patent Retrieval Models

This chapter presents information technology ( information technology (IT) ) based patent retrieval models. It first compares and contrasts information retrieval ( information retrieval (IR) ) with patent retrieval, and highlights their key differences. For instance, IR can be considered as a precision-oriented retrieval, whereas patent retrieval can be considered as a recall-oriented retrieval. The chapter then describes the boolean retrieval model, which was designed for IR but can be used for patent retrieval. To facilitate effective patent retrieval, a basic patent retrieval model is presented. With this model, representative keyword terms are extracted from the user query and are ranked according to their importance so that top- $$k$$ k relevant patents can be retrieved with irrelevant patents eliminated. Moreover, the chapter also presents some enhancements and extensions to the basic patent retrieval model, which include incorporation of relevance feedback, estimation of the importance of keyword terms, text preprocessing of patent documents, and handling of patent category frequency. In addition, two dynamic patent retrieval models are also described. These two models perform interactive patent retrieval via dispersion or accumulation to dynamically rank the patents. Experimental results with real-life datasets dataset real-life show that the models presented in this chapter outperformed many conventional search systems with respect to time and cost. While this chapter focuses on the theoretical aspects of IT based patent retrieval models which are of interest to IT specialists, practical illustrative examples in the chapter demonstrate the empirical aspects of patent retrieval models which are helpful to IT practitioners.

Carson Leung, Wookey Lee, Justin Jongsu Song

### 35. The Role of the Patent Attorney in the Filing Process

The role of the legal representative in patent filing processes is, so far, under-explored in patent statistics. This chapter addresses the question of the role and the impact of the patent attorney in the filing process. One of the core assumptions is that more experienced attorneys have more in-depth knowledge of the intricacies of the patent system and, thus, are more likely to pursue more elaborate and successful filing strategies.The results show a high concentration of attorneys and filing action in absolute as well as in relative terms in some countries, namely Germany and the UK, and numbers worth mentioning also in other larger applicant countries like France, Italy, Sweden, or the Netherlands. Explanations for this biased distribution in Europe are language advantages in the case of the UK (and also Ireland) and geographical proximity to the European Patent Office (EPO), as well as economies of scale in the case of Germany.The experience of the representative has a considerable impact on the outcome. Multivariate analyses suggest that the (financial) resource endowment is a decisive factor in the hiring of patent attorneys. It was shown that the patents of more experienced representatives were significantly more often withdrawn (but neither refused nor granted with a higher probability), and they were less often opposed than the ones by less experienced attorneys.

Rainer Frietsch, Peter Neuhäusler

### 36. Exploiting Images for Patent Search

Patent offices worldwide receive considerable numbers of patent documents that aim at describing and protecting innovative artifacts, processes, algorithms, and other inventions. These documents apart from the main text description may contain figures, drawings, and diagrams in an effort to better explain the patented object. Two main directions are presented in this chapter; concept-based and content-based patent retrieval. Concept-based search utilizes textual and visual information, fusing them in a classification late fusion stage. Conversely, content-based retrieval is based on the shape/content information from patent images and is therefore based on the visual descriptors that are extracted from binary images. Concepts are extracted using classification techniques, such as support vector machines and random forests. Adaptive hierarchical density histograms serve as binary image retrieval techniques that combine high efficiency and effectiveness, while being compact and therefore capable of dealing with large binary image databases. Given the vast number of images included in patent documents, it is highly significant for the patent experts to be able to examine them in their attempt to understand the patent contents and identify relevant inventions. Therefore, patent experts would benefit greatly from a tool that supports efficient patent image retrieval image retrieval patent and extends standard figure browsing and metadata-based retrieval by providing content-based search according to the query-by-example paradigm.

Ilias Gialampoukidis, Anastasia Moumtzidou, Stefanos Vrochidis, Ioannis Kompatsiaris

### 37. Methodological Challenges for Creating Accurate Patent Indicators

The chapter deals with new methodological issues of retrieval for patent indicators linked to the change of the patent system in the last $$\mathrm{20}$$ 20 years and the new ways to access patent data. In particular, it describes international flows of patent applications between the US, Europe, and Southeast Asia, and illustrates methods for an appropriate cross-country comparison. A central topic of this chapter is the implications of the frequently used Patent Cooperation Treaty ( Patent Cooperation Treaty (PCT) ) route of patent applications on the conception of search strategies and the interpretation of search results. Furthermore, the possibilities of search with the new international Cooperative Patent Classification ( Cooperative Patent Classification (CPC) ) are explained. In addition, the patenting activities of very large companies and patent value are discussed.

Ulrich Schmoch, Mosahid Khan

### 38. Using Text Mining Algorithms for Patent Documents and Publications

In this chapter we present an overview of text mining approaches that can be used to conduct science and technology studies that rely on assessing the (content) similarity between patent documents and/or scientific publications. We highlight the rationale behind vector space models, latent semantic analysis, and probabilistic topic models. In addition, several validation studies pertaining to patent documents and publications are presented. These studies reveal that choices in terms of algorithms, pre-processing, and calculation options have non-trivial consequences in terms of outcomes and their validity. As such, scholars should pay attention to the technicalities implied when engaging in text mining efforts in order for outcomes to become relevant and informative.

Bart Van Looy, Tom Magerman

### 39. Application of Text-Analytics in Quantitative Study of Science and Technology

The quantitative study of science, technology and innovation (ST&I science, technology, and innovation (STI) ) has experienced significant growth with advancements in disciplines such as mathematics, computer science and information sciences. From the early studies utilizing the statistics method, graph theory, to citations or co-authorship, the state of the art in quantitative methods leverages natural language processing and machine learning. However, there is no unified methodological approach within the research community or a comprehensive understanding of how to exploit text-mining potentials to address ST&I research objectives. Therefore, this chapter intends to present the state of the art of text mining within the framework of ST&I. The major contribution of the chapter is twofold; first, it provides a review of the literature on how text mining extended the quantitative methods applied in ST&I and highlights major methodological challenges. Second, it discusses two hands-on detailed case studies on how to implement the text analytics routine.

Samira Ranaei, Arho Suominen, Alan Porter, Tuomo Kässi

### 40. Functional Patent Classification

Patent classifications are systematically used in patent analysis for a number of purposes. Existing classifications not only shape the administrative activities of recording and reporting and the search for prior art, but also create the backbone of the construction of science and technology indicators used in economic analysis, policy making, and business and competitive intelligence.Yet the current classification system of patents, despite significant and continuous efforts to update, suffers from a number of limitations. In particular, it fails to capture the full potential of inventions to cut across industrial boundaries, does not allow fine-grained technology intelligence, and misses almost entirely the opportunities for lateral vision.We suggest integrating existing schemes with a full scale functional classification, i. e., based on the main functions performed by a technology, rather than on the inventive solutions or their potential applications. The functional approach allows us to overcome most of the limits of traditional classification, due to the generality and abstraction of the representation of functions. In this chapter, we will first review the conceptual background of the functional approach in epistemology and analytical philosophy and illustrate its recent developments in engineering design, design theory, artificial intelligence, computational linguistics, and data mining. We then discuss three short case studies of the application of the methodology for the definition of patent sets (in particular within a technology foresight exercise), prior art analysis, and technology crossover identification and mapping.

Andrea Bonaccorsi, Gualtiero Fantoni, Riccardo Apreda, Donata Gabelloni

### 41. Computer-Implemented Inventions in Europe

The dispute between proponents and opponents of the patent system has been especially visible with regard to the patenting of computer programs. Different developments have resulted in the fact that there are large differences in the patent practices between the European Patent Office (EPO) and the U.S. Patent and Trademark Office (USPTO). While software as such is patentable at the USPTO, the EPO prohibits patenting of pure computer programs and only allows patenting of computer implemented inventions ( computer-implemented invention (CII) ).In this chapter, we investigate the differences between the European and American patent systems with regard to patenting computer programs by also addressing the historical developments that have resulted in the national differences. Based on these considerations, a definition of CII is derived, which enables us to carry out empirical analyses.By applying a conservative estimate, our results show that the share of CII filings at the EPO lies at around $${\mathrm{25}}\%$$ 25 % at present, while at the USPTO a current margin of approximately $${\mathrm{33}}\%$$ 33 % is reached. Thus, at least every fourth patent at the EPO and every third patent at the USPTO is a CII filing. In order to take account of the factual (technological and economical) relevance of computer-implemented inventions, we argue for clear rules with regard to patenting CII, as they are essential to reduce uncertainties and provide the relevant incentives for innovation.

Peter Neuhäusler, Rainer Frietsch

### 42. Interplay of Patents and Trademarks as Tools in Economic Competition

Integrated manufacturing-service systems have been receiving attention recently. The phenomenon of services-to-artifacts companies, namely those specializing in intermediate goods and complex equipment, is increasingly instrumental for long-run competitiveness in fast-changing, high-quality global markets. The debate has so far has remained largely qualitative, and the effective role and relevance of services is rather fuzzy. Against this background, this chapter brings in empirical evidence concerning the evolving business models of a variety of leading innovative manufacturing companies. For this purpose, over 50 manufacturing companies listed in the European Union (EU) research & development (R&D) investment scoreboard are analyzed in terms of patents and trademarks. In particular, trademark strategies are studied in greater depth, and they are sub-divided into goods and services marks and into high and low sophistication. Service marks are used as a supplement to patents, as the service component of industrial offerings is not covered by classic indicators of technical change. The economic data from the EU (EU Scoreboard R&D, sales, growth, employees, profits, or investment) are linked to the patent and trademark data in order to see which balance of goods and service capabilities leads to favorable economic results.

Sandro Mendonça, Ulrich Schmoch, Peter Neuhäusler

### 43. Post Catch-up Trajectories: Publishing and Patenting Activities of China and Korea

This chapter seeks to explore the sequential cyclical growth of science, technology, and science-based technology for two economies—China and South Korea—in the course of transitioning to the postcatching-up phase. Both China and South Korea intend to capitalize on scientific and technological knowledge in order to transition to the postcatching-up phase of development. This chapter highlights the production trajectories of science and technology towards the postcatching-up phase in terms of: 1. Scientific publications 2. Granted patents 3. Copatenting pattern 4. Forward citations 5. Science-based patents. China and South Korea have been active in terms of scientific publication and patenting activities. In regard to patenting, both economies have shown the capability to produce patents and are able to converge the growth of patents with that of publications. This chapter highlights a generic cyclical growth path for science, technology, and science-based technology in the course of transitioning to an advanced knowledge-based economy. It is nonetheless important to explore if there are different paths pursued by other emerging economies emerging economies .

Chan-Yuan Wong, Hon-Ngen Fung

### 44. Standardization and Standards as Science and Innovation Indicators

The focus of innovation policies has shifted from knowledge creation and protection (e. g., by patents) to knowledge diffusion (e. g., via open access) in order to promote their implementation. This has led to an increasing need for innovation indicators that reflect the implementation of knowledge within innovative products and services. Standardization as a kind of open innovation process, and standards as its output, represents a new type of innovation indicator. In this chapter, we begin with a discussion of existing opportunities for using standards and standardization as innovation indicators indicator innovation , including three specific examples of input, throughput, and output indicators. Next we identify challenges that must be addressed to close the data gaps—which are still very significant when compared with patent data. In addition, the broader concept of quality infrastructure quality infrastructure is introduced in order to point out the complexity of standards implementation, and its close link to innovation as well. The chapter concludes with examples of how decision makers in industry and policy could make use of a comprehensive database of standardization and standards to evaluate innovation policy initiatives.

Knut Blind

### Backmatter

Weitere Informationen