1 Introduction
Understanding whether industrial invention depends on science remains an important topic for research and public policy, and more specifically, understanding how science or the interactions between science and technology may impact the rate and direction of emerging technologies. Nightingale (
1998) distinguishes between research as science focused upon the creation and validation of generalizable knowledge whereas practice as technology focuses upon solving specific practical problems. Earlier studies addressing how to categorize the technological impact of new knowledge make a key distinction between the degree of novelty involved and the degree of impact (Trajtenberg
1990; Kaplan and Vakili
2015; Wang et al.
2017). Science is represented by the degree of novelty while technology presents the economic impact and usefulness of an idea. With a focus on recombinations of technological components and their interdependencies, the literature analyzing patents and patent citations has studied how different search strategies, as contrasted with the landscape of all possible searches, can affect the type of industrial invention which later occurs (Fleming and Sorensen
2001). Fleming and Sorensen (
2004) propose that whereas technology search occurs through incremental steps with independent components, science provides advantages for distant search (for more breakthrough inventions), especially where there are highly coupled components. In contrast, Kaplan and Vakili (
2015) argue that breakthrough inventions may require both narrow recombinations of application areas as well as more distant search. McKelvey and Saemundsson (
2021) propose that generating both new scientific and technical knowledge can be conceptualized as an evolutionary problem-solving process, where there are often ambiguities—or gray zones—between research and actual use in practice. Hence, in addition to the literature considering search in relation to the rate and direction of technological change, we also need to consider the time required for new ideas to be implemented in society. Extensive studies of science and technology demonstrate a time-lag between the appearance of a novel idea in science and the socioeconomic impact of that new knowledge through technology or wider use in society (Salter and Martin
2001). We therefore focus upon the combination of time-lag through delayed recognition with impact. More generally, the concept of sleeping beauties in science is attributed to Van Raan (
2004), to highlight both delayed recognition (sleepers) and high impact (beauties), which we here apply to patents.
1 We hence apply and develop this concept with regards to technology, in order to better understand potential breakthrough inventions in an emerging technology. Our aim is to study whether innovation depends on long-term patterns of interactions in technology and science, using patents in nanotechnology. Specifically, we are interested in what kinds of links to science matter and what influences the delayed recognition and high impact of a technological invention, for this emerging technology.
Knowledge needs to be continually recombined, in order to be useful for new scientific outcomes and for applying technology to solve problems. Since there is no direct literature on the factors affecting the probability of a patent becoming a sleeping beauty in nanotechnology, we combined various literature streams in our deductive approach to build our hypotheses below. To do so, we adapted some concepts from scientometrics used for studying scientific fields and communities in order to study technology (as well as linkages between science and technology). Parallelisms are common between scientometrics and patentometrics (Narin
1994; Meyer
2000), and we followed this tradition.
Being able to recombine different types of knowledge, and in such a way as to stimulate emerging technologies matters for public policy. Although we do not directly address the topic of the antecedents and consequences of technological specialization within regions, we are aware of the vast number of studies within regional science which consider technological development (Henning and McKelvey
2020; Neffke et al.
2011; Bathelt et al.
2004; Glanzel and Garfield
2004; Hicks et al.
2001; Beise and Stahl
1999). The concept of related variety captures the details of how local knowledge bases in relation to technological and industrial specialization affect the long-term development of regions (Neffke et al.
2011; Juhász et al.
2021). Asheim and Coenen (
2005) conceptualized different regional knowledge bases as being either predominately based on analytical (scientific) or synthetic (industrial) knowledge. Bathelt and Glückler (
2003) and Bathelt et al. (
2004) represent the relational turn in economic geography, in order to understand how knowledge, geography, and networks interrelate in explaining knowledge creation and diffusion. For understanding technology-specific attributes of RandD collaboration networks, Neuländtner and Scherngell (
2020) focus on the geographical and relational effects of networks in Key Enabling Technologies (KETs), and find that for all technologies, varying degrees of network effects compensate for some geographical boundaries; nanotechnology specifically has more localized but some inter-regional links. We hope our study helps to inform a limited part of this debate, through the policy implications for science-technology linkages and by following the previous literature which considers nanotechnology as an important emerging technology.
To study long-term patterns, we developed the metaphor of sleeping beauties for patents. We do so by combining three strands of literature, namely patent citations as indicators of innovation; the concept of sleeping beauties in scientometrics; and specific studies of nanotechnology as an emerging technology The metaphor has previously been applied to understanding science through scientific publications (van Raan
2004; Dey et al.
2017), delayed recognition of application-oriented sleeping beauties (van Raan
2015), and references to those scientific papers in patents (van Raan
2015,
2017). In studies of science, “sleeping beauties” refer to scientific papers that “sleep” (receive no citations) until they are “woken up” by a citing paper, and then receive many citations, indicating a large impact on subsequent science. Very few studies have applied this to patents. Hou and Yang (
2019) applied the concept to graphene patents in China, in order to identify different patterns of being “awoken.” We propose that the metaphor of sleeping beauties is useful in patents to disentangle the differing concepts of delayed recognition (sleepers) and high impact (beauties), which underlie the combination of the two as emerging technologies that are potential breakthrough inventions.
Taking patents as indicative of technology, we identify patents which have both delayed recognition and high impact. We do so in the empirical area of nanotechnology, because it has been identified as an emerging technology with extensive science-technology interactions (Meyer
2001; Meyer et al.
2010; Bourelos et al.
2017) and identified by the European Commission (
2012) as a KET, or Key Enabling Technology. We do so based on data extracted from the European Patent Office (EPO) dataset PATSTAT for the years 1956–2018. We compare the population of nanotechnology patents with at least one citation, with the population of all patents (e.g., whole population) for these years, in relation to our concept of sleeping beauties in patents. Based on our literature review, we develop two hypotheses about two different types of linkages to science, to nuance the discussions. We expect that both direct and strong science-based linkages as well as indirect and more diverse science-based linkages will positively affect sleeping beauties in nanotechnology. Based upon our analysis, we reject both hypotheses, because we find that each of these types of linkages positively effect high impact but negative effect delayed recognition. Contrary to expectations, we find that the science-based patents mainly have an earlier impact and possibly a more direct impact on industrial invention. Control variables of IPC application class and company ownership do matter, which suggests that companies are active in combining multiple channels and sources of knowledge into industrial inventions. One contribution is to propose that non-patent literature should not be considered a proxy for science linkages in general, but instead this reflects a search amongst various types of codified as well as informal technological and scientific knowledge. We propose that references to the non-patent literature can be conceptualized as a variety of informal and temporary ways of searching across a broader range of scientific fields and industrial application areas. Further research on these topics can elucidate a more nuanced understanding of how and why combining and recombining both scientific and technological knowledge may impact the combination of delayed recognition and high impact.
3 Data and methodology
We broke down the sleeping beauty metaphor into the two different components: delayed recognition (being a sleeper) and high impact (being a beauty) in order to study patents with both characteristics. The research design is consistent with our econometric approach, with results presented in next section, using three different models for the “sleepers,” the “beauties,” and the “sleeping beauties.”
3.1 Identifying sleeping beauties in nanotechnology
Our first step is to solve the methodological challenge of delineating an emerging field of technology, addressed in the literature by using keywords in bibliometric and patentometric studies (Chen et al.
2008; Huang et al.
2011; Li et al.
2007b; Hullmann and Meyer
2003). A challenge in scientometrics is to delimit the field of nanotechnology in order to meaningfully categorize the researchers as well as their academic and commercial work within nanoscience. There are methodological difficulties since many academics are involved in other disciplines or fields and only occasionally publish and/or patent in nanoscience. We followed the line of research using key words as indicative of community in our study, considering that words and their co-occurrence represent a Kuhnian approach to a deep, narrow community which is reasonably coherent, yet evolving (Kaplan and Vakili
2015).
This paper follows the keywords-based methodology proposed by Porter et al. (
2008) for categorizing patents belonging to nanotechnology. We extracted patent data from the 2017 Spring version of PATSTAT. PATSTAT is the official patent database of the European Patent Office (EPO), and provides structured bibliographic and legal data from the EPO’s databases, covering more than 100 million patent records worldwide, from the mid-nineteenth century to current days.
In this study we use the term “patent” to refer to a patent family with at least one citation. An invention can be applied for in several patent offices, forming what is called a patent family. Thus, we integrated citations to different patents in the same cited nanotechnological family from different patents in the same citing family from any field (nanotechnology or not) as a single citation from the citing family to the cited family. Since most patents never receive any citations, we limited our population to patent families with at least one citation. This leaves 65,759 nanotechnology patent families, with priority years ranging from 1956 to 2012.
The second step is to identify nanotechnology patents which are also sleeping beauties. Several methods have been proposed in scientific publications, and here adapted to patents. “Beauty,” in this context, is seen as an impact on later science, and measured by the number of citations, which depends on a paper’s field (Van Raan
2015; Dey et al.
2017). “Sleep” can be more or less deep, depending on low frequency of annual citations and how many citations are needed before they start to make a larger impact. For example, a paper might receive very few citations (none or one) per year for many years, and then suddenly start receiving many citations every year. Of course, the more restrictive these definitions, the fewer sleeping beauties we find (Van Raan
2004). In patents, we acknowledge that backward citations in patents follow different patterns than citations in scientific publications. References to prior art limit the scope of a patent’s protection, and thus the rationales for citing are opposite. The citation lifetime of patents, the peak of citations, and the density distribution of citations over years differ from that found in papers (Narin
1994). Moreover, they vary across technical fields. That is why we used forward citations to measure impact.
We identified impact, or “beauty,” with the number of forward citations a patent receives, as a proxy for the technological value of a patent. The citation distribution curve depends on patents’ technical field (Mariani
2004): the average number of citations, as well as their time period, vary from one field to the other. In this study we limited the patent selection to a single technology, nanotechnology, so one single field. Forward citations to nanotechnology patents, though, can come from patents in any field. Citations in patents have been used to recognize breakthrough inventions as those with a particularly high impact, usually in the top quantiles of the distribution of forward citations, even though there are more advanced methods (see Castaldi et al.
2015). For this study, we defined beauties as those patents in the top 10% of the citation curve (Tur
2016). For our robustness checks, we calculated the patents in the top 1% and in the top 5% where we identified a small amount of beauties. The small amount of beauties in nanotechnology produced even smaller amounts of sleeping beauties at these thresholds (e.g., only three at 5% of the citation curve, with 5 years sleeping time), which made the numbers negligible for further analysis. Therefore, we chose 10% as a threshold for beauties. Furthermore, we did robustness checks with the threshold at 15%. Given this population and our analysis, “beauties” are defined as patents with 22 citations or more.
We also identified delayed recognition, or “sleep” of a patent. Patents have many different associated dates, such as the application date, the publication date, or grant date (if the patent has been granted). A patent’s priority date is the date of the first filing of an application in a patent family. We chose “priority date” because it is closest to the invention date, and less subject to bureaucratic delays such as examination, or formal delays (though there can be strategic in-company delays, as described in Kang and Bekkers
2015). We define a patent’s sleep as the period between the priority date of the cited patent and of the first citing patent. Mirroring our definition of beauty, we define sleepers as those in the top 10% of the sleep length distribution (Tur
2016). Given the population of 65,759 nanotechnology patent families, sleepers are those that do not receive any citations for 4 years or more. (We also ran robustness checks with 3 and 5 years of sleep length in the regressions.)
With these thresholds for impact and delayed recognition of an emerging technology, a sleeping beauty is defined as a patent with at least 22 citations over its lifetime (a beauty) that did not receive citations for at least 4 years after its priority date (a sleeper). The number of sleeping beauties in the database is 162, or 0.25% of the population.
3.2 Variables
Once we had determined our empirical strategy for identifying sleeping beauties, we then created a dummy variable, SB, which is 1 if a patent is a sleeping beauty and 0 otherwise. In addition, we created two intermediate dummy variables, beauty and sleeper, which identify patents that are highly cited and have experienced delayed recognition.
The main independent variable for Hypothesis 1 should reflect direct and strong science-base linkages to the technology, and our proxy is university ownership of the patents. Therefore, we created the dummy variable
university, which is 1 if at least one of the applicants works at a university and 0 otherwise. For Hypothesis 2, the main independent variable should reflect indirect and more diverse science-base linkages, and our proxy is the number of backward citations to the non-patent literature, a new interpretation as discussed in Sect.
2.
We also included variables in our empirical model as controls, selecting the most common ones found to affect patent value, measured in citations, in order to avoid biases due to omitted variables. The size of the patent family is the number of applications related to the same invention, and indicates an invention’s market scope (Lanjouw et al.
1998; Harhoff et al.
2003). As each additional application adds to the cost of protection, applicants will only make these costs for the inventions they deem valuable. The number of claims indicates a possibly broader invention, and has been used to measure patent value (Moore
2005; Gambardella et al.
2008). At several patent offices, fees increase with the number of claims, so only patents that the applicant considers valuable will have a high number of claims. Moreover, the number of IPC subclasses indicates a patent’s technological breadth, which has been related to patent value (Merges and Nelson
1990; van Zeebroeck et al.
2009; Petruzzelli et al.
2015): the more subclasses, the more potential applications to many areas. The number of inventors is related to the size of the research project, so a bigger team of inventors relates to more complicated developments (Mariani
2004; Bass and Kurgan
2010). The number of backward citations to patents shows the embeddedness in the current technological trajectory, and therefore also indicates patent value (Criscuolo and Verspagen
2008; Verhoeven et al.
2016).
Table
1 summarizes the explanatory variables and controls, together with their basic descriptive statistics.
Table 1
Definition of variables and descriptive statistics
Dependent | Sleeping beauty | 1 if both a beauty and a sleeper | Dummy | 0.003 | 0.057 | 0 | 1 |
Beauty | 1 if the number of citations is at least 22 | Dummy | 0.13 | 0.34 | 0 | 1 |
Sleeper | 1 if the length of sleep is at least 4 | Dummy | 0.14 | 0.35 | 0 | 1 |
Independent | University-owned | 1 if any patent in the family is university-owned | Dummy | 0.23 | 0.42 | 0 | 1 |
Non-patent literature | Max no. of references to non-patent literature in the patent family | Integer | 3.88 | 11.66 | 0 | 182 |
Control | Year | Earliest priority year in the family | Year | 2005.4 | 7.25 | 1956 | 2012 |
Granted | 1 if any patent in family is granted | Dummy | 0.70 | 0.46 | 0 | 1 |
Inventors | Max no. of inventors in family | Integer | 3.17 | 2.01 | 0 | 38 |
Claims | Max no. of claims in the patent family | Integer | 3.23 | 9.02 | 0 | 248 |
Backward citations | Max no. of backward citations to patents in the patent family | Integer | 10.64 | 17.30 | 0 | 205 |
Company | 1 if any patent in the family is owned by a company | Dummy | 0.68 | 0.47 | 0 | 1 |
Family size | Size of the DOCDB family | Integer | 2.957 | 3.815 | 1 | 379 |
Number IPC | Total no. of different IPC subclasses in family | Integer | 2.25 | 1.42 | 1 | 17 |
IPC Sub-classes XnnX | 1 if any patent in the family belongs to subclass XnnX | Dummy | – | – | – | – |
5 Concluding remarks and discussion
Our study examined whether innovation depends on long-term patterns of interactions in technology and science, using patents in nanotechnology. The previous literature has distinguished between the degree of novelty (science-base) and the degree of technological impact (innovation) of a technology, and stressed that links to the science-base will lead to more breakthrough inventions, due to distant recombinations (Trajtenberg
1990; Kaplan and Vakili
2015; Wang et al.
2017; Fleming and Sorensen
2001,
2004). Our paper then addresses the context of breakthrough industrial inventions in an emerging technology, focusing on the case of nanotechnology. To discover long-term patterns, we developed an empirical strategy to study nanotechnology patents through the metaphor of sleeping beauties, which highlights the combination of delayed recognition and high impact of an emerging technology. Nanotechnology was chosen as an emerging technology, known to be a science-based technology with extensive linkages (Meyer
2000,
2006; Dang et al.
2010; Shapira and Wang
2009; Thursby and Thursby
2011; Guan and Ma
2007; Alencar et al.
2007). Our initial empirical comparison suggests that sleeping beauties occur more frequently in nanotechnology than in the general population of patents.
One of our contributions is that the linkages between science and technology can be studied at a more fine-grained level, namely distinguishing two types of science linkages, “direct and strong science-base” and “indirect and more diverse science-base” as impacting recombinations of technology. Methodologically, we proxy these two types of science linkages as, respectively, university ownership of patents and backward citations to the non-patent literature.
In general, we found that science matters for explaining high impact patents within nanotechnology, in the sense that both types of science linkages lead to more forward citations. This part of our results was expected given a fairly robust finding in the literature across technology fields and IPC classes that science affects technological value. This has been proxied in different ways—including the two we use—and with robust results for university-owned patents (Henderson et al.
1998; Sampat et al.
2003; Bacchiocchi and Montobbio
2009; Crespi et al.
2010; Ljugnberg and McKelvey
2012; Czarnitzki et al.
2012) and backward citations to non-patent literature (Carpenter et al.
1981; Reitzig
2003; Nagaoka
2007). Hence, in this science-based technology, having links to science does matter for explaining the impact of that technology. Our interpretation is that these close or dense relationships to science likely signal a breakthrough invention, which later impacts a technological trajectory. This may also be because the same individual academics are highly influential in articles and patents (Bourelos et al.
2017).
Interestingly enough relative to our contribution of the different types of search, both our hypotheses were rejected, which suggest that our main results on delayed recognition and high impact technologies were not as expected, relative to debates in the current literature. We find that both proxies, university-owned patents as well as the non-patent literature references, have a significant but negative effect on being a “sleeper” and are not significant for explaining “sleeping beauties.” For the first hypothesis, for patents with direct and strong science-base linkages to the technology, we expected, based on reasoning by Fleming and Sorensen (
2001,
2004) that this is due to expected recombinations of highly coupled, interdependent components; these inventions therefore may take longer, because they are further away from the current localized search. For the second hypothesis, for patents with indirect and diverse science-base linkages, we expected, based on Kaplan and Vakili (
2015), that recombining technologies that depend on highly coupled, interdependent components may require an indirect link to science, in order to try out a wider number of possible combinations with a local search around application areas. Thus, these inventions may take longer, because more combinations need to be tested.
We go beyond the existing literature, in order to stress the importance of having indirect and diverse science-base linkages to the technology for signaling technological impact, which in turn may lead to potential future breakthrough inventions in nanotechnology. Our proxy is backward citations to the non-patent literature, as we have argued these are informal and temporary ways of searching across a broader range of scientific fields and industrial application areas. Our reasoning is that making recombinations of technology that depend on highly coupled, interdependent components may require an indirect link to science, in order to try out a wider range of possible combinations with local search around application areas, and hence these inventions may take longer, because more combinations need to be tested. We propose that the non-patent literature should not be considered science linkages in general, which can be considered an extension of arguments also found in (Callaert et al.
2006), and more specifically, we propose that they reflect a search among various types of codified and informal technological and scientific knowledge.
Contrary to expectations, in nanotechnology, we find that science-based patents mainly have an earlier impact, and possibly also a more direct impact on industrial invention. Hence, moreover, within such science-based technologies, we propose that the long-term patterns of delayed recognition and high impact may instead be explained by developing in future a more nuanced understanding of the role of firms in combining multiple knowledge sources, seeing the firms as knowledge-intensive firms required for innovation. Firms are to combine different types of scientific and technology knowledge—and thereby industrial invention and innovations within firms requires the firm to manage diverse cognitive communities as argued by Nightingale
1998 and to manage an evolutionary problem-solving process with multiple ambiguities and grey zones, as argued by McKelvey and Saemundsson
2021. This interpretation can be in line with other findings, in that our control variables of IPC application class and company-owned patents show some degree of significance. We conceptualize that these represent a firm context, and specifically the need to combine many technologies within firms’ industrial inventions. Unlike the representation of narrow application areas such as by Kaplan and Vakili (
2015), the application areas for nanotechnology are proposed as quite dense and complex, suggesting the need to recombine multiple application areas to achieve delayed breakthrough inventions, in a corporate setting. Hence, within the science-based technology of nanotechnology, delayed recognition and high impact specifically require multiple technologies, specializations, and industrial applications. Regarding implications for regions interested in developing nanotechnology (and other emerging technologies), our interpretation suggests the need to have a supportive regional knowledge base, predominately based on analytical (scientific) or synthetic (industrial) knowledge, identified by Asheim and Coenen (
2005), e.g., a density of connected universities and knowledge-intensive firms. Thus, we suggest that it is important to further understand the knowledge base and impacts of large companies for stimulating potential breakthrough inventions in emerging technologies.
We acknowledge this study has many limitations due to data and approach, and as such, this article makes limited contributions as well, but also opens up interesting future areas of research. A first group of limitations has to do with choices related to the dataset. We are also limited to EPO data, and it is possible to examine backward citations from different patent offices, in order to follow previous literature (Alcacer and Gittelman
2006; Criscuolo and Verspagen
2008) and study the effect of whether assigning citations in patents comes from the examiners or the inventors. This study compares nanotechnology to all patents, but our research design only used EPO data for nanotechnology as indicative of other emerging technologies for our questions. Comparisons could be made with different types of technologies or else with other key enabling technologies identified by the EU or other policy makers, as done by Neuländtner and Scherngell (
2020), to examine the generalizability of our findings. Moreover, in our definition of beauties, we aggregated citations from a citing DOCDB family to a cited DOCDB family, which is a much better approach than taking citations from a citing individual patent to a cited individual patent. Nonetheless, it does not take into account that some patent offices add more references to their patents, and thus receive more forward citations. Thus, the bigger the family, the more likely it is to contain at least one patent from such a patent office, and thus receive more citations. We have corrected this by controlling for the size of the family in our analysis, but a more advanced approach could define highly cited patents by patent office, before aggregating the patent family, in future research. A second group of future research could better explain our findings and proposed explanations. We acknowledge that the paper looks at the overall phenomenon of delayed recognition high impact technology, which could be further understood through more fine-grained analysis. Hou and Yang (
2019) have identified different patterns of delayed recognition in patents. Future research could focus on the characteristics and position of the first citing patent. We can think of this as the one which “awakens” the sleeping beauty, a so-called prince in this metaphor. The first citing patent—or the first large cluster of citing patents—should be analyzed, because it signals the point in the technological development where our studied technology starts to greatly impact the overall technological trajectory. Our study does not explain why delayed recognition and high impact may occur together. Future research could delve into explicit explanations. Reasonable alternative explanations can be formulated as follows. A new advance at the frontier of technological progress may be ahead of its time and remain latent until complementary knowledge that builds on it has been developed. An alternative hypothesis would be that the social network of inventors is determinant for the diffusion of inventions and isolated actors with a weak social position lack the means to make their inventions noticed. This second explanation would reveal a shortcoming of the technology system, since important developments are ignored, delaying further technological development. In such a case, there may be scope for policy action to correct this flaw in the diffusion of technologies. A complementary study along these lines could investigate more details of how and why knowledge, geography, and networks are interrelated in explaining knowledge creation and diffusion in this case (in line with Henning and McKelvey
2020; Bathelt and Glückler
2003; Bathelt et al.
2004).
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.