Introduction
As research on entrepreneurship grows, several streams have unfolded, leading to a theoretical expansion and comprehensive understanding of the entrepreneurship phenomenon within its ecosystem. Two significant factors contributing to the advancement of entrepreneurship in both practice and academia are technological advancement and digital transformation. This transformation has opened multiple possibilities to understand the complexities of the entrepreneurial environment and provide insights into the development of innovation, decision-making processes, the behavioural patterns of entrepreneurs, and forecasts (Mazzoni et al., 2021; Nambisan, 2017; Omorede, 2023; Xiao et al., 2023).
One such technological advancement is the integration of big data and big data analytics into research and practice (George et al., 2014; McAfee et al., 2012). Research has highlighted the significance of integrating big data analysis in the advancement of entrepreneurship and management research, not just because of its interdisciplinary nature but also because of the possibility of changing how research and practice interact (Khan, 2020; Obschonka & Audretsch, 2020). Therefore, research has called for the integration of artificial intelligence (AI) and big data into entrepreneurship research (Giuggioli & Pellegrini, 2022; Lévesque et al., 2022). These technologies have become increasingly relevant as they offer innovative tools to tackle challenges in understanding and predicting entrepreneurial dynamics.
Anzeige
The emergence of big data and artificial intelligence (AI) technologies is significantly changing entrepreneurship research and practices. These developments not only signify a change in basic assumptions in the ways we interpret entrepreneurial activities, but also in how these activities are evaluated and encouraged, rather than only providing minimal incremental improvements. Deeper insights into improved predictive capabilities and the ability to recognise new trends and obstacles in the entrepreneurial ecosystem are made possible by the unprecedented opportunities represented by the integration of big data analytics and AI into entrepreneurship research.
Furthermore, big data analysis offers the potential to gain a deeper understanding and enhance predictive indicators of the entrepreneurial process (how entrepreneurs search for opportunities, how they initiate their start-ups, strategies for development and growth, as well as strategies to terminate their venture, Omorede, 2014) and its impact on entrepreneurial success through forecasting and anticipation of changes in market trends, identifying emerging opportunities and challenges, and implementing new strategies (Wamba et al., 2015). These predictive analytics can provide insights into venture sustainability and performance (Calvard, 2016). Additionally, the scalability of big data allows researchers to investigate broader patterns across industries and regions, thus offering a global perspective on entrepreneurial trends. First, it makes it possible to understand the entrepreneurial process on a large scale. Second, it improves our capacity to precisely forecast entrepreneurial results. Third, it facilitates recognition of new opportunities and difficulties that might not be visible using more conventional research techniques.
To completely understand the field’s revolutionary potential, it is essential to examine specific instances of the use of big data in entrepreneurship research. One noteworthy instance is the research of Schwab and Zhang (2019), who showed how machine learning techniques can be used to predict the success of startups. They achieved unprecedented accuracy in identifying the critical elements that influence venture performance by examining sizable datasets of start-up characteristics and outcomes. Similarly, Prüfer and Prüfer (2020) showed how big data can help us understand how technology disruption affects entrepreneurial ecosystems by using big data approaches to analyse the effect of artificial intelligence on entrepreneurship. Obschonka et al. (2020) provide yet another groundbreaking use of big data in entrepreneurship studies. They mapped geographical variances in entrepreneurial personality using social media data. Their study serves as an example of how big data can shed light on the psychological and cultural underpinnings of entrepreneurship at a local level. Wang et al. (2017) used big data analysis to examine entrepreneurial social networks, providing fresh insights into the interpersonal elements of entrepreneurship. These examples demonstrate state-of-the-art big data applications for entrepreneurship research and pave the way for novel contributions in this field.
Although integrating big data into entrepreneurship research can lead to endless possibilities, several barriers and challenges may arise in its implementation (Calvard, 2016). A major challenge in implementing big data is its associated ethical implications, which may violate individual data protection (Boyd & Crawford, 2012). Another challenge is how real-time analysis is conducted at different stages of the venture lifecycle (Kim et al., 2016). Other challenges associated with the integration of big data are how the data are gathered, analysed, and interpreted to yield adequate results (Kitchin, 2014). Bouwman et al. (2019) point out the methodological difficulties associated with using big data analytics in entrepreneurship research, such as the need for sophisticated analytical abilities and problems with integration and quality. Additionally, as mentioned by Obschonka et al. (2020), the use of social media data for research presents significant privacy and consent issues that require careful consideration in subsequent investigations.
Anzeige
Notwithstanding these challenges, big data and AI have much to offer in entrepreneurship research. These techniques offer previously difficult-to-capture insights into regional differences in entrepreneurial activities. Furthermore, by considering the quick changes that frequently occur in these settings, it is possible to create more dynamic models of entrepreneurial ecosystems with the ability to analyse large-scale datasets in real-time (Stam, 2017). However, these advancements should be considered complementary to traditional methodologies to ensure a holistic approach to entrepreneurship research.
This study aims to explore the use of big data analysis in entrepreneurship research by highlighting several factors that entrepreneurship researchers must consider when embarking on this type of research. Considering the rapidly evolving field of big data applications in entrepreneurship research, this study addresses the following research questions:
i.
How can big data analytics and AI be effectively integrated into entrepreneurship research to enhance our understanding of entrepreneurial processes and outcomes?
ii.
What are the key benefits and challenges of using big data in entrepreneurship research?
iii.
How can emerging technologies in big data analytics be applied to address the current gaps in entrepreneurship research?
iv.
What ethical considerations must be addressed when applying big data analytics in entrepreneurship research?
The research questions above link to several streams of research within entrepreneurship. First, research question one is supported by the theoretical framework of interdisciplinary approaches to entrepreneurship, including the dynamic capabilities theory and resource-based view. Second, research question two is grounded in theories of innovation and dynamic capabilities, as highlighted by Wamba et al. (2015) and Teece (2018). Third, research question three links disruptive innovation theory and network theory to explore opportunities for technological advancement (Christensen et al., 2013; Hayter, 2013; Obschonka & Audretsch, 2020). Finally, the fourth question is framed by data governance theories and privacy guidelines, including Boyd and Crawford (2012).
This study highlights the benefits and enablers that can facilitate the use of big data and further acknowledges several challenges and possible barriers when using this approach. Thus, this study contributes to the current discussions on conducting interdisciplinary research within entrepreneurship, as well as the discussion about the implementation of data analytics and artificial intelligence (AI) to research entrepreneurship. Ultimately, this study seeks to advance theoretical and practical understanding, providing a foundation for future studies in the field. The following sections present how big data is conceptualised and implemented in entrepreneurship research, followed by discussions about its benefits and challenges. The paper concludes with recommendations for implementation and considerations for future research.
Literature review and theoretical background
Overview of the state of the art
Several authors have conceptualised the meaning of big data. De Mauro et al. (2016), describe big data using four essential elements: Information (the fuel of big data), Technology (a prerequisite for using big data), Methods (techniques for processing big data), and Impact (big data touches our lives pervasively). Here, they define big data as ‘the information asset characterised by such a High Volume, Velocity, and Variety to require specific Technology and Analytical Methods for its transformation into Value’. (pg. 127). They emphasised that the definition of big data encompasses big data technology and methods. Furthermore, researchers such as Laney (2001) adopt the same main dimensions of Volume, Velocity, and Variety. However, other researchers have defined big data as a popular term used to describe the exponential growth, availability, and use of information, both structured and unstructured (George et al., 2014; Khan, 2020). This information includes, but is not limited to, video and audio data originating from various sources, such as social media, transaction records, and sensors.
Big data applications are becoming increasingly prevalent in various sectors owing to their potential to offer competitive advantages. These include healthcare, finance, marketing, supply chain management, and manufacturing (Benjelloun et al., 2015; Zhong et al., 2016; Sravanthi & Reddy, 2015). These applications use big data analytics to improve decision-making, optimise resources, and enhance strategies (Benjelloun et al., 2015). The adoption of big data technologies has led to the development of new business models, with organisations adapting their approaches to address the challenges and opportunities presented by large-scale data processing (Canan et al., 2013) in combination with other technologies such as Artificial Intelligence. As a result, both the service and manufacturing sectors have benefited from ongoing research and investment supported by the public and private sector (Benjelloun et al., 2015; Zhong et al., 2016).
Furthermore, big data has emerged as a significant factor in business management and entrepreneurship research, offering new insights and opportunities for business growth, innovation, and business ideas (Sood et al., 2021; Shan et al., 2022). The integration of Big Data analytics affects various aspects of business activities, such as decision-making, innovation, and customer experience optimisation (Lull et al., 2024). Moreover, insights derived from Big Data analytics can lead to improved business strategy decisions (Kaur & Cheema, 2017; Paredes-Moreno, 2015). Big Data can empower knowledge-intensive entrepreneurship by aligning business processes with customer needs and market trends. As research in this field has advanced, the potential to revolutionise business practices and drive economic development has evolved. Big Data is seen as an essential investment for organisations seeking to maintain high quality and productivity, with 92% of executives reporting satisfaction with the results (Nascimento et al., 2021).
However, recent research has highlighted the growing importance of big data in entrepreneurship studies. Big data analytics can enhance enterprise performance (Shan et al., 2022). Psychological big data offers insights into entrepreneurial culture and regional economic development (Obschonka, 2017). While big data presents new opportunities for advancing entrepreneurship theory and practice, researchers must navigate methodological challenges to ensure high-quality studies (Schwab & Zhang, 2019). As the field evolves, researchers are encouraged to employ sophisticated analytical methods to fully leverage the potential of big data in entrepreneurship research (Obschonka, 2017; Schwab & Zhang, 2019). This demonstrates the relevant potential for both the service and manufacturing sectors, sectors with ongoing research and development efforts supported by public-private development programs (Benjelloun et al., 2015; Zhong et al., 2016). In conclusion, the evolving role of big data in entrepreneurship research necessitates a dual focus: leveraging existing methodologies to maximise analytical capabilities and addressing gaps in the literature by integrating emerging trends and technologies. These considerations provide a robust foundation for the objectives of this study and its contribution to the field.
Theoretical foundations of big data in entrepreneurship research
In entrepreneurship, research using big data analytics is not only used to obtain meaningful insights into understanding the current state of events but also to make accurate predictions of future events. For instance, to assess and mitigate risk rather than relying on heuristics, formulate strategies for opportunity identification, evaluation, and exploitation to achieve social good and sustainable change through social innovation (Kitchin, 2014; Pappas et al., 2017). Insights from fields such as economics, marketing, and business management further contextualise this data within the entrepreneurial ecosystem, including understanding demand, organisational behaviour, strategic management for competitive advantage, operational challenges, and indicators of economic growth (Porter, 1991; Varian, 2014). An interdisciplinary approach to entrepreneurship research using big data creates further possibilities for innovative methodological rigor that yields a richer understanding of new insights into entrepreneurship research.
The effective application and implementation of big data in entrepreneurship research entails an interdisciplinary approach that can incorporate knowledge from data science, economics, business studies, and marketing (Khan, 2020). Through this interdisciplinary lens, researchers can take advantage of various methodologies and perspectives, leading to a thorough understanding of entrepreneurial phenomena. The core of data science, encompassing statistical analysis, machine learning, and computational techniques, offers tools to gather and interpret complex data and enrich entrepreneurship research with innovative methodological rigor (Provost & Fawcett, 2013; Shepherd & Majchrzak, 2022).
The interdisciplinary approach to applying big data in entrepreneurship research also involves the application of theories and frameworks from other research fields, which cannot be overemphasised (Khan, 2020). For instance, in theories such as the resource-based view (Foss et al., 2008; Kellermanns et al., 2016) and knowledge-based view (Grant, 1996; Hayter, 2013), resources and knowledge are the data themselves, and the insights derived from these resources can be used to make strategic decisions, advance innovation capabilities, and gain competitive advantages (Wamba et al., 2015). Additionally, as a dynamic capability (Teece et al., 1997; Zahra et al., 2006), big data can aid organisations in not only sensing opportunities and reacting to threats more quickly, but also in seizing these opportunities and gaining competitive advantages. Therefore, research on corporate and strategic entrepreneurship can benefit from such integration. Moreover, in extremely dynamic markets, Big Data Analytics Capabilities (BDAC) have become essential tools for business competitiveness (Ciampi et al., 2021). For instance, BDAC can improve a company’s capacity to create both incremental and radical innovation, which is particularly important for entrepreneurial ventures (Mikalef et al., 2019). These capabilities allow for a more in-depth analysis of ecosystem dynamics, market trends, and entrepreneurial activities. Finally, disruptive innovation theory (Christensen et al., 2013) and network theory can be used to integrate big data into entrepreneurship research. Big data can reveal insights into the entrepreneurial ecosystem by facilitating the identification of collaboration opportunities, stakeholder relationships, and disruptive innovations, offering organisational tools to effectively navigate evolving markets.
Recent studies of entrepreneurship using big data have examined the connection between local entrepreneurial activities and media coverage. For example, von Bloh et al. (2020) found that news coverage of entrepreneurial events can serve as a catalyst for regional venture creation and development. This demonstrates how big data analytics can be used to comprehend the complex relationships between media narratives and entrepreneurial activities. Furthermore, big data applications in entrepreneurship education are becoming increasingly popular. For instance, Ma et al. (2020) proposed a hierarchical framework that prioritises elements such as opportunity recognition, institutional environments, and psychological factors in their assessment of the use of big data technology in entrepreneurship education. These applications emphasise the transformative potential of big data, enabling researchers to uncover new patterns and forecast trends, and provide actionable insights for academics and practitioners.
Methodology
Our research sought to identify crucial studies published in peer-reviewed publications on entrepreneurship and big data analytics, focusing on articles written in English up to July 2024. The review process encompassed publications in various areas, including entrepreneurship, management, information systems, and data science. This methodology was employed in similar studies by Guttentag (2019) and Ip et al. (2011). To ensure the relevance of each article, we assessed it individually to confirm its focus on the confluence of entrepreneurship research and big data analytics.
The review process for investigating the utilisation of big data in entrepreneurship was structured into five key phases. First, data sources and search strategies were identified using databases such as Web of Science, Scopus, and Google Scholar. Subsequently, search terms including but not limited to ‘entrepreneur*?, ‘big data’, and ‘digital entrepreneur*’ were applied, incorporating only articles specifically analysing the utilisation of big data in entrepreneurship. In the second phase, an initial review of the titles and abstracts was conducted, followed by a comprehensive text review. Subsequently, data were extracted and synthesised, and content analysis was performed to identify common trends and gaps in the literature. Finally, a quality assessment was conducted to ensure that only high-quality studies aligned with the research objectives were included. Figure 1 outlines the steps involved in the selection of the papers analysed.
Fig. 1
Review process for big data in entrepreneurship research.
Findings
Methodological framework for big data research in entrepreneurship
When processing big data for entrepreneurship research, the use of an appropriate research design cannot be overemphasised. Big data deals with high volume, velocity, and variety of information. This indicates that the design for conducting such research must be structured to gather meaningful insights from the data. Several research designs can be adapted to use Big Data in entrepreneurship research. Strategies, for instance, can be exploratory in nature, where exploratory design can be used to identify new patterns, ideas, trends, and previously unrelated factors/concepts as well as to test and confirm hypotheses (Shah et al., 2012). Exploratory designs can include activities related to the mining of data and machine learning algorithms (MLA) to detect these patterns. One aspect that could be of interest to entrepreneurship researchers is the exploration of firm failure and success factors using big data. Furthermore, predictive research design is another strategy by which researchers can adopt big data analytics. Specifically, predictive analytics such as machine learning algorithms can use historical data to forecast future trends in the market, potential venture performance, and possible investment outcomes (Provost & Fawcett, 2013). Additionally, using big data in a longitudinal research design can facilitate tracking a venture’s activity and provide insights into what it can improve, eliminate, or implement to maintain its performance (George et al., 2014).
For such designs, the methods by which data are collected and curated are also important. Data for this type of research can be extracted from diverse sources using both structured and unstructured methods. A structured means of big data collection can involve accessing the database of the organisation to extract customer transaction logs, financial records, digital footprints of customers, videos, images, and the social media of ventures and organisations to make predictive analytics and strategic decisions. Data collection tools such as application programming interfaces (API), Internet of Things (IoT) devices like sensors and actuators, and web scraping tools such as bright data, scrapping-dog, and AvesAPI can be used to track information and interactions and gather data to understand different dimensions within entrepreneurship research (Fan & Gordon, 2014).
Considering the volume of data collected through big data analytics, it is also important to clean, organise, and structure the data using suitable formats to effectively manage the data. Data management ensures the reliability and validity of the collected data (Kitchin, 2014). This is because it establishes grounds for accuracy, completeness, and the quality of the data. In doing so, it becomes less complicated to sort through the data, seek connections, and identify patterns so that relevant information can be extracted. Artificial intelligence (AI) in the form of natural learning processing (NLP) tools and algorithm tools such as Gensim and SpaCy can be used to interpret and quantify qualitative data from social media and customer reviews as well as analyse visual content related to entrepreneurial products and services (Halevy et al., 2009). In today’s digital age, entrepreneurship researchers will benefit from adopting some of these tools and methods to gather information about factors that are important to advance the research field (Gandomi & Haider, 2015).
Analytical techniques in using big data for entrepreneurship research
Quantitative and statistical analyses are commonly associated with big data research as they involve large datasets with computational algorithms for pattern identification, hypothesis testing, and prediction (Fan & Bifet, 2013). For example, when working with structured data, a quantitative approach is more suitable for effectively handling large volumes of numerical data. Given the nature of big data, entrepreneurship research can use analytical tools, such as regression analysis, cluster analysis, and network analysis, to allow for correlations within large populations and generalise such findings (Fan & Bifet, 2013), provide insights into venture startup performance and operational efficiency, facilitate data-driven decision-making (Wamba et al., 2015), and ensure the rigor and objectivity of findings. Although quantitative analysis has proven to be an adequate analytical tool for big data analysis, researchers can also adopt qualitative research analysis, such as content analysis and thematic analysis. These can be used to synthesise unstructured text, images, and videos (Fan & Gordon, 2014).) to interpret patterns and themes, as well as to understand the underlying factors behind a phenomenon. These analytical tools can provide further understanding of why and how a phenomenon occurs.
Other advanced tools (earlier discussed in methods- MLA and NLP) that can be applicable for entrepreneurship research create avenues to analyse data that would be difficult for traditional methods of analysis. These tools not only provide information on leveraging long-term success and the potential for making venture funding decisions and potential market opportunities, but can also create avenues into insights for customer interaction, leadership communications, and access to new markets, which are vital aspects of entrepreneurship research (Müller et al., 2018).
Benefits of big data in entrepreneurship research
There are several benefits to using big data to research entrepreneurship. Some prominent ones are highlighted below.
Enhancing analytical depth and breadth
Integrating big data analysis in entrepreneurship research can help understand several elements of this field, especially because it provides a comprehensive analysis that may be challenging for traditional methods to implement. Owing to the attributes of the 4Vs (velocity, variety, volume, and veracity), big data can enable entrepreneurship research and practice to draw nuanced, holistic, and timely insights into the studied phenomena. More specifically, with its volume, entrepreneurship research can examine a complex entrepreneurial phenomenon that cuts across countries, customers, and professionals in different contexts. For instance, Gupta and George (2016) emphasise that big data can provide insights from initial contact points to long-term purchasing behaviour, uncovering insights that drive personalised marketing and customer retention strategies. Furthermore, the variety of big data can provide a range of numeric transactional data as well as text-based reports from social media, thereby collecting both quantitative and qualitative data. This information provides an avenue for addressing research questions using diverse approaches, as it incorporates everything from real-time microeconomic trends to macroeconomic trends that affect the entrepreneurial ecosystem (Gandomi & Haider, 2015; Kitchin, 2014). Additionally, the veracity of big data presents an opportunity for entrepreneurship researchers to apply techniques to clean and process data before analysis, enabling them to use accurate insights from the data. Thus, ensuring that the findings do not just reflect the real world but also enhance the validity of the research (Boyd & Crawford, 2012; Kitchin, 2014). Finally, its capacity to analyse real-time data cannot be overemphasised. In relation to its velocity, entrepreneurship research and practice can use real-time data to track trends and make quick decisions (Bello-Orgaz et al., 2016). Making use of MLA and NLP tools would not only allow for the testing of theories and hypotheses in near real-time, but also facilitate rapid iteration and refinement of entrepreneurial strategies (George et al., 2014; Mikalef et al., 2018).
Innovative insights and predictive modelling
One main benefit of big data is how it contributes to predictive insights and foresight. This enables researchers to forecast trends with greater accuracy, which is a result of the pattern algorithm analysis of big data. Using big data analysis for forecasting has been shown to help organisations manage future uncertainty (Klievink et al., 2017). For instance, entrepreneurs can use available models to gain a competitive edge by swiftly and efficiently adapting to market changes (Provost & Fawcett, 2013). Big data also facilitates the generation of innovative insights, especially for entrepreneurship in practice. With its ability to process and analyse datasets at scale, entrepreneurship researchers can further understand factors relating to startup success, the impact of social networks on venture funding, industry shifts, the role of technology adoption in the growth of ventures, the influence of regional policies on startup survival rates, economic changes, and overall firm performance (Brynjolfsson et al., 2011; Einav & Levin, 2014).
Challenges and limitations in using big data for entrepreneurship research
Given its extensive possibilities, applying big data to entrepreneurship research is not without its limitations and challenges. Challenges related to using big data in entrepreneurship research can create bias when there is a lack of transparency in the research process, undermining the accuracy and credibility of the findings. Some prominent challenges are discussed below.
Data quality, integrity, and representativeness
The “presence of noise” is one main challenge that researchers usually deal with when working with big data (Boyd & Crawford, 2012). Such noise can be generated from errors in the data and often affects their quality. For instance, errors in data collection, incorrect data entry, inconsistent data entry and formatting, and improper treatment of outliers and anomalies in information. Researchers using big data for gathering, evaluating, and analysing data must endeavour that information gathered, irrespective of where and how it is gathered (e.g. from different sources, with different scales, definitions, and methods of data collection) should be properly treated. The risk of the lack of or poor-quality data, especially when working with large-scale data, is that the results become misrepresented and flawed which tends to lead to discrepancies in results (Boyd & Crawford, 2012).
Furthermore, the integrity of data, which is related to authenticity and trustworthiness, is another challenge that entrepreneurship researchers must consider. This is crucial for entrepreneurship research in practice because decisions are often data-driven and important when longitudinal studies are conducted, as the results are affected for an extended period (Wang & Strong, 1996). Poor data integrity would not only lead to misguided conclusions but also lead to setting wrong strategies for future actions. Such integrity issues can arise because of data manipulation, degradation over time, and merging of datasets from incompatible sources. Therefore, it is important to understand where they originate, how they have been handled, and by whom to ensure reliable data and valid results (Baesens et al., 2014).
Regarding the representativeness of the data, researchers must ask themselves whether big datasets are representative of the broader population and the phenomena under study. Considering the context (geographical region, a specific business venture in which its end users are not visible online) in which the research is conducted, big data can often produce skewed results. One way to address this challenge is to use other forms of data collection that can supplement big data and mitigate biases (Japec et al., 2015). Finally, when dealing with big data, the skills of researchers are important. The nature and complexity of big data analytics often require statistical competencies, machine learning, data mining, and knowledge of computer science. As this may be beyond the skills of some entrepreneurship researchers, they can bridge this gap by engaging in interdisciplinary collaboration. In addition, faculties can invest in training staff members to acquire analytical knowledge and skills to equip them to navigate big data and its analysis.
Analytical and interpretive complexities
The complexities associated with analytics are often due to certain elements (e.g. volume, velocity, variety) of big data. The size of the data can be overwhelming and conducting an analysis using traditional statistical software, such as SPSS may be insufficient. This also means that researchers may have to rely on machine learning techniques to work with large datasets (Kaisler et al., 2013). Second, the velocity of the data also possesses a layer of complexity because the generation and processing of data entails an extreme level of speed, where there is a constant change in data owing to the continuous inflow of real-time information. Entrepreneurship researchers aiming to use big data must employ an adaptive approach, which is often resource-intensive and challenging to maintain. Nonetheless, adopting these real-time analytical techniques presents possibilities for providing timely and accurate insights (Marr, 2015). Furthermore, to adequately assess the diverse range of types and sources of structured, semi-structured, and unstructured data, researchers in entrepreneurship must consider the challenge of data management associated with a variety of data, such as social media, images, video, and text. To handle such analytical complexities, big data recommends the use of NLP, image recognition, and MLA to convert these data into suitable and workable materials for analysis (Gandomi & Haider, 2015).
Before results and conclusions can be presented, gathered, and analysed the data must be interpreted. This poses another challenge for researchers who intend to use big data to understand entrepreneurial phenomena. Given its nature as a large dataset, there is a risk of having a multitude of patterns and correlations, where an understanding of the underlying mechanisms for the causal relationship is lacking. In particular, it is challenging to distinguish between corrections and patterns that are meaningful and those that are not (Castellani & Rajaram, 2022). To address these changes, entrepreneurship researchers must apply a combination of skills, substantive domain knowledge, and critical thinking skills. Collaborative work and interdisciplinary approaches should be encouraged.
Ethical considerations using big data for entrepreneurship research
Ethical considerations are critical to ensuring that entrepreneurship research contributes positively to society and respects the rights and dignity of individuals. Researchers must consider several ethical concerns when conducting entrepreneurship research using big data.
First, research using big data is complex and often involves real individuals and industries. This makes privacy an important issue. There is a risk that certain information will potentially infringe on people’s right to privacy. This is because big data entails collecting material such as social media, images, and text that contain detailed and/or sensitive information about individuals and personal information that individuals may not intentionally want to publish. Although some authors argue that data anonymity is possible, Boyd and Crawford (2012) emphasise that privacy issues cannot be managed through anonymisation alone but require a comprehensive approach to data governance. This argument stems from the notion that even when data are anonymised and do not contain personal information, individuals can be re-identified through the process of data triangulation (Barocas & Nissenbaum, 2014; Tene & Polonetsky, 2012). Researchers must ensure that they treat the data carefully to avoid identifying individuals in the dataset.
Second, a further ethical issue to consider relates strongly to data privacy, which requires the informed consent of participants whose data are utilised. From the traditional data collection method, research ethics recommends that participants be informed of the purpose of the research and how the data will be used. Because big data are gathered from diverse sources, informed consent is unfeasible because of the large amount of data often collected from third parties (Metcalf & Crawford, 2016). In such situations, it is important for entrepreneurship researchers who intend to use big data to explore other forms of consent. For instance, they can seek community consent for certain types of data (Metcalf & Crawford, 2016).
Third, entrepreneurship researchers must also be mindful of certain biases that come with cleaning, analysing, and interpreting the data. Algorithms trained on bias pose the risk of inaccurate results, which may be detrimental to entrepreneurial ventures and the context of the study. To avoid biases, researchers must adopt strategies that carefully identify and correct for biases to ensure that their findings are accurate and exclude those that promote discriminatory practices (O’ Neil, 2017).
Finally, data ownership is another ethical issue that researchers may face when using big data to conduct research. With big data, information is often gathered by several other platforms and services; therefore, the question of who owns the data becomes a grey area. Some researchers have argued that data are open and free to use. Nevertheless, before embarking on using the data, entrepreneurship researchers must ensure that they comply with the terms of services and copyright laws where applicable (Einav & Levin, 2014; Purtova et al., 2018).
Navigating legal frameworks and ensuring compliance with data protection laws and regulations are critical aspects of big data in entrepreneurship research. For instance, the European Union (EU) has a general data protection regulation (GDPR) that sets guidelines for handling individual personal data (Voigt & Von dem Bussche, 2017). Although other countries do not have regulations like the EU, it is paramount that researchers are aware of specific regulations when transferring or working with data across borders, as they may be subject to different types of regulations.
Discussion
This study explored the implementation, benefits, and challenges of using big data in entrepreneurship research, uncovering important insights that enhance current understanding in this field. Our results add to the continuing discussion on incorporating big data analytics in entrepreneurship research and present a fresh viewpoint on its potential impact. This discussion reiterates these findings by answering our research questions and simultaneously presenting our theoretical contributions. Thus, providing insights into the potential benefits and challenges of using big data analytics in entrepreneurship research adds to the growing conversation about the topic.
Expanding the integration of artificial intelligence (AI) and big data analytics in entrepreneurship research
Although earlier research has indicated the potential of big data in the study of entrepreneurship (Obschonka & Audretsch, 2020; Khan, 2020), our study adds to this understanding by proposing a multidisciplinary strategy that combines domain-specific knowledge, entrepreneurship, and data science expertise. By illustrating how big data analytics and AI can be successfully incorporated into entrepreneurship research, this study responds to the first research question. Our findings point to a fundamental shift in the conceptualisation and study of entrepreneurial processes and outcomes as part of the integration of big data and AI, which goes beyond simple data analysis. To gain insights into how cultural factors affect entrepreneurial cognition and decision-making, we suggest, for example, that natural language processing techniques be used to analyse entrepreneurial narratives in various cultural contexts (Hirschberg & Manning, 2015). Building on the concepts introduced by Obschonka et al. (2020) and Lévesque et al. (2022), this study extends earlier research by providing a more complex cross-cultural perspective on entrepreneurial phenomena. Finally, our findings expand on the resource-based view (RBV) by suggesting that big data can serve as a significant resource for improving organisational capabilities. The business sector can achieve a competitive edge by using big data. Such as better management and allocation of resources and prediction of market trends (Wamba et al., 2017). This work is also viewed as supporting dynamic capability theory, as big data enables swift responses to changes in increasingly complex business ecosystems (Teece, 2018).
Novel insights into the benefits and challenges of using big data in entrepreneurship research
To address our second research question, we offer detailed insights into the benefits and challenges of utilising big data in entrepreneurship research. While most previous studies concentrated on the analytical benefits of big data (Gupta & George, 2016; Wamba et al., 2015), our study goes a step further by considering the strategic implications of big data. Based on this finding, we propose that big data analytics can function as a meta-capability, facilitating the faster development and improvement of other capabilities for entrepreneurship. This viewpoint expands on how big data is currently understood to play a role in entrepreneurship by indicating that its influence may be more extensive than previously thought and is consistent with the findings of Ciampi et al. (2021) on the capabilities of big data analytics and innovation business models. Our study also goes beyond the issues raised by earlier studies by contextualising the requirements and limitations of entrepreneurship research. For instance, we emphasise the distinct challenges of ensuring that data accurately represent recent ventures and informal entrepreneurial activities. This may not be well represented by traditional large-scale data sources. This extends the concerns about data quality expressed by Kitchin (2014) and Boyd and Crawford (2012), but applies them specifically to the field of entrepreneurship.
Future directions and emerging studies
Our study addresses the third research question by identifying emerging technologies that can fill the current gaps in entrepreneurship research. Although previous studies have focused on social media data and web scraping techniques (von Bloh et al., 2020; Schwab & Zhang, 2019), we propose integrating Internet of Things (IoT) data and advanced AI techniques to broaden this perspective. Reiterating the argument of Atzori et al. (2010), we propose that IoT data can offer real-time insights into entrepreneurial work patterns and decision-making processes. This approach could provide a more comprehensive understanding of the entrepreneurial journey, connecting the digital and physical aspects of entrepreneurship and expanding on Nambisan’s (2017) work on digital entrepreneurship.
Ethical consideration: a proactive approach
The last research question is based on earlier discussions of ethical considerations in big data research by offering a proactive entrepreneurship-specific ethical framework. While previous research has mainly focused on general ethical issues, such as privacy and consent, we advocate for a more contextualised approach that considers the unique ethical challenges in entrepreneurship research. For instance, we suggest that researchers create new approaches to identifying and addressing entrepreneurship-specific ethical concerns that align with the fast-paced and informal nature of entrepreneurial activities. This proposition builds upon existing ethical guidelines by recognising the specific requirements and limitations of entrepreneurship research, drawing from research on data protection regulations (Purtova, 2018) and algorithm bias (O’Neil, 2017). Additionally, we stress that researchers studying entrepreneurship should consider how their big data-driven research may affect society, echoing the concerns expressed by Calvard (2016) regarding the moral ramifications of big data analytics in corporate settings. One aspect of this is being aware of how big data-driven predictive models may affect venture capital allocations and policy choices, which could potentially perpetuate current disparities in the entrepreneurial ecosystem.
In summary, this research adds to the expanding body of literature on big data in entrepreneurship studies by presenting fresh insights into implementation tactics, recognising the innovative uses of emerging technologies and advocating a proactive stance on ethical considerations. These perspectives lay the groundwork for further studies to better exploit big data while addressing its distinctive challenges in entrepreneurship research. As recommended by George et al. (2014) and Shepherd and Majchrzak (2022), future research should concentrate on developing new methodological approaches and ethical principles specific to big data research in entrepreneurship, ensuring that the field continues to develop not only in a responsible but also influential manner.
Future directions in big data and entrepreneurship research
Given the limitations of using big data in entrepreneurship research discussed earlier, there are several factors to consider in the advancement of research. We evaluate two future directions: first, the need to integrate emerging trends and technologies in entrepreneurship research, and second, the need to continuously adapt to evolving language programming in entrepreneurship research. We discuss these aspects below.
The need to Integrate Emerging Trends and Technologies in Entrepreneurship Research: As technology continues to advance, entrepreneurship research also needs to advance by applying advanced technologies to maintain relevance and rigor. Several technological trends may be useful in entrepreneurship research. First. Use of the Internet of Things (IoT). This inflow of data can be leveraged by both practice and research to gain insights into topics such as the entrepreneurial ecosystem, success and failure factors, best withdrawal plans, and the best time to enter the market, just to mention a few possibilities. IoT data can be leveraged through the innovation capabilities of entrepreneurial ventures. This is because it can assist researchers in tracking the diffusion of innovations in real time, especially when environmental factors are present (Atzori et al., 2010).
Second, advancements in AI, MLA, and NLP will not only allow for more accurate predictive conclusions and better recognition of analytical patterns, but can also lead to better understanding of the nuances of human language to provide accurate contextual analysis of data generated from text. Specific to NLP, entrepreneurship researchers can further analyse the narrative around emerging markets and public perception of entrepreneurship ecosystems, providing a depth of understanding that goes beyond quantitative data (Hirschberg & Manning, 2015). Adapting to these trends also means that entrepreneurship researchers develop more skills in applying these approaches and seek even more collaboration across disciplines.
Practical implications and recommendations
Although an ambitious endeavour, applying big data in entrepreneurship research has the potential to advance the study of entrepreneurship. In doing so, guidance is needed for researchers. We offer guidance and actionable plans for entrepreneurship researchers intending to apply big data to their research. First, similar to applying a traditional research methods, researchers must begin with a clear plan and purpose for their research. This includes defining research questions that Big Data can answer, ensuring data quality and representativeness, and adopting advanced analytical techniques that mitigate bias.
The first step was to develop clear objectives and answer the research questions. Owing to the complexity of working with big data, having a clear, narrow scope increases the complexity of applying big data. Entrepreneurship researchers should utilise the relevant literature and provide theory-driven hypotheses that guide their data collection and analysis (George et al., 2014). Additionally, identifying research questions that big data can answer, as well as its limitations, will direct researchers in data gathering and analysis.
Second, when gathering data, researchers must consider the challenges associated with data quality, integrity, and representativeness. They must ensure that the data they gather are reliable and relevant to address their research questions. To ensure the reliability of the data, they must be scrutinised for accuracy, representativeness, and completeness. One focus of this stage is how they clean and store their data. In doing so, they must apply tools that are necessary for handling large datasets. Furthermore, entrepreneurship research must avoid the potential trap of thinking that all data are relevant, which stems from the belief that more data leads to better insight. Therefore, it is important to use validation techniques to avoid this problem (Kitchin, 2014). At this stage, ethical considerations must be at the forefront of research to ensure that individuals’ data privacy is protected and that data protection guidelines are adhered to (Martin, 2019).
Third, when processing data, researchers must be aware of the biases that may arise during this stage. Questions such as how we process the data and how we determine the quality and integrity of the data are important to ensure that biases are mitigated. Researchers must consider the implementation of rigorous cleaning and processing protocols to deal with missing values, outliers, and duplicated data accurately and completely. The use of automation tools that are trained to avoid bias is relevant to ensure data quality (Batini et al., 2009). Another recommendation is for researchers to use a combination of data sources and analytical techniques to triangulate findings and apply advanced statistical methods to adjust for and mitigate bias (Boyd & Crawford, 2012).
Fourth, in analysing big data, adequate use of tools and techniques is required. For instance, to analyse data for prediction, machine learning is considered. Furthermore, when analysing data for gathering sentiments from the data, NLP is appropriate. Network analysis can be considered by analysing data that can potentially seek interconnectivity within the entrepreneurial ecosystem. Researchers must understand the underlying assumptions and limitations of tools applied to analysing big data (Provost & Fawcett, 2013). This emphasises the need for researchers to have the appropriate skills and competencies to appropriately utilise and apply the suggested tools.
Finally, for the actionable application of big data in entrepreneurship research, a multidisciplinary team must be established to engage experts and ensure the relevance and applicability of big data analysis in entrepreneurship research. Additionally, clear data governance that ensures data management must be created. This will ensure that researchers have policies that guide the collection, storage, processing, and sharing of big data. Moreover, maintaining transparency in the research process cannot be overemphasised. Detailed records must be kept not only to maintain transparency but also to allow for the replicability of the research. A schematic representation of a roadmap for the implementation of Big Data in entrepreneurship is shown in Fig. 2.
Fig. 2
Guidance and actionable plans for applying big data in entrepreneurship research. Source: the authors
Conclusion
This study explored the use of big data and data analytics in entrepreneurship research and defined big data by considering variables such as sheer volume, velocity, and the variety of data available. Different methodological frameworks and analytical techniques to be used, such as predictive modelling and natural language processing (NLP), were introduced in this research to guide researchers in the application of big data in their future work. These tools are built on the foundation of the ability to discover patterns, predict trends, and extract valuable information in real time, thus facilitating a deeper understanding of business and entrepreneurship phenomena.
This document highlights several challenges, especially in terms of data quality, representativeness, and the interpretative complexity of massive datasets. Ethical issues related to privacy, consent, and data ownership are also crucial, given that big data are often sourced from online platforms and social networks. There is also a problem with the correlations found in large datasets, as these structures do not always provide a clear understanding of the underlying causal relationships, which poses the challenge of ensuring methodological rigor.
From a future research perspective, the integration of emerging technologies such as the Internet of Things (IoT) and artificial intelligence (AI) could further expand the capabilities of big data analytics in entrepreneurship. Thus, it seems clear that fostering interdisciplinary collaboration between entrepreneurship researchers, data scientists, and AI experts is key to making the most of these technologies. Thus, research in this field will be able to advance not only from an ethical point of view and methodological rigor, but also from a perspective on the impact on theories and practices applicable to entrepreneurship.
Although this study presents a literature overview of the application of big data in entrepreneurship research, it is not without its limitations. First, the research, as discussed above, has not applied any form of primary data, such as collecting big datasets in qualitative or quantitative settings. Future research can, therefore, attempt to use our suggestions and recommendations to test the use of big data in entrepreneurship research while highlighting the further benefits and limitations of its application. Second, the research is limited to the field of entrepreneurship; other fields of research, such as marketing and business management, can apply these recommendations in their fields. Finally, future research can further apply big data analytics to determine the extent of its usage in developing the field of entrepreneurship research.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.