3.1 Data description
In this paper, we use three main data types, (i) cryptocurrency price time series data, (ii) cryptocurrency metadata describing projects’ technological features and/or their use case and functionalities, and (iii) data capturing information on investment rounds in cryptocurrency projects.
Market data (i) and cryptocurrency metadata (ii) were extracted from the website Coinmarketcap [
38]. The data covers 1324 cryptocurrency projects over eight years, spanning from 2014 to 2022. It is important to note that the term ‘cryptocurrency’ here encompasses various types of blockchain-based digital assets. This includes traditional cryptocurrencies like Bitcoin and Litecoin, which are standalone digital currencies operating on their own blockchains, and blockchain-based tokens, such as the previously mentioned ERC-20 tokens on the Ethereum blockchain and analogous tokens on other platforms. These tokens have a range of applications, and they can represent various assets or functionalities within decentralized applications. A notable example within this group is stablecoins, which are typically designed to minimize price volatility by being pegged to more stable assets such as fiat currencies.
Market data consists of each cryptocurrency’s opening price, closing price, and traded volume, sampled weekly.
Coinmarketcap also assigns tags describing the main features of the different cryptocurrencies. Metadata can be broadly classified into three categories. The first is
technology-related specifications, which refer to the underlying blockchain technology that the cryptocurrency employs (e.g., Proof-of-Work vs. Proof-of-Stake algorithms–these are different methods used to validate transactions and create new blocks in the blockchain). The second is
ecosystem-related information, indicating whether the cryptocurrency operates on an independent blockchain or as part of an existing one, as well as whether it is part of decentralized finance (DeFi) projects. The third category relates to the
use case, or the specific purpose and utility of the cryptocurrency (e.g., it could be used for facilitating distributed storage, as a fan token for a particular brand or celebrity, or simply as a digital store of value, like digital gold). See Appendix
A.5 for a list of available tags used to categorize these aspects and their respective frequency. The dataset contains 226 unique tags. Cryptocurrencies’ tags might change over time as, for instance, the project pivots its scope or new categories are invented. Thus, the data we collected and used in the analysis should be understood as a snapshot of the cryptocurrency environment at the time they were gathered (August 2021).
Coinmarketcap also provides cryptocurrencies’ webpage URLs, which are used to merge market-related data with investment data.
Finally, the investments’ data (iii) is gathered from Crunchbase [
39], a commercial database covering worldwide innovative companies and accessed by 75
M users each year. The data is sourced through two main channels: an extensive investor network and community contributors. Investors commit to keeping their portfolios updated to get free access to the dataset. More than 600
k executives, entrepreneurs, and investors update over 100
k company, people, and investor profiles per month. Crunchbase processes the data with machine learning algorithms to ensure accuracy and scan for anomalies, ultimately verified by a team of data experts at Crunchbase. Due to its broad coverage, the data has been used in thousands of scholarly articles and technical reports [
39,
40]. Information on Crunchbase includes an overview of the company’s activities, number of employees, and detailed information on funding rounds, including investors and—more rarely—amounts raised. We provide detailed information on the features contained in this dataset in Appendix
A.4.
We merged the Crunchbase data on investment rounds with Coinmarketcap data via the companies’ webpage URLs. After merging, the dataset includes 4395 investments made in 1458 rounds by 1767 investors to 1324 cryptocurrency projects appearing on Crunchbase. The total investments amount to \(\$13B\) US dollars in the period considered (2008–2022). When merging with the time series data, we can still track 624 cryptocurrency projects.