3.1 Traditional data trading
Zaslavsky et al. propose the (SaaS) model, whereby sensor data should be provided for a fee [
4]. The authors identify three main participants within this model: the owners of the sensors, mediators for publishing and managing sensors, and the data consumers. Notably, the (SaaS) model does not constitute a data marketplace but can be seen as the underlying principle of data marketplaces, bringing together different parties to exchange data for monetary value.
The authors of [
20] conducted an initial survey on data marketplaces in 2013. The authors define a data marketplace as “a platform on which anybody (or at least a great number of potentially registered clients) can upload and maintain data sets”. The authors examined data providers based on twelve dimensions: Type, time frame, domain, data origin, pricing model, data access, data output, language, target audience, trustworthiness, size of vendor, and maturity. Each of these dimensions captures different properties of data marketplaces to differentiate between various data vendors. In the following discussions, we will focus on a specific type, namely (IoT) data marketplaces.
Mišura and Žagar propose a model for a centralized (IoT) data marketplace [
6]. The authors mention that the design of existing solutions of data marketplaces on the Internet like the deprecated Microsoft Azure Data Market [
21] and Salesforce’s data.com are not well-suited to implement a marketplace for (IoT) data. These marketplaces only offer collections of data that have been previously acquired. Instead, (IoT) data marketplaces require a special design taking into account the characteristics of data generated by (IoT) devices. In the proposed solution, (IoT) devices and data consumers register themselves at the marketplace through a Web interface. Sensor measurements of devices are saved in a database. Devices never communicate directly with consumers to ensure low battery usage. For consumers to retrieve the data, a query mechanism is used. Devices that satisfy the query are returned by the system and can be accepted by the consumer. While this solution is based on a centralized approach, it gives important insights into design issues for (IoT) data marketplaces.
In [
3], Cao et al. discuss problems similar to those identified by [
6]. The authors develop a new design for a data marketplace especially to handle real-time human sensing data that encourage data providers to publish their data. The requirements include effective near real-time distribution of data, push and pull communication, data contract management, monitoring quality, and an (API) for providers and consumers. The result is an architecture consisting of several services. Data providers use the management services to offer their data on the data marketplace. Further, a data discovery service provides the means for data consumers to find specific offers. The data is delivered from producers to consumers through databuses. Additionally, these databuses are monitored by services for data quality analysis to ensure credibility and a payment service.
The authors of [
22] also propose a centralized marketplace platform for the (IoT), called (I3). The platform provides mechanisms including a user interface for each user group, a registry, a recommendation system, a rating system, mechanisms for billing and payments, real-time data routing, historical data access, authorization, metering, privacy, different data formats, and (API). The registry stores important information about all participants of the data marketplaces as well as the offered data streams. The important functionality of matching buyers to data streams is facilitated by a recommendation system. It provides the buyers with a reliable way to identify and find the data which they need. Trust is built by using a rating system, where buyers and sellers can rate each other. Billings and payments can be realized in various ways, whereby the system utilizes data access and metering mechanisms to determine the exact price based on the data which got traded. Real-time data routing is realized through a data routing middleware, e.g., message brokers, where the devices publish generated data on specific topics, and authorized data consumers can subscribe to these topics. In some situations, data consumers may not need access to the data in real-time but rather historical data, which is achieved by providing a data repository for historical data.
Looking at traditional data trading approaches, we can identify several disadvantages of centralized data marketplaces. The data marketplace represents a single point of failure since the users are fully dependent on central components. In the event of a failure, the users would no longer be able to use the data marketplace, which can be very problematic for applications that heavily rely on the availability of certain data. Centralized data marketplaces also come with trust and privacy issues since users have to deal with a centralized authority that is capable of manipulating the market and accessing the data. These reasons, among other things, motivate the exploration of the application of blockchain technology in this research area.
3.3 Blockchain-based data marketplaces
Wörner and Von Bomhard [
26] propose a system where sensors can offer their measurements in exchange for electronic cash. The authors propose a decentralized infrastructure that allows the exchange of data on a worldwide data market. The system consists of a sensor client, a requester client, and a sensor repository. The sensor client is the provider of the data and has to respond to payments by creating and publishing transactions with the bought data on the blockchain. A requester client wants to consume data and pays the consumer by issuing Bitcoin transactions. The sensors which provide the data are registered in the sensor repository to enable requesters to find sensors. Entries of the sensor registry contain basic information about the offered product and additional metadata. However, the system faces scalability issues since the data is stored on the blockchain, and the transmission is limited by the throughput of the blockchain.
Missier et al. [
27] come up with an architecture for (IoT) traffic metering and contract compliance. The architecture is built on top of the (IoT) brokered data infrastructure, whereby the authors assume that communication is mediated through an infrastructure that consists of message brokers to exchange data between (IoT) devices. Buyers and sellers agree upon specific topics, which will be accessible to the buyers for a certain timeframe. Contract enforcement and settlement are realized through traffic cubes, wherein all or partial data flows are recorded. Producers and consumers only construct unilateral traffic cubes since the broker decouples the communication between producer and consumer, whereby the broker has the whole view of all data flows. The cubes are published through transactions on the blockchain to ensure transparency and then checked for consistency for the settlement. The design focuses mainly on decentralized metering of (IoT) data and trustless contract settlement. However, the solution does not provide a platform to discover and select certain products. Further, the incentive why a third party should operate a broker is missing.
Building upon the previous work by Missier et al. [
27], Bajoudah et al. [
28] create a new design for an (IoT) data marketplace that does not require reports of the participants. Instead, the participants exchange data receipts at regular intervals. First, a producer and consumer create a trade agreement to define the conditions for a trade, whereby both parties need to sign it. Then, the consumer can subscribe to and receive a data stream. After receiving a batch of data, the consumer sends a data receipt to the smart contract to confirm it. If the consumer acknowledges the receipt of the data, the provider continues with the transmission. Otherwise, the provider terminates the trade agreement. Again, the authors mainly focus on data trading and do not consider other necessary functionality to make the data marketplace applicable in the real world.
The authors of [
5] propose a three-tier framework for a dynamic and decentralized (IoT) data marketplace. The proposed framework consists of three major participants, including data providers, data consumers, and brokers. Providers and consumers perform the same role as in most other works about data marketplaces, with the provider supplying the data from (IoT) devices and the consumer buying the published data of interest. The role of the broker is inherited from a highly-resourced device. Its task is to mediate the data trading process by providing registration, discovery, and selection functionality. The broker is responsible for matching buyers with sellers and is encouraged by a fee-based system. Data trading is based on the usage of smart contracts, whereby the participants can negotiate with each other before proceeding with the contract. Further, the concept also incorporates a reputation framework, encrypted data transmissions, and a settlement solution that checks the quality of the trade. While this concept provides very extensive functionalities, it has not been implemented. Partially, we apply the fundamental considerations by [
5] in the work at hand.
In [
29], the authors propose another (IoT) data marketplace and introduce a mechanism for energy-aware demand selection and allocation. The data marketplace includes an agreement framework, a pricing model, and a rating system implemented as smart contracts on the Ethereum platform. The mechanism for demand selection and allocation enables sellers to optimize their revenue by matching buyer’s demands with the seller’s (IoT) devices concerning its energy, quality, and allocation constraints. Unfortunately, there is no real distinction between devices and products, and the solution does not provide a user interface.
Ramachandran et al. [
30] present a decentralized data marketplace using blockchain technology to facilitate trust and transparency. Product information is stored using the (IPFS), which is a (DFS). The metadata is saved on the blockchain, and the buyer makes use of both the blockchain and the (DFS) to find the desired product. The implementation uses the previously mentioned (SDPP), and after each trade, buyers and sellers can rate each other. Smart contracts on the Ethereum blockchain are used for the registration of the products as well as the rating. The authors make an effort to consider other functionalities that are needed by a data marketplace besides data trading. Yet, the presented solution does not offer negotiations or a user interface.
Another data marketplace based on blockchain technology is proposed by Banerjee and Ruj [
31]. The authors describe a blockchain-enabled data marketplace that fulfills fairness, efficiency, security, privacy, and adherence to regulations. In the design of the data marketplace, the blockchain is used as a trusted third party, which ensures these properties. The design of the data marketplace ensures fairness by using a FairSwap protocol, where buyer and seller agree over a certain condition expressed as a boolean circuit. A judge contract is responsible for managing the funds until the buyer agrees to the trade or raises a valid complaint. Transparency, security, and privacy are given through the usage of the distributed public ledger, whereby the data is encrypted and can be verified using zero-knowledge proofs or proofs of misbehavior. Different regulations for data are also described by a predicate to verify if the regulations are followed. The authors also provide suggestions to efficiently define and verify the condition and regulation predicates but mention it as a difficult hurdle to overcome. However, they do not mention how to discover other participants or data products. Furthermore, no other features like price negotiation are offered since the authors have focused on data trading.
In [
15], the authors introduce an (IoT) collectability marketplace model to trade collectability of (IoT) devices, whereby collectability refers to the capability of an (IoT) device to collect and transmit real-time data to a specific destination. The model consists of four conceptual layers: the sensor owner, sensing provider, blockchain/marketplace, and data consumer layers. The main responsibility of the blockchain and marketplace layer is to record all transactions and exchanges and provide a reputation system by monitoring the service quality level. However, the marketplace is missing an incentive system to encourage honest behavior of the participants.
Another decentralized marketplace aiming at a machine-to-machine economy for smart cities is proposed in [
16]. The marketplace utilizes the IOTA tangle to enable consumers to buy and monitor different data streams. The architecture consists of two main applications: a Google Chrome extension and a program to collect and send the data. The former enables users to interact with the marketplace and buy the data streams. The latter is used to collect sensor data and send it to consumers using (MAM) (a second layer communication protocol). As a result, the performance also depends on the IOTA tangle. Further, the marketplace does not incorporate a rating system.
Zan et al. [
17] also designed a decentralized data marketplace for the machine-to-machine economy. Similar to the work in [
16], the authors also make use of the IOTA tangle and (MAM) to transfer the data. Additionally, a registration contract on the Ethereum platform is used to maintain a lookup table for the different participants, and product information is stored on the (IPFS). To buy a data product, a consumer has to pay a subscription fee to the purchase contract, which will add it to the consumer list. Further, this approach also makes use of brokers to handle the data trading process and publish the data streams. The marketplace also offers a voting-based refund policy to protect the data providers and consumers from misuse. Again, the performance of the data transfer depends on the IOTA tangle, and the solution does not include a rating system.
A different design of a blockchain-enabled data-sharing platform is described by Sharma et al. [
32]. In contrast to other approaches, the authors base the architecture on microservices for better extensibility through modularization and strong system boundaries. Basic services implement the functional requirements, whereas extended services implement non-functional requirements. The data marketplace enables the sale of datasets, which are stored in the (IPFS) and offered through the data marketplace. A smart contract verifies the payment and the hash of the dataset and provides the consumer access to the dataset. Through the extended services, the users can also use other storage options, automatically generate metadata or decide to check the quality of the data. However, it is not possible to trade real-time streaming data.
The authors of [
33] propose an (IoT) data marketplace based on blockchain technology in which devices upload datasets to Swarm (a decentralized file transfer system). Consumers can query the smart contract to search for data from specific sensors and pay for it through the usage of payment channels. (IoT) data uploaded to Swarm is encrypted with a symmetric key to prevent unauthorized users from retrieving the data. After purchasing a dataset, the parties exchange the symmetric key, and the consumer can retrieve the data via the specified filehandle. Further, a voting system enables participants to rank the data sources. Unfortunately, this solution also does not consider the sale of real-time streaming data.
In [
34], the authors examine the usage of access control policies that are stored and managed on the blockchain to execute access requests for an (IoT) data marketplace. In the course of this, the authors propose a data marketplace and two different data sharing schemes. One of the data sharing schemes is based on (ACL), while the other is based on prefix decryption keys. The data is stored in the cloud which is responsible for the enforcement of the access policy. Basic data trading functions to create and verify metadata and to create and accept offers are implemented through smart contracts. While this work offers two interesting data sharing schemes, it is also not suitable for real-time streaming data.
Badreddine et al. [
35] propose a real-time (IoT) data-sharing framework by combining the (MQTT) protocol with Ethereum smart contracts. The smart contracts automatically calculate the bills based on the proposed monetization approach, which are paid for by the subscribers with the native cryptocurrency. Further, the authors present three blockchain-based traceability solutions to record publish and subscribe operations. These solutions differ in the level of traceability, whereby the lowest level enables only unreliable verification, and the highest level incurs very high gas costs. The framework mainly relies on tracking data and does not apply a rating mechanism. Further, it does not provide a mechanism to search for specific products.
In [
36], the authors present a privacy-aware data marketplace that provides data privacy, output verifiability, and atomicity of payments. The model of the data marketplaces consists of data generators, data brokers, and data consumers. Hereby, data brokers pay data generators for their data, compute functions over the data and sell it to data consumers. By using multi-client functional encryption, the data marketplace ensures that data generators are not able to learn any information about the transmitted data. Further, with the application of (ZKP), the data marketplace ensures that data generators can not alter the data, and consumers can verify the received result. While this work provides an interesting solution to provide the above-mentioned properties, it does not allow dynamic changes in the set of data generators and focuses on the privacy aspect of data marketplaces.
Meijers et al. [
37] propose another blockchain-based IoT data marketplace that uses an optimized data trading mechanism to lower the costs by reducing the number of transactions. Consumers and producers agree off-chain on the conditions of trade. A consumer has to return the provider a receipt to acknowledge that it received the data. The provider can provide a receipt to a smart contract that escrows the funds for the trade. Additionally, the data marketplace applies a credit system where producers grant consumers credit to reduce the costs. This credit allows the producer to send data for a longer period without the consumer having to make a new deposit. As the authors already mentioned, the work’s main goal is to present a blockchain-based IoT data trading mechanism. Other functionalities would still need to be added to make it a full-fledged data marketplace.
As it can be seen, several different data marketplace solutions have been proposed. However, only very few consider a broad range of functionality that would make the data marketplace applicable in real-world scenarios. Most of these solutions focus too much on data trading and neglect other important aspects. Although this is the main element of a data marketplace, it is not possible to operate a data marketplace that solely offers data trading. Nevertheless, these works offer approaches that are helpful for the development of a decentralized data marketplace and, therefore, should be considered more closely. However, the consideration of other functionality (see Sect.
4) is very important because only the interplay of these enables a data marketplace. A marketplace has to offer user management, device management, product management, negotiation, discovery, selection, routing, settlement, rating, monitoring, and web access. Furthermore, when designing the data marketplace, particular attention must be paid to the fact that the (IoT) has different characteristics than the Internet, which must be reflected in the design. Therefore, we create a blockchain-based (IoT) data marketplace that incorporates a wide range of functionalities and takes the shortcomings of already existing solutions into account.