ABSTRACT
JSON -- the most popular data format for sending API requests and responses -- is still lacking a standardized schema or meta-data definition that allows the developers to specify the structure of JSON documents. JSON Schema is an attempt to provide a general purpose schema language for JSON, but it is still work in progress, and the formal specification has not yet been agreed upon. Why this could be a problem becomes evident when examining the behaviour of numerous tools for validating JSON documents against this initial schema proposal: although they agree on most general cases, when presented with the greyer areas of the specification they tend to differ significantly. In this paper we provide the first formal definition of syntax and semantics for JSON Schema and use it to show that implementing this layer on top of JSON is feasible in practice. This is done both by analysing the theoretical aspects of the validation problem and by showing how to set up and validate a JSON Schema for Wikidata, the central storage for Wikimedia.
- Online Appendix. http://web.ing.puc.cl/~jreutter/JSch, 2015.Google Scholar
- Swagger: The World's Most Popular Framework for APIs. http://swagger.io/, 2015.Google Scholar
- Wikimedia. https://www.wikimedia.org/, 2015.Google Scholar
- J. Berman. JSON Schema Test Suite. https://github.com/json-schema/JSON-Schema-Test-Suite, 2015.Google Scholar
- T. Bray. The JavaScript Object Notation (JSON) Data Interchange Format. 2014.Google Scholar
- H. Comon, M. Dauchet, R. Gilleron, C. Löding, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. Tree automata techniques and applications, 2007. release October, 12th 2007.Google Scholar
- T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson. Introduction to Algorithms. McGraw-Hill Higher Education, 2nd edition, 2001. Google ScholarDigital Library
- B. Courcelle. The monadic second-order logic of graphs. I. Recognizable sets of finite graphs. Information and Computation, 85(1):12 -- 75, 1990. Google ScholarDigital Library
- ECMA. The JSON Data Interchange Format . http://www.ecma-international.org/publications/standards/Ecma-404.htm, 2013.Google Scholar
- F. Galiegue and K. Zyp. Json schema: Core definitions and terminology. http://json-schema.org/latest/json-schema-core.html, 2013.Google Scholar
- L. M. Goldschlager. The Monotone and Planar Circuit Value Problems Are Log Space Complete for P. SIGACT News, 9(2):25--29, July 1977. Google ScholarDigital Library
- Google. Google API Discovery Service. https://developers.google.com/discovery/, 2015.Google Scholar
- I. G. group. Heroics: Ruby HTTP client for APIs represented with JSON schema. https://github.com/interagent/heroics, 2013.Google Scholar
- I. G. group. Prmd: JSON Schema tools and documentation generation for HTTP APIs. https://github.com/interagent/prmd, 2013.Google Scholar
- I. G. group. Schematics: A Go point of view on JSON Schema. https://github.com/interagent/schematic, 2013.Google Scholar
- L. Hilaiel. Orderly. http://orderly-json.org/, 2015.Google Scholar
- J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to automata theory, languages, and computation - international edition (2. ed). Addison-Wesley, 2003. Google ScholarDigital Library
- Internet Engineering Task Force (IETF). JavaScript Object Notation (JSON) Pointer. https://tools.ietf.org/html/rfc6901, April 2013.Google Scholar
- Internet Engineering Task Force (IETF). The JavaScript Object Notation (JSON) Data Interchange Format. https://tools.ietf.org/html/rfc7159, March 2014.Google Scholar
- json-schema.org: The home of json schema. http://json-schema.org/.Google Scholar
- L. Libkin. Elements of Finite Model Theory. Springer, 2004. Google ScholarDigital Library
- L. Libkin. Logics for unranked trees: An overview. Logical Methods in Computer Science, 2(3), 2006. Google ScholarCross Ref
- W. Martens, F. Neven, T. Schwentick, and G. J. Bex. Expressiveness and complexity of xml schema. ACM Transactions on Database Systems (TODS), 31(3):770--813, 2006. Google ScholarDigital Library
- MongoDB Inc. The MongoDB3.0 Manual. https://docs.mongodb.org/manual/, 2015.Google Scholar
- RethinkDB: The open-source database for the realtime web. https://www.rethinkdb.com/, 2015.Google Scholar
- J. L. Reutter, F. Suárez, M. Ugarte, and D. Vrgo\vc. JSON Schema: syntax and semantics. http://cswr.github.io/JsonSchema/, 2015.Google Scholar
- M. Sporny, G. Kellogg, and M. Lanthaler. JSON-LD 1.0: A JSON-based Serialization for Linked Data. http://www.w3.org/TR/json-ld/, January 2014.Google Scholar
- The Apache Software Foundation. Apache CouchDB. http://couchdb.apache.org/, 2015.Google Scholar
- The International Organization for Standardization (ISO). ISO/IEC 14977:1996 - Extended BNF. http://www.iso.org/iso/catalogue_detail?csnumber=26153, 1996.Google Scholar
- The RAML Workgroup. RAML: RESTful API Modeling Language. http://raml.org/, 2015.Google Scholar
- D. Vrandecic and M. Krötzsch. Wikidata: a free collaborative knowledgebase. Commun. ACM, 57(10):78--85, 2014. Google ScholarDigital Library
- Wikidata. Wikidata:Database download. https://www.wikidata.org/wiki/Wikidata:Database_download, 2015.Google Scholar
- Wikimedia. Wikidata: The Free Knowledge Base. http://www.wikidata.org, October 2015.Google Scholar
Index Terms
- Foundations of JSON Schema
Recommendations
Reducing Ambiguity in Json Schema Discovery
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataAd-hoc data models like Json simplify schema evolution and enable multiplexing various data sources into a single stream. While useful when writing data, this flexibility makes Json harder to validate and query, forcing such tasks to rely on automated ...
JS4Geo: a canonical JSON Schema for geographic data suitable to NoSQL databases
AbstractThe large volume and variety of data produced in the current Big Data era lead companies to seek solutions for the efficient data management. Within this context, NoSQL databases rise as a better alternative to the traditional relational databases,...
Validation of Modern JSON Schema: Formalization and Complexity
JSON Schema is the de-facto standard schema language for JSON data. The language went through many minor revisions, but the most recent versions of the language, starting from Draft 2019-09, added two novel features, dynamic references and annotation-...
Comments