ABSTRACT
Document stores have become one of the most popular NoSQL systems, mainly due to their semi-structured data storage structure and well-developed query capabilities. The semi-structured nature allows them to have database designs beyond traditional normalization theories. This makes the database design decisions more complicated with a myriad of possibilities. Thus, the database design process for them has resorted to ad-hoc trial and error methods. However, having a good database design is essential for any data storage system’s performance, and bad design decisions cannot always be compensated by adding more powerful hardware. Thus, in this work, we propose DocDesign, a decision aid tool for document store database design. DocDesign allows its users to evaluate different database designs for data storage requirements under a particular workload. Through DocDesign, users can make informed decisions for a design by evaluating the estimated storage statistics and query runtimes without testing it on an actual document store. DocDesign also generates design specific queries for the input workload. This not only cuts down the time and the effort taken in design decision making and development but also save money spent on fixing poor designs in the long run. On-site, we will showcase how DocDesign facilitates the design decision-making process for MongoDB with both synthetic and real-world examples.
- Francesca Bugiotti, Luca Cabibbo, Paolo Atzeni, and Riccardo Torlone. 2014. Database design for NoSQL systems. In Int. Conf. on Conceptual Modeling. ER.Google ScholarCross Ref
- Rick Cattell. 2010. Scalable SQL and NoSQL data stores. SIGMOD Record 39, 4 (2010), 12–27.Google ScholarDigital Library
- Moditha Hewasinghage, Alberto Abelló, Jovan Varga, and Esteban Zimányi. [n.d.]. A Cost Model for Random Access Queries in Document Stores (Under review).Google Scholar
- Moditha Hewasinghage, Jovan Varga, Alberto Abelló, and Esteban Zimányi. 2018. Managing Polyglot Systems Metadata with Hypergraphs. In Int. Conf. on Conceptual Modeling. ER, 463–478.Google Scholar
- Kalervo Järvelin and Jaana Kekäläinen. 2017. IR evaluation methods for retrieving highly relevant documents. SIGIR Forum 51, 2 (2017), 243–250.Google ScholarDigital Library
- Sin Yeung Lee, Mong-Li Lee, Tok Wang Ling, and Leonid A. Kalinichenko. 1999. Designing Good Semi-Structured Databases and Conceptual Modeling. In Int. Conf. on Conceptual Modeling. ER, 131–145.Google Scholar
- Sam Lightstone, Toby J. Teorey, and Thomas P. Nadeau. 2007. Physical Database Design: the database professional’s guide to exploiting indexes, views, storage, and more. Morgan Kaufmann.Google Scholar
- Erik Meijer and Gavin M. Bierman. 2011. A co-relational model of data for large shared data banks. Commun. ACM 54, 4 (2011), 49–58.Google ScholarDigital Library
Recommendations
A cost model for random access queries in document stores
AbstractDocument stores have become one of the key NoSQL storage solutions. They have been widely adopted in different domains due to their ability to store semi-structured data and expressive query capabilities. However, implementations differ in terms ...
Design a Data Warehouse Schema from Document-Oriented database
AbstractTraditional data warehouses are unable to meet the growing needs of the modern enterprise to integrate and analyze a wide variety of data generated by social, mobile and sensor sources. What is remarkable is that many companies have changed their ...
Schema-independent querying for heterogeneous collections in NoSQL document stores
AbstractNoSQL document stores are well-tailored to efficiently load and manage massive collections of heterogeneous documents without any prior structural validation. However, this flexibility becomes a serious challenge when querying ...
Highlights- Document stores offer the flexibility to store documents with heterogeneous schemas.
Comments