Abstract
At UC Irvine, we are building a next generation parallel database system, called ASTERIX, as our approach to addressing today's "Big Data" management challenges. ASTERIX aims to combine time-tested principles from parallel database systems with those of the Web-scale computing community, such as fault tolerance for long running jobs. In this demo, we present a whirlwind tour of ASTERIX, highlighting a few of its key features. We will demonstrate examples of our data definition language to model semi-structured data, and examples of interesting queries using our declarative query language. In particular, we will show the capabilities of ASTERIX for answering geo-spatial queries and fuzzy queries, as well as ASTERIX' data feed construct for continuously ingesting data.
- ASTERIX Website. http://asterix.ics.uci.edu/.Google Scholar
- Apache Hive, http://hadoop.apache.org/hive.Google Scholar
- A. Behm, V. R. Borkar, M. J. Carey, R. Grover, C. Li, N. Onose, R. Vernica, A. Deutsch, Y. Papakonstantinou, and V. J. Tsotras. Asterix: Towards a Scalable, Semistructured Data Platform for Evolving-World Models. Distributed and Parallel Databases, 29(3):185--216, 2011. Google Scholar
- V. R. Borkar, M. J. Carey, R. Grover, N. Onose, and R. Vernica. Hyracks: A flexible and extensible foundation for data-intensive computing. In ICDE, pages 1151--1162, 2011. Google Scholar
- Jaql, http://www.jaql.org.Google Scholar
- JSON. http://www.json.org/.Google Scholar
- Object database management systems. http://www.odbms.org/odmg/.Google Scholar
- C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: a Not-so-Foreign Language for Data Processing. In SIGMOD, pages 1099--1110, 2008. Google Scholar
- R. Ramakrishnan and J. Gehrke. Database Management Systems. WCB/McGraw-Hill, 2002. Google Scholar
- R. Vernica, M. J. Carey, and C. Li. Efficient parallel set-similarity joins using MapReduce. In SIGMOD, pages 495--506, 2010. Google Scholar
- XQuery 1.0: An XML query language. http://www.w3.org/TR/xquery/.Google Scholar
Index Terms
- ASTERIX: an open source system for "Big Data" management and analysis (demo)
Recommendations
ASTERIX: towards a scalable, semistructured data platform for evolving-world models
ASTERIX is a new data-intensive storage and computing platform project spanning UC Irvine, UC Riverside, and UC San Diego. In this paper we provide an overview of the ASTERIX project, starting with its main goal--the storage and analysis of data ...
ASTERIX: scalable warehouse-style web data integration
IIWeb '12: Proceedings of the Ninth International Workshop on Information Integration on the WebA growing wealth of digital information is being generated on a daily basis in social networks, blogs, online communities, etc. Organizations and researchers in a wide variety of domains recognize that there is tremendous value and insight to be gained ...
Comments