Abstract
Over the last decade, the nature of content stored on computer storage systems has evolved from being relational to being semi-structured, i.e., unstructured data accompanied by relational metadata. Average data volumes have increased from a few hundred megabytes to hundreds of terabytes. Simultaneously, data feed rates have also increased with increase in processor, storage and network bandwidths. Data growth trends seem to be following Moore's law and thereby imply an exponential explosion in content volumes and rates in the years to come. The near future poses requirements for data management systems to provide solutions that provide unlimited scalability in execution, availability, recoverability and storage usage of semi-structured content.
Traditionally, filesystems have been preferred over database management systems for providing storage solutions for unstructured data, while databases have been the preferred choice to manage relational data. Lack of consolidated semi-structured content management architecture compromises security, availability, recoverability, and manageability among other features. We introduce a system without compromises, the Oracle SecureFiles System, designed to provide highly scalable storage and access execution of unstructured and structured content as first-class objects within the Oracle relational database management system. Oracle SecureFiles breaks the performance barrier that has kept such content out of databases. The architecture provides capability to maximize utilization of storage usage through compression and de-duplication and achieves robustness by preserving transactional atomicity, durability, availability, read-consistent query-ability and security of the database management system.
- Blumberg, R., Atre, S. The Problem with Unstructured Data. DM Review Magazine, Feb. 2003.Google Scholar
- Lallier, J. Storage Management in the Year 2010. Computer Technology Review, September 2004.Google Scholar
- Vijayan, P. Iron File Systems. Thesis Submitted for Doctor of Philosophy in Computer Sciences, University of Wisconsin-Madison, 2006.Google Scholar
- Oracle Database 11g Product Family. An Oracle White Paper, January 2008.Google Scholar
- Lahiri, T., Srihari, V., Chan, W., Macnaughton, N., Chandrasekaran, S. Cache Fusion: Extending Shared-Disk Clusters with Shared Caches, Proceedings of the 27th VLDB conference, Roma, Italy, 2001. Google ScholarDigital Library
- Biggar, H. Experiencing Data De-Duplication: Improving Efficiency and Reducing Capacity Requirements. A SearchStorage.com White Paper, Feb 2007.Google Scholar
- Shapiro, M., Miller, E. Managing Databases with Binary Large Objects. Proceedings of the 16th IEEE Mass Storage System Symposium, San Diego, CA, March 1999.Google Scholar
- Olson, M. A., The Design and Implementation of Inversion Filesystem. Proceedings of the winter 1993 USENIX Conference, Berkeley, CA, 1993.Google Scholar
- Gray, J. Greetings! From a Filesystem User. 4th USENIX Conference on File and Storage Technologies, San Francisco, CA, DEC 2005Google Scholar
- Sears, R., Ingen, C., Gray, J. To BLOB or not to BLOB: Large object Storage in a database or a Filesystem? Microsoft Research Technical Report, MSR-TR-2006-45, June 2006.Google Scholar
- Dumler, M. Microsoft SQL Server 2005. A Microsoft Product Guide, September 2005.Google Scholar
- Seltzer, M., Olson, M. LIBTP: Portable, Modular Transactions for UNIX. Proceedings of the 1992 Winter Usenix, San Francisco, CA, JAN 1992.Google Scholar
- Gehani, N., Jagadish, H. V., Roome, and W. D. OdeFS: A Filesystem Interface to an Object-Oriented Database. Proceedings of the Twentieth International Conference on Very Large Databases, Santiago, Chile, 1994. Google ScholarDigital Library
- Murphy, N., Tonkelowitz, M., and Vernal, M. The Design and Implementation of the Database Filesystem. www.eecs.harvard.edu/vernal/learn/cs261r/index.shtml, January 2002.Google Scholar
- LUSTRE FILE SYSTEM: High Performance Storage Architecture and Scalable Cluster File System. A Sun Microsystems White Paper, DEC 2007.Google Scholar
- Ghemawat, S., Gobioff, H., Leung, S. The Google File System. 19th ACM Symposium on Operating Systems Principles, NY, OCT 2003. Google ScholarDigital Library
- Architecture of ZFS for Lustre. Sun Micorsystems, FEB 2008Google Scholar
- Chang, F., Dean, J., Ghemawat, S., Hsieh, W., Wallach, D., Burrows, M., Chandra, T., Fikes, A., Gruber, R. Bigtable: A Distributed Storage System for Structured Data. 7th Usenix Symposium on Operating Systems Design and Implementation, Seattle, WA, Nov 2006. Google ScholarDigital Library
- Rajamani, R. Oracle Total recall/Flashback Data Archive. An Oracle White Paper, June 2007.Google Scholar
- Jensen, Christian S., Snodgrass, Richard T. Temporal Data Management. IEEE Transactions on Knowledge and data Engineering, Vol. 11, No. 1, January 1999. Google ScholarDigital Library
- Stonebraker, M., Madden, S., Abadi, D., Harizopoulos, S., Hachem, N., Helland, P. The End of an Architectural Era (It's Time for a Complete Rewrite). Proceedings of VLDB, Vienna, SEP 2007 Google ScholarDigital Library
- Adams, P. National Ignition facility and 11g SecureFiles. Oracle Openworld, NOV 2007.Google Scholar
- Gray, J., Graefe, G. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. Proceedings of ACM SIGMOD, Tucson, AR, 1997. Google ScholarDigital Library
- Nath, S, Kansal, M. FlashDB: Dynamic Self-tuning Database for NAND Flash. Proceedings of the International Conference on Information Processing in Sensor Networks, Cambridge, MA, APR 2007. Google ScholarDigital Library
- Flash Filesystems Overview. An Intel White Paper, 2006.Google Scholar
- Geoff Lee. Oracle Database 11g XML DB Technical Overview. An Oracle White Paper, July 2007.Google Scholar
Index Terms
- Oracle SecureFiles System
Recommendations
Oracle database filesystem
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of dataModern enterprise, web, and multimedia applications are generating unstructured content at unforeseen volumes in the form of documents, texts, and media files. Such content is generally associated with relational data such as user names, location tags, ...
Oracle SecureFiles: prepared for the digital deluge
Digital unstructured data volumes across enterprise, Internet and multimedia applications are predicted to surpass 6.023x1023 (Avogadro's number) bits a year in the next fifteen years. This poses tremendous scalability challenges for data management ...
Comments