Abstract
Azure SQL Database and the upcoming release of SQL Server introduce a novel database recovery mechanism that combines traditional ARIES recovery with multi-version concurrency control to achieve database recovery in constant time, regardless of the size of user transactions. Additionally, our algorithm enables continuous transaction log truncation, even in the presence of long running transactions, thereby allowing large data modifications using only a small, constant amount of log space. These capabilities are particularly important for any Cloud database service given a) the constantly increasing database sizes, b) the frequent failures of commodity hardware, c) the strict availability requirements of modern, global applications and d) the fact that software upgrades and other maintenance tasks are managed by the Cloud platform, introducing unexpected failures for the users. This paper describes the design of our recovery algorithm and demonstrates how it allowed us to improve the availability of Azure SQL Database by guaranteeing consistent recovery times of under 3 minutes for 99.999% of recovery cases in production.
- Arulraj, J., Pavlo, A., and Dulloor, S. R. Let's talk about storage & recovery methods for non-volatile memory database systems. SIGMOD, 2015, Pages 707--722. Google ScholarDigital Library
- Coburn, J., Bunker, T., Schwarz, M., Gupta, R., and Swanson, S. From ARIES to MARS: Transaction support for next-generation, solid-state drives. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP, 2013, Pages 197--212. Google ScholarDigital Library
- Delaney, K., Randal, P. S., Tripp, K. L., Cunningham, C., Machanic, A. Microsoft SQL Server 2008 Internals. Microsoft Press, Redmond, WA, USA, 2009. Google ScholarDigital Library
- Gao, S., Xu, J., He, B., Choi, B., Hu, H. PCMLogging: Reducing transaction logging overhead with pcm. CIKM, 2011, Pages 2401--2404. Google ScholarDigital Library
- IBM, IBM DB2, Crash recovery. https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.admin.ha.doc/doc/c0005962.htmlGoogle Scholar
- Lomet, D., Hong, M., Nehme, R., Zhang, R. Transaction time indexing with version compression. PVLDB, 1(1):870--881, 2008. Google ScholarDigital Library
- Microsoft, Accelerated Database Recovery. https://docs.microsoft.com/en-us/azure/sql-database/sql-database-accelerated-database-recoveryGoogle Scholar
- Microsoft, Offload read-only workload to secondary replica of an Always On availability group. https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/active-secondaries-readable-secondary-replicas-always-on-availability-groups?view=sql-server-2017Google Scholar
- Mohan, C., Haderle, D. J., Lindsay, B. G., Pirahesh, H., Schwarz, P. M. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging. ACM TODS, 17(1):94--162, 1992. Google ScholarDigital Library
- MySQL, InnoDB Recovery. https://dev.mysql.com/doc/refman/8.0/en/innodb-recovery.htmlGoogle Scholar
- Oracle, Using Fast-Start On-Demand Rollback https://docs.oracle.com/cd/B10500_01/server.920/a96533/instreco.htm#429546Google Scholar
- Oukid, I., Booss, D., Lehner, W., Bumbulis, P., Willhalm, T. SOFORT: A hybrid SCM-DRAM storage engine for fast data recovery. DaMoN, 2014. Google ScholarDigital Library
- Stonebraker, M., Rowe, L. A. The design of POSTGRES. SIGMOD, 1986. Google ScholarDigital Library
- Verbitski, A., Gupta, A., Saha, D., Corey, J., Gupta, K., Brahmadesam, M., Mittal, R., Krishnamurthy, S., Maurice, S., Kharatishvilli, T., Bao, X. Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes. SIGMOD, 2018, Pages 789--796. Google ScholarDigital Library
- Wang, T., Johnson, R. Scalable logging through emerging non-volatile memory. PVLDB, 7(10):865--876, 2014. Google ScholarDigital Library
Index Terms
- Constant time recovery in Azure SQL database
Recommendations
Microsoft azure SQL database telemetry
SoCC '15: Proceedings of the Sixth ACM Symposium on Cloud ComputingMicrosoft operates the Azure SQL Database (ASD) cloud service, one of the dominant relational cloud database services in the market today. To aid the academic community in their research on designing and efficiently operating cloud database services, ...
Comments