skip to main content
10.1145/2367589.2367605acmotherconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
research-article

Understanding data survivability in archival storage systems

Published:04 June 2012Publication History

ABSTRACT

Preserving data for a long period of time in the face of faults, large and small, is crucial for designing reliable archival storage systems. However, the survivability of data is different from the reliability of storage because typically, data are stored in more than one storage at a given moment. Previous studies of reliability ignore the former. We present a framework for relating data survivability and storage reliability, and use the framework to gauge the impact of rare but large-scale events on data survivability. We also present a method to track all copies of data and the condition of all the online and offline media, devices and systems on which they are stored uninterruptedly over the whole lifetime of the data. With this method, the survivability of the data can be closely monitored, and potential dangers can be handled in a timely manner. A better understanding of data survivability can be used in reducing unnecessary data replicas, thus reducing the cost.

References

  1. S. O. Akçiz, L. G. Ludwig, J. R. Arrowsmith, and O. Zielke. Century-long average time intervals between earthquake ruptures of the San Andreas fault in the Carrizo Plain, California. Geology, 38: 787--790, Sept. 2010.Google ScholarGoogle ScholarCross RefCross Ref
  2. L. N. Bairavasundaram, G. R. Goodson, B. Schroeder, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. An analysis of data corruption in the storage stack. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST), pages 223--238, Feb. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Baker, M. Shah, D. S. H. Rosenthal, M. Roussopoulos, P. Maniatis, T. Giuli, and P. Bungale. A fresh look at the reliability of long-term digital storage. In Proceedings of EuroSys 2006, pages 221--234, Apr. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Bhagwat, K. Pollack, D. D. E. Long, E. L. Miller, J.-F. Pâris, and T. Schwarz, S. J. Providing high reliability in a minimum redundancy archival storage system. In Proceedings of the 14th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '06), Monterey, CA, Sept. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. M. Blum, A. Goyal, P. Heidelberger, S. S. Lavenberg, M. K. Nakayama, and P. Shahbuddin. Modeling and analysis of system dependability using the system availability estimator. In Proceedings of the 24th International Symposium on Fault-Tolerant Computing (FTCS '94), pages 137--141, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  6. Y. Bozorgnia and V. V. Bertero. Earthquake engineering: from engineering seismology to performance-based engineering. CRC Press LLC, 2006.Google ScholarGoogle Scholar
  7. R. Chalfant. Tape: A collapsing star. http://www.mainframezone.com/storage/backup-recovery-business-continuity/tape-a-collapsing-star, 2010.Google ScholarGoogle Scholar
  8. J. G. Elerath. Specifying reliability in the disk drive industry: No more MTBF's. In Proceedings of 2000 Annual Reliability and Maintainability Symposium, pages 194--199. IEEE, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. G. Elerath and M. Pecht. Enhanced reliability modeling of RAID storage systems. In Proceedings of the 2007 Int'l Conference on Dependable Systems and Networking (DSN 2007), pages 175--184. IEEE, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Giaretta. Advanced Digital Preservation. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. M. Gladney. Preserving digital information. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Gopinath, J. Elerath, and D. Long. Reliability modelling of disk subsystems with probabilistic model checking. Technical Report UCSC-SSRC-09-05, University of California, Santa Cruz, May 2009.Google ScholarGoogle Scholar
  13. K. M. Greenan. Reliability and power-efficiency in erasure-coded storage systems. Technical report, University of California, Santa Cruz, Dec. 2009.Google ScholarGoogle Scholar
  14. K. M. Greenan, E. L. Miller, and J. J. Wylie. Reliability of flat XOR-based erasure codes on heterogeneous devices. In Proceedings of the 2008 Int'l Conference on Dependable Systems and Networking (DSN 2008), pages 147--156, June 2008.Google ScholarGoogle ScholarCross RefCross Ref
  15. K. M. Greenan, J. S. Plank, and J. J. Wylie. Mean time to meaningless: MTTDL, Markov models, and storage system reliability. In Proceedings of the 1st Workshop on Hot Topics in Storage and File Systems (HotStorage '10), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Greenfieldboyce. Houston, we erased the Apollo 11 tapes. National Public Radio, http://www.npr.org/templates/story/story.php?storyId=106637066, July 2009.Google ScholarGoogle Scholar
  17. W. Jiang, C. Hu, Y. Zhou, and A. Kanevsky. Are disks the dominant contributor for storage failures? In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST), Feb. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Maniatis, M. Roussopoulos, T. J. Giuli, D. S. H. Rosenthal, and M. Baker. The LOCKSS peer-to-peer digital preservation system. ACM Transactions on Computer Systems, 23(1): 2--50, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Nath, H. Yu, P. B. Gibbons, and S. Seshan. Subtleties in tolerating correlated failures in wide-area storage systems. In Proceedings of the 3rd Symposium on Networked Systems Design and Implementation (NSDI), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Oprea and A. Juels. A clean-slate look at disk scrubbing. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST), Feb. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Panzer-Steindel. Data integrity. CERN/IT, 2007.Google ScholarGoogle Scholar
  22. D. A. Patterson, G. Gibson, and R. H. Katz. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, pages 109--116. ACM, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. E. Pinheiro, W.-D. Weber, and L. A. Barroso. Failure trends in a large disk drive population. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST), Feb. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. S. H. Rosenthal. Keeping bits safe: How hard can it be? Communications of the ACM, 53, Nov. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Schroeder and G. A. Gibson. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST), pages 1--16, Feb. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T. J. E. Schwarz, Q. Xin, E. L. Miller, D. D. E. Long, A. Hospodor, and S. Ng. Disk scrubbing in large archival storage systems. In Proceedings of the 12th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '04), pages 409--418, Oct. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. L. Shimpi. The SandForce roundup: Corsair, Kingston, Patriot, OCZ, OWC & MemoRight SSDs compared. AnandTech, Aug. 2011.Google ScholarGoogle Scholar
  28. M. Storer, K. Greenan, E. L. Miller, and C. Maltzahn. Pot-shards: Storing data for the long-term without encryption. In Proceedings of the 3rd International IEEE Security in Storage Workshop, Dec. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. W. Storer, K. M. Greenan, I. Adams, E. L. Miller, D. D. E. Long, and K. Vorugant. Logan: Automatic management for evolvable, large-scale, archival storage. In Proceedings of the 3rd Petascale Data Storage Workshop (PDSW '08), Nov. 2008.Google ScholarGoogle ScholarCross RefCross Ref
  30. M. W. Storer, K. M. Greenan, E. L. Miller, and K. Voruganti. Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST), Feb. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. T. A. W. S. Team. Summary of the Amazon EC2 and Amazon RDS service disruption in the US East Region. Amazon Web Services, http://aws.amazon.com/message/65648/, Apr. 2011.Google ScholarGoogle Scholar
  32. R. Weisman. Data backup firm sues 2 hardware suppliers. The Boston Globe, Mar. 2009.Google ScholarGoogle Scholar
  33. L. L. You, K. T. Pollack, D. D. E. Long, and K. Gopinath. PRESIDIO: a framework for efficient archival data storage. ACM Transactions on Storage, 7(2), July 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    SYSTOR '12: Proceedings of the 5th Annual International Systems and Storage Conference
    June 2012
    183 pages
    ISBN:9781450314480
    DOI:10.1145/2367589

    Copyright © 2012 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 4 June 2012

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate94of285submissions,33%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader