ABSTRACT
Erasure coding can reduce the space and band width overheads of redundancy in fault-tolerant data storage and delivery systems. But it introduces the fundamental difficulty of ensuring that all erasure-coded fragments correspond to the same block of data. Without such assurance, a different block may be reconstructed from different subsets of fragments. This paper develops a technique for providing this assurance without the bandwidth and computational overheads associated with current approaches. The core idea is to distribute with each fragment what we call homomorphic fingerprints. These fingerprints preserve the structure of the erasure code and allow each fragment to be independently verified as corresponding to a specific block. We demonstrate homomorphic fingerprinting functions that are secure, efficient, and compact.
- M. K. Aguilera, R. Janakiraman, and L. Xu. Using erasure codes efficiently for storage in a distributed system. In Proceedings of the International Conference on Dependable Systems and Networks, pages 336--345. IEEE Computer Society, 2005. Google ScholarDigital Library
- M. Bellare, O. Goldreich, and S. Goldwasser. Incremental cryptography: The case of hashing and signing. In Advances in Cryptology - CRYPTO '94, pages 216--233. Springer-Verlag, 1994. Google ScholarCross Ref
- M. Bellare and P. Rogaway. Random oracles are practical: A paradigm for designing efficient protocols. In Proceedings of the 1st ACM Conference on Computer and Communications Security, pages 62--73. ACM Press, 1993. Google ScholarDigital Library
- A. Z. Broder. Some applications of Rabin's fingerprinting method. Sequences II: Methods in Communications, Security, and Computer Science, pages 143--152, 1993.Google Scholar
- C. Cachin and S. Tessaro. Asynchronous verifiable information dispersal. In Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems, pages 191--202. IEEE Press, 2005. Google ScholarDigital Library
- C. Cachin and S. Tessaro. Optimal resilience for erasure-coded Byzantine distributed storage. In Proceedings of the International Conference on Dependable Systems and Networks, pages 115--124. IEEE Computer Society, 2006. Google ScholarDigital Library
- J. L. Carter and M. N. Wegman. Universal classes of hash functions (extended abstract). In Proceedings of the 9th ACM Symposium on Theory of Computing, pages 106--112. ACM Press, 1977. Google ScholarDigital Library
- M. Castro, P. Druschel, A. Kermarrec, A. Nandi, A. Rowstron, and A. Singh. Splitstream: High-bandwidth multicast in cooperative environments. In Proceedings of the 19th ACM Symposium on Operating Systems Principles, pages 298--313. ACM Press, 2003. Google ScholarDigital Library
- B. Chor, S. Goldwasser, S. Micali, and B. Awerbuch. Verifiable secret sharing and achieving simultaneity in the presence of faults. In Proceedings of the 26th Annual Symposium on the Foundations of Computer Science, pages 383--395. IEEE Press, 1985.Google ScholarDigital Library
- P. Feldman. A practical scheme for non-interactive verifiable secret sharing. In Proceedings of the 28th Annual Symposium on the Foundations of Computer Science, pages 427--437. IEEE Press, 1987.Google ScholarDigital Library
- B. Gladman. SHA1, SHA2, HMAC and key derivation in C. http://fp.gladman.plus.com/cryptography technology/sha.Google Scholar
- L. Gong. Securely replicating authentication services. In Proceedings of the 9th International Conference on Distributed Computing Systems, pages 85--91. IEEE Computer Society, 1989.Google ScholarCross Ref
- G. R. Goodson, J. J. Wylie, G. R. Ganger, and M. K. Reiter. Efficient Byzantine-tolerant erasure-coded storage. In Proceedings of the International Conference on Dependable Systems and Networks, pages 135--144. IEEE Computer Society, 2004. Google ScholarDigital Library
- G. R. Goodson, J. J. Wylie, G. R. Ganger, and M. K. Reiter. The safety and liveness properties of a protocol family for versatile survivable storage infrastructures. Technical Report CMU-PDL-03-105, Parallel Data Laboratory, Carnegie Mellon University, 2004.Google Scholar
- H. Krawczyk. Distributed fingerprints and secure information dispersal. In Proceedings of the 12th ACM Symposium on Principles of Distributed Computing, pages 207--218. ACM Press, 1993. Google ScholarDigital Library
- H. Krawczyk. LFSR-based hashing and authentication. In Advances in Cryptology - CRYPTO '94, pages 129--139. Springer-Verlag, 1994. Google ScholarCross Ref
- M. N. Krohn, M. J. Freedman, and D. Mazieres. On-the-fly verification of rateless erasure codes for efficient content distribution. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Press, 2004.Google ScholarCross Ref
- J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, C. Wells, and B. Zhao. Oceanstore: An architecture for global-scale persistent storage. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 190--201. ACM Press, 2000. Google ScholarDigital Library
- P. Maymounkov. Online codes. Technical Report TR2002-833, Secure Computer Systems Group, New York University, 2002.Google Scholar
- K. Mehlhorn and U. Vishkin. Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories. Acta Informatica, 21(4):339--374, 1984. Google ScholarDigital Library
- M. Naor, B. Pinkas, and O. Reingold. Distributed pseudo-random functions and KDCs. In Advances in Cryptology - EUROCRYPT '99, pages 327--346. Springer-Verlag, 1999.Google ScholarCross Ref
- W. Nevelsteen and B. Preneel. Software performance of universal hash functions. In Advances in Cryptology -EUROCRYPT '99, pages 24--41. Springer-Verlag, 1999.Google ScholarCross Ref
- T. Pedersen. Distributed provers with applications to undeniable signatures. In Advances in Cryptology -EUROCRYPT '91, pages 221--242. Springer-Verlag, 1991.Google ScholarCross Ref
- M. O. Rabin. Fingerprinting by random polynomials. Technical Report TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.Google Scholar
- M. O. Rabin. Efficient dispersal of information for security, load balancing, and fault tolerance. Journal of the ACM, 36(2):335--348, 1989. Google ScholarDigital Library
- I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. SIAM Journal of Applied Mathematics, 8:300--304, 1960.Google ScholarCross Ref
- P. Rogaway and T. Shrimpton. Cryptographic hash-function basics: Definitions, implications, and separations for preimage resistance, second-preimage resistance, and collision resistance. In Proceedings of the 11th International Workshop on Fast Software Encryption. Springer-Verlag, 2004.Google ScholarCross Ref
- Y. Saito, S. Frølund, A. Veitch, A. Merchant, and S. Spence. FAB: Building distributed enterprise disk arrays from commodity components. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 48--58. ACM Press, 2004. Google ScholarDigital Library
- T. Schwarz. Verification of parity data in large scale storage systems. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications. CSREA Press, 2004.Google Scholar
- V. Shoup. On fast and provably secure message authentication based on universal hashing. In Advances in Cryptology - CRYPTO '96, pages 313--328. Springer-Verlag, 1996. Google ScholarCross Ref
- M. A. Stadler. Publicly verifiable secret sharing. In Advances in Cryptology - EUROCRYPT '96, pages 190--199. Springer-Verlag, 1996.Google ScholarCross Ref
- D. Travis. On irreducible polynomials in Galois fields. The American Mathematical Monthly, 70(10):1089--1090, 1963.Google ScholarCross Ref
Index Terms
- Verifying distributed erasure-coded data
Recommendations
A Layered Architecture for Erasure-Coded Consistent Distributed Storage
PODC '17: Proceedings of the ACM Symposium on Principles of Distributed ComputingMotivated by emerging applications to the edge computing paradigm, we introduce a two-layer erasure-coded fault-tolerant distributed storage system offering atomic access for read and write operations. In edge computing, clients interact with an edge-...
Exploiting Redundancies and Deferred Writes to Conserve Energy in Erasure-Coded Storage Clusters
We present a power-efficient scheme for erasure-coded storage clusters---ECS2---which aims to offer high energy efficiency with marginal reliability degradation. ECS2 utilizes data redundancies and deferred writes to conserve energy. In ECS2 parity ...
A "hitchhiker's" guide to fast and efficient data reconstruction in erasure-coded data centers
SIGCOMM'14Erasure codes such as Reed-Solomon (RS) codes are being extensively deployed in data centers since they offer significantly higher reliability than data replication methods at much lower storage overheads. These codes however mandate much higher ...
Comments