skip to main content
article

A parallel general-purpose synthetic data generator

Published:01 March 2007Publication History
Skip Abstract Section

Abstract

PSDG is a parallel synthetic data generator designed to generate "industrial sized" data sets quickly using cluster computing. PSDG depends on SDDL, a synthetic data description language that provides flexibility in the types of data we can generate.

References

  1. Turbo Data, http://www.turbodata.caGoogle ScholarGoogle Scholar
  2. GS Data Generator, http://www.GSApps.comGoogle ScholarGoogle Scholar
  3. DTM Data Generator, http://www.sqledit.comGoogle ScholarGoogle Scholar
  4. RowGen, http://www.iri.comGoogle ScholarGoogle Scholar
  5. N. Bruno and S. Chaudhuri. "Flexible Database Generators," Proceedings of the 31st VLDB Conference, pp. 1097--1107, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Houkjaer, K. Torp, and R. Wind. "Simple and Realistic Data Generation," Proceedings on Very Large Data Bases, 2006, pp. 1243--1246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Lin et al. "Development of a Synthetic Data Set Generator for Building and Testing Information Discovery Systems," Proceedings of the Third International Conference on Information Technology: New Generations, IEEE Computer Society, Las Vegas, USA, April 10--12, 2006, pp. 707--712. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Gray et al. "Quickly Generating Billion-Record Synthetic Databases," Proceedings of the ACM International Conference on Management of Data (SIGMOD), 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. E. O'Neil. The Set-Query Benchmark. www.cs.umb.edu/~poneil/SetQBM.pdfGoogle ScholarGoogle Scholar
  10. Transaction Processing Performance Council, http://www.tpc.org/tpccGoogle ScholarGoogle Scholar
  11. J. Stephens and M. Poess, "MUDD: a Multi-Dimensional Data Generator", International Workshop on Software and Performance, Redwood City, California, January 2004, pp. 104--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. KRDataGeneration home page, http://www.datageneration.com, accessed January 2007.Google ScholarGoogle Scholar
  13. University of Arkansas Synthetic Data Generation home page, http://www.csce.uark.edu/~cwt/SDG.Google ScholarGoogle Scholar

Index Terms

  1. A parallel general-purpose synthetic data generator

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGMOD Record
            ACM SIGMOD Record  Volume 36, Issue 1
            March 2007
            60 pages
            ISSN:0163-5808
            DOI:10.1145/1276301
            Issue’s Table of Contents

            Copyright © 2007 Authors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 March 2007

            Check for updates

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader