skip to main content
research-article

SCOUT: prefetching for latent structure following queries

Authors Info & Claims
Published:01 July 2012Publication History
Skip Abstract Section

Abstract

Today's scientists are quickly moving from in vitro to in silico experimentation: they no longer analyze natural phenomena in a petri dish, but instead they build models and simulate them. Managing and analyzing the massive amounts of data involved in simulations is a major task. Yet, they lack the tools to efficiently work with data of this size.

One problem many scientists share is the analysis of the massive spatial models they build. For several types of analysis they need to interactively follow the structures in the spatial model, e.g., the arterial tree, neuron fibers, etc., and issue range queries along the way. Each query takes long to execute, and the total time for executing a sequence of queries significantly delays data analysis. Prefetching the spatial data reduces the response time considerably, but known approaches do not prefetch with high accuracy.

We develop SCOUT, a structure-aware method for prefetching data along interactive spatial query sequences. SCOUT uses an approximate graph model of the structures involved in past queries and attempts to identify what particular structure the user follows. Our experiments with neuro-science data show that SCOUT prefetches with an accuracy from 71% to 92%, which translates to a speedup of 4x-15x. SCOUT also improves the prefetching accuracy on datasets from other scientific domains, such as medicine and biology.

References

  1. T. Achenbach et al. Accuracy of automatic airway morphometry in computed tomography-correlation of pathological findings. European Journal of Radiology, 81(1):183--188, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  2. R. Agrawal and R. Srikant. Mining sequential patterns. In ICDE, pages 3--14, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Arthur et al. K-means has polynomial smoothed complexity. In FOCS, pages 405--414, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Azuma and G. Bishop. A frequency-domain analysis of head-motion prediction. In SIGGRAPH, pages 401--408, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Chan, R. Lau, and A. Si. A motion prediction method for mouse-based navigation. In CGI, pages 139--146, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chen, A. Ailamaki, P. Gibbons, and T. Mowry. Improving hash join performance through prefetching. In ICDE, pages 116--127, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Chim et al. On caching and prefetching of virtual objects in distributed virtual environments. In MULTIMEDIA, pages 171--180, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Choi, S. H. Noh, S. L. Min, and Y. Cho. Towards application file-level characterization of block references: A case for fine-grained buffer management. In SIGMETRICS, pages 286--295, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. T. Fu, S.-S. Hung, H.-L. Lin, D. Tsaih, and J. T. Chen. Intelligent-based latency reduction in 3d walkthrough. In ISTASC, pages 218--226, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. V. Gaede et al. Multidimensional access methods. ACM Computing Surveys, 30(2):170--231, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Grinberg et al. Large-scale simulation of the human arterial tree. Clinical and Experimental Pharmacology and Physiology, 36(2):194--205, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  12. N. Knafla. A prefetching technique for object-oriented databases. In BNCOD, pages 154--168, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Lee et al. Adaptation of a neighbor selection markov chain for prefetching tiled web gis data. In ADVIS, pages 213--222, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. T. Leutenegger, M. A. Lopez, and J. Edgington. STR: a Simple and Efficient Algorithm for R-tree Packing. In ICDE, pages 497--506, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Li et al. On trip planning queries in spatial databases. In SSTD, pages 273--290, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C.-K. Luk and T. C. Mowry. Compiler-based prefetching for recursive data structures. In ASPLOS, pages 222--233, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H. Markram. The blue brain project. Nature Reviews Neuroscience, 7(2):153--160, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  18. M. McAllister and J. Snoeyink. Medial axis generalization of river networks. Cartography and Geographic Information Science, 27(2):129--138, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  19. S. Minglong et al. Multimap: Preserving disk locality for multidimensional datasets. In ICDE, pages 926--935, 2007.Google ScholarGoogle Scholar
  20. A. Nanopoulos, D. Katsaros, and Y. Manolopoulos. A data mining algorithm for generalized web prefetching. IEEE Transaction on Knowledge and Data Engineering, 15(5):1155--1169, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Papadomanolakis et al. Efficient query processing on unstructured tetrahedral meshes. In SIGMOD, pages 551--562, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D.-J. Park and H.-J. Kim. Prefetch policies for large objects in a web-enabled gis application. Data and Knowledge Engineering, 37(1):65--84, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. E. Perlman, R. Burns, M. Kazhdan, R. R. Murphy, W. P. Ball, and N. Amenta. Organization of data in non-convex spatial domains. In SSDBM, pages 342--359, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Rabinovich and O. Spatschek. Web caching and replication. SIGMOD Record, 32(4):107--108, 2002.Google ScholarGoogle Scholar
  25. A. Roth, A. Moshovos, and G. S. Sohi. Dependence based prefetching for linked data structures. In ASPLOS, pages 115--126, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. G. Said, E. B. Omar, and L. Robert. Data prefetching algorithm in mobile environments. European Journal of Scientific Research, 28(3):478--491, 2009.Google ScholarGoogle Scholar
  27. F. Tauheed et al. Accelerating range queries for brain simulations. In ICDE, pages 218--230, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Thereska, M. Abd-El-Malek, J. J. Wylie, D. Narayanan, and G. R. Ganger. Informed data distribution selection in a self-predicting storage system. In ICAC '06, pages 187--198, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Walker et al. A bayesian framework for automated dataset retrieval in geographic information systems. In MMM, pages 138--150, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Z. Xiaohong, F. Shengzhong, and F. Jianping. A history-based motion prediction method for mouse-based navigation in 3d digital city. In GCC, pages 686--690, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Zhang and S. You. Dynamic tiled map services: Supporting query visualization of large-scale raster geospatial data. In COM.Geo, pages 191--198, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 5, Issue 11
    July 2012
    608 pages

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 July 2012
    Published in pvldb Volume 5, Issue 11

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader