Abstract
Data scientists rely on visualizations to interpret the data returned by queries, but finding the right visualization remains a manual task that is often laborious. We propose a DBMS that partially automates the task of finding the right visualizations for a query. In a nutshell, given an input query Q, the new DBMS optimizer will explore not only the space of physical plans for Q, but also the space of possible visualizations for the results of Q. The output will comprise a recommendation of potentially "interesting" or "useful" visualizations, where each visualization is coupled with a suitable query execution plan. We discuss the technical challenges in building this system and outline an agenda for future research.
- K. Chakrabarti et al. Approximate query processing using wavelets. In VLDB, pages 111--122, 2000. Google ScholarDigital Library
- G. Cormode and S. Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. J. Algorithms, 55(1): 58--75, 2005. Google ScholarDigital Library
- P. B. Gibbons. Distinct sampling for highly-accurate answers to distinct values queries and event reports. In VLDB, pages 541--550, 2001. Google ScholarDigital Library
- H. Gonzalez et al. Google fusion tables: web-centered data management and collaboration. In SIGMOD Conference, pages 1061--1066, 2010. Google ScholarDigital Library
- J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online aggregation. In J. Peckham, editor, SIGMOD 1997, pages 171--182. ACM Press, 1997. Google ScholarDigital Library
- C. Jermaine et al. Scalable approximate query processing with the dbo engine. ACM Trans. Database Syst., 33(4), 2008. Google ScholarDigital Library
- S. Kandel et al. Wrangler: interactive visual specification of data transformation scripts. In CHI, pages 3363--3372, 2011. Google ScholarDigital Library
- S. Kandel et al. Profiler: integrated statistical analysis and visualization for data quality assessment. In AVI, pages 547--554, 2012. Google ScholarDigital Library
- M. Livny et al. Devise: Integrated querying and visualization of large datasets. In SIGMOD Conference, pages 301--312, 1997. Google ScholarDigital Library
- J. D. Mackinlay et al. Show me: Automatic presentation for visual analysis. IEEE Trans. Vis. Comput. Graph., 13(6): 1137--1144, 2007. Google ScholarDigital Library
- A. Parameswaran, N. Polyzotis, and H. Garcia-Molina. SeeDB: Visualizing Database Queries Efficiently. Stanford Infolab, 2013.Google ScholarDigital Library
- S. Sarawagi. Explaining differences in multidimensional aggregates. In VLDB, pages 42--53, 1999. Google ScholarDigital Library
- S. Sarawagi. User-adaptive exploration of multidimensional data. In VLDB, pages 307--316, 2000.Google Scholar
- G. Sathe and S. Sarawagi. Intelligent rollups in multidimensional olap data. In VLDB, pages 531--540, 2001. Google ScholarDigital Library
- T. K. Sellis. Multiple-query optimization. ACM TODS, 13(1): 23--52, 1988. Google ScholarDigital Library
- C. Stolte et al. Polaris: a system for query, analysis, and visualization of multidimensional databases. Commun. ACM, 51(11): 75--84, 2008. Google ScholarDigital Library
- C. Wang and H.-W. Shen. Information theory in scientific visualization. Entropy, 13(1): 254--273, 2011.Google ScholarCross Ref
- Wikipedia. Jensen shannon divergence --- wikipedia, the free encyclopedia, 2013. {Online; accessed 16-July-2013}.Google Scholar
- Wikipedia. Kullback leibler divergence --- wikipedia, the free encyclopedia, 2013. {Online; accessed 16-July-2013}.Google Scholar
- Wikipedia. Statistical distance --- wikipedia, the free encyclopedia, 2013. {Online; accessed 16-July-2013}.Google Scholar
Recommendations
SeeDB: automatically generating query visualizations
Data analysts operating on large volumes of data often rely on visualizations to interpret the results of queries. However, finding the right visualization for a query is a laborious and time-consuming task. We demonstrate SeeDB, a system that partially ...
SeeDB: efficient data-driven visualization recommendations to support visual analytics
Proceedings of the 41st International Conference on Very Large Data Bases, Kohala Coast, HawaiiData analysts often build visualizations as the first step in their analytical workflow. However, when working with high-dimensional datasets, identifying visualizations that show relevant or desired trends in data can be laborious. We propose SeeDB, a ...
Equivalence and minimization of conjunctive queries under combined semantics
ICDT '12: Proceedings of the 15th International Conference on Database TheoryThe problems of query containment, equivalence, and minimization are fundamental problems in the context of query processing and optimization. In their classic work [2] published in 1977, Chandra and Merlin solved the three problems for the language of ...
Comments