Skip to main content
Erschienen in:
Buchtitelbild

2016 | OriginalPaper | Buchkapitel

Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Platform

verfasst von : Wei Zhong Toh, Kwok Pui Choi, Limsoon Wong

Erschienen in: Intelligent Information and Database Systems

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a platform named Redhyte, short for an interactive platform for “Rapid exploration of data and hypothesis testing”. Redhyte aims to augment the conventional statistical hypothesis testing framework with data-mining techniques in a bid for more wholesome and efficient hypothesis testing. The platform is self-diagnosing (it can detect whether the user is doing a valid statistical test), self-correcting (it can propose and make corrections to the user’s statistical test), and helpful (it can search for promising or interesting hypotheses related to the initial user-specified hypothesis). In Redhyte, hypothesis mining consists of several steps: context mining, mined-hypothesis formulation, mined-hypothesis scoring on interestingness, and statistical adjustments. To capture and evaluate specific aspects of interestingness, we developed and implemented various hypothesis-mining metrics. Redhyte is an R shiny web application and can be found online at https://​tohweizhong.​shinyapps.​io/​redhyte, and the source codes are housed in a GitHub repository at https://​github.​com/​tohweizhong/​redhyte.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bickel, P., Hammel, E., O’connell, J.: Sex bias in graduate admissions: data from Berkeley. Sci. 187, 398–404 (1975)CrossRef Bickel, P., Hammel, E., O’connell, J.: Sex bias in graduate admissions: data from Berkeley. Sci. 187, 398–404 (1975)CrossRef
4.
Zurück zum Zitat Cox, D.R.: The regression analysis of binary sequences (with discussion). J. R. Stat. Soc. B 20, 215–242 (1958)MATH Cox, D.R.: The regression analysis of binary sequences (with discussion). J. R. Stat. Soc. B 20, 215–242 (1958)MATH
5.
Zurück zum Zitat Fisher, R.A.: On a distribution yielding the error functions of several well-known statistics. Proc. Int. Congr. Math. 2, 805–813 (1924) Fisher, R.A.: On a distribution yielding the error functions of several well-known statistics. Proc. Int. Congr. Math. 2, 805–813 (1924)
6.
Zurück zum Zitat Freedman, D.A.: Statistical Models: Theory and Practice. Cambridge University Press, Cambridge (2009)CrossRefMATH Freedman, D.A.: Statistical Models: Theory and Practice. Cambridge University Press, Cambridge (2009)CrossRefMATH
7.
Zurück zum Zitat Gosset, W.S.: The probable error of a mean. Biometrika 6, 1–25 (1908)CrossRef Gosset, W.S.: The probable error of a mean. Biometrika 6, 1–25 (1908)CrossRef
9.
Zurück zum Zitat Liu, G., Suchitra, A., Zhang, H., Feng, M., Ng, S.K., Wong, L.: AssocExplorer: an association rule visualization system for exploratory data analysis. In: Proceedings of 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1536–1539 (2012) Liu, G., Suchitra, A., Zhang, H., Feng, M., Ng, S.K., Wong, L.: AssocExplorer: an association rule visualization system for exploratory data analysis. In: Proceedings of 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1536–1539 (2012)
10.
Zurück zum Zitat Liu, G., Zhang, H., Wong, L.: A flexible approach to finding representative pattern sets. IEEE Trans. Knowl. Data Eng. 26, 1562–1574 (2014)CrossRef Liu, G., Zhang, H., Wong, L.: A flexible approach to finding representative pattern sets. IEEE Trans. Knowl. Data Eng. 26, 1562–1574 (2014)CrossRef
11.
Zurück zum Zitat Liu, G., Zhang, H., Feng, M., Wong, L., Ng, S.K.: Supporting exploratory hypothesis testing and analysis. ACM Trans. Knowl. Discov. Data 9, Article 31 (2015) Liu, G., Zhang, H., Feng, M., Wong, L., Ng, S.K.: Supporting exploratory hypothesis testing and analysis. ACM Trans. Knowl. Discov. Data 9, Article 31 (2015)
12.
Zurück zum Zitat Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947)MathSciNetCrossRefMATH Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18, 50–60 (1947)MathSciNetCrossRefMATH
14.
Zurück zum Zitat Pearson, K.: On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos. Mag. Ser. 5(50), 157–175 (1900)CrossRefMATH Pearson, K.: On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos. Mag. Ser. 5(50), 157–175 (1900)CrossRefMATH
15.
Zurück zum Zitat Poernomo, A.K., Gopalkrishnan, V.: CP-summary: a concise representation for browsing frequent itemsets. In: Proceedings of 12th ACM SIGKDD International Conference on Knowlegde Discovery and Data Mining, pp. 687–696 (2009) Poernomo, A.K., Gopalkrishnan, V.: CP-summary: a concise representation for browsing frequent itemsets. In: Proceedings of 12th ACM SIGKDD International Conference on Knowlegde Discovery and Data Mining, pp. 687–696 (2009)
16.
17.
Zurück zum Zitat Simpson, E.H.: The interpretation of interaction in contingency tables. J. R. Stat. Soc. B 13, 238–241 (1951)MathSciNetMATH Simpson, E.H.: The interpretation of interaction in contingency tables. J. R. Stat. Soc. B 13, 238–241 (1951)MathSciNetMATH
19.
Zurück zum Zitat Wang, C., Parthasarathy, S.: Summarizing itemset patterns using probabilistic models. In: Proceedings of 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 730–735 (2006) Wang, C., Parthasarathy, S.: Summarizing itemset patterns using probabilistic models. In: Proceedings of 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 730–735 (2006)
20.
Zurück zum Zitat West, M.: Bayesian factor regression models in the “large p, small n” paradigm. Bayesian Stat. 7, 723–732 (2003) West, M.: Bayesian factor regression models in the “large p, small n” paradigm. Bayesian Stat. 7, 723–732 (2003)
21.
Zurück zum Zitat Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing itemset patterns: a profile-based approach. In: Proceedings of 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 314–323 (2005) Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing itemset patterns: a profile-based approach. In: Proceedings of 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 314–323 (2005)
Metadaten
Titel
Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Platform
verfasst von
Wei Zhong Toh
Kwok Pui Choi
Limsoon Wong
Copyright-Jahr
2016
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-49390-8_1