Skip to main content

2016 | OriginalPaper | Buchkapitel

Efficient Mining of Pan-Correlation Patterns from Time Course Data

verfasst von : Qian Liu, Jinyan Li, Limsoon Wong, Kotagiri Ramamohanarao

Erschienen in: Advanced Data Mining and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

There are different types of correlation patterns between the variables of a time course data set, such as positive correlations, negative correlations, time-lagged correlations, and those correlations containing small interrupted gaps. Usually, these correlations are maintained only on a subset of time points rather than on the whole span of the time points which are traditionally required for correlation definition. As these types of patterns underline different trends of data movement, mining all of them is an important step to gain a broad insight into the dependencies of the variables. In this work, we prove that these diverse types of correlation patterns can be all represented by a generalized form of positive correlation patterns. We also prove a correspondence between positive correlation patterns and sequential patterns. We then present an efficient single-scan algorithm for mining all of these types of correlations. This “pan-correlation” mining algorithm is evaluated on synthetic time course data sets, as well as on yeast cell cycle gene expression data sets. The results indicate that: (i) our mining algorithm has linear time increment in terms of increasing number of variables; (ii) negative correlation patterns are abundant in real-world data sets; and (iii) correlation patterns with time lags and gaps are also abundant. Existing methods have only discovered incomplete forms of many of these patterns, and have missed some important patterns completely.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cho, R.J., Campbell, M.J., Winzeler, E.A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T.G., Gabrielian, A.E., Landsman, D., Lockhart, D.J., Davis, R.W.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2(1), 65–73 (1998)CrossRef Cho, R.J., Campbell, M.J., Winzeler, E.A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T.G., Gabrielian, A.E., Landsman, D., Lockhart, D.J., Davis, R.W.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2(1), 65–73 (1998)CrossRef
2.
Zurück zum Zitat Chuang, C.L., Jen, C.H., Chen, C.M., Shieh, G.S.: A pattern recognition approach to infer time-lagged genetic interactions. Bioinformatics 24(9), 1183–1190 (2008)CrossRef Chuang, C.L., Jen, C.H., Chen, C.M., Shieh, G.S.: A pattern recognition approach to infer time-lagged genetic interactions. Bioinformatics 24(9), 1183–1190 (2008)CrossRef
3.
Zurück zum Zitat Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. Nat. Acad. Sci. 97(22), 12079–12084 (2000)CrossRef Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. Nat. Acad. Sci. 97(22), 12079–12084 (2000)CrossRef
4.
Zurück zum Zitat Ji, L., Tan, K.L.: Mining gene expression data for positive and negative co-regulated gene clusters. Bioinformatics 20(16), 2711–2718 (2004)CrossRef Ji, L., Tan, K.L.: Mining gene expression data for positive and negative co-regulated gene clusters. Bioinformatics 20(16), 2711–2718 (2004)CrossRef
5.
Zurück zum Zitat Ji, L., Tan, K.L.: Identifying time-lagged gene clusters using gene expression data. Bioinformatics 21(4), 509–516 (2005)MathSciNetCrossRef Ji, L., Tan, K.L.: Identifying time-lagged gene clusters using gene expression data. Bioinformatics 21(4), 509–516 (2005)MathSciNetCrossRef
6.
Zurück zum Zitat Jiang, D., Pei, J., Ramanathan, M., Tang, C., Zhang, A.: Mining coherent gene clusters from gene-sample-time microarray data. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), pp. 430–439. ACM, New York (2004) Jiang, D., Pei, J., Ramanathan, M., Tang, C., Zhang, A.: Mining coherent gene clusters from gene-sample-time microarray data. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), pp. 430–439. ACM, New York (2004)
7.
Zurück zum Zitat Li, J., Liu, Q., Zeng, T.: Negative correlations in collaboration: concepts and algorithms. In: KDD, pp. 463–472 (2010) Li, J., Liu, Q., Zeng, T.: Negative correlations in collaboration: concepts and algorithms. In: KDD, pp. 463–472 (2010)
8.
Zurück zum Zitat Madeira, S., Oliveira, A.: A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series. Algorithms Mol. Biol. 4(1), 8 (2009)CrossRef Madeira, S., Oliveira, A.: A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series. Algorithms Mol. Biol. 4(1), 8 (2009)CrossRef
9.
Zurück zum Zitat Madeira, S.C., Teixeira, M.C., Sa-Correia, I., Oliveira, A.L.: Identification of regulatory modules in time series gene expression data using a linear time biclustering algorithm. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(1), 153–165 (2010)CrossRef Madeira, S.C., Teixeira, M.C., Sa-Correia, I., Oliveira, A.L.: Identification of regulatory modules in time series gene expression data using a linear time biclustering algorithm. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(1), 153–165 (2010)CrossRef
10.
Zurück zum Zitat Roy, S., Bhattacharyya, D.K., Kalita, J.K.: CoBi: pattern based co-regulated biclustering of gene expression data. Pattern Recognit. Lett. 34(14), 1669–1678 (2013)CrossRef Roy, S., Bhattacharyya, D.K., Kalita, J.K.: CoBi: pattern based co-regulated biclustering of gene expression data. Pattern Recognit. Lett. 34(14), 1669–1678 (2013)CrossRef
11.
Zurück zum Zitat Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., Futcher, B.: Comprehensive identification of cell cycle-cregulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9(12), 3273–3297 (1998)CrossRef Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., Futcher, B.: Comprehensive identification of cell cycle-cregulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9(12), 3273–3297 (1998)CrossRef
12.
Zurück zum Zitat Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: 20th International Conference on Data Engineering, Proceedings, pp. 79–90 (2004) Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: 20th International Conference on Data Engineering, Proceedings, pp. 79–90 (2004)
13.
Zurück zum Zitat Zeng, T., Li, J.: Maximization of negative correlations in time-course gene expression data for enhancing understanding of molecular pathways. Nucleic Acids Res. 38(1), e1 (2010)CrossRef Zeng, T., Li, J.: Maximization of negative correlations in time-course gene expression data for enhancing understanding of molecular pathways. Nucleic Acids Res. 38(1), e1 (2010)CrossRef
14.
Zurück zum Zitat Zhao, Y., Yu, J., Wang, G., Chen, L., Wang, B., Yu, G.: Maximal coregulated gene clustering. IEEE Trans. Knowl. Data Eng. 20(1), 83–98 (2008)CrossRef Zhao, Y., Yu, J., Wang, G., Chen, L., Wang, B., Yu, G.: Maximal coregulated gene clustering. IEEE Trans. Knowl. Data Eng. 20(1), 83–98 (2008)CrossRef
Metadaten
Titel
Efficient Mining of Pan-Correlation Patterns from Time Course Data
verfasst von
Qian Liu
Jinyan Li
Limsoon Wong
Kotagiri Ramamohanarao
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-49586-6_16