Skip to main content

2014 | OriginalPaper | Buchkapitel

6. Large Scale Data

verfasst von : Harry Strange, Reyer Zwiggelaar

Erschienen in: Open Problems in Spectral Dimensionality Reduction

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter the problems of using spectral dimensionality reduction with large scale datasets are outlined along with various solutions to these problems. The computational complexity of various spectral dimensionality reduction algorithms are looked at in detail. There is also often much overlap between the solutions in this chapter and what has been discussed previously with regards to incremental learning. Finally, some parallel and GPU based implementation aspects are discussed.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Elsevier (2011) Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Elsevier (2011)
2.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011) Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
3.
Zurück zum Zitat Dijkstra, E.W.: A note on two problems in connexion with graphs. Numerische Mathematik 1, 269–271 (1959) Dijkstra, E.W.: A note on two problems in connexion with graphs. Numerische Mathematik 1, 269–271 (1959)
4.
Zurück zum Zitat Floyd, R.W.: Algorithm 97: Shortest path. Communications of the ACM 5(6), 345 (1962) Floyd, R.W.: Algorithm 97: Shortest path. Communications of the ACM 5(6), 345 (1962)
5.
Zurück zum Zitat Chen, W., Weinberger, K.Q., Chen, Y.: Maximum variance correction with application to A* search. In: Proceedings of the 30th International Conference on Machine Learning (2013) Chen, W., Weinberger, K.Q., Chen, Y.: Maximum variance correction with application to A* search. In: Proceedings of the 30th International Conference on Machine Learning (2013)
6.
Zurück zum Zitat van der Maaten, L., Postma, E., van den Herik, J.: Dimensionality reduction: A comparitive review. Tech. Rep. TiCC-TR 2009–005, Tilburg University (2009). Unpublished van der Maaten, L., Postma, E., van den Herik, J.: Dimensionality reduction: A comparitive review. Tech. Rep. TiCC-TR 2009–005, Tilburg University (2009). Unpublished
7.
Zurück zum Zitat Mishne, G., Cohen, I.: Multiscale anomaly detectiong using diffusion maps. IEEE Journal of Selected Topics in Signal Processing 7(1), 111–123 (2013) Mishne, G., Cohen, I.: Multiscale anomaly detectiong using diffusion maps. IEEE Journal of Selected Topics in Signal Processing 7(1), 111–123 (2013)
8.
Zurück zum Zitat Fokkema, D.R., Sleijpen, G.L.G., Vorst, H.A.v.: Jacobi-Davidson style QR and QZ algorithms for the reduction of matrix pencils. SIAM Journal on Scientific Computing 20(1), 94–125 (1999) Fokkema, D.R., Sleijpen, G.L.G., Vorst, H.A.v.: Jacobi-Davidson style QR and QZ algorithms for the reduction of matrix pencils. SIAM Journal on Scientific Computing 20(1), 94–125 (1999)
10.
Zurück zum Zitat Cayton, L.: Algorithms for manifold learning. Tech. Rep. CS2008-0923, University of California San Diego (2005) Cayton, L.: Algorithms for manifold learning. Tech. Rep. CS2008-0923, University of California San Diego (2005)
11.
Zurück zum Zitat Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(2), 214–225 (2004) Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(2), 214–225 (2004)
12.
Zurück zum Zitat Nyström, E.J.: Über die Praktische Auflösung von Linearen Integralgleichungen mit Anwendungen auf Randwertaufgaben der Potentialtheorie. Commentationes Physio-Mathematicae 4(15), 1–52 (1928) Nyström, E.J.: Über die Praktische Auflösung von Linearen Integralgleichungen mit Anwendungen auf Randwertaufgaben der Potentialtheorie. Commentationes Physio-Mathematicae 4(15), 1–52 (1928)
13.
Zurück zum Zitat Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems 13: Proceedings of the 2001 Conference (NIPS), pp. 682–688 (2001) Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems 13: Proceedings of the 2001 Conference (NIPS), pp. 682–688 (2001)
14.
Zurück zum Zitat Baker, C.T.: The numerical treatment of integral equations. Clarendon Press (1977) Baker, C.T.: The numerical treatment of integral equations. Clarendon Press (1977)
15.
Zurück zum Zitat Ham, J., Lee, D.D., Mika, S., Schölkopf, B.: A kernel view of the dimensionality reduction of manifolds. In. In Proceedings of the 21st International Conference on Machine Learning, pp. 47–55 (2004) Ham, J., Lee, D.D., Mika, S., Schölkopf, B.: A kernel view of the dimensionality reduction of manifolds. In. In Proceedings of the 21st International Conference on Machine Learning, pp. 47–55 (2004)
16.
Zurück zum Zitat Kumar, S., Mohri, M., Talwalkar, A.: Samping techniques for the nyström method. Journal of Machine Learning Research 13(1), 981–1006 (2012) Kumar, S., Mohri, M., Talwalkar, A.: Samping techniques for the nyström method. Journal of Machine Learning Research 13(1), 981–1006 (2012)
17.
Zurück zum Zitat Drineas, P., Mahoney, M.W.: On the nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research 6, 2153–2175 (2005) Drineas, P., Mahoney, M.W.: On the nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research 6, 2153–2175 (2005)
18.
Zurück zum Zitat Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation matrix. SIAM Journal on Computing 36, 158–183 (2006) Drineas, P., Kannan, R., Mahoney, M.W.: Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation matrix. SIAM Journal on Computing 36, 158–183 (2006)
19.
Zurück zum Zitat Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. Theory of Computing 2(12), 225–247 (2006) Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. Theory of Computing 2(12), 225–247 (2006)
20.
Zurück zum Zitat Zhang, K., Kwok, J.T.: Clustered Nyström Method for Large Scale Manifold Learning and Dimension Reduction. IEEE Transactions on Neural Networks 21(10), 1576–1587 (2010) Zhang, K., Kwok, J.T.: Clustered Nyström Method for Large Scale Manifold Learning and Dimension Reduction. IEEE Transactions on Neural Networks 21(10), 1576–1587 (2010)
21.
Zurück zum Zitat Silva, V.d., Tenenbaum, J.B.: Global versus local methods in nonlinear dimensionality reduction. In: Advances in Neural Information Processing Systems 15: Proceedings of the 2003 Conference (NIPS), pp. 705–712. MIT Press (2003) Silva, V.d., Tenenbaum, J.B.: Global versus local methods in nonlinear dimensionality reduction. In: Advances in Neural Information Processing Systems 15: Proceedings of the 2003 Conference (NIPS), pp. 705–712. MIT Press (2003)
22.
Zurück zum Zitat Law, M., Jain, A.: Incremental nonlinear dimensionality reduction by manifold learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(3), 377–391 (2006) Law, M., Jain, A.: Incremental nonlinear dimensionality reduction by manifold learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(3), 377–391 (2006)
23.
Zurück zum Zitat Narváez, P., Siu, K.Y., Tzeng, H.Y.: New dynamic algorithms for shortest path tree computation. IEEE/ACM Transactions on Networking 8(6), 734–746 (2000) Narváez, P., Siu, K.Y., Tzeng, H.Y.: New dynamic algorithms for shortest path tree computation. IEEE/ACM Transactions on Networking 8(6), 734–746 (2000)
24.
Zurück zum Zitat Vladymyrov, M., Carreira-Perpiñán, M.A.: Locally linear landmarks for large-scale manifold learning. In: In Proceedings of the 24th European Conference on Machine Learning and Princicples and Applications of Knowledge Discovery in Databases (ECML/PKDD), pp. 256–271 (2013) Vladymyrov, M., Carreira-Perpiñán, M.A.: Locally linear landmarks for large-scale manifold learning. In: In Proceedings of the 24th European Conference on Machine Learning and Princicples and Applications of Knowledge Discovery in Databases (ECML/PKDD), pp. 256–271 (2013)
25.
Zurück zum Zitat Silva, V.d., Tenenbaum, J.B.: Sparse multidimensional scaling using landmark points. Tech. rep., Stanford University (2004) Silva, V.d., Tenenbaum, J.B.: Sparse multidimensional scaling using landmark points. Tech. rep., Stanford University (2004)
26.
Zurück zum Zitat Weinberger, K.Q., Packer, B.D., Saul, L.K.: Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. In: In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pp. 381–388 (2005) Weinberger, K.Q., Packer, B.D., Saul, L.K.: Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. In: In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pp. 381–388 (2005)
28.
Zurück zum Zitat Campana-Olivo, R., Manian, V.: Parallel implementation of nonlinear dimensionality reduction methods applied in object segmentation using CUDA and GPU. In: Proceedings of Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XVII, pp. 80,480R–80,480R–12 (2011) Campana-Olivo, R., Manian, V.: Parallel implementation of nonlinear dimensionality reduction methods applied in object segmentation using CUDA and GPU. In: Proceedings of Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XVII, pp. 80,480R–80,480R–12 (2011)
29.
Zurück zum Zitat NVIDIA Corporation: NVIDIA CUDA C Programming Guide (2011) NVIDIA Corporation: NVIDIA CUDA C Programming Guide (2011)
30.
Zurück zum Zitat Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2322 (2000) Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2322 (2000)
31.
Zurück zum Zitat Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by Locally Linear Embedding. Science 290, 2323–2326 (2000) Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by Locally Linear Embedding. Science 290, 2323–2326 (2000)
32.
Zurück zum Zitat Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems 14: Proceedings of the 2002 Conference (NIPS), pp. 585–591 (2002) Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems 14: Proceedings of the 2002 Conference (NIPS), pp. 585–591 (2002)
34.
Zurück zum Zitat Talwalkar, A., Kumar, S., Mohri, M., Rowley, H.: Manifold Learning Theory and Applications, chap. Large-Scale Manifold Learning, pp. 121–143. CRC Press (2012) Talwalkar, A., Kumar, S., Mohri, M., Rowley, H.: Manifold Learning Theory and Applications, chap. Large-Scale Manifold Learning, pp. 121–143. CRC Press (2012)
35.
Zurück zum Zitat Talwalkar, A., Kumar, S., Rowley, H.: Large-scale manifold learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008) Talwalkar, A., Kumar, S., Rowley, H.: Large-scale manifold learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Metadaten
Titel
Large Scale Data
verfasst von
Harry Strange
Reyer Zwiggelaar
Copyright-Jahr
2014
DOI
https://doi.org/10.1007/978-3-319-03943-5_6