Skip to main content

2018 | OriginalPaper | Buchkapitel

2. Machine Learning with Shallow Neural Networks

verfasst von : Charu C. Aggarwal

Erschienen in: Neural Networks and Deep Learning

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Conventional machine learning often uses optimization and gradient-descent methods for learning parameterized models. Examples of such models include linear regression, support vector machines, logistic regression, dimensionality reduction, and matrix factorization. Neural networks are also parameterized models that are learned with continuous optimization methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In recent years, the sigmoid unit has fallen out of favor compared to the ReLU.
 
2
In order to obtain exactly the same direction as the Fisher method with Equation 2.8, it is important to mean-center both the feature variables and the binary targets. Therefore, each binary target will be one of two real values with different signs. The real values will contain the fraction of instances belonging to the other class. Alternatively, one can use a bias neuron to absorb the constant offsets.
 
3
This subspace is defined by the top-k singular vectors of singular value decomposition. However, the optimization problem does not impose orthogonality constraints, and therefore the columns of V might use a different non-orthogonal basis system to represent this subspace.
 
4
There is no loss in reconstruction accuracy in several special cases like the single-layer case discussed here, even on the training data. In other cases, the loss of accuracy is only on the training data, but the autoencoder tends to better reconstruct out-of-sample data because of the regularization effects of parameter footprint reduction.
 
5
The t-SNE method works on the principle is that it is impossible to preserve all pairwise similarities and dissimilarities with the same level of accuracy in a low-dimensional embedding. Therefore, unlike dimensionality reduction or autoencoders that try to faithfully reconstruct the data, it has an asymmetric loss function in terms of how similarity is treated versus dissimilarity. This type of asymmetric loss function is particularly helpful for separating out different manifolds during visualization. Therefore, t-SNE might perform better than autoencoders at visualization.
 
6
The work in [287] does point out a number of implicit relationships with matrix factorization, but not the more direct ones pointed out in this book. Some of these relationships are also pointed out in [6].
 
7
There is a slight abuse of notation in the updates adding \(\overline{u}_{i}\) and \(\overline{v}_{j}\). This is because \(\overline{u}_{i}\) is a row vector and \(\overline{v}_{j}\) is a column vector. Throughout this section, we omit the explicit transposition of one of these two vectors to avoid notational clutter, since the updates are intuitively clear.
 
8
This fact is not evident in the toy example of Figure 2.17. In practice, the degree of a node is a tiny fraction of the total number of nodes. For example, a person might have 100 friends in a social network of millions of nodes.
 
9
The weighted degree of node j is r c rj.
 
Literatur
[3]
Zurück zum Zitat C. Aggarwal. Data mining: The textbook. Springer, 2015. C. Aggarwal. Data mining: The textbook. Springer, 2015.
[4]
Zurück zum Zitat C. Aggarwal. Recommender systems: The textbook. Springer, 2016. C. Aggarwal. Recommender systems: The textbook. Springer, 2016.
[6]
Zurück zum Zitat C. Aggarwal. Machine learning for text. Springer, 2018. C. Aggarwal. Machine learning for text. Springer, 2018.
[40]
Zurück zum Zitat C. M. Bishop. Pattern recognition and machine learning. Springer, 2007. C. M. Bishop. Pattern recognition and machine learning. Springer, 2007.
[41]
Zurück zum Zitat C. M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995. C. M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995.
[48]
Zurück zum Zitat H. Bourlard and Y. Kamp. Auto-association by multilayer perceptrons and singular value decomposition. Biological Cybernetics, 59(4), pp. 291–294, 1988. H. Bourlard and Y. Kamp. Auto-association by multilayer perceptrons and singular value decomposition. Biological Cybernetics, 59(4), pp. 291–294, 1988.
[62]
Zurück zum Zitat S. Chang, W. Han, J. Tang, G. Qi, C. Aggarwal, and T. Huang. Heterogeneous network embedding via deep architectures. ACM KDD Conference, pp. 119–128, 2015. S. Chang, W. Han, J. Tang, G. Qi, C. Aggarwal, and T. Huang. Heterogeneous network embedding via deep architectures. ACM KDD Conference, pp. 119–128, 2015.
[64]
Zurück zum Zitat J. Chen, S. Sathe, C. Aggarwal, and D. Turaga. Outlier detection with autoencoder ensembles. SIAM Conference on Data Mining, 2017. J. Chen, S. Sathe, C. Aggarwal, and D. Turaga. Outlier detection with autoencoder ensembles. SIAM Conference on Data Mining, 2017.
[67]
Zurück zum Zitat Y. Chen and M. Zaki. KATE: K-Competitive Autoencoder for Text. ACM KDD Conference, 2017. Y. Chen and M. Zaki. KATE: K-Competitive Autoencoder for Text. ACM KDD Conference, 2017.
[77]
Zurück zum Zitat A. Coates, A. Ng, and H. Lee. An analysis of single-layer networks in unsupervised feature learning. AAAI Conference, pp. 215–223, 2011. A. Coates, A. Ng, and H. Lee. An analysis of single-layer networks in unsupervised feature learning. AAAI Conference, pp. 215–223, 2011.
[82]
Zurück zum Zitat C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3), pp. 273–297, 1995. C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3), pp. 273–297, 1995.
[94]
Zurück zum Zitat M. Denil, B. Shakibi, L. Dinh, M. A. Ranzato, and N. de Freitas. Predicting parameters in deep learning. NIPS Conference, pp. 2148–2156, 2013. M. Denil, B. Shakibi, L. Dinh, M. A. Ranzato, and N. de Freitas. Predicting parameters in deep learning. NIPS Conference, pp. 2148–2156, 2013.
[97]
Zurück zum Zitat F. Despagne and D. Massart. Neural networks in multivariate calibration. Analyst, 123(11), pp. 157R–178R, 1998. F. Despagne and D. Massart. Neural networks in multivariate calibration. Analyst, 123(11), pp. 157R–178R, 1998.
[99]
Zurück zum Zitat C. Ding, T. Li, and W. Peng. On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Computational Statistics and Data Analysis, 52(8), pp. 3913–3927, 2008. C. Ding, T. Li, and W. Peng. On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Computational Statistics and Data Analysis, 52(8), pp. 3913–3927, 2008.
[110]
Zurück zum Zitat A. Elkahky, Y. Song, and X. He. A multi-view deep learning approach for cross domain user modeling in recommendation systems. WWW Conference, pp. 278–288, 2015. A. Elkahky, Y. Song, and X. He. A multi-view deep learning approach for cross domain user modeling in recommendation systems. WWW Conference, pp. 278–288, 2015.
[120]
Zurück zum Zitat R. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7: pp. 179–188, 1936. R. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7: pp. 179–188, 1936.
[139]
Zurück zum Zitat F. Girosi and T. Poggio. Networks and the best approximation property. Biological Cybernetics, 63(3), pp. 169–176, 1990. F. Girosi and T. Poggio. Networks and the best approximation property. Biological Cybernetics, 63(3), pp. 169–176, 1990.
[164]
Zurück zum Zitat A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. ACM KDD Conference, pp. 855–864, 2016. A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. ACM KDD Conference, pp. 855–864, 2016.
[166]
Zurück zum Zitat M. Gutmann and A. Hyvarinen. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. AISTATS, 1(2), pp. 6, 2010. M. Gutmann and A. Hyvarinen. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. AISTATS, 1(2), pp. 6, 2010.
[178]
Zurück zum Zitat T. Hastie and R. Tibshirani. Generalized additive models. CRC Press, 1990. T. Hastie and R. Tibshirani. Generalized additive models. CRC Press, 1990.
[181]
Zurück zum Zitat S. Hawkins, H. He, G. Williams, and R. Baxter. Outlier detection using replicator neural networks. International Conference on Data Warehousing and Knowledge Discovery, pp. 170–180, 2002. S. Hawkins, H. He, G. Williams, and R. Baxter. Outlier detection using replicator neural networks. International Conference on Data Warehousing and Knowledge Discovery, pp. 170–180, 2002.
[186]
Zurück zum Zitat X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T. S. Chua. Neural collaborative filtering. WWW Conference, pp. 173–182, 2017. X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T. S. Chua. Neural collaborative filtering. WWW Conference, pp. 173–182, 2017.
[190]
Zurück zum Zitat G. Hinton. Connectionist learning procedures. Artificial Intelligence, 40(1–3), pp. 185–234, 1989. G. Hinton. Connectionist learning procedures. Artificial Intelligence, 40(1–3), pp. 185–234, 1989.
[198]
Zurück zum Zitat G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313, (5766), pp. 504–507, 2006. G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313, (5766), pp. 504–507, 2006.
[206]
Zurück zum Zitat T. Hofmann. Probabilistic latent semantic indexing. ACM SIGIR Conference, pp. 50–57, 1999. T. Hofmann. Probabilistic latent semantic indexing. ACM SIGIR Conference, pp. 50–57, 1999.
[224]
Zurück zum Zitat C. Johnson. Logistic matrix factorization for implicit feedback data. NIPS Conference, 2014. C. Johnson. Logistic matrix factorization for implicit feedback data. NIPS Conference, 2014.
[253]
Zurück zum Zitat Y. Koren. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1, 2010. Y. Koren. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1, 2010.
[272]
Zurück zum Zitat Q. Le and T. Mikolov. Distributed representations of sentences and documents. ICML Conference, pp. 1188–196, 2014. Q. Le and T. Mikolov. Distributed representations of sentences and documents. ICML Conference, pp. 1188–196, 2014.
[273]
Zurück zum Zitat Q. Le, J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, and A. Ng, On optimization methods for deep learning. ICML Conference, pp. 265–272, 2011. Q. Le, J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, and A. Ng, On optimization methods for deep learning. ICML Conference, pp. 265–272, 2011.
[274]
Zurück zum Zitat Q. Le, W. Zou, S. Yeung, and A. Ng. Learning hierarchical spatio-temporal features for action recognition with independent subspace analysis. CVPR Conference, 2011. Q. Le, W. Zou, S. Yeung, and A. Ng. Learning hierarchical spatio-temporal features for action recognition with independent subspace analysis. CVPR Conference, 2011.
[275]
Zurück zum Zitat Y. LeCun. Modeles connexionnistes de l’apprentissage. Doctoral Dissertation, Universite Paris, 1987. Y. LeCun. Modeles connexionnistes de l’apprentissage. Doctoral Dissertation, Universite Paris, 1987.
[284]
Zurück zum Zitat H. Lee, C. Ekanadham, and A. Ng. Sparse deep belief net model for visual area V2. NIPS Conference, 2008. H. Lee, C. Ekanadham, and A. Ng. Sparse deep belief net model for visual area V2. NIPS Conference, 2008.
[287]
Zurück zum Zitat O. Levy and Y. Goldberg. Neural word embedding as implicit matrix factorization. NIPS Conference, pp. 2177–2185, 2014. O. Levy and Y. Goldberg. Neural word embedding as implicit matrix factorization. NIPS Conference, pp. 2177–2185, 2014.
[288]
Zurück zum Zitat O. Levy, Y. Goldberg, and I. Dagan. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3, pp. 211–225, 2015. O. Levy, Y. Goldberg, and I. Dagan. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3, pp. 211–225, 2015.
[295]
Zurück zum Zitat D. Liben-Nowell, and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), pp. 1019–1031, 2007. D. Liben-Nowell, and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), pp. 1019–1031, 2007.
[305]
Zurück zum Zitat L. Maaten and G. E. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9, pp. 2579–2605, 2008. L. Maaten and G. E. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9, pp. 2579–2605, 2008.
[310]
Zurück zum Zitat A. Makhzani and B. Frey. Winner-take-all autoencoders. NIPS Conference, pp. 2791–2799, 2015. A. Makhzani and B. Frey. Winner-take-all autoencoders. NIPS Conference, pp. 2791–2799, 2015.
[320]
Zurück zum Zitat P. McCullagh and J. Nelder. Generalized linear models CRC Press, 1989. P. McCullagh and J. Nelder. Generalized linear models CRC Press, 1989.
[322]
Zurück zum Zitat G. McLachlan. Discriminant analysis and statistical pattern recognition John Wiley & Sons, 2004. G. McLachlan. Discriminant analysis and statistical pattern recognition John Wiley & Sons, 2004.
[327]
Zurück zum Zitat T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. NIPS Conference, pp. 3111–3119, 2013. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. NIPS Conference, pp. 3111–3119, 2013.
[329]
Zurück zum Zitat G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography, 3(4), pp. 235–312, 1990.https://wordnet.princeton.edu/ G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography, 3(4), pp. 235–312, 1990.https://​wordnet.​princeton.​edu/​
[332]
Zurück zum Zitat A. Mnih and G. Hinton. A scalable hierarchical distributed language model. NIPS Conference, pp. 1081–1088, 2009. A. Mnih and G. Hinton. A scalable hierarchical distributed language model. NIPS Conference, pp. 1081–1088, 2009.
[333]
Zurück zum Zitat A. Mnih and K. Kavukcuoglu. Learning word embeddings efficiently with noise-contrastive estimation. NIPS Conference, pp. 2265–2273, 2013. A. Mnih and K. Kavukcuoglu. Learning word embeddings efficiently with noise-contrastive estimation. NIPS Conference, pp. 2265–2273, 2013.
[344]
Zurück zum Zitat F. Morin and Y. Bengio. Hierarchical Probabilistic Neural Network Language Model. AISTATS, pp. 246–252, 2005. F. Morin and Y. Bengio. Hierarchical Probabilistic Neural Network Language Model. AISTATS, pp. 246–252, 2005.
[357]
Zurück zum Zitat J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Ng. Multimodal deep learning. ICML Conference, pp. 689–696, 2011. J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Ng. Multimodal deep learning. ICML Conference, pp. 689–696, 2011.
[371]
Zurück zum Zitat J. Pennington, R. Socher, and C. Manning. Glove: Global Vectors for Word Representation. EMNLP, pp. 1532–1543, 2014. J. Pennington, R. Socher, and C. Manning. Glove: Global Vectors for Word Representation. EMNLP, pp. 1532–1543, 2014.
[372]
Zurück zum Zitat B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. ACM KDD Conference, pp. 701–710. B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. ACM KDD Conference, pp. 701–710.
[397]
Zurück zum Zitat S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio. Contractive auto-encoders: Explicit invariance during feature extraction. ICML Conference, pp. 833–840, 2011. S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio. Contractive auto-encoders: Explicit invariance during feature extraction. ICML Conference, pp. 833–840, 2011.
[400]
Zurück zum Zitat R. Rifkin. Everything old is new again: a fresh look at historical approaches in machine learning. Ph.D. Thesis, Massachusetts Institute of Technology, 2002. R. Rifkin. Everything old is new again: a fresh look at historical approaches in machine learning. Ph.D. Thesis, Massachusetts Institute of Technology, 2002.
[401]
Zurück zum Zitat R. Rifkin and A. Klautau. In defense of one-vs-all classification. Journal of Machine Learning Research, 5, pp. 101–141, 2004. R. Rifkin and A. Klautau. In defense of one-vs-all classification. Journal of Machine Learning Research, 5, pp. 101–141, 2004.
[405]
Zurück zum Zitat F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386, 1958. F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386, 1958.
[406]
Zurück zum Zitat D. Ruck, S. Rogers, and M. Kabrisky. Feature selection using a multilayer perceptron. Journal of Neural Network Computing, 2(2), pp. 40–88, 1990. D. Ruck, S. Rogers, and M. Kabrisky. Feature selection using a multilayer perceptron. Journal of Neural Network Computing, 2(2), pp. 40–88, 1990.
[408]
Zurück zum Zitat D. Rumelhart, G. Hinton, and R. Williams. Learning representations by back-propagating errors. Nature, 323 (6088), pp. 533–536, 1986. D. Rumelhart, G. Hinton, and R. Williams. Learning representations by back-propagating errors. Nature, 323 (6088), pp. 533–536, 1986.
[414]
Zurück zum Zitat R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted Boltzmann machines for collaborative filtering. ICML Confererence, pp. 791–798, 2007. R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted Boltzmann machines for collaborative filtering. ICML Confererence, pp. 791–798, 2007.
[436]
Zurück zum Zitat S. Sedhain, A. K. Menon, S. Sanner, and L. Xie. Autorec: Autoencoders meet collaborative filtering. WWW Conference, pp. 111–112, 2015. S. Sedhain, A. K. Menon, S. Sanner, and L. Xie. Autorec: Autoencoders meet collaborative filtering. WWW Conference, pp. 111–112, 2015.
[442]
Zurück zum Zitat A. Shashua. On the equivalence between the support vector machine for classification and sparsified Fisher’s linear discriminant. Neural Processing Letters, 9(2), pp. 129–139, 1999. A. Shashua. On the equivalence between the support vector machine for classification and sparsified Fisher’s linear discriminant. Neural Processing Letters, 9(2), pp. 129–139, 1999.
[448]
Zurück zum Zitat S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter. Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical Programming, 127(1), pp. 3–30, 2011. S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter. Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical Programming, 127(1), pp. 3–30, 2011.
[465]
Zurück zum Zitat Y. Song, A. Elkahky, and X. He. Multi-rate deep learning for temporal recommendation. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 909–912, 2016. Y. Song, A. Elkahky, and X. He. Multi-rate deep learning for temporal recommendation. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 909–912, 2016.
[468]
Zurück zum Zitat N. Srivastava and R. Salakhutdinov. Multimodal learning with deep Boltzmann machines. NIPS Conference, pp. 2222–2230, 2012. N. Srivastava and R. Salakhutdinov. Multimodal learning with deep Boltzmann machines. NIPS Conference, pp. 2222–2230, 2012.
[472]
Zurück zum Zitat F. Strub and J. Mary. Collaborative filtering with stacked denoising autoencoders and sparse inputs. NIPS Workshop on Machine Learning for eCommerce, 2015. F. Strub and J. Mary. Collaborative filtering with stacked denoising autoencoders and sparse inputs. NIPS Workshop on Machine Learning for eCommerce, 2015.
[499]
Zurück zum Zitat A. Tikhonov and V. Arsenin. Solution of ill-posed problems. Winston and Sons, 1977. A. Tikhonov and V. Arsenin. Solution of ill-posed problems. Winston and Sons, 1977.
[506]
Zurück zum Zitat P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol. Extracting and composing robust features with denoising autoencoders. ICML Confererence, pp. 1096–1103, 2008. P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol. Extracting and composing robust features with denoising autoencoders. ICML Confererence, pp. 1096–1103, 2008.
[512]
Zurück zum Zitat D. Wang, P. Cui, and W. Zhu. Structural deep network embedding. ACM KDD Conference, pp. 1225–1234, 2016. D. Wang, P. Cui, and W. Zhu. Structural deep network embedding. ACM KDD Conference, pp. 1225–1234, 2016.
[513]
Zurück zum Zitat H. Wang, N. Wang, and D. Yeung. Collaborative deep learning for recommender systems. ACM KDD Conference, pp. 1235–1244, 2015. H. Wang, N. Wang, and D. Yeung. Collaborative deep learning for recommender systems. ACM KDD Conference, pp. 1235–1244, 2015.
[521]
Zurück zum Zitat K. Weinberger, B. Packer, and L. Saul. Nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization. AISTATS, 2005. K. Weinberger, B. Packer, and L. Saul. Nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization. AISTATS, 2005.
[529]
Zurück zum Zitat J. Weston and C. Watkins. Multi-class support vector machines. Technical Report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London, May, 1998. J. Weston and C. Watkins. Multi-class support vector machines. Technical Report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London, May, 1998.
[531]
Zurück zum Zitat B. Widrow and M. Hoff. Adaptive switching circuits. IRE WESCON Convention Record, 4(1), pp. 96–104, 1960. B. Widrow and M. Hoff. Adaptive switching circuits. IRE WESCON Convention Record, 4(1), pp. 96–104, 1960.
[535]
Zurück zum Zitat Y. Wu, C. DuBois, A. Zheng, and M. Ester. Collaborative denoising auto-encoders for top-n recommender systems. Web Search and Data Mining, pp. 153–162, 2016. Y. Wu, C. DuBois, A. Zheng, and M. Ester. Collaborative denoising auto-encoders for top-n recommender systems. Web Search and Data Mining, pp. 153–162, 2016.
[547]
Zurück zum Zitat W. Yu, W. Cheng, C. Aggarwal, K. Zhang, H. Chen, and Wei Wang. NetWalk: A flexible deep embedding approach for anomaly Detection in dynamic networks, ACM KDD Conference, 2018. W. Yu, W. Cheng, C. Aggarwal, K. Zhang, H. Chen, and Wei Wang. NetWalk: A flexible deep embedding approach for anomaly Detection in dynamic networks, ACM KDD Conference, 2018.
[548]
Zurück zum Zitat W. Yu, C. Zheng, W. Cheng, C. Aggarwal, D. Song, B. Zong, H. Chen, and W. Wang. Learning deep network representations with adversarially regularized autoencoders. ACM KDD Conference, 2018. W. Yu, C. Zheng, W. Cheng, C. Aggarwal, D. Song, B. Zong, H. Chen, and W. Wang. Learning deep network representations with adversarially regularized autoencoders. ACM KDD Conference, 2018.
[558]
Zurück zum Zitat D. Zhang, Z.-H. Zhou, and S. Chen. Non-negative matrix factorization on kernels. Trends in Artificial Intelligence, pp. 404–412, 2006. D. Zhang, Z.-H. Zhou, and S. Chen. Non-negative matrix factorization on kernels. Trends in Artificial Intelligence, pp. 404–412, 2006.
[564]
Zurück zum Zitat C. Zhou and R. Paffenroth. Anomaly detection with robust deep autoencoders. ACM KDD Conference, pp. 665–674, 2017. C. Zhou and R. Paffenroth. Anomaly detection with robust deep autoencoders. ACM KDD Conference, pp. 665–674, 2017.
Metadaten
Titel
Machine Learning with Shallow Neural Networks
verfasst von
Charu C. Aggarwal
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-94463-0_2

Premium Partner