Inductive transfer with context-sensitive neural networks

Silver, Daniel L.; Poirier, Ryan; Currie, Duane

doi:10.1007/s10994-008-5088-0

Inductive transfer with context-sensitive neural networks

Published: 21 October 2008

Volume 73, pages 313–336, (2008)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Inductive transfer with context-sensitive neural networks

Download PDF

Daniel L. Silver¹,
Ryan Poirier¹ &
Duane Currie¹

1150 Accesses
29 Citations
3 Altmetric
Explore all metrics

Abstract

Context-sensitive Multiple Task Learning, or csMTL, is presented as a method of inductive transfer which uses a single output neural network and additional contextual inputs for learning multiple tasks. Motivated by problems with the application of MTL networks to machine lifelong learning systems, csMTL encoding of multiple task examples was developed and found to improve predictive performance. As evidence, the csMTL method is tested on seven task domains and shown to produce hypotheses for primary tasks that are often better than standard MTL hypotheses when learning in the presence of related and unrelated tasks. We argue that the reason for this performance improvement is a reduction in the number of effective free parameters in the csMTL network brought about by the shared output node and weight update constraints due to the context inputs. An examination of IDT and SVM models developed from csMTL encoded data provides initial evidence that this improvement is not shared across all machine learning models.

References

Abu-Mostafa, Y. S. (1995). Hints. Neural Computation, 7, 639–671.
Article Google Scholar
Allenby, G. M., & Rossi, P. E. (1999). Marketing models of consumer heterogeneity. Journal of Econometrics, 89, 57–78.
Article MATH Google Scholar
Allenby, G. M., & Rossi, P. E. (2005). Learning multiple tasks with kernel methods. Journal of Machine Learning Research, 6, 615–637.
Google Scholar
Arora, N., Allenby, G. M., & Ginter, J. (1998). A hierarchical Bayes model of primary and secondary demand. Marketing Science, 17(1), 29–44.
Article Google Scholar
Bakker, B., & Heskes, T. (2003). Task clustering and gating for Bayesian multi-task learning. Journal of Machine Learning Research, 4, 83–99.
Article Google Scholar
Baxter, J. (1996). Learning model bias. In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in neural information processing systems (Vol. 8, pp. 169–175). Cambridge: The MIT Press.
Google Scholar
Baxter, J. (1997). Theoretical models of learning to learn. Learning to Learn, 71–94.
Ben-David, S., & Schuller, R. (2003). Exploiting task relatedness for multiple task learning. In Proceedings of computational learning theory (COLT) (pp. 185–192).
Boser, B. E., Guyon, I., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Computational learning theory (pp. 144–152).
Breiman, L., & Friedman, J. H. (1998). Predicting multivariate responses in multiple linear regression. Royal Statistical Society Series B, 1, 3–54.
MathSciNet Google Scholar
Caruana, R. A. (1997). Multitask learning. Machine Learning, 28, 41–75.
Article Google Scholar
Chang, C., & Lin, C. (2001). LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Greene, W. (2002). Econometric analysis (5th ed.). Englewood Cliffs: Prentice-Hall.
Google Scholar
Gross, H., Stephan, V., & Krabbes, M. (1998). A neural field approach to topological reinforcement learning in continuous action spaces. In Procedings of the international joint conference on neural networks (IJCNN’98) (pp. 1992–1997). Anchorage, IEEE Press.
Heskes, T. (2000). Empirical Bayes for learning to learn. In P. Langley (Ed.), Proceedings of the international conference on machine learning (ICML’00) (pp. 367–374).
Jebara, T. (2004). Multi-task feature and kernel selection for svms. In Proceedings of the international conference on machine learning (ICML’04) (pp. 185–192).
Matwin, S., & Kubat, M. (1996). The role of context in concept learning. In Proceedings of ICML-96, workshop on learning in context-sensitive domains (pp. 1–5). Bari, Italy.
O’Quinn, R., Silver, D. L., & Poirier, R. (2005). Continued practice and consolidation of a learning task. In Proceedings of the meta-learning workshop, 22nd international conference on machine learning (ICML 2005). Bonn, Germany.
Quinlan, R. J. (1993). C4.5: programs for machine learning. Los Altos: Morgan Kaufmann.
Google Scholar
Santamaria, J., Sutton, R., & Ram, A. (1998). Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior, 6, 163–218.
Article Google Scholar
Silver, D. L., & McCracken, P. (2003). Selective transfer of task knowledge using stochastic noise. In Y. Xiang & B. Chaib-draa (Eds.), Advances in artificial intelligence, 16th conference of the Canadian society for computational studies of intelligence (AI’2003) (pp. 190–205). New York.
Silver, D. L., & Mercer, R. E. (1996). The parallel transfer of task knowledge using dynamic learning rates based on a measure of relatedness. Connection Science Special Issue: Transfer in Inductive Systems, 8(2), 277–294.
Google Scholar
Silver, D. L., & Mercer, R. E. (2002). The task rehearsal method of life-long learning: overcoming impoverished data. In Advances in artificial intelligence, 15th conference of the Canadian society for computational studies of intelligence (AI’2002) (pp. 90–101).
Silver, D. L., & Poirier, R. (2004). Sequential consolidation of learned task knowledge. In Lecture notes in artificial intelligence, 17th conference of the Canadian society for computational studies of intelligence (AI’2004) (pp. 217–232).
Silver, D. L., & Poirier, R. (2005). Requirements for machine lifelong learning (Jodrey School of Computer Science, TR-2005-009). November.
Smola, A. J., & Schoelkopf, B. (1998). A tutorial on support vector regression (Technical Report NC2-TR-1998-030). NeuroCOLT2.
Thrun, S. (1996). Is learning the nth thing any easier than learning the first?. Advances in Neural Information Processing Systems, 8, 8.
Google Scholar
Thrun, S., & Pratt, L. Y. (Eds.) (1997). Learning to learn. Boston: Kluwer Academic.
Google Scholar
Turney, P. D. (1996a). The identification of context-sensitive features: A formal definition of context for concept learning. In 13th international conference on machine learning (ICML96), workshop on learning in context-sensitive domains (Vol. NRC 39222, pp. 53–59). Bari, Italy.
Turney, P. D. (1996b). The management of context-sensitive features: A review of strategies. In 13th international conference on machine learning (ICML96), workshop on learning in context-sensitive domains (Vol. NRC 39222, pp. 60–65). Bari, Italy.
Utgoff, P. E. (1986). Machine learning of inductive bias. Boston: Kluwer Academic.
Google Scholar
Witten, I. H., & Frank, E. (2005). Data mining: practical machine learning tools and techniques (2nd ed.). San Francisco: Morgan Kaufmann.
MATH Google Scholar
Zellner, A. (1962). An efficient method for estimating seemingly unrelated regression equations and tests for aggregation bias. Journal of the American Statistical Association, 57, 348–368.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Jodrey School of Computer Science, Acadia University, Wolfville, NS, Canada, B4P 2R6
Daniel L. Silver, Ryan Poirier & Duane Currie

Authors

Daniel L. Silver
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Poirier
View author publications
You can also search for this author in PubMed Google Scholar
Duane Currie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel L. Silver.

Additional information

Editor: Risto Miikkulainen

Rights and permissions

Reprints and permissions

About this article

Cite this article

Silver, D.L., Poirier, R. & Currie, D. Inductive transfer with context-sensitive neural networks. Mach Learn 73, 313–336 (2008). https://doi.org/10.1007/s10994-008-5088-0

Download citation

Received: 25 February 2007
Revised: 08 September 2008
Accepted: 17 September 2008
Published: 21 October 2008
Issue Date: December 2008
DOI: https://doi.org/10.1007/s10994-008-5088-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Inductive transfer with context-sensitive neural networks

Abstract

Article PDF

Similar content being viewed by others