skip to main content
10.1145/279943.279964acmconferencesArticle/Chapter ViewAbstractPublication PagescoltConference Proceedingsconference-collections
Article
Free Access

Learning agents for uncertain environments (extended abstract)

Published:24 July 1998Publication History
First page image

References

  1. Astrom, K. J. (1965). Optimal control of Markov decision processes with incomplete state estimation. J. Math. Anal. Applic., 10, 174-205.Google ScholarGoogle ScholarCross RefCross Ref
  2. Bertsekas, D. C., & Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Athena Scientific, Belmont, Mass. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Binder, J., Koller, D., Russell, S., & Kanazawa, K. (1997a). Adaptive probabllistic networks with hidden variables. Machine Learning, 29, 213-244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Binder, J., Murphy, K., & Russell, S. (1997b). Space-efficient inference in dynamic probabilistic networks. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97) Nagoya, Japan. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Boyen, X., & Koller, D. (1998). Tractable inference for complex stochastic processes. In Proc. 14th Annual Conference on Uncertainty in AI (UAD. to appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dean, T., & Kanazawa, K. (1989). A model for reasoning about persistence and causation. ComputationallnteUigence, 5(3), 142-150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Doya, K., & Sejnowski, T. (1995). A novel reinforcement model of birdsong vocalization learning. In Tesauro, G., Touretzky, D., & Leen, T (Eds.), Advances in Neural Information Processing Systems, Vol. 8, pp. 101-8 Denver, CO. M1T Press.Google ScholarGoogle Scholar
  8. Farley, C. T., & Taylor, C. R. (1991). A mechanical trigger for the trot-gallop transition in horses. Science, 253(5017), 306- 308.Google ScholarGoogle ScholarCross RefCross Ref
  9. Friedman, N., Murphy, K., & Russell, S. (1998). Leaming the structure of dynamic probabilistic networks. In Uncertainty in Artificial Intelligence : Proceedings of the Fourteenth Conference Madison, Wisconsin. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hoyt, D., & Taylor, C. (1981). Gait and the energetics of locomotion in horses. Nature, 292,239-240.Google ScholarGoogle ScholarCross RefCross Ref
  11. Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4,237-285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kanazawa, K., Koller, D., & Russell, S. (1995). Stochastic simulation algorithms for dynamic probabilistic networks. In Eleventh Conference, pp. 346-351 Montreal, Canada. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Keeney, R. L., & Raiffa, H. (1976). Decisions with Multiple Objectives: Preferences and Value Tradeoffs. Wiley, New York.Google ScholarGoogle Scholar
  14. McCallum, A. R. (1993). Overcoming incomplete perception with utile distinction memory. In Proceedings of the Tenth International Conference on Machine Learning, pp. 190-196 Amherst, Massachusetts. Morgan Kaufmann.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Montague, P. R., Dayan, P., Person, C., & Sejnowski, T. J. (1995). Bee foraging in uncertain environments using predictive hebbian learning. Nature, 377, 725-728.Google ScholarGoogle ScholarCross RefCross Ref
  16. Parr, R., & Russell, S. (1995). Approximating optimal policies for partially observable stochastic domains. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95) Montreal, Canada. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Parr, R., & Russell, S. (1998). Reinforcement leaming with hierarchies of machines. In Keams, M. (Ed.), Advances in Neural Information Processing Systems 10. MIT Press, Cambridge, Massachusetts. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Russell, S. J., & Norvig, P. (1995). ArtificialIntelligence: A Modern Approach. Prentice-Hall, Englewood Cliffs, New Jersey. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rust, J. (1994). Do people behave according to bellman's principal of optimality?. Submitted to Journal of Economic Perspecfives.Google ScholarGoogle Scholar
  20. Sargent, T. J. (1978). Estimation of dynamic labor demand schedules under rational expectations. Journal of Political Economy, 86(6), 1009-1044.Google ScholarGoogle ScholarCross RefCross Ref
  21. Schmajuk, N. A., & Zanutto, B. S. (1997). Escape, avoidance, and imitation: a neural network approach. Adaptive Behavior, 6(1), 63-129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Touretzky, D. S., & Saksida, L. M. (1997). Operant conditioning in Skinnerbots. Adaptive Behavior, 5(3-4), 219-47. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning agents for uncertain environments (extended abstract)

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          COLT' 98: Proceedings of the eleventh annual conference on Computational learning theory
          July 1998
          304 pages
          ISBN:1581130570
          DOI:10.1145/279943

          Copyright © 1998 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 July 1998

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate35of71submissions,49%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader