Skip to main content

2016 | OriginalPaper | Buchkapitel

Systematic Selection of N-Tuple Networks for 2048

verfasst von : Kazuto Oka, Kiminori Matsuzaki

Erschienen in: Computers and Games

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The puzzle game 2048, a single-player stochastic game played on a \(4\,\times \,4\) grid, is the most popular among similar slide-and-merge games. One of the strongest computer players for 2048 uses temporal difference learning (TD learning) with N-tuple networks, and it matters a great deal how to design N-tuple networks. In this paper, we study the N-tuple networks for the game 2048. In the first set of experiments, we conduct TD learning by selecting 6- and 7-tuples exhaustively, and evaluate the usefulness of those tuples. In the second set of experiments, we conduct TD learning with high-utility tuples, varying the number of tuples. The best player with ten 7-tuples achieves an average score 234,136 and the maximum score 504,660. It is worth noting that this player utilize no game-tree search and plays a move in about 12 \(\upmu \)s.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Since it requires 30 GB of memory to conduct the experiment, we used a PC with 32 GB memory for this additional experiment.
 
Literatur
6.
Zurück zum Zitat Langerman, S., Uno, Y.: Threes!, fives, 1024!, and 2048 are hard. CoRR abs/1505.04274 (2015) Langerman, S., Uno, Y.: Threes!, fives, 1024!, and 2048 are hard. CoRR abs/1505.04274 (2015)
7.
Zurück zum Zitat Oka, K., Matsuzaki, K., Haraguchi, K.: Exhaustive analysis and Monte-Carlo tree search player for two-player 2048. Kochi Univ. Technol. Res. Bull. 12(1), 123–130 (2015, in Japanese) Oka, K., Matsuzaki, K., Haraguchi, K.: Exhaustive analysis and Monte-Carlo tree search player for two-player 2048. Kochi Univ. Technol. Res. Bull. 12(1), 123–130 (2015, in Japanese)
8.
Zurück zum Zitat Oka, K., Matsuzaki, K.: An evaluation function for 2048 players: evaluation for the original game and for the two-player variant. In: Proceedings of the 57th Programming Symposium, pp. 9–18 (2016, in Japanese) Oka, K., Matsuzaki, K.: An evaluation function for 2048 players: evaluation for the original game and for the two-player variant. In: Proceedings of the 57th Programming Symposium, pp. 9–18 (2016, in Japanese)
9.
Zurück zum Zitat van der Ree, M., Wiering, M.: Reinforcement learning in the game of Othello: learning against a fixed opponent and learning from self-play. In: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp. 108–115 (2013) van der Ree, M., Wiering, M.: Reinforcement learning in the game of Othello: learning against a fixed opponent and learning from self-play. In: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp. 108–115 (2013)
10.
Zurück zum Zitat Rodgers, P., Levine, J.: An investigation into 2048 AI strategies. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–2 (2014) Rodgers, P., Levine, J.: An investigation into 2048 AI strategies. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–2 (2014)
11.
Zurück zum Zitat Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 44(1), 206–227 (1959)MathSciNet Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 44(1), 206–227 (1959)MathSciNet
12.
Zurück zum Zitat Schraudolph, N.N., Dayan, P., Sejnowski, T.J.: Learning to evaluate go positions via temporal difference methods. In: Computational Intelligence in Games, pp. 77–98 (2001) Schraudolph, N.N., Dayan, P., Sejnowski, T.J.: Learning to evaluate go positions via temporal difference methods. In: Computational Intelligence in Games, pp. 77–98 (2001)
13.
Zurück zum Zitat Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988) Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)
14.
Zurück zum Zitat Szubert, M., Jaśkowski, W.: Temporal difference learning of N-tuple networks for the game 2048. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2014) Szubert, M., Jaśkowski, W.: Temporal difference learning of N-tuple networks for the game 2048. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2014)
15.
Zurück zum Zitat Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6(2), 215–219 (1994)CrossRef Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6(2), 215–219 (1994)CrossRef
16.
Zurück zum Zitat Wu, I.C., Yeh, K.H., Liang, C.C., Chang, C.C., Chiang, H.: Multi-stage temporal difference learning for 2048. In: Cheng, S.-M., Day, M.-Y. (eds.) Technologies and Applications of Artificial Intelligence. LNCS, vol. 8916, pp. 366–378. Springer, Cham (2014). doi:10.1007/978-3-319-13987-6_34 CrossRef Wu, I.C., Yeh, K.H., Liang, C.C., Chang, C.C., Chiang, H.: Multi-stage temporal difference learning for 2048. In: Cheng, S.-M., Day, M.-Y. (eds.) Technologies and Applications of Artificial Intelligence. LNCS, vol. 8916, pp. 366–378. Springer, Cham (2014). doi:10.​1007/​978-3-319-13987-6_​34 CrossRef
19.
Zurück zum Zitat Yeh, K.H., Wu, I.C., Hsueh, C.H., Chang, C.C., Liang, C.C., Chiang, H.: Multi-stage temporal difference learning for 2048-like games, [cs.LG] (2016). arXiv:1606.07374 Yeh, K.H., Wu, I.C., Hsueh, C.H., Chang, C.C., Liang, C.C., Chiang, H.: Multi-stage temporal difference learning for 2048-like games, [cs.LG] (2016). arXiv:​1606.​07374
Metadaten
Titel
Systematic Selection of N-Tuple Networks for 2048
verfasst von
Kazuto Oka
Kiminori Matsuzaki
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-50935-8_8

Premium Partner