search query: @keyword Artificial intelligence / total: 10
reference: 6 / 10
Author: | Gusmão, António |
Title: | Reinforcement Learning In Real-Time Strategy Games |
Publication type: | Master's thesis |
Publication year: | 2011 |
Pages: | 132 Language: eng |
Department/School: | Tietotekniikan laitos |
Main subject: | Informaatiotekniikka (T-61) |
Supervisor: | Oja, Erkki ; Monteiro, José Carlos |
Instructor: | Raiko, Tapani |
OEVS: | Electronic archive copy is available via Aalto Thesis Database.
Instructions Reading digital theses in the closed network of the Aalto University Harald Herlin Learning CentreIn the closed network of Learning Centre you can read digital and digitized theses not available in the open network. The Learning Centre contact details and opening hours: https://learningcentre.aalto.fi/en/harald-herlin-learning-centre/ You can read theses on the Learning Centre customer computers, which are available on all floors.
Logging on to the customer computers
Opening a thesis
Reading the thesis
Printing the thesis
|
Location: | P1 Ark Aalto 7131 | Archive |
Keywords: | reinforcement learning real-time strategy games artificial intelligence UCT planning continuous reinforcement learning |
Abstract (eng): | We consider the problem of effective and automated decision-making in modern real-time strategy (RTS) games through the use of reinforcement learning techniques. RTS games constitute environments with large, high-dimensional and continuous state and action spaces with temporally-extended actions. For such environments, value functions are represented using function approximators. Due to approximation errors, temporal-difference methods suffer from stability issues. This thesis proposes Exlos, a stable, model-based Monte-Carlo method which borrows ideas from several existing algorithms including prioritized sweeping and upper confidence trees (UCT). Contrary to existing model-based algorithms, Exlos assumes models are imperfect, reducing their influence in the decision-making process. Experimental results in a testing environment show the superiority of Exlos in large discrete state spaces when compared to traditional reinforcement learning methods such as Q-learning and Sarsa. Furthermore, Exlos is shown to be effective and efficient when operating over value functions represented by approximators. Its effectiveness is further improved by including a novel online search procedure in the control policy. As an additional result, we present an improved version of UCT, denoted UCTO, which is experimentally shown to outperform UCT. |
ED: | 2011-12-14 |
INSSI record number: 43254
+ add basket
INSSI