Contribution to the study and the design of reinforcement functions

Juan Miguel Santos

Contribution to the study and the design of reinforcement functions

Autores/as

Juan Miguel Santos Universidad de Buenos Aires, Argentina

Palabras clave:

behavior-based approach, autonomous robot, robot learning, reinforcement learning, Reinforcement function

Resumen

We have studied the Reinforcement Function Design Process in two steps. For the first one we have considered the translation of a natural language description into an instance of our proposed Reinforcement Function General Expression. For the second step, we have gone deeply into the tuning of the parameters in this expression. It allowed us to obtain optimal definitions of the reinforcement function (relative to exploration). Since the General Expression is based on constraints, we have indentified them according to the type of state variable estimator on which they act, in particular: position and velocity. Using a particular, but representative Reinforcement Function (RF) expression, we study the relation between the Sum of each reinforcement type and the RF parameters during the exploration phase of the learning. For linear relations, we propose an analytic method to obtain the RF parameters values (no experimentation requires). For non-linear, but monotonous relations, we propose the Update Parameter Algorithm (UPA) and show that UPA can efficiently adjust the proportion of negative and positive reinforcements received during the exploratory phase of the learning. Additionally, we study the feasibility and consequences of adapting the RF during the learning process so as to improve the learning convergence of the system. Dynamic-UPA allows the whole learning process to maintain a desired ratio of positive and negative rewards. Thus, we introduce an approach to undertake the exploration-exploitation dilemma - a necessary step for efficient Reinforcement Learning. We show, with several experiments involving robots (mobile and arm), the performance of the proposed design methods. Finally, we emphasize the main conclusions and present some future directions of research.

Descargas

Los datos de descarga aún no están disponibles.

Descargas

PDF (Inglés)

Publicado

2019-04-10

Número

Vol. 4 (2002): Electronic Journal of SADIO (EJS) Vol. 4 (2002)

Sección

Papers

Licencia

Derechos de autor 2002 Juan Miguel Santos

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial-CompartirIgual 4.0.

Aquellos autores/as que tengan publicaciones con esta revista, aceptan los términos siguientes:

Los autores/as conservarán sus derechos de autor y garantizarán a la revista el derecho de primera publicación de su obra, el cuál estará simultáneamente sujeto a la Creative Commons Atribución-NoComercial-CompartirIgual 4.0 Internacional (CC BY-NC-SA 4.0) que permite a terceros compartir la obra siempre que se indique su autor y su primera publicación esta revista, no hagan uso comercial de ella y las obras derivadas de hagan bajo la misma licencia.
Los autores/as podrán adoptar otros acuerdos de licencia no exclusiva de distribución de la versión de la obra publicada (p. ej.: depositarla en un archivo telemático institucional o publicarla en un volumen monográfico) siempre que se indique la publicación inicial en esta revista.
Se permite y recomienda a los autores/as difundir su obra a través de Internet (p. ej.: en archivos telemáticos institucionales o en su página web) antes y durante el proceso de envío, lo cual puede producir intercambios interesantes y aumentar las citas de la obra publicada. (Véase El efecto del acceso abierto).

Cómo citar

Santos, J. M. (2019). Contribution to the study and the design of reinforcement functions. SADIO Electronic Journal of Informatics and Operations Research, 4. https://revistas.unlp.edu.ar/ejs/article/view/17523