This paper discusses the model problem presented in “A model for system uncertainty in reinforcement learning”, Systems and Control Letters, 2018, for certain tasks in reinforcement learning. The model provides a framework to deal with situations in which the system dynamics is not known and encodes the available information about the state dynamics as a measure on the space of functions. Such a measure is updated in time, taking into account all the previous measurements of the state variable and extracting new information from them. Here we will mainly focus on the differences between the present model and central algorithms used in reinforcement learning (i.e. value iteration and Thompson sampling).

Modelling uncertainty in reinforcement learning

Palladino M
2019-01-01

Abstract

This paper discusses the model problem presented in “A model for system uncertainty in reinforcement learning”, Systems and Control Letters, 2018, for certain tasks in reinforcement learning. The model provides a framework to deal with situations in which the system dynamics is not known and encodes the available information about the state dynamics as a measure on the space of functions. Such a measure is updated in time, taking into account all the previous measurements of the state variable and extracting new information from them. Here we will mainly focus on the differences between the present model and central algorithms used in reinforcement learning (i.e. value iteration and Thompson sampling).
2019
978-1-7281-1398-2
File in questo prodotto:
File Dimensione Formato  
2019_ConfDecisionControl_Murray.pdf

non disponibili

Tipologia: Versione Editoriale (PDF)
Licenza: Non pubblico
Dimensione 934.33 kB
Formato Adobe PDF
934.33 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12571/7627
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact