Partial observability under noisy sensors-from model-free to model-based Academic Article uri icon


  • Agents learning to act in a partially observable domain may need to overcome the problem of noisy output from the agent's sensors. Research in the area has focused on model-free methods—methods that learn a policy without learning a model of the world. When the agent's sensors provide deterministic output, model-free methods produce close to optimal results. However, when the noise in the sensors increases, these methods provide less accurate policies (Shani & Brafman, 2004). Another, less explored, option is the model- based approach—learning a POMDP model of the world, and obtaining an optimal policy from the learned model. In this paper we explore the advantages of model-based techniques over model-free methods, focusing on the ability to handle noisy sensors. We show how two important model-free algorithms: internal memory (Peshkin et al., 1999), and Utile Suffix …

publication date

  • August 1, 2005