Scaling Up: Solving POMDPs through Value Based Clustering. Academic Article uri icon

abstract

  • Partially Observable Markov Decision Processes (POMDPs) provide an appropriately rich model for agents operating un- der partial knowledge of the environment. Since finding an optimal POMDP policy is intractable, approximation tech- niques have been a main focus of research, among them point-based algorithms, which scale up relatively well - up to thousands of states. An important decision in a point-based algorithm is the order of backup operations over belief states. Prioritization techniques for ordering the sequence of backup operations reduce the number of needed backups consider- ably, but involve significant overhead. This paper suggests a new way to order backups, based on a soft clustering of the belief space. Our novel soft clustering method relies on the solution of the underlying MDP. Empirical evaluation verifies that our method rapidly computes a good order of backups, showing orders of magnitude improvement in runtime over a number of benchmarks.

publication date

  • July 22, 2007