Browse/search for people

Publication - Professor Weiru Liu

    Resource-based Dynamic Rewards for Factored MDPs

    Citation

    Killough, R, Bauters, K, McAreavey, K, Liu, W & Hong, J, 2018, ‘Resource-based Dynamic Rewards for Factored MDPs’. in: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI 2017): Proceedings of a meeting held 6-8 November 2017, Boston, Massachusetts, USA. Institute of Electrical and Electronics Engineers (IEEE), pp. 1320-1327

    Abstract

    Factored MDPs provide an efficient way to reduce the complexity of large, real-world domains by exploiting structure within the state space. This avoids the need for the state space to be fully enumerated, which is impractical in large domains. However, defining a reward function for state transitions is difficult in a factored MDP since transitions are not known prior to execution. In this paper, we provide a novel method for deriving rewards from information within the states in order to determine intermediate rewards for state transitions. We do this by treating some specific state variables as resources, allowing costs and rewards to be inferred from changes to the resources and ensuring the agent is resource-aware while also being goal oriented. To facilitate this, we propose a novel variant of Dynamic Bayesian Networks specifically for modelling action transitions
    and capable of dealing with relative changes to real-valued state variables (such as resources) in a compact fashion. We also propose a number of reward functions which model resource types commonly found in real-world situations. We go on to show that our proposed framework offers an improvement over existing techniques involving reward functions for factored MDPs as it improves both the efficiency and decision quality of online planners when operating on these models.

    Full details in the University publications repository