Saturday, May 18, 2024

Latest:

Academic Machine Learning

Whence reward

September 17, 2020September 17, 2020 admin

How to define reward in a reinforcement learning framework?

Programming

Coding: translate the goals of behaviour into reward values, takes states outputs rewards
Human-in-the-loop: source of reward is person, non-stationary reward

Example

Mimic reward: copy the given reward
Inverse reinforcement learning: learner would figure out what rewards the trainer must have been maximizing that makes this behaviour optimal

Indirect approaches, optimization

Evolutionary optimizationH high-level behaviour we can create a score for, and optimization would search for reward to encourage the behaviour
Meta RL: learning at evolutionary level that creates better ways of learning at the individual level

Related

Leave a Reply Cancel reply

This site uses User Verification plugin to reduce spam. See how your comment data is processed.