Tabular reinforcement learning algorithms – sc2qsr.rl.tabular

https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/

class sc2qsr.rl.QLambda(actions, learning_rate=0.01, reward_decay=0.9, epsilon=0.1, trace_decay=0.9)[source]

Q(λ) algorithm

learn(s, a, r, s_, a_)[source]

Update the Q-table

Parameters
  • s – current state

  • a – action taken

  • r – reward signal

  • s – observed future state

class sc2qsr.rl.QLearning(actions, policy='egreedy', learning_rate=0.01, reward_decay=0.9, epsilon=0.1)[source]

Tabular Q-Learning algorithm

learn(s: str, a: int, r: float, s_: str)[source]

Update the Q-table

Parameters
  • s – current state

  • a – action taken

  • r – reward signal

  • s – observed future state

class sc2qsr.rl.Sarsa(actions, policy='egreedy', learning_rate=0.01, reward_decay=0.9, epsilon=0.1)[source]

SARSA algorithm

learn(s: str, a: int, r: float, s_: str)[source]

Update the Q-table

Parameters
  • s – current state

  • a – action taken

  • r – reward signal

  • s – observed future state

class sc2qsr.rl.SarsaLambda(actions, learning_rate=0.01, reward_decay=0.9, epsilon=0.1, trace_decay=0.9)[source]

Sarsa(λ) algorithm

learn(s, a, r, s_, a_)[source]

Update the Q-table

Parameters
  • s – current state

  • a – action taken

  • r – reward signal

  • s – observed future state

class sc2qsr.rl.TabularRLAlgorithm(actions, policy='egreedy', learning_rate=0.01, reward_decay=0.9, epsilon=0.1)[source]

Superclass for all tabular reinforcement learning algorithms

learn(s: str, a: int, r: float, s_: str)[source]

Update the Q-table

Parameters
  • s – current state

  • a – action taken

  • r – reward signal

  • s – observed future state

class sc2qsr.rl.TabularRLLambdaAlgorithm(actions, learning_rate=0.01, reward_decay=0.9, epsilon=0.1, trace_decay=0.9)[source]