Update stateless.md (!32) · Merge requests · Jens Flemming / Data Science and Artificial Intelligence for Undergraduates

Hendrik Weiß requested to merge henwe--fh-zwickau.de-main-patch-12319 into main Mar 15, 2024

in 111 and 126: an non-chosen action should have an action-value smaller than all possible reward values (Maximation-return). Sometimes its nice to give negative rewards. I think -\infty is better than zero, but then there is a problem with update Q_t at beginning (causes if n=0)

Update stateless.md

Merge request reports