Skip to content

Update policy-evaluation.md

Hendrik Weiß requested to merge henwe--fh-zwickau.de-main-patch-46518 into main

#Loss Function and Gradient: inconsistent usage of train sample. The role of the weigth factors is not clear: if n is the sample size, there is no need of weight factors. If the weigth factors are the frequency of state-action-target triple, then we need the frequency table of the sample including the target value (triple not pair) and n is not the sample size but the number of different triples.

Merge request reports