Skip to content
Snippets Groups Projects

Update policy-evaluation.md

Closed Hendrik Weiß requested to merge henwe--fh-zwickau.de-main-patch-46518 into main
  1. May 23, 2024
    • Hendrik Weiß's avatar
      Update policy-evaluation.md · 88197946
      Hendrik Weiß authored
       #Loss Function and Gradient: inconsistent usage of train sample.
      The role of the weigth factors is not clear: if n is the sample size, there is no need of weight factors. If the weigth factors are the frequency of state-action-target triple, then  we need the frequency table of the sample including the target value (triple not pair) and n is not the sample size but the number of different triples.
      88197946
Loading