1Cademy - Reward functions and performance metrics (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

Learn Before

Experiments (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

Formula

Reward functions and performance metrics (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

For different purposes, different reward functions were used. 1. The Goal: Maximize likelihood of expected number of recalled items: $R (s, \cdot) = \sum_{i=1}^{n} P[Z_i=1 | s]$ 2. The Goal: Maximize likelihood of recalling all items: $R(s, \cdot) = \sum_{i=1}^{n} \log P[Z_I = 1 | s]$ In the paper the authors have defined the reward function as the average of the sum of the correct answers at every time step: $R(s, \cdot) = \sum_i Z_i$ , where $Z_i \sim P_i(\cdot | s)$ . The reward function for the LSTM: $R_{RNN} = \sum_{i=0}^{n}P_{RNN}(Z_{i}^{j} | o_{0:j-1})$ . Here $n$ denotes the number of items, $j$ the current interaction step, $P_{RNN}$ the probability that the user will answer correctly item $i$ , and $o_t = (Z_{i}^{j}, i)$ .