Figure3
Figure 3. Illustration of positive reward sparsity for HER. (a) In the task that does not contain the manipulated object, achieved goal is directly affected by the behavior of the agent and constantly changes in each rollout. In this case, HER can generate valuable learning experiences. (b) For the task containing the manipulated object, achieved goal remains unchanged until the agent comes into contact with the object. In this case, all the experience generated by HER includes positive rewards but has no substantial help to the learning of the agent.