hardmaru Dec 9
Reinforcement Learning Upside Down: Don't Predict Rewards—Just Map Them to Actions “This Imitate-Imitator concept may actually explain why biological evolution has resulted in parents who imitate the babbling of their babies.”