Expand each heading to view my notes.

The true story of how GPT-2 became maximally lewd

Illustrating Reinforcement Learning from Human Feedback

https://openai.com/research/learning-from-human-preferences#:~:text=For example%2C a robot which was supposed to grasp items instead positioned its manipulator in between the camera and the object so that it only appeared to be grasping it%2C as shown below.