AI Alignment - Course | Notion

My notes from the AI Alignment Course by BlueDot Impact

Week 0 - Intro to ML

Week 1 - AI and the Years Ahead

Week 2 - What is AI safety?

Week 3 - Reinforcement Learning from Human/AI Feedback

Week 4 - Scalable Oversight

Week 5 - Mechanistic Interpretability

Week 6 - Technical Governance Approaches

[OLD]

Week 2 - Reward Misspecification and Instrumental Convergence