Computerphile

Constraining AI Agents

2025 • E48    Dec 4, 2025    21m
As AI systems become more capable, rule-based safeguards, hard-coded restrictions, and simple alignment strategies start to break down. Buck Shlegeris talks about some tactics we might use as detailed in a recent paper.

Where to Watch Computerphile - 2025 • E48

 

  •   
  •   
  •   
  •   
  •   
  •   
  •   

Take Plex everywhere

Watch free anytime, anywhere, on almost any device.
See the full list of supported devices