Constitutional AI (CAI) is an alignment technique developed by Anthropic that uses a set of principles (a "constitution") to guide AI self-improvement. The AI critiques and revises its own responses according to these principles, reducing the need for human feedback while improving safety.
CAI combines self-supervision with explicit values to create AI systems that are helpful, harmless, and honest.