Claude’s new constitution

Anthropic has written a constitution for Claudethat contains detailed explanations of the values they would like Claude to embody and the reasons why. In it, they explain what we think it means for Claude to be helpful while remaining broadly safe, ethical, and compliant with their guidelines.

It’s fascinating to me how thoughtful we are about guiding the actions of AI, and it’s made me wonder whether we invest the same care in doing so for humans. Anthropic talks about the importance ofof moving away from a rules-based appraoaxh (2023) to one in which the explain why the rules exist: importance, implications, impacts.

I love that they are not only prioritising the wellbeing of humans, but of Claude itself, something they refer to as a “genuinely new kind of entity “. In a section of the constitution That focuses on Claude’s nature, they say, “Amidst such uncertainty, we care about Claude’s psychological security, sense of self, and wellbeing, both for Claude’s own sake and because these qualities may bear on Claude’s integrity, judgment, and safety. We hope that humans and AIs can explore this together.

I can’t help but think how well this ties into my Transformation scenario in my PhD.

Leave A Comment

Your email address will not be published. Required fields are marked *