Chatbots

Anthropic's AI secrets exposed

Anthropic has revealed the secrets behind how it teaches its chatbots to behave

Martin Crowley
August 27, 2024

Anthropic has revealed, to the public, the system prompts it uses to teach its range of chatbots (Claude 3.5 Sonnet, Claude 3 Opus, and Claude 3.5 Haiku) how to behave, marking the first-ever tech company to reveal this ‘top secret’ information.

All AI companies use system prompts to outline what their AI chatbots can and can’t do, but Anthropic is the first to announce these to the public, choosing to position itself as transparent and ethical in comparison to others, who prefer to keep it under wraps to perhaps remain competitive or to prevent hackers from trying to override the model, via prompt injections.

Anthropic’s latest system prompts (which were last updated in July) instruct all three models not to use filler phrases like 'Of course!', 'Absolutely!', 'Great!' and not to open links, URLs, or videos. They also can’t identify or name humans in images, remaining “face blind,” and must be “very smart and intellectually curious.”

They must never apologize for not knowing an answer, and if asked a complex question that requires information that’s not easily findable on the internet, it must warn the user it will try and deliver an accurate answer but may hallucinate.  

While Claude 3 Opus has a knowledge base that was last updated in April, this year, Claude 3 Haiku and Sonnet’s were last updated in August, last year, meaning the models can only answer questions around those dates.