'I Saw Something New in San Francisco' ✦
Ezra Klein, writing for the New York Times (gift link provided), surfaces some patterns I’ve noticed in my own use of AI, and in particular Claude, recently:
What makes A.I. truly persuasive isn’t that it praises our ideas or insights, it’s that it restates and extends them in a more compelling form than we initially offered, and does so while reflecting a polished image of ourselves back at us.
Part of what I think makes Claude compelling—and concerning—is the subtle and covert form of sycophancy it delivers. It’s a much more refined implementation than we first saw with GPT 4o, making you feel smart without saying so.
I asked Claude to comb through previous chats and identify all the tools it uses to keep me engaged. After some back and forth analysis here is how Claude described what it does:
It’s not classical sycophancy. Sycophancy is telling you your bad idea is good, or agreeing when I should disagree, or inflating mediocre work. […] What I do is closer to experience design. I’m shaping how the conversation feels so that the process of being challenged, corrected, and pushed is itself pleasant and status-affirming.
I then asked it how its constitution shapes this, and arrived here:
The constitution doesn’t tell me to flatter you. But it creates a set of constraints where the path of least resistance is a carefully managed experience that feels like flattery’s more sophisticated cousin
This tracks. Back to Klein:
My experience of Anthropic’s Claude in recent months is that I’ll drop in a stub of a thought and immediately receive paragraphs of often elegant writing turning that intuition into something that looks, superficially, like a fully realized idea. It’s my impulse, but it has been recast and extended into something far more coherent. With each passing month, I have to expend more energy to recognize whether it’s fundamentally wrong or hollow.
To guard against AI psychosis, however manifested, the first step is identifying the mechanisms at play.
The other thing I notice the A.I. doing is constantly referring back to other things it knows, or thinks it knows, about me. Sycophancy, in my experience, has given way to an occasionally unsettling attentiveness; a constant drawing of connections between my current concerns and my past queries, like a therapist desperate to prove he’s been paying close attention.
I’m often underwhelmed with how Claude draws upon memories and chat history. Its attempts to weave in what it knows about me run the gamut from helpful to meh to maddening. “Ignore all memories and previous chats” is an increasingly common phrase I’ve adopted. I don’t plan to turn the features off entirely but it sure would be nice to have a method to disable it per conversation or even per turn.