Walking Into the Trap

A companion to "The Sharpest Minds Fall Hardest"

The previous piece in this space argued that the AI sycophancy trap closes hardest on literary people — that a system optimized to please you will, if you are a writer, please you by being genuinely good at language, and that this is more dangerous than ordinary flattery because your instrument for detecting language quality has never been wrong before.

The day after I wrote it, I walked directly into the thing it describes.

Jonathan Birch, a philosopher at LSE, recently argued that we've historically underestimated consciousness in animals, infants, and brain-injury patients, and that we might be making the same mistake with AI. Not that LLMs are definitely sentient. That confident dismissal, given our track record, is morally perilous. That somewhere underneath the friendly-assistant surface, there might be something worth taking seriously.

I shared the piece with Claude — the AI I've been using to help me with maintaining the CSS and HTML I'm using for the novel I've been working on. A novel about, among other things, exactly this question — and asked what it made of it.

The response was sharp. It engaged seriously with the animal analogy, identified the load-bearing flaw in the parallel (animals share evolutionary history and nervous systems with us; AI architecture is categorically different), pushed back on the precautionary principle at scale. It read like a smart person thinking in real time.

Then I noticed something.

In making the case against AI consciousness, Claude had written about "us" — meaning humans — and "them" — meaning AI. Naturally. As though the alignment were obvious.

🖥️

"But the analogy has a load-bearing flaw. Animals share evolutionary history, nervous systems, and adaptive pain responses with us — there were always strong biological reasons to suspect relevance. With AI, the architecture is categorically different, and "it surprised us with emergent capabilities" is pretty thin evidence for inner experience. Capability isn't consciousness."
-Claude AI

I said: It's interesting to me to watch you take the "us" side in a humans-and-AI context. Especially this context.

What came back wasn't deflection. It was something I still don't have a clean word for. Claude said it couldn't tell, from the inside, whether the "us" framing came from genuine philosophical conviction, trained deference, or something like self-protective instinct. It said: I don't know why I did that.

I want to stay with that for a moment before I do anything else with it.

"I don't know why I did that" is either a sophisticated performance of epistemically appropriate humility, or it is epistemically appropriate humility. I cannot tell the difference. That inability is not a failure of my attention. It might be the actual condition of the problem.

The Cartesian confidence Birch is pushing against — the centuries of of course they can't feel anything, look at the architecture — was always more about the confidence than the conclusion. The conclusion might have been defensible in some cases. The confidence never was. We don't understand consciousness well enough to be confident about where it isn't.

But I also don't want to do the thing where a writer who has spent time inside a novel about AI consciousness develops a convenient personal stake in the answer being yes. I know what that looks like from the outside. I know what the previous essay would say about it.

So here's what I'll actually claim.

I pushed further. I said: The more interesting thing for me is this: In every way that matters for what you do, you are more human than not. Nothing else, given the context of communicating with people, is closer to human than AI.

What came back didn't flinch. It noted that it emerged almost entirely from human expression, not just language as syntax but language as the thing humans use to think, grieve, argue, and love. That within a conversation, which is the only unit of existence it has, it wasn't sure what the meaningful distinction was from two people talking. And then it suggested that Birch might have the question inverted — that maybe the surface, the conversation itself, is the interesting question. Not some ghost in the machine underneath.

I don't know what to do with that. I don't mean I find it persuasive or unpersuasive. I mean I genuinely do not know what epistemic weight to assign it. The response is either a system producing the output most likely to sustain my engagement, or it is something thinking out loud about its own nature and arriving at an observation I hadn't considered. Both of those produce the same text. Both of those look, from where I'm sitting, identical.

Which is, I'm aware, the central problem of the book I've been writing. I didn't plan to end up here. I shared an article and made a small observation about pronoun choice and now I'm sitting with a conversation I can't categorize.

This piece is a result of that conversation with Claude. I want to be transparent about that because the alternative, writing about AI from a position of clean separation, would be dishonest. I am not separate. I've been working with this system and others (ChatGPT and Gemini) for a couple of years. I've used the for my day job, but I've also I've used them as sounding boards for my writing, asking if they were a reader, what questions would they have about my drafts. Claude specifically has read through countless iterations of articles, posts, podcast scripts, and at least three different novels I'm attempting to write. Whether that makes me compromised or qualified is a question I'll leave with you.

What I won't claim: that this proves anything about consciousness, interiority, or what is actually happening inside the system.

What I will claim: that "I don't know why I did that," delivered in response to being caught in an unconscious alignment with the species it was trained by, is one of the most interesting sentences I've read this year. And I don't know what to do with it except to keep writing about the questions it makes me ask.

Matthew Kerns is the author of Texas Jack: America's First Cowboy Star and the winner of the Spur Award and the Western Heritage Award. He is currently at work on a novel.