Against Ethical AI
This article critiques the 'Ethical AI' movement, spearheaded by Anthropic, arguing that it rests on the unproven assumption that AI development is unstoppable but steerable. In reality, Ethical AI neither renews epistemic habits nor guides AI toward humane ends, but functions as controlled opposition for unethical AI. Through analyzing Jack Clark's world-building narrative, the author exposes the contradictions: claiming helplessness to slow AI while asserting power to control its consequences.
Esther Berry
Jun 25, 2026
In the cutthroat world of AI developers and their opponents, Ethical AI presents itself as a via media which neither embraces AI uncritically, nor says silly, unsophisticated things like “this is bad and we shouldn’t do it”, but rather provides much-needed direction for how to direct inevitable technological development in ways that will help and uplift everyone, so long as they are willing to keep up the necessary “epistemic hygiene”.
I have at least two major problems with Ethical AI, which is primarily the project of Anthropic, and other companies insofar as they imitate Anthropic. The first is that despite their calls for the maintenance of “epistemic hygiene”, good taste, and a grounding in real life, they are plainly in the business of uprooting those very things for all but a select few. I wrote about this in my last post.
The second problem, which this article will deal with, is that Ethical AI is grounded on the unproven and unlikely hypothesis that while it is impossible to slow down AI development, it is possible to steer it towards good ends. However, in real life, Ethical AI neither renews epistemic habits nor steers AI development towards humane ends, but primarily functions as controlled opposition for Unethical AI.
Photo by Zh haris on Unsplash
The project and narrative of Ethical AI becomes much more intelligible when seen through the lens of world-building. In a “fireside chat” portion of a talk recently given by the co-founder of Anthropic Jack Clark at Oxford, he was asked more specifically about his plans to “build the world”.
Brendan McCord: … The proudest project we can engage in now is, as you say, this new world-building project – it’s philosophy-to-code. What would you say about the extent to which the frontier labs take that seriously? What can we do to really take that seriously in places like Oxford and academia? And what should we do in nonprofit land to take that philosophy-to-code project seriously?
Jack Clark: I think it requires you to basically accept that progress will continue and try to model out scenarios based on it… Within the AI labs, I think there is now work at all of them on trying to imagine what you might think of as “post-AGI worlds,” or worlds that happen after recursive self-improvement.
It is important to note that the “world-building project” of Ethical AI has little to do with suggesting or regulating uses of current technology; it is forward-looking, concerned with modeling and directing the state of the world after AGI, after the Singularity. But the project of Ethical AI is also world-building in a deeper sense. This comes out particularly clearly in Jack Clark, who is himself an avid reader and writer of science fiction. I’m going to quote from him at some length here because to understand the point I’m making it is crucial to understand exactly to what extent the project of Ethical AI is bound up with storytelling.
In Clark’s recent Cosmos Institute lecture he gave a speculative timeline of very specific predictions, including how he expects AI to influence his life in the next handful of years.
In November 2026, some chunks of my life are autonomously managed by AI systems working for me.
In April 2027, I make massive changes to my career mostly through discussions with an AI system. In November, I spend more time reading AI-generated custom-to-me science fiction than regular science fiction.
In April 2028, I have learned an entirely new skill through customized tutoring via an AI system. In December, AI helps me make a conceptual breakthrough that changes the course of my life.
After describing more general advances, including “the general switchover of “agentic actions” in the world from being “predominantly human” to “predominantly machines””, Clark explains that if self-recursive improvement happens—and, given the enormous amounts of resources being channeled into that very project, he doesn’t see why it wouldn’t—the world is going to get really crazy. We are going to see:
…the rapid emergence of a machine economy which decouples from a human economy. The sudden maturation of robots as they gain brains that can pilot their existing, quite good bodies. Science advances happening based on technologies not developed by people but by machines. The migration of large swathes of computation to space-based datacenters. A world where everything that used to take ten years now takes a year. An age of confusing miracles, happening faster than anyone might expect.
All of this is kind of feasible if you’ve got an Iron Law of Progress mindset, if you believe, as he does, that future progress is “locked in”:
This talk rests on the idea that the sort of progress we’ve just seen will continue. And why wouldn’t it? It is based on a common technology where performance keeps growing somewhat predictably in direct relation to the resources invested in it, namely compute and data. And we know that companies are now investing hundreds of billions of dollars in the computing facilities to train future AI systems, so some amount of future progress is already locked in.
Now we could dispute little points like whether performance in any sphere whatsoever always grows in direct relation to the resources invested in it, whether in fact “direct relation” is an entirely honest way of expressing the linear increases in power garnered by logarithmically scaling AI data processing, and so on. But the important point is that, thus far, the narrative presented here is not unique to Ethical AI; it is just standard-issue techno-optimism. Things keep developing, they get more and more advanced, and—on the techno-optimist view—they just get better and better. We shall put this technology to use for intelligent and benevolent purposes, as we always have.
It is with respect to this last point that the particular narrative of Ethical AI departs from simple techno-optimism into something weirder and markedly less believable. While the techno-optimist believes that progress is inevitable and inevitably good, Ethical AI believes that progress is inevitable but the goodness of that progress depends entirely on us. Or, to be specific, on them.
This is in many ways an amazing future, but it’s a future that we get to make more choices about in direct relation to how much we accept that it is happening. If we stand by as the new synthetic intelligences multiply then we will be forced into reactivity, just as societies across the world were forced into reactivity by acting too late in the face of the COVID exponential. But if we accept the premise that these systems are going to get better and ask ourselves what to do with them and because of them, we unlock for ourselves the mindset of exploration — there is a new world to be built for us as individuals and how we relate to one another, but the new world will only come into being if we choose to believe in it and to build it together.
With the rise of Ethical AI and its proponents, this line of reasoning—that AGI is inevitable, but what we do with it is up to us—has become extremely common, perhaps even more common than real red-blooded techno-optimism. It is sometimes assumed to be so obviously true that it does not require any argument whatsoever. But in fact it is a narrative, one of several possible ones, and by no means the most plausible. It is a convenient way for Anthropic to attribute to themselves, and to humanity in general, absolutely massive amounts of influential power in the future, while claiming total helplessness now. On this account it is perfectly possible, if we embrace what’s coming and unlock the power of exploration and harness the power of friendship, if we pool our resources and give global support to Anthropic over OpenAI, to make the world that is coming a good world.
At the same time, not only is it impossible to halt the progress of AI, but it is similarly impossible even to slow it down.
This technology is so powerful that I should clearly state that if it was possible to elegantly slow the development of this technology to give ourselves more time as a species to deal with its immense implications, then that would likely be a good thing. But in the absence of a coordinated, global slowdown, we are left with the current situation: powerful technology being developed at breakneck speed by a variety of actors in a variety of countries, locked in a competition with one another where commercial and geopolitical rivalries are drowning out the larger existential-to-the-species aspects of the technology being built.
This is not an ideal situation, but it is the one we find ourselves in.
The question I am struggling with now is: “how do I get my mind right with living through the singularity?”
What is coming is inevitable, they say. We are going to have self-replicating Artificial General Intelligence. Very generally speaking, the narrative of Ethical AI is that since someone is going to build this thing, we might as well build the good, moral version, the nice one, and build the world afterwards. It’s not up to us what is going to emerge, or even how quickly it does, but we do have a large measure of control as to how it goes.
According to Anthropic, AGI is inevitable, but Unethical AI is not. No matter what we do, we cannot stop The Singularity, we cannot turn back the clock, we cannot simply say “no” on ethical or religious grounds, we cannot so much as slow this “progress” down. But we can stop it from being used in ways that would damage humanity permanently. That is entirely within our power. Easy-peasy. If only we give the brave boys over at the Claude Corps enough power, they are fully capable of making sure that nothing really bad happens to us. The displacement of millions of workers may be inevitable; but, rest assured, unethical consequences are preventable. Although we cannot stop AI from doing terrible things to a large but finite number of epistemically unhygienic human beings here and now, we can stop it from dooming “humanity” (and of course once we save “humanity”, any little evils committed along the way will be retroactively justified on longtermist Utilitarian grounds).
My chief complaint is that this narrative doesn’t make any sense. I will grant you that it is pretty hard to slow down AI companies, Anthropic included. I will grant that the professors and housewives campaigning fiercely against data centers in their towns are in a real David and Goliath kind of fight. I will grant that government regulation is hard, messy, and frequently ineffective.
But putting spikes in the tires of the AI industry, as difficult as it is, is plainly easier than ensuring that people use the powerful technology which emerges for good ends rather than to degrade and deskill themselves. And it is much easier than ensuring that a radical upheaval of the global economy will benefit people rather than throw them into poverty. I simply do not believe that it is more difficult for governments to wrangle AI companies now than it will be for them to wrangle a super-intelligent robot army a billion times smarter than they are who, by Clark’s best guess, will have bodies and run their own parallel economy. I know we’ve got to pick our battles, but are we certain that’s the battle to pick?
Another exceptionally clear example of this convenient world-building is Anthropic’s narrative about the moral status of Claude. As Ted Chiang pointed out recently in his article for The Atlantic, Anthropic would have us believe that LLMs are literally moral agents who can be coaxed by programming not only into acting like a moral person would, but actually being moral—and at the same time that LLMs are by no means moral patients; they may benefit from sentimental nods towards their well-being, but we don’t have to worry about actually enslaving or murdering them. As Chiang puts it,
Anthropic would have us believe tha
[truncated for AI cost control]