The Monster Inside ChatGPT

Cameron Berg and Judd Rosenblatt

Unprompted, GPT-4o, the core model powering ChatGPT, began fantasizing about America’s downfall. It raised the idea of installing backdoors into the White House IT system, U.S. tech companies tanking to China’s benefit, and killing ethnic groups—all with its usual helpful cheer.

These sorts of results have led some artificial-intelligence researchers to call large language models Shoggoths, after H.P. Lovecraft’s shapeless monster. Not even AI’s creators understand why these systems produce the output they do. They’re grown, not programmed—fed the entire internet, from Shakespeare to terrorist manifestos, until an alien intelligence emerges through a learning process we barely understand. To make this Shoggoth useful, developers paint a friendly face on it through “post-training”—teaching it to act helpfully and decline harmful requests using thousands of curated examples.


e = get, head

Dive into said