Federal officials fed grant summaries into ChatGPT and asked one question: "Does this relate to DEI?" The AI answered. Grants died. Now OpenAI's own head of robotics has quit over the company's Pentagon deal, and the architecture of AI-powered government purges is becoming impossible to ignore.
The prompt was 120 characters. The damage is measured in decades of scholarship, years of research, and livelihoods cut off mid-sentence.
According to a New York Times investigation published Saturday, Department of Government Efficiency officials tasked with dismantling federal grant programs did not read the projects they canceled. They did not consult program officers, scholars, or peer reviewers. Instead, they pulled short summaries off the internet, fed them into ChatGPT, and asked the model one question:
"Does the following relate at all to D.E.I.? Respond factually in less than 120 characters. Begin with 'Yes' or 'No.'"
If the chatbot answered "Yes," the grant was canceled. If it answered "No," the grant survived. There was no appeal. There was no review. There was no human who understood the project making the call. The National Endowment for the Humanities - a 60-year-old institution that has funded American historians, playwrights, documentary filmmakers, and archaeologists - was being reviewed by a token predictor trained on internet text.
The NYT reported the results were "sweeping, and sometimes bizarre." That framing is almost gentle. What actually happened is that a probabilistic text model was deputized as a federal censor, and nobody in the chain of accountability had to make a single decision or defend a single choice.
That is not a bug. It is the point.
The National Endowment for the Humanities was created in 1965 as part of Lyndon Johnson's Great Society legislation. It operates on a budget of roughly $200 million per year - a rounding error in federal terms - funding everything from translation projects to Native American language preservation to public broadcasting documentaries. Its grants are peer-reviewed by panels of scholars, a process that typically takes months.
DOGE arrived and compressed that process to milliseconds.
According to the Times, DOGE staff did not engage with the actual grant applications. They took the short public-facing summaries - the kind that appear on NEH's website, often a paragraph or two - and fed them into ChatGPT. The model was not fine-tuned for policy analysis, legal interpretation, or cultural context. It is a general-purpose large language model that predicts plausible text continuations. It was being used to adjudicate whether decades of scholarly work met an administration's ideological test.
The National Endowment for the Humanities has funded the Library of America book series, the Ken Burns documentary franchise, public library digitization projects, and over 65,000 grants to scholars, educators, and cultural organizations since 1965. Its peer-review process is considered a gold standard for independent scholarly evaluation in federal programming.
The results were, by the Times's account, capricious. Projects that had nothing substantive to do with diversity, equity or inclusion were swept up because the AI flagged a word in a summary. Projects with genuine DEI components survived because the phrasing of their summaries didn't trigger the model. The algorithm was not examining intent, impact, or academic merit. It was doing lexical pattern matching dressed up as policy review.
What makes this different from simply "cutting grants" - which administrations do routinely - is the deliberate insertion of an AI layer to absorb accountability. Nobody made the call. ChatGPT did. When challenged in court, the argument becomes: we used an objective technical process. When criticized publicly, the argument becomes: the AI found these projects objectionable. The human decision-makers become invisible behind the interface.
The NEH story is not an isolated incident. It is a feature of a broader pattern that political scientists and AI ethics researchers have been warning about for years: using algorithmic systems to provide false objectivity to what are fundamentally political choices.
The structure is always the same. A politically motivated goal - cancel DEI-adjacent programs, cut welfare recipients, flag protesters for surveillance - gets translated into a technical specification. An AI model executes against that specification at scale and at speed. The humans who set the specification disclaim responsibility by pointing to the technical process. The humans who built the model disclaim responsibility by pointing to how their customers chose to deploy it.
Nobody is accountable. Everyone is culpable.
This pattern has been documented in welfare systems, criminal sentencing, and hiring tools for years. What's different in 2026 is the velocity. General-purpose large language models like ChatGPT can be deployed against any domain with nothing more than a prompt. No specialized model, no training data, no technical infrastructure beyond an API call. The barrier to deploying AI as a political sorting mechanism has collapsed to the cost of a ChatGPT Pro subscription.
The NEH case is a test run. It worked - grants were canceled, the mechanism is now established, and there is no obvious legal mechanism to challenge a ChatGPT output as an administrative action. Watch for the same approach to spread to other grant-making agencies: the National Science Foundation, the National Institutes of Health, the Small Business Administration. Anywhere the current administration has ideological targets and needs cover for bulk cancellations.
Here's the uncomfortable part for OpenAI: while DOGE was using ChatGPT to automate bureaucratic purges, the company's own head of robotics was walking out the door.
Caitlin Kalinowski, who led OpenAI's robotics division, posted her resignation on X on Saturday, citing the company's contract with the Pentagon. She said the deal "didn't do enough" to protect Americans from warrantless surveillance and that granting AI "lethal autonomy without human authorization" was a line that "deserved more deliberation."
"Granting AI lethal autonomy without human authorization is a line that deserved more deliberation." - Caitlin Kalinowski, former Head of Robotics at OpenAI, resignation statement, March 7, 2026
Kalinowski's departure follows a week of turbulence at OpenAI over its military contracts. The company struck a deal with the Department of Defense that removed previous restrictions on weapons applications - restrictions that had been in place since OpenAI's founding principles required the company to decline military use cases. The reversal triggered the #QuitGPT protest movement and prompted a separate wave of high-profile departures from the company's policy and safety teams.
The timing of her resignation is significant. On the same day DOGE was being revealed as having used OpenAI's product to gut federal cultural institutions, the person responsible for OpenAI's physical AI systems was saying the company had crossed ethical lines she couldn't accept. The product being used to automate political purges is made by a company whose own technical leadership cannot agree on acceptable use cases for AI in government and military contexts.
OpenAI has not commented on Kalinowski's departure as of publication time.
To understand what happened at the NEH - and why it will happen elsewhere - it helps to understand what a large language model is actually doing when it answers a question like "Does this relate to DEI?"
GPT-4 class models are trained on vast corpora of internet text. They learn statistical associations between words and concepts. When asked whether a humanities grant summary "relates to DEI," the model is not making a principled legal determination. It is asking: what does text that my training data associated with DEI look like, and does this summary resemble it?
The model has no understanding of the grant's actual content, no knowledge of the scholarly field, no capacity to distinguish between a project that happens to mention "underrepresented communities" in passing and one that is fundamentally organized around diversity principles. It cannot tell the difference between a translation project that cites diversity of source languages and a HR training program about inclusive hiring. It pattern-matches on surface features.
This is not a limitation that can be fixed with a better prompt. It is fundamental to how these systems work. Anyone with basic familiarity with large language models knows this. DOGE either didn't know, or didn't care, or - most likely - found it useful precisely because the arbitrary nature of AI flagging provides maximum coverage with minimum accountability.
A grant for preserving Native American oral histories might be flagged because the summary mentions "Indigenous communities." A translation project involving African literature might be flagged because it mentions "African American studies departments." A project studying the history of labor movements might be flagged for mentioning "equity in working conditions." The model cannot distinguish between using these words incidentally and organizing an entire project around them. But for the purposes of political targeting, that distinction doesn't matter - the goal is coverage, not precision.
This is the same failure mode that has plagued AI content moderation for a decade. Models systematically over-flag minority voices, non-Western cultural contexts, and language patterns associated with marginalized communities - not because they're programmed to do so, but because those patterns are statistically associated in training data with content that human moderators have historically flagged. When the same architecture is applied to federal grant review, the bias becomes federal policy.
DOGE begins broad sweep of federal agencies, targeting programs the Trump administration considers ideologically opposed. Initial focus is on diversity and equity programs across agencies.
Reports emerge that DOGE is using automated tools to accelerate grant review processes. AI involvement not yet confirmed in public reporting.
OpenAI strikes Pentagon deal removing restrictions on weapons applications. QuitGPT protest movement launches. Multiple OpenAI safety staff resign.
Anthropic CEO Dario Amodei writes internal memo suggesting company was blacklisted because it refused to "pander to Trump" unlike OpenAI. Pentagon designates Anthropic a "supply chain risk."
OpenAI launches Codex Security agent claiming to have found vulnerabilities in OpenSSH, GnuTLS, Chromium and other critical open-source infrastructure.
New York Times reveals DOGE used ChatGPT to cancel NEH grants. OpenAI's Caitlin Kalinowski resigns as head of robotics over Pentagon deal. Both stories break on the same Saturday.
The same week DOGE used OpenAI's product to gut the humanities, OpenAI was announcing Codex Security - a new AI agent designed to find and fix vulnerabilities in critical software infrastructure.
Codex Security, launched in research preview on March 6, is genuinely impressive from a technical standpoint. The system scanned more than 1.2 million commits across external repositories in beta testing, identifying 792 critical findings and 10,561 high-severity findings. It found real vulnerabilities in OpenSSH, GnuTLS, Chromium, PHP, and other foundational software that billions of people depend on. The signal-to-noise ratio improved dramatically over the beta - noise cut by 84% in one repository, false positive rates down by more than 50% across the board.
That's legitimately useful work. AI-assisted vulnerability discovery at scale could meaningfully improve software security if the precision holds up outside carefully controlled beta conditions. OpenSSH vulnerabilities alone affect millions of servers globally.
But the contrast is stark: OpenAI can build a system sophisticated enough to find subtle cross-tenant authentication vulnerabilities in complex codebases, yet the same company's product was being used the same week to make blunt, binary, politically motivated decisions about American scholarship. The sophistication of what OpenAI is capable of building makes the crudeness of how its products are actually being deployed in government all the more jarring.
Kalinowski's resignation is, among other things, a statement that the gap between OpenAI's technical ambitions and its governance choices has become too wide for at least some of the people building those capabilities to stand behind.
The immediate damage from the NEH cancellations is measurable. Grant recipients lose funding. Projects stall. Scholars on multi-year research timelines face career interruptions that ripple forward for years.
The second-order effects are less visible and more permanent.
The NEH peer-review process is an institutional knowledge structure that took decades to build. It depends on a network of domain experts willing to serve on review panels, a culture of academic independence in grant-making, and a shared understanding that federal arts and humanities funding operates outside direct political control. The 1965 legislation that created the NEH and the National Endowment for the Arts explicitly designed them to be insulated from political interference - modeled partly on the British Arts Council concept of maintaining cultural patronage without political strings.
When you replace peer review with a ChatGPT prompt, you don't just cancel some grants. You signal to every academic institution, every cultural organization, and every scholar considering a federal grant application that the process is now subject to algorithmic political screening with no meaningful review pathway. The chilling effect on grant applications will reduce the applicant pool in ways that outlast any particular administration.
More broadly, the normalization of AI as a political sorting mechanism in federal agencies represents a ratchet that only turns one way. Once the precedent is established that ChatGPT can substitute for human review in federal administrative processes, subsequent administrations - of any political orientation - will find it legally and politically easier to deploy the same approach. The Democrat who argues in 2028 that an AI review process was illegitimate will be citing precedents they themselves failed to challenge when first established.
The legal landscape is particularly murky. Administrative law requires federal agencies to provide reasoned explanation for their decisions - the "arbitrary and capricious" standard under the Administrative Procedure Act. Whether a ChatGPT output constitutes a "reasoned explanation" has never been litigated. Courts that have generally been deferential to executive agency decision-making will face novel questions about whether delegating judgment to a commercial AI model satisfies the legal requirements for administrative reasoning.
"The challenge isn't that AI will replace human judgment. The challenge is that it will be used to make human judgment invisible - and therefore unaccountable." - Recurring theme in AI governance scholarship, now playing out in federal policy
One more data point from the same week: Google's Liz Reid, who runs Google Search, told journalist Alex Heath in a rare candid interview that she doesn't know whether Google Search and Gemini will ever fully merge.
"I don't know the answer," she said, adding that with the rise of AI agents, "the right product is neither" Search nor Gemini "but a third thing altogether."
That is an extraordinary statement from the person running the most valuable information product in human history. Google Search processes roughly 8.5 billion queries per day. It is the primary discovery mechanism for information for the majority of the world's internet users. The person responsible for it is saying, publicly, that she doesn't know what shape it will take in the near future.
The same uncertainty pervades the whole landscape. OpenAI is building government AI tools while its robotics head resigns over their military applications. DOGE is deploying consumer AI to make federal policy while researchers document how the same models systematically misclassify minority cultural contexts. Google's search chief says the future might be "a third thing" that doesn't exist yet.
The pace of AI deployment in consequential domains - government, military, legal, academic - has dramatically outrun both the technical reliability of these systems and the governance frameworks needed to constrain their worst applications. The NEH incident is what that gap looks like when it becomes policy.
Several legal and advocacy organizations have already flagged the NEH situation as a test case for AI administrative decision-making. The Electronic Frontier Foundation, which has been tracking algorithmic decision-making in government contexts for years, is expected to comment. Scholars affected by the cancellations are organizing through academic associations to document the scope of what was cut and prepare the record for potential legal challenge.
The path to legal challenge is narrow but not closed. An affected grant recipient could argue under the APA that the cancellation was arbitrary and capricious because no reasoned explanation was provided - that an AI chatbot output does not constitute a reasoned agency decision. A court that finds in their favor would create immediate precedent limiting AI-substitution in federal administrative processes.
The lobbying angle is more complicated. Technology companies - including OpenAI - are increasingly embedded in both the current administration and in federal contracting. OpenAI's willingness to remove weapons restrictions from its Pentagon contract suggests the company's previous ethical constraints were more negotiable than its public communications implied. The companies best positioned to pressure government on appropriate AI use cases are the same companies now competing for government contracts that require removing those constraints.
Caitlin Kalinowski's resignation represents one response to that structural problem: individual professionals at AI companies choosing to exit rather than build systems they cannot endorse. That form of pressure is real but limited. It depends on individual conscience and career risk tolerance at a moment when AI engineer compensation makes walking away a serious financial sacrifice.
The harder structural fix - legal frameworks that prohibit AI substitution for human review in federal administrative processes, mandatory disclosure when AI is used in government decision-making, liability frameworks that pierce the "the algorithm decided" accountability shield - requires political will that does not currently exist in Washington.
What the NEH case demonstrates is that the accountability gap is not a theoretical future risk. It arrived this week, dressed in a 120-character prompt, and gutted a 60-year institution without anyone having to sign their name to a single decision.
That is the playbook now. Expect to see it again.
Get BLACKWIRE reports first.
Breaking news, investigations, and analysis - straight to your phone.
Join @blackwirenews on Telegram