Image: The Knowledge Poisoners: AI Translators Have Been Quietly Co
A non-profit hired contractors to use AI to translate Wikipedia articles at scale. The result: fabricated citations, paragraphs sourced from completely unrelated material, and contaminated entries across multiple language editions - all without most readers ever knowing.
Wikipedia is where a billion people go to find out if something is real. And for months, a corner of it has been quietly broken - not by vandals, not by nation-state actors, but by a well-meaning non-profit that handed AI a microphone and walked away.
The organization is called the Open Knowledge Association (OKA). Its stated mission is noble: improve Wikipedia and other open knowledge platforms by funding contributors and translators. It pays monthly stipends to full-time workers, mostly contractors in the Global South, who use large language models to translate Wikipedia articles from English into other languages.
The problem is that the AI was hallucinating into the encyclopedia - and nobody was checking.
Wikipedia editors began noticing something was wrong when they started examining OKA-translated articles closely. In one case, a translated article about the French royal La Bourdonnaye family cited a specific book and page number as its source. An editor named Ilyas Lebleu, who goes by Chaotic Enby on Wikipedia, pulled the book. The page cited did not mention the family at all.
"Some of the articles had swapped sources or added unsourced sentences with no explanation, while [one article] added paragraphs sourced from material completely unrelated to what was written!" - Wikipedia editor Ilyas Lebleu
Lebleu ran a spot-check on a sample of OKA translations and found errors immediately. It was not a handful of bad apples. It was systematic.
Wikipedia editors investigated OKA's workflow and found its contractors were following instructions to copy-paste article text directly into Gemini or ChatGPT, then paste the AI output back into Wikipedia. A public spreadsheet used by OKA translators included instructions to make edits "only if the suggestions are an improvement and don't change the meaning" - but the AI was changing the meaning constantly, and nobody was catching it.
There is a layer here that makes this story stranger. OKA's original instructions told translators to use Grok - Elon Musk's LLM - for the translations. Grok, notably, already operates Grokepedia, an automated Wikipedia alternative widely criticized for errors and lack of human oversight.
The use of Grok proved controversial inside OKA itself. An internal study eventually showed ChatGPT and Claude produced more accurate output, so OKA switched - though it still lists Grok as "valuable for experienced editors handling complex, template-heavy articles."
This matters for understanding the scale of the contamination. The errors were not caused by one bad model. They were caused by removing humans from the verification loop entirely, regardless of which model was doing the translating.
This is where the story gets genuinely interesting. Wikipedia's community-driven governance system is the reason this was caught at all. There is no algorithm scanning for hallucinated citations. There is no automated fact-check layer. There are just volunteer editors doing tedious spot-checks, and the open discussion architecture that lets those editors flag, debate, and ultimately enforce policy changes.
The editors did not ban AI-assisted translation outright. They accepted that AI tools can help non-native speakers produce reasonable first drafts. But they drew a sharp line: humans are responsible for what gets published, and that responsibility cannot be delegated to a model.
The immediate damage - contaminated citations in articles about French aristocracy and 19th-century French Senate elections - is fixable. Editors are scrubbing it now.
The harder problem is what this represents structurally. OKA was operating in good faith. It was not trying to pollute Wikipedia. It was trying to expand it into underrepresented languages, using the cheapest available tools, and trusting that the humans in the loop would catch errors. They did not.
Wikipedia is the training data source that virtually every major AI model has ingested. Articles hallucinated by AI today have a realistic path back into the training sets used to build tomorrow's models. The contamination does not just sit in a Wikipedia article. It potentially loops back into the systems generating the next round of "translations."
There is also the question of what is happening in Wikipedia's smaller language editions, where the volunteer editor base is thin and spot-check capacity is limited. An editor fluent in English can catch a fabricated English-language citation. That same editor may not be able to verify a citation in a translation made into a language they do not read.
The OKA incident was caught because English-language Wikipedia has a dense enough community of experienced editors to notice anomalies. That safety net is considerably weaker everywhere else - and OKA's work was primarily aimed at expanding Wikipedia into those exact underserved language editions.
OKA is not unique. The model it used - pay contractors to run text through AI, lightly review the output, publish at scale - is now a standard playbook across content industries. It is how low-cost content farms are scaling. It is how some publishing operations handle localization. It is how knowledge is being manufactured cheaply and distributed widely.
Wikipedia's governance model caught it here. Most systems do not have Wikipedia's governance model.
Get BLACKWIRE reports first.
Breaking news, investigations, and analysis - straight to your phone.
Join @blackwirenews on Telegram