Researchers built a chatbot that only knows the world before 1931

The post Researchers built a chatbot that only knows the world before 1931 appeared first on Malwarebytes.
The internet’s chatbots have read every forum rant, leaked Slack log, and confident blog post your uncle ever wrote about chemtrails.

[…Keep reading]

Researchers built a chatbot that only knows the world before 1931

Researchers built a chatbot that only knows the world before 1931

The post Researchers built a chatbot that only knows the world before 1931 appeared first on Malwarebytes.
The internet’s chatbots have read every forum rant, leaked Slack log, and confident blog post your uncle ever wrote about chemtrails. The results are predictable: they reflect the state of the internet, and it isn’t pretty. That, along with some questionable design decisions, is partly why Elon Musk’s Grok chatbot briefly generated antisemitic content and referred to “MechaHitler” during testing.
Wouldn’t it be nice if we had a chatbot that only draws on knowledge from before the internet, reality TV, or AI-slop content ever existed? Three researchers have created just that: a chatbot that hasn’t read anything published after 1930.
Talkie is a 13-billion-parameter language model trained on digital scans of English-language texts published before the end of 1930. That cutoff aligns with the current US public domain year, meaning anything published until the end of that year is fair game and there are no lawsuits from irate IP-holders to worry about.
David Duvenaud, an associate professor of computer science and statistics at the University of Toronto, led the work with two collaborators. You can download it from GitHub or Hugging Face, or chat with it through a web interface, if you don’t mind a model whose mental map of the world ends with the Great Depression.
The model knows only what appears in books, newspapers, legal texts, and other publications before its cutoff date. So it’s great for questions about Prohibition or World War One. NASA’s first moon landing? Not so much.
Why bother?
The obvious question: why train an AI that doesn’t know what the Nazis did, what the internet is, or what an LLM even is?
These aren’t so much exercises to look at the “good old days” through rose-colored glasses so much as intellectual experiments. Nostalgia misrepresents the past, and the world was just as problematic back then, if not more so.
Duvenaud told The Register that such a model could be useful for examining how people might have interpreted laws or events at the time, using only the knowledge available then.
Another fun experiment: Use it to see whether a model can “rediscover” later breakthroughs using only earlier knowledge, as a way of probing the limits of AI reasoning.
Where it breaks
There are definite weaknesses in Talkie, which its inventors are well aware of.
For example, there was no digital publishing in 1930, so every word of Talkie’s corpus had to be transcribed from a scan. OCR is famously imperfect anyway, but more so on the blurry text printed back in the day.
It also leaks future information that can sometimes creep in from mislabeled future documents, despite the researchers’ best efforts. We asked it about television, which was just starting out in the late 1920s, and this is what happened:

But still, what an absorbing project. It isn’t alone, either. In their paper, the researchers mention other projects such as Ranke-4b from the University of Zurich, a series of LLMs with historical snapshots of data. “Trip” also created Mr Chatterbox, which he trained on a dataset of British literature from 1500–1900 to become, in his words, “a Victorian gentleman in silicon.” Magic.
These are both a fun experiment and a useful insight into the workings of AI. As the Talkie researchers put it:

“Have you ever daydreamed about talking to someone from the past? What would you ask someone with no knowledge of the modern world? What would they ask you?”

And they provide some fun-making opportunities. The nerd in us still wants to hook one of these things up to an Edwardian typewriter keyboard and a ticker tape, steampunk-style.

Your name, address, and phone number are probably already for sale.  
Data brokers collect and sell your personal details to anyone willing to pay. Malwarebytes Personal Data Remover finds them and gets your information removed, then keeps watch so it stays that way. 

*** This is a Security Bloggers Network syndicated blog from Malwarebytes authored by Malwarebytes. Read the original post at: https://www.malwarebytes.com/blog/ai/2026/04/researchers-built-a-chatbot-that-only-knows-the-world-before-1931

About Author

What do you feel about this?

Subscribe To InfoSec Today News

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

World Wide Crypto will use the information you provide on this form to be in touch with you and to provide updates and marketing.