philipbohun.com

gitlab | github | youtube | twitter

blog dark mode

The Death of Information

2025-08-05

Thesis: LLM companies are destroying the sources of information they rely on to make their product.

Bots Everywhere

Bots continuously scrape every part of the internet, without regard or respect for the resources they are consuming. The goal is to gobble up every bit of information in existance in order to feed the ever more voracious LLMs (Large Language Models) with training information.

Generated Content

Recently I went looking for information about the differences between NTFS and ExFAT, particulary in relation to file copy speed. In the past, I might have come across someone's personal technical block that talked all about the file system formats and included their own personal charts and graphs of performance benchmarks that they carefully ran and documented.

Instead, every link I clicked to a "tech website" had the whiff of LLM generated content. One of the articles even had directly contradictory claims from one paragraph to the next. The history and technical description of the file systems were shallow at best, and some details were even incorrect.

Scraping

Scraping content from sites has a number of unintended side effects. Before, people would post their own blogs because they loved a subject and wanted to share information. If you had a great blog you could become well known in the community for providing quality information. This could lead to new connections, business opportunities, jobs, and friendships. You could also directly make money from little sidebar ads on your blog. Marshall Brain created an entire business based on this idea, called How Stuff Works

Today, this type of social behavior is dead. Search results are bought by advertisers, so there might as well only be a handful of websites on the internet. You'll never search for a topic and get some small but insighful and relavent blog/site at the top.

LLMs have put the final nail in the coffin. Any chance of building a website to share information is now dead. You will be scraped and the information will go into answering someone's question directly from an AI assistant response. There is no chance to become known for providing good information, and to build business and personal relationships. Your info will be scraped and put into the LLM soup fed directly to "customers".

Consequences

The consequences of this will be disastrous. The number of people who publish will further decline. By comparison the amount of AI slop will dramatically increase. What will happen when 99% of the content companies scrape is AI generated content that is self-contradictory and ever more detached from reality?

I don't have an immediate solution for this, but it seems like mass media is quickly becoming useless. The only places to find good content and actually have relationships will be small, closed communities where trust can be formed. This area of tech is underexplored, likely because there's very little to no money in creating small forums. Even so, there's a problem that small forums will be infiltrated with bots. Bots to spam, bots to manipulate people politically, bots to troll, bots to spy. Small forums and Discord servers are not immune to this. Solving the problem of being able to form trusted relationships online seems like an important field of study for the immediate future.

If we don't figure this out it might be the death of information. The only thing left will be "vibes" and "narratives".