RAG Enabled WordPress in Core Could Transform WordPress from CMS to AIMS

Sometimes you see an idea in an area you’ve been thinking about deeply, and it doesn’t just click. It’s a puzzle piece that absolutely thunders into place. You take a step back, the puzzle is complete, and what was an abstract image has crystalized into a clear vision of an inevitable future. That just happened with a tweet by James LePage from 5 hours ago.

I’ve been spending a huge amount of time reading AI, as have many of us. I’ve also been working hard with a product team, which gives me the opportunity to turn what I’m learning into applied knowledge. The benefit of this is that you’re taking a hard look at the tech landscape and identifying real problems and matching them to applied solutions using the latest (meaning this week’s) ideas and technology.

I suspect that James mis-tweeted when saying WordPress IS a vector database, but I think we all know what he means – and the concept here is incredibly exciting. WordPress is:

  • Called a content management system, but this can also be described as a document management system.
  • Is one of perhaps the two most popular document management systems on Earth – with Google Docs being the other.
  • Has more documents under management than perhaps any other platform, already uploaded, categorized and tagged along with metadata for each doc.
  • Has an app store-like dev ecosystem with the plugin repository.
  • Has several hundred million installs with billions of human visitors to those installs (websites).
  • Is an online application that runs 24 hours a day and provides an interface for humans and machines to interact with.
  • Has a mature security ecosystem (shout out to my company Wordfence) with many vendors and solutions.
  • Has solved high performance document storage and retrieval at scale on a live site with live editing.

Retrieval Augmented Generation, or RAG, is the process of turning documents into embeddings (an array of numbers) which represent the meaning of each doc or chunk of text, storing those numbers with an index referencing each doc or chunk in a database, and then retrieving the documents or chunks to augment prompts sent to AI. It works like this:

A company has specific knowledge in a big document database. They vectorize the whole lot (generate embeddings for the docs or chunked docs) and store it all and create a RAG application. You come along and interact with, lets say a chat bot. Here’s what happens:

  • Your question is turned into an embedding (string of numbers) representing its meaning.
  • That embedding is used to retrieve chunks of text in the vector database that are related to your question.
  • The docs related to your question are used to create an AI prompt that reads something like “Here is a user’s question and a bunch of documents related to the question. Use this knowledge along with your own to answer the user’s question as best you can”

That’s it. That’s RAG. But what’s super powerful is two things:

  1. That we can represent semantic meaning in a way that lets us retrieve on similarity in meaning. That’s breakthrough number one.
  2. That we can retrieve documents similar in meaning to a question and augment the knowledge of an AI as part of our question to that AI model.

Put differently: Individuals and organizations can immediately put their SPECIFIC KNOWLEDGE to work and that can be a differentiator for them in this world of AI models. It’s not just your little business providing an interface into Sam Altman’s latest model. It’s your little business with its specific knowledge providing a differentiated AI powered application, because your AI knows things that others don’t and perhaps never will. That’s why RAG rocks.

Here’s the thing about WordPress: Every WordPress website is a collection of specific knowledge that in many cases is extremely high quality and has taken years to accumulate. If you could put that specific knowledge to work in an AI context, WOW! You don’t even need to collect and organize the docs. You already have done the work to collect and index the knowledge. You just need to use RAG to feed it to an AI for each prompt, whether you’re generating new content, answering user questions, whatever.

So what James tweeted was a user interface – that’s probably in working condition – of RAG for WP and I think he’s doing it in the context of this being added to WP core, since he works for Automattic. But the message here is really: Hey, consider enabling the largest document management system in the world (one of) as an enabler for RAG apps using the existing dev ecosystem, massive deployed base, massive collection of documents, and overnight turn this into the largest RAG AI dev ecosystem in the world.

What I mean specifically is this:

  • Document vectorization would be as easy as it appears in James’ screenshots.
  • RAG retrieval would be available in the core API.
  • WordPress plugins would immediately be able to build applications around fetching chunks of text from the existing knowledge a site has and sending RAG augmented prompts to any AI interface, whether it’s self hosted, open source, closed source, REST endpoint or running local.
  • In other words, WP developers would immediately be able to put the specific knowledge that hundreds of millions of websites have spent years collecting to immediate work for the benefit of the site owner.

The value of a WordPress website immediately increases by an order of magnitude.

There are challenges that need to be solved. Specifically:

  • What model is generating the embeddings and where is it run? Local? API endpoint? Does it have vendor lockin or is it open source? Does the host have vendor lockin or is it open source? Ideally it would be CPU and usable directly from PHP so no new ops dependencies are introduced.
  • Is the orchestration of vectorization and retrieval in PHP? Or is it Python, which may not be available on a WP site? Ideally all PHP so that Python is not a dependency for existing sites.
  • How is retrieval being done? Pgvector, which adds postresql as a dependency? Or some kind of MySQL/MariaDB magic I’m not aware of? (Since MySQL/MariaDB doesn’t support vector retrieval/indexing). Ideally you wouldn’t add a new DB engine as a dependency.

If you can eliminate the dependencies, you can deploy a new version of WP core overnight and enable this on every site, world-wide for immediate use.

There’s a massive opportunity here if hosting providers collaborate with WP core for the hosts to provide local GPU resources to generate embeddings along with pgvector for retrieval. It’s a whole new source of revenue for them. At the last WCUS I literally went around and asked every host if they’re providing GPU hosting and only one said they are. It’s wide open for the taking and it’s worth billions, since GPU is far more expensive that CPU  to rent.

It’s quite possible that James’ tweet will be a harbinger of the transformation of WordPress from CMS to AIMS (AI management system).

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *