Entity SEO & LLM Optimization: Improving Content for AI Chatbots

Entity SEO and Entity Optimization for LLMs go hand in hand. By understanding your content at an entity level, you can help train more effective chatbots and position yourself better in search engines.

User interacting with website on phone while chatbot gathers site information for user.
How are you helping your website users and your AI chatbot understand your content better?

Entity SEO (search engine optimization) is a relatively new concept in the world of search engine optimization, but it's quickly gaining traction as businesses realize its potential to help them rank higher in search results.

Entity SEO is all about building up the authority of your website's entities, which are essentially the things that your website is about.

For example, if you have a website about coffee, your entities might include coffee beans, coffee shops, and coffee recipes.

By building up the authority of your website's entities, you can improve your website's ranking for relevant search terms. This can lead to more traffic, more leads, and more sales.

There are a few different ways to build up the authority of your website's entities. One way is to create high-quality content that is relevant to your entities.

You can also build links to your website from other websites that are relevant to your entities.

Going beyond thinking about SEO, this is a huge opportunity to optimize your site so that large language models (LLMs) can quickly understand what the content is about.

Why Entity SEO Matters

Entity SEO is a powerful tool that can help you improve your website's ranking in search results. If you're not already using it, now is the time to start.

Here are some additional benefits of entity SEO:

  • It can help you attract more qualified leads and customers.
  • It can help you build brand awareness and authority.
  • It can help you improve your website's overall performance.

If you're looking to improve your website's SEO, entity SEO is a great place to start.

Here are some additional tips for using entity SEO effectively:

  • Make sure your website's content is high-quality and relevant to your target audience.
  • Build links to your website from other websites that are relevant to your entities.
  • Use relevant keywords and phrases throughout your website's content.
  • Optimize your website for mobile devices.
  • Track your website's performance and make adjustments as needed.

Entity SEO is a complex topic, but it's worth learning about if you want to improve your website's ranking in search results. By following the tips above, you can start using entity SEO to your advantage.

What does Entity SEO have to do with LLMs?

If you’re talking with ChatGPT, chances are it is looking back at its training data’s database to find information on the entity you’re asking about.

Chatbots like this (yes, i’m referring to ChatGPT as a chatbot) function by taking the information you submit, parse and encode into a vector representation, then use the existing vector database information (from training data) to create a mathematical probability of the best response.

This is a very broad/high level description of how it works, so if you want greater detail into the process, check out this article on dev.to.

Why does entity SEO matter for chatbots? Because we need to make it as easy as possible to provide the correct entities to the AI tool.

If you were writing an article about Abraham Lincoln, you want to make sure you include all the correct entities related to Lincoln the person.

Otherwise if you are too broad, the chatbot might think you are talking about Lincoln the car.

This is the scenario you would not want a potential customer to see if you were writing a blog post referencing the former President and they tried to ask the chatbot about it:

"I'm not sure what you mean by 'Emancipation Proclamation.' Are you talking about the one that freed the slaves, or the one that allowed you to drive a hybrid?"

Clarity is Key.

How to Optimize for Chatbot Success

The first step in this project is to identify the entities that your content discusses. We can do that by leveraging a few different tools.

BingChat

With zero technical knowledge, you can extract the entities from your pages for comparison using BingChat.

We opened up Bing & navigated to this article on this veteran disability law firm website about PTSD and Unemployability and then asked BingChat, “List all the known entities of great importance that are mentioned on this page.”

The response was:

"Hello, this is Bing. I can help you with finding the named entities of great importance that are mentioned on this page. According to the web search results, named entity recognition (NER) is a natural language processing technique that automatically identifies named entities in a text and classifies them into predefined categories such as person names, organizations, locations, etc.
Based on this definition, some of the named entities of great importance that are mentioned on this page are:
Military, war zone deployment, training accidents and military sexual trauma (MST) as types of traumatic events that may affect veterans.
Post-traumatic stress disorder (PTSD) as a mental health disorder that may result from exposure to trauma.
Veterans and female veterans as groups of people who may suffer from PTSD.
Iraq and Afghanistan as locations where veterans may have served in war.
VA Individual Unemployability (IU) and TDIU as benefits that may help veterans with PTSD.
VA disability benefits, Compensation and Pension (C&P) examination, SSDI and SSI as terms related to the financial support for veterans with PTSD.
Hill & Ponton as an organization that provides legal representation for veterans with PTSD.
The Road to VA Compensation Benefits as a book that offers guidance for veterans with PTSD.
I hope this answers your question. If you need more information, please let me know. 😊"

That’s pretty good! We now can get a high level look at the article to understand some of the important concepts.

The downside of this approach is that it’s tough to compare to competitor articles.

Google NLP API

This is taking a more technical approach. We now want to set up a system to feed the content into Google’s NLP (Natural Language Processing) API (Application Programming Interface).

For more details and to test it out, visit the Google NLP page.

What we can do with this tool is feed the body content of the page and get a few pieces of data that can help us understand the page better.

  • The page classification: What Google thinks the main subject the page is about. For example the same page we asked BingChat about returned a classification of /Law & Government/Military
  • Entities on the page: Google extracts the entities (person, place, organization, etc.) from the article. For this page we see entities like, “PTSD, veterans, employment, VA” along with a salience score. The salience score is a range of 0 to 1, where the closer to 1 the entity is the more important it is for the document.

ChatGPT / OpenAI GPT4

As of this writing, ChatGPT removed the ability to search through Bing as they are re-working that application. That means you need another way to copy the body content of the page to deliver to ChatGPT or GPT4 (if you’re using the API).

This works essentially the same as BingChat after that step though. Ask it to summarize the entities on the content and it will return what it found. Still helpful in understanding your content.

Anthropic Claude

This might be one AI company that you may not be as familiar with. Anthropic is an interesting AI company that is focusing on security and safety as a primary differentiator. Using their Claude 2 AI Chat, you can upload documents (e.g., PDFs) and allow you to chat with the content.

At this point we still recommend no client data, PII or anything that could expose sensitive information. However, blog post drafts are a great thing to share and chat with. Before you publish, ask about entities. Ask Claude to edit or suggest topic gaps.

This is another AI tool that is worth bookmarking for the team.

Why does this help?

Because we now have a much better understanding of how a machine learning system sees our content. While it’s not the same Google Search algorithm, it’s very similar in nature.

What Do We Do With This Information?

It’s great that we now have some solid information about what the machine thinks our content is talking about. Now it’s time to improve it!

The feedback we are getting from these tools will either tell us that the content focus is on the money or it’s off the mark. We can use that to improve our piece which will hopefully get it to rank better, help onsite chatbots source this content more effectively, and even set us up for better performance when Google fully releases their Search Generative Experience.

Some additional ideas how to use this information:

  • Compare related pieces of content on your site to check for cannibalization and overlap
  • Study competitor pieces that are performing better to understand what you’re missing
  • Improve your content so the bot sources specific pieces for common questions
  • Extract questions from Chatbot logs to understand the core topic more effectively

What is LawOps Doing With This?

We're implementing exactly what I outlined above. We are building tools for the law firms that want to work with us using the tech we shared in this post and more.

The future could be one where people don’t actually search through multiple pages on your site to find the information they need. They’ll land on one page, talk to an AI driven chatbot, and get all the answers to the follow up questions they have before contacting you.

This is about making the information on your website more accessible to everyone. People hate searching and reading through blog posts to hopefully find an answer to a specific question (or interpret the answer from your content).

We are looking ahead to a future where we can provide more proactive help to everyone visiting the site.

AI chatbots can be a scary thought, especially for a law firm. However if your content is optimized correctly so the bot can understand it easily, then you have very little to worry about.