This post originally appeared on Searchenginewatch.com
They’re by no means a secret, and entities’ role in SEO has been heavily documented – entity optimization just isn’t the trendy topic you might see every time you check your Twitter timeline.
We’d much rather discuss less impactful concepts, like whether content within a subfolder will rank better than a subdomain or whether it’s important for an SEO to learn Python (am I right?).
But entity optimization should be getting the same amount of press as the other topics and concepts we SEO’s drive into the ground week after week. I want to help us understand why, and how to approach content with entities in mind.
What is an entity?
Google defines an entity as, “A thing or concept that is singular, unique, well-defined and distinguishable.” An entity can be an event, idea, book, person, company, place, brand, a domain, and so much more. You might ask, “Isn’t that the definition of a keyword? What’s the difference?”
An entity isn’t bound by language or spelling, but rather a universally understood concept or thing. And at the core of an entity is its relation to other entities. Google uses an illustration of “nodes” and “edges” to explain entities, with entities as nodes and relationships as edges. Let’s look at a search to see how this plays out:
A search for “Justin Trudeau” displays a knowledge panel where he carries the title “Prime Minister of Canada”. And a search for “prime minister of Canada” displays a knowledge panel of Justin Trudeau. So we know that Justin Trudeau is associated with Prime Minister of Canada and vice versa. Trudeau is the current prime minister, so what if we search for the same entities with a different relationship?
Here we see a different set of results, based on a different relationship between the nodes.
How are entities used by search engines?
We believe Google uses a model called Word2Vec (referenced in this patent regarding keyword extraction) to break down entities, map them to a graph, and assign a unique ID. In a sense, Word2Vec turns language into a mathematical computation, allowing Google to properly identify concepts and map them appropriately – regardless of language – in a way traditional models simply can’t.
We don’t know exactly how entities fit into search results right now but based on a model introduced in a patent titled “Ranking search results based on entity metrics“, we know one of the biggest factors is relatedness.
Relatedness is judged primarily by something called co-occurrence (the linked patent is still pending, but helpful in understanding co-occurrence). Co-occurrence judges the strength of relationships based on the frequency of the entities appearing together in documents around the web. The more frequently two entities are mentioned together, and the more authoritative the document that mentions them, the stronger the relation.
Are entities a ranking factor?
Entities aren’t necessarily a ranking factor – at least in the traditional sense. And we don’t really know exactly how much weight they carry as quality signals. But we know there are two key categories of ranking factors (among many others) heavily influenced by the entity graph.
Keywords have historically been the judge of the relevance and quality of content. Keywords aren’t dead, but entities give better insight to search engines on the relationship between words in a search.
For example, let’s look at the search “best shoes for basketball in Atlanta.” Sure, we could create a post and stuff it with the keyphrase. But in a world of entity-based indexing, Google is looking for semantics around each of these entities, and signals that indicate their relationships.
You might recall the explosion of “LSI keywords”. Whether or not latent semantic indexing is used in Google’s algorithm, this fascination with semantics is rooted in entities. All search is now semantic.
It’s pretty common knowledge in the world of SEO that not all links are created equal. Entity-based indexing amplifies this sentiment. A post aiming to rank for “best shoes for basketball in Atlanta” needs links and references from authoritative sources on shoes, basketball, and the city of Atlanta in order to really own that SERP.
How long have entities been used in algorithms?
We’ve seen patents on entities surfacing for over ten years, and most believe entities have played a role in search algorithms for quite a long time. The question is when did entities become core to indexing?
Cindy Crum of Mobile Moxie wrote a brilliant five-part series on entities. She makes a strong case for entities becoming a strong ranking signal at the same time as Google rolled out Mobile-First Indexing. In fact, she terms the entire update Entity-First Indexing.
BERT and entities
Did BERT have anything to do with entities? Though I believe BERT got a little more attention than it probably deserved, its use in Google’s algorithm can help us understand the importance of entities.
BERT (Bidirectional Encoder Representations from Transformers) is a Natural Language Processing model that Google introduced in 2018 and began rolling out in October 2019. BERT has the ability to consider the full context of a word based on the words that come before or after named entities.
We won’t dive deep, but we’ll look at an example Google gave to help us understand what BERT means for search. Google called out the query “2019 Brazil traveler to USA needs a visa” in a recent post. The preposition “to” is crucial here, and more crucial is its relationship to the entities found before and after it. Before BERT, Google would have returned results about US citizens traveling to Brazil. Post-BERT, Google can recognize that nuance and return a more relevant and helpful result:
Entities are at the core of Natural Language Processing models like BERT.
How to optimize content for entities
Before we dive into some actionable tips, know that entities have far more implications than content. Entity optimization is crucial for building brands, establishing domains, and all kinds of other online endeavors. Having said that, there are massive implications for content.
*Quick preface: I’ve used this approach to rank articles and have seen success, but this is by no means foolproof and battle-tested. I don’t at this time have or know of research that proves a direct correlation between an approach like this and high rankings. Nonetheless, I believe in it and believe a knowledge of entities gives SEOs a leg up.
Choose and research a topic
For starters, we need a topic and keyphrase for which we want to rank. We won’t dive into how to do keyword research or topic research, but let’s stick with our example above and aim to rank for “best shoes for basketball.”
If we want to aim to rank for this keyphrase, we need to gather insight on what other topics and concepts Google deems related in their entity graph. Where can we gain insight like this? A few places:
Wikipedia: We know entities are the foundation of Google’s Knowledge Graph – and we know Wikipedia fuels a lot of their knowledge on entities. We can assume that if Google leans on Wikipedia to help them understand topics, the attributes and sources found within Wikipedia may help guide our content.
Google images is another goldmine for entity insight:
Beneath the search bar, we find entities Google positively associates with “best shoes for basketball.” These aren’t the shoes or attributes of shoes you must list in your article, but logic would say the mentioning of these topics will help Google associate your article with them.
“People Also Ask” is another helpful source for entity optimization. These are the other topics and questions Google associates with your target keyphrase:
Use Google’s NLP API demo to analyze the competition
Identify the top two or three ranking articles for your target keyphrase. Now we will look at how Google views the entities found within their articles. We’re going to use Google’s NLP API demo:
This is just a sample demo of their NLP cloud product. Nonetheless, it provides really valuable data. Before we dive in, we need to define a key term.
Google’s API demo looks at a handful of things: salience, sentiment, syntax, and categories. We’re really only focusing on salience in this article.
Salience is a score of how important the entity is in the context of the whole text. The higher the score, the more salient the entity is. We’ll use salience to help guide our content. Here’s what to do:
- Click on one of your competing posts in the SERP
- Copy and paste the content into the demo editor
- Click “Analyze”
- Check out for which entities Google reveals high salience
We see the entities with the highest salience are “player,” “best basketball shoes,” and “basketball shoes.” Seeing as Google ranks this page well for the keyphrase we desire, we can conclude these are entities we should seek to optimize for in our post.
Provide context throughout
How can you optimize for these entities? As you begin writing, your goal should be to establish the relationship between the entities you’re targeting in your keyphrase and give Google all the context you can to associate your target keywords with their entity graph. This isn’t done by keyword stuffing, but by using some of the language and semantics we’ve gleaned from the above sources.
Google Images and Wikipedia should help you choose semantically related keywords and language to use throughout your article, while “People Also Ask” can help guide your overall topics and headings. Again, the aim is not to stuff keywords in, but to have a toolbox of individual words, phrases, language, and topics to guide our writing in a way that prioritizes our target entities.
Once you’ve finished writing, run your own article through Google’s NLP API demo to get a feel for how you stack up. If the desired entities show low salience, it may be worth going back to the drawing board. At the very least, you can analyze articles that show more entity success to gain insight into how Google associates your targets.
Update content as needed
Because entity optimization is a bit more complex than keyword optimization, there’s a stronger case for updating content on a regular basis as new topics arise around your entities. For example, as new basketball shoes come out, and Google establishes their place in the entity graph, it would help the salience of your entities to add them to your post.
BERT is another great example. As it blew up across the internet, if you had a post on Natural Language Processing, Google would expect to see mention of it.
The future of search
There is still a lot myself and the industry have to learn on the topic of entity optimization. And again, the implications expand far beyond content optimization.
But I do believe a focus on entities has already begun, and the signals will only grow in prominence for Google and other search engines.
Here’s to better content, more relevant SERPs, and the future of search.
Brooks Manley is a Digital Marketing Specialist and SEO Lead at Engenius, a marketing agency in Greenville, SC. When he’s not panicking about ranking drops and algorithm updates, you can find him watching NBA games and eating tacos.
From checking site load speed to content relevance, and more – A specially curated list of tools to make your job easier and stay ahead of your competition.
“Semantics” refers to the concepts or ideas conveyed by words, and semantic analysis is making any topic (or search query) easy for a machine to understand.
Google makes hundreds of changes to its search algorithms every year of which any could affect your search ranking. Eight SEO trends to stay atop in 2020.