The Advanced Guide to Keyword Clustering

The Advanced Guide to Keyword Clustering

The Advanced Guide to Keyword Clustering 1920 1080 tomcasano

Posted by tomcasano

If your goal is to grow your organic traffic, you have to think about SEO in terms of “product/market fit.”

Keyword research is the “market” (what users are actually searching for) and content is the “product” (what users are consuming). The “fit” is optimization.

To grow your organic traffic, you need your content to mirror the reality of what users are actually searching for. Your content planning and creation, keyword mapping, and optimization should all align with the market. This is one of the best ways to grow your organic traffic.

Why bother with keyword grouping?

One web page can rank for multiple keywords. So why aren’t we hyper-focused on planning and optimizing content that targets dozens of similar and related keywords?

Why target only one keyword with one piece of content when you can target 20?

The impact of keyword clustering to acquire more organic traffic is not only underrated, it is largely ignored. In this guide, I’ll share with you our proprietary process we’ve pioneered for keyword grouping so you can not only do it yourself, but you can maximize the number of keywords your amazing content can rank for.

Here’s a real-world example of a handful of the top keywords that this piece of content is ranking for. The full list is over 1,000 keywords.

17 different keywords one page is ranking for

Why should you care?

It’d be foolish to focus on only one keyword, as you’d lose out on 90%+ of the opportunity.

Here’s one of my favorite examples of all of the keywords that one piece of content could potentially target:

List of ~100 keywords one page ranks for

Let’s dive in!

Part 1: Keyword collection

Before we start grouping keywords into clusters, we first need our dataset of keywords from which to group from.

In essence, our job in this initial phase is to find every possible keyword. In the process of doing so, we’ll also be inadvertently getting many irrelevant keywords (thank you, Keyword Planner). However, it’s better to have many relevant and long-tail keywords (and the ability to filter out the irrelevant ones) than to only have a limited pool of keywords to target.

For any client project, I typically say that we’ll collect anywhere from 1,000 to 6,000 keywords. But truth be told, we’ve sometimes found 10,000+ keywords, and sometimes (in the instance of a local, niche client), we’ve found less than 1,000.

I recommend collecting keywords from about 8–12 different sources. These sources are:

  1. Your competitors
  2. Third-party data tools (Moz, Ahrefs, SEMrush, AnswerThePublic, etc.)
  3. Your existing data in Google Search Console/Google Analytics
  4. Brainstorming your own ideas and checking against them
  5. Mashing up keyword combinations
  6. Autocomplete suggestions and “Searches related to” from Google

There’s no shortage of sources for keyword collection, and more keyword research tools exist now than ever did before. Our goal here is to be so extensive that we never have to go back and “find more keywords” in the future — unless, of course, there’s a new topic we are targeting.

The prequel to this guide will expand upon keyword collection in depth. For now, let’s assume that you’ve spent a few hours collecting a long list of keywords, you have removed the duplicates, and you have semi-reliable search volume data.

Part 2: Term analysis

Now that you have an unmanageable list of 1,000+ keywords, let’s turn it into something useful.

We begin with term analysis. What the heck does that mean?

We break each keyword apart into its component terms that comprise the keyword, so we can see which terms are the most frequently occurring.

For example, the keyword: “best natural protein powder” is comprised of 4 terms: “best,” “natural,” “protein,” and “powder.” Once we break apart all of the keywords into their component parts, we can more readily analyze and understand which terms (as subcomponents of the keywords) are recurring the most in our keyword dataset.

Here’s a sampling of 3 keywords:

  • best natural protein powder
  • most powerful natural anti inflammatory
  • how to make natural deodorant

Take a closer look, and you’ll notice that the term “natural” occurs in all three of these keywords. If this term is occurring very frequently throughout our long list of keywords, it’ll be highly important when we start grouping our keywords.

You will need a word frequency counter to give you this insight. The ultimate free tool for this is Write Words’ Word Frequency Counter. It’s magical.

Paste in your list of keywords, click submit, and you’ll get something like this:

List of keywords and how frequently they occur

Copy and paste your list of recurring terms into a spreadsheet. You can obviously remove prepositions and terms like “is,” “for,” and “to.”

You don’t always get the most value by just looking at individual terms. Sometimes a two-word or three-word phrase gives you insights you wouldn’t have otherwise. In this example, you see the terms “milk” and “almond” appearing, but it turns out that this is actually part of the phrase “almond milk.”

To gather these insights, use the Phrase Frequency Counter from WriteWords and repeat the process for phrases that have two, three, four, five, and six terms in them. Paste all of this data into your spreadsheet too.

A two-word phrase that occurs more frequently than a one-word phrase is an indicator of its significance. To account for this, I use the COUNTA function in Google Sheets to show me the number of terms in a phrase:

=COUNTA(SPLIT(B2," "))

Now we can look at our keyword data with a second dimension: not only the number of times a term or phrase occurs, but also how many words are in that phrase.

Finally, to give more weighting to phrases that recur less frequently but have more terms in them, I put an exponent on the number of terms with a basic formula:

=(C4^2)*A4

In other words, take the number of terms and raise it to a power, and then multiply that by the frequency of its occurrence. All this does is give more weighting to the fact that a two-word phrase that occurs less frequently is still more important than a one-word phrase that might occur more frequently.

As I never know just the right power to raise it to, I test several and keep re-sorting the sheet to try to find the most important terms and phrases in the sheet.

Spreadsheet of keywords and their weighted importance

When you look at this now, you can already see patterns start to emerge and you’re already beginning to understand your searchers better.

In this example dataset, we are going from a list of 10k+ keywords to an analysis of terms and phrases to understand what people are really asking. For example, “what is the best” and “where can i buy” are phrases we can absolutely understand searchers using.

I mark off the important terms or phrases. I try to keep this number to under 50 and to a maximum of around 75; otherwise, grouping will get hairy in Part 5.

Part 3: Hot words

What are hot words?

Hot words are the terms or phrases from that last section that we have deemed to be the most important. We’ve explained hot words in greater depth here.

Why are hot words important?

We explain:

This exercise provides us with a handful of the most relevant and important terms and phrases for traffic and relevancy, which can then be used to create the best content strategies — content that will rank highly and, in turn, help us reap traffic rewards for your site.

When developing your hot words list, we identify the highest frequency and most relevant terms from a large range of keywords used by several of your highest-performing competitors to generate their traffic, and these become “hot words.”

When working with a client (or doing this for yourself), there are generally 3 questions we want answered for each hot word:

  1. Which of these terms are the most important for your business? (0–10)
  2. Which of these terms are negative keywords (we want to ignore or avoid)?
  3. Any other feedback about qualified or high-intent keywords?

We narrow down the list, removing any negative keywords or keywords that are not really important for the website.

Once we have our final list of hot words, we organize them into broad topic groups like this:

Organized spreadsheet of hot words by topic

The different colors have no meaning, but just help to keep it visually organized for when we group them.

One important thing to note is that word stems play an important part here.

For example, consider that all of these words below have the same underlying relevance and meaning:

  • blog
  • blogs
  • blogger
  • bloggers
  • blogging

Therefore, when we’re grouping keywords, to consider “blog” and “blogging” and “bloggers” as part of the same cluster, we’ll need to use the word stem of “blog” for all of them. Word stems are our best friend when grouping. Synonyms can be organized in a similar way, which are basically two different ways of saying the same thing (and the same user intent) such as “build” and “create” or “search” and “look for.”

Part 4: Preparation for keyword grouping

Now we’re going to get ourselves set up for our Herculean task of clustering.

To start, copy your list of hot words and transpose them horizontally across a row.

Screenshot of menu in spreadsheet

List your keywords in the first column.

Screenshot of keyword spreadsheet

Now, the real magic begins.

After much research and noodling around, I discovered the function in Google Sheets that tells us whether a stem or term is in a keyword or not. It uses RegEx:

=IF(RegExMatch(A5,"health"),"YES","NO")

This simply tells us whether this word stem or word is in that keyword or not. You have to individually set the term for each column to get your “YES” or “NO” answer. I then drag this formula down to all of the rows to get all of the YES/NO answers. Google Sheets often takes a minute or so to process all of this data.

Next, we have to “hard code” these formulas so we can remove the NOs and be left with only a YES if that terms exists in that keyword.

Copy all of the data and “Paste values only.”

Screenshot of spreadsheet menu

Now, use “Find and replace” to remove all of the NOs.

Screenshot of Find and Replace popup

What you’re left with is nothing short of a work of art. You now have the most powerful way to group your keywords. Let the grouping begin!

Screenshot of keyword spreadsheet

Part 5: Keyword grouping

At this point, you’re now set up for keyword clustering success.

This part is half art, half science. No wait, I take that back. To do this part right, you need:

  • A deep understanding of who you’re targeting, why they’re important to the business, user intent, and relevance
  • Good judgment to make tradeoffs when breaking keywords apart into groups
  • Good intuition

This is one of the hardest parts for me to train anyone to do. It comes with experience.

At the top of the sheet, I use the COUNTA function to show me how many times this word step has been found in our keyword set:

=COUNTA(C3:C10000)

This is important because as a general rule, it’s best to start with the most niche topics that have the least overlap with other topics. If you start too broadly, your keywords will overlap with other keyword groups and you’ll have a hard time segmenting them into meaningful groups. Start with the most narrow and specific groups first.

To begin, you want to sort the sheet by word stem.

The word stems that occur only a handful of times won’t have a large amount of overlap. So I start by sorting the sheet by that column, and copying and pasting those keywords into their own new tab.

Now you have your first keyword group!

Here’s a first group example: the “matcha” group. This can be its own project in its own right: for instance, if a website was all about matcha tea and there were other tangentially related keywords.

Screenshot of list of matcha-related keywords

As we continue breaking apart one keyword group and then another, by the end we’re left with many different keyword groups. If the groups you’ve arrived at are too broad, you can subdivide them even more into narrower keyword subgroups for more focused content pieces. You can follow the same process for this broad keyword group, and make it a microcosm of the same process of dividing the keywords into smaller groups based on word stems.

We can create an overview of the groups to see the volume and topical opportunities from a high level.

Screenshot of spreadsheet with keyword group overview

We want to not only consider search volume, but ideally also intent, competitiveness, and so forth.

Voilà!

You’ve successfully taken a list of thousands of keywords and grouped them into relevant keyword groups.

Wait, why did we do all of this hard work again?

Now you can finally attain that “product/market fit” we talked about. It’s magical.

You can take each keyword group and create a piece of optimized content around it, targeting dozens of keywords, exponentially raising your potential to acquire more organic traffic. Boo yah!

All done. Now what?

Now the real fun begins. You can start planning out new content that you never knew you needed to create. Alternatively, you can map your keyword groups (and subgroups) to existing pages on your website and add in keywords and optimizations to the header tags, body text, and so forth for all those long-tail keywords you had ignored.

Keyword grouping is underrated, overlooked, and ignored at large. It creates a massive new opportunity to optimize for terms where none existed. Sometimes it’s just adding one phrase or a few sentences targeting a long-tail keyword here and there that will bring in that incremental search traffic for your site. Do this dozens of times and you will keep getting incremental increases in your organic traffic.

What do you think?

Leave a comment below and let me know your take on keyword clustering.

Need a hand? Just give me a shout, I’m happy to help.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

* Checkbox GDPR is required

*

I agree

Will you like to book a consultation today?

We promise you’ll be glad to have us as the only premium website developer you’ve ever had!

Will you like to book a consultation today?

We promise you’ll be glad to have us as the only premium website developer you’ve ever had!

Bear Design - WordPress Development

Bear Design provides website development and design, creating content uploaded websites and improving web page placements and web traffic. Bear Design websites are unique, easy to use and responsive. Site owners can easily edit the content, or can trust the Bear Design & Communications to keep them up to date and supply quality content regularly.


GET IN TOUCH
160 City Road, EC1V 2NX London, United Kingdom
Monday – Thursday: 9:00 AM – 5:00 PM
Friday: 9:00 AM – 2:00 PM

WE ARE IN LONDON

Bear Design - WordPress Development

Bear Design provides website development and design, creating content uploaded websites and improving web page placements and web traffic. Bear Design websites are unique, easy to use and responsive. Site owners can easily edit the content, or can trust the Bear Design & Communications to keep them up to date and supply quality content regularly.


WE ARE IN LONDON

GET IN TOUCH
160 City Road, EC1V 2NX London, United Kingdom
Monday – Thursday: 9:00 AM – 5:00 PM
Friday: 9:00 AM – 2:00 PM

Bear Design - WordPress Development

Bear Design provides website development and design, creating content uploaded websites and improving web page placements and web traffic. Bear Design websites are unique, easy to use and responsive. Site owners can easily edit the content, or can trust the Bear Design & Communications to keep them up to date and supply quality content regularly.


GET IN TOUCH
160 City Road, EC1V 2NX London, United Kingdom
Monday – Thursday: 9:00 AM – 5:00 PM
Friday: 9:00 AM – 2:00 PM

WE ARE IN LONDON

© Made with by Bear Design

© Made with by Bear Design

    We are Bear Design

    WE DESIGN

    YOUR WORLD

    Bear Design & Communications Ltd.

    Address : 160 City Road, EC1V 2NX London, United Kingdom
    Phone : +36 702 448 100
    Email : [email protected]

    Opening hours :
    Monday – Thursday: 9:00 AM – 5:00 PM
    Friday: 9:00 AM – 2:00 PM

    Are you sure?
    You must approve our cookie policy to use our site. I you refuse it you will redirect to the Google.
    Refuse
    Approve Cookies
    Cookie Policy
    Cookie Policy
    This Bear Design Cookie Policy (“Policy”) outlines the general policy, practices, and types of cookies that Bear Design And Communications Ltd.. (“Bear Design”, “we”, “us” or “our”) may use to improve our services and your experience when visiting our websites.Cookies are small pieces of text used to store information on web browsers. They’re used by many websites to store and receive identifiers and other information on devices, such as a handheld phone or computer. Our site and services use cookies and other similar technologies (collectively in this Policy, “cookies”), in order to provide a better service to you and to generally improve our sites and services. For example, we may use cookies to help direct you to the appropriate part of our websites, by indicating that you are a repeat visitor. We may also use information to present you with services that are matched to your preferences.Some portions of our websites are functional without cookies, and you may generally choose whether to accept cookies. Most web browsers are set to accept cookies by default, however, you may be able to delete cookies yourself through your browser’s cookie manager. To do so, please follow the instructions provided by your web browser. Please note that disabling cookies will reset your session, disable auto-login, and may adversely the availability and functionality of our websites and the services we can provide to you.As part of our services, we may also place cookies on the computers of visitors to websites protected by Bear Design. We do this in order to identify malicious visitors, reduce the chance of blocking legitimate users, and to provide customized services.Our websites use first party cookies (i.e., cookies set directly by Bear Design) as well as third party cookies, as detailed in the table below.
    Type of CookieWhy we use these cookiesWho serves them and where can you find out more information?
    Analytics and research of usersThese are used to understand, improve, and research users visiting //beardesign.me and their needs for our product offerings. For example, we may use cookies to understand what pages a user browses before submitting a sales request form. We do not share information about this analysis with any third parties.Selected third parties listed and defined as follows:
    • Google Analytics – Web traffic tracking – //www.google.com/policies/privacy/
    • Bing – Conversion tracking from Bing ads – https://advertise.bingads.microsoft.com/en-us/resources/policies/microsoft-bing-adsprivacy-policy
    • Doubleclick – Google advertising platform that analyzes browsing activity across website to establish user profile – //www.google.com/policies/technologies/ads/
    • Twitter – Analyzes browsing activity across website to establish user profile – https://support.twitter.com/articles/20170514
    • Facebook – Analyzes browsing activity across website to establish user profile – https://www.facebook.com/policies/cookies/
    A user can delete these cookies through browser settings.
    Improving Website experienceThese provide functionality to help us deliver a better user experience for our website. For example, cookies help facilitate chats with our sales representatives, allow you to search the website, and deliver the user quickly to their intended website location.1st party and selected third parties as defined below:
    • __cfduid 3rd party cookie – This cookie is strictly necessary for Cloudflare’s security features
    • __hssc Cookie for keeping track of sessions. This is used to determine if we should increment the session number and timestamps in the __hstc cookie. It contains: the domain, viewCount (increments each pageView in a session), session start timestamp. (Expires: 30 min)
    • __hssrc Whenever HubSpot changes the session cookie, this cookie is also set. We set it simply to the value “1”, and use it to determine if the user has restarted their browser. If this cookie does not exist when we manage cookies, we assume it is a new session. (Expires: None. Session cookie)
    • __hstc The main cookie for tracking visitors. It contains: the domain, utk (see below), initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session) (Expires: 2 years)
    • hsfirstvisit This cookie used to keep track of a user’s first visit. (Expires: 10 years)
    • hubspotutk This cookie is used for to keep track of a visitor’s identity. This cookie is passed to HubSpot on form submission and used when deduplicating contacts. (Expires: 10 years)
    • wordpress_ WordPress cookie for a logged in user.
    • wordpress_logged_in_ WordPress cookie for a logged in user.
    • wp-settings- WordPress also sets a few wp-settings-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
    • wp-settings-time- WordPress also sets a few wp-settings-{time}-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
    • __cfduid 3rd party cookie – This cookie is strictly necessary for Cloudflare’s security features
    A user can delete these cookies through browser settings.
    LAST UPDATE: 24.01.2018, LONDON
    Approve
    Refuse
    Cookie Policy
    This Bear Design Cookie Policy (“Policy”) outlines the general policy, practices, and types of cookies that Bear Design And Communications Ltd.. (“Bear Design”, “we”, “us” or “our”) may use to improve our services and your experience when visiting our websites.Cookies are small pieces of text used to store information on web browsers. They’re used by many websites to store and receive identifiers and other information on devices, such as a handheld phone or computer. Our site and services use cookies and other similar technologies (collectively in this Policy, “cookies”), in order to provide a better service to you and to generally improve our sites and services. For example, we may use cookies to help direct you to the appropriate part of our websites, by indicating that you are a repeat visitor. We may also use information to present you with services that are matched to your preferences.Some portions of our websites are functional without cookies, and you may generally choose whether to accept cookies. Most web browsers are set to accept cookies by default, however, you may be able to delete cookies yourself through your browser’s cookie manager. To do so, please follow the instructions provided by your web browser. Please note that disabling cookies will reset your session, disable auto-login, and may adversely the availability and functionality of our websites and the services we can provide to you.As part of our services, we may also place cookies on the computers of visitors to websites protected by Bear Design. We do this in order to identify malicious visitors, reduce the chance of blocking legitimate users, and to provide customized services.Our websites use first party cookies (i.e., cookies set directly by Bear Design) as well as third party cookies, as detailed in the table below.
    Type of CookieWhy we use these cookiesWho serves them and where can you find out more information?
    Analytics and research of usersThese are used to understand, improve, and research users visiting //beardesign.me and their needs for our product offerings. For example, we may use cookies to understand what pages a user browses before submitting a sales request form. We do not share information about this analysis with any third parties.Selected third parties listed and defined as follows:
    • Google Analytics – Web traffic tracking – //www.google.com/policies/privacy/
    • Bing – Conversion tracking from Bing ads – https://advertise.bingads.microsoft.com/en-us/resources/policies/microsoft-bing-adsprivacy-policy
    • Doubleclick – Google advertising platform that analyzes browsing activity across website to establish user profile – //www.google.com/policies/technologies/ads/
    • Twitter – Analyzes browsing activity across website to establish user profile – https://support.twitter.com/articles/20170514
    • Facebook – Analyzes browsing activity across website to establish user profile – https://www.facebook.com/policies/cookies/
    A user can delete these cookies through browser settings.
    Improving Website experienceThese provide functionality to help us deliver a better user experience for our website. For example, cookies help facilitate chats with our sales representatives, allow you to search the website, and deliver the user quickly to their intended website location.1st party and selected third parties as defined below:
    • __cfduid 3rd party cookie – This cookie is strictly necessary for Cloudflare’s security features
    • __hssc Cookie for keeping track of sessions. This is used to determine if we should increment the session number and timestamps in the __hstc cookie. It contains: the domain, viewCount (increments each pageView in a session), session start timestamp. (Expires: 30 min)
    • __hssrc Whenever HubSpot changes the session cookie, this cookie is also set. We set it simply to the value “1”, and use it to determine if the user has restarted their browser. If this cookie does not exist when we manage cookies, we assume it is a new session. (Expires: None. Session cookie)
    • __hstc The main cookie for tracking visitors. It contains: the domain, utk (see below), initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session) (Expires: 2 years)
    • hsfirstvisit This cookie used to keep track of a user’s first visit. (Expires: 10 years)
    • hubspotutk This cookie is used for to keep track of a visitor’s identity. This cookie is passed to HubSpot on form submission and used when deduplicating contacts. (Expires: 10 years)
    • wordpress_ WordPress cookie for a logged in user.
    • wordpress_logged_in_ WordPress cookie for a logged in user.
    • wp-settings- WordPress also sets a few wp-settings-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
    • wp-settings-time- WordPress also sets a few wp-settings-{time}-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
    • __cfduid 3rd party cookie – This cookie is strictly necessary for Cloudflare’s security features
    A user can delete these cookies through browser settings.
    LAST UPDATE: 24.01.2018, LONDON
    Approve
    Refuse
    Welcome
    We use cookies to ensure that we give you the best experience on our website. Before you continue browsing you must approve or refuse our cookie policy.
    Approve
    Refuse
    Cookie Policy