How to Discover and Monitor Bad Backlinks

How to Discover and Monitor Bad Backlinks

How to Discover and Monitor Bad Backlinks 1920 1280 rjonesx.

Posted by rjonesx.

Identifying bad backlinks has become easier over the past few years with better tool sets, bigger link indexes, and increased knowledge, but for many in our industry it’s still crudely implemented. While the ideal scenario would be to have a professional poring over your link profile and combing each link one-by-one for concerns, for many webmasters that’s just too expensive (and, frankly, overkill).

I’m going to walk through a simple methodology using Link Explorer and Excel (although you could do this with Google Sheets just as easily) to combine together the power of Moz Link Explorer, Keyword Explorer Lists, and finally Link Lists to do a comprehensive link audit.

The basics

There are several components involved in determining whether a link is “bad” and should potentially be removed. Ultimately, we want to be able to measure the riskiness of the link (how likely is Google to flag the link as manipulative and how much do we depend on the link for value). Let me address three common factors used by SEOs to determine this score:

Trust metrics:

There are a handful of metrics in our industry that are readily available to help point out concerning backlinks. The two that come to mind most often are Moz Spam Score and Majestic Trust Flow (or, better yet, the difference between Citation Flow and Trust Flow). These two scores actually work quite differently. Moz’s Spam Score predicts the likelihood a domain is banned or penalized based on certain site features. Majestic Trust Flow determines the trustworthiness of a domain or page based on the quality of links pointing to it. While calculated quite differently, the goal is to help webmasters identify which sites are trustworthy and which are not. However, while these are a good starting point, they aren’t sufficient on their own to give you a clear picture of whether a link is good or bad.

Anchor text manipulation:

One of the first things an SEO learns is that using valuable anchor text can help increase your rankings. The very next thing they learn is that using valuable anchor text can bring on a penalty. The reason for this is pretty clear: the likelihood a webmaster will give you valuable anchor text out of the goodness of their heart is very rare, so over-optimization sticks out like a sore thumb. So, how do we measure anchor text manipulation? If we look at anchor text with our own eyes, this seems to be rather intuitive, but there’s a better way to do it in an automated, at-scale fashion that will allow us to better judge links.

Low authority:

Finally, low-authority links — especially when you would expect higher authority based on the domain — are concerning. A good link should come from an internally well-linked page on a site. If the difference between the Domain Authority and Page Authority is very high, it can be a concern. It isn’t a strong signal, but it is one worth looking at. This is especially obvious in certain types of spam, like paginated comment spam or forum profile spam.

So, let’s jump into how we can pull together a quick backlink analysis taking into account these various features of a bad backlink profile. If you’d like to follow along with this tutorial, hop into Link Explorer in another tab:

Follow along with Link Explorer

Step 1: Get the backlink data

The first and easiest step is just to get your backlink data from Link Explorer’s huge backlink index. With nearly 30 trillion links in our index, you can rest assured that we will find most of the bad backlinks with which you should be concerned. To begin, visit the Link Explorer > Inbound Links section and enter in the domain or page which you wish to analyze.

How to Find Bad Backlinks

Because we aren’t concerned with nofollow links, you will want to set the “follow” filter so that we only export followed links. We also aren’t concerned with deleted links, so we can set the Link Status to “Active.”

How to Find Bad Backlinks

Once you have set these filters, hit the “Export” button. You will have a couple of choices. If your site has fewer than 1,000 backlinks, go ahead and choose the immediate download. However, if your link profile is larger, choose the largest setting and be patient for the download to be prepared. We can keep going with other steps of the project in the meantime, but you don’t want to miss out on bad links, which means you need to export them all.

A lot of SEOs will stop at this point. With PA, DA, and Spam Score included in the standard export, you can do a damn good job of finding bad links. Link Explorer does all of that out-of-the-box for you. But for our purposes here, we wan’t to go a step further and do “anchor text qualification.” This is especially valuable for large link profiles.

Step 2: Get anchor text

Getting anchor text out of the new Link Explorer is incredibly simple. Just visit Link Explorer > Anchor Text and hit the Export button. No extra filters will be needed here.

How to Find Bad Backlinks

Step 3: Measure anchor text value

Now here is a quick trick where we can take advantage of Moz Keyword Explorer’s Keyword Lists to find anchor text that appears to be manipulated. First, we want to remove some of the extraneous anchor text which we know absolutely won’t be concerning, such as URLs as anchor text. This step isn’t completely necessary, but will save you some some credits in Moz Keyword Explorer, so it might be worth it.

How to Find Bad Backlinks

After you’ve removed the extraneous anchor text, we’ll just copy and paste our anchor text into a new keyword list for Keyword Explorer.

How to Find Bad Backlinks

By putting the anchor text into Keyword Explorer, we’ll be able to sort anchor text by search volume. It isn’t very common that anchor text happens to have a high search volume, but when webmasters are trying to manipulate search results they often use the keyword for which they’d like to rank in the anchor text. Thus, we can use the search volume of anchor text as a proxy for manipulated anchor text. In fact, when working with Remove’em before I joined Moz, we discovered the anchor text manipulation was the most predictive factor in link penalties.

Step 4: Merge, filter, sort, & model

We will now merge the data (backlinks export and keyword list export) to finally get that list of concerning backlinks. Let’s start with the backlink export. We’ll open it up in Excel and then remove duplicate domain-anchor text pairs.

I’ll start by showing you a quick trick to extract out the domains from a long list of URLs. I copied the list of URLs from the first column to the last column in Excel, and then chose Data > Text to Columns > Delimited > Other > /. This will cause the URLs to be split into different columns wherever the slash occurs, leaving you with the 4th new column being just the domain names.

How to Find Bad Backlinks

Once you have completed this step, we are going to remove duplicate domain-anchor text pairs. Notice that we aren’t going to limit ourselves to one link per domain, which is what many SEOs do. This would be a mistake, since there could be multiple concerning links on the site with different anchor text.

How to Find Bad Backlinks

After choosing Data > Remove Duplicates, I select the column of Anchor Text and the column of Domain. With the duplicates removed, we are now left with the links we want to judge as good or bad. We need one more thing, though. We need to merge in the search volume data we got from Keyword Explorer. Hit the export button on the keyword list you created from anchor text in Keyword Explorer:

How to Find Bad Backlinks

Open up the export and then copy and paste the data into a second sheet in Excel, next to the backlinks sheet you already created and filtered. In this case, I named the two sheets “Raw Data” and “Anchor Text Data”:

How to Find Bad Backlinks

You’ll then want to do a VLOOKUP on the backlinks spreadsheet to create a column with the search volume for the anchor text on each link. I’ve taken a screenshot of the VLOOKUP formula I used, but yours will look a little different depending upon the the names of the sheets and the exact columns you’ve created.

Excel formula: =IF(ISNA(VLOOKUP(C2,'Anchor Text Data'!$A$1:$I$402,3,FALSE)),0,VLOOKUP(C2,'Anchor Text Data'!$1:$I$402,3,FALSE))

=IF(ISNA(VLOOKUP(C2,’Anchor Text Data’!$A$1:$I$402,3,FALSE)),0,VLOOKUP(C2,’Anchor Text Data’!$1:$I$402,3,FALSE))

It looks a little complicated, but that’s simply because I’m using two VLOOKUPs simultaneously to replace N/A results with the number 0. You can always manually put in 0 wherever N/A shows up.

Now it’s time for the fun part: modeling. First, I recommend sorting by the volume column you just created just so you can see the most concerning anchor text at the top. It’s amazing to see links with anchor text like “ring” or “jewelry” automatically populate at the top of the list, since they’re also keywords with high search volume.

How to Find Bad Backlinks

Second, we’ll create a new column with a formula that takes into account the quality of the link, the riskiness of the anchor text, and the Spam Score:

Excel formula: =D11+(F11-E11)+(LOG(G11+1)*10)+(LOG(O11+1)*10)

=D11+(F11-E11)+(LOG(G11+1)*10)+(LOG(O11+1)*10)

Let’s break down that formula real quickly:

  • D11: This is simply the Spam Score
  • (F11-E11): This is the Domain Authority minus the Page Authority. (This is a bit debatable — some people might just prefer to choose 100-E11)
  • (Log(G11+1)*10): This is a fancy way of converting the number of times this anchor text link occurs into a consistent number for our equation. Without taking the log(), having a high number here could overcome the other signals.
  • (Log(O11+1)*10): This is a fancy way of converting the search volume to a number consistent for our equation. Without taking the log(), having a high search volume could also overcome other signals.

Once we run this equation and create a new column, we can sort by “Riskiness” and find the links with which we should be most concerned.

How to Find Bad Backlinks

As you can see, examples of comment spam and paid links popped to the top of the list because the formula gives a higher value to low-quality, spammy links with risky anchor text. But wait, there’s more!

Step 5: Build a Link List

Link Explorer doesn’t just leave you hanging after doing analysis. Our goal is to help you do SEO, not just analyze it. Your next step is to start a new Link List.

The Link List feature allows you to track whether certain links are alive. If you embark on a campaign to try and remove some of these spammier links, you can create a Link List and use it to monitor the status of those links. Just create a new list by naming it, adding your domain, and then copying and pasting the concerning links.

How to Find Bad Backlinks

You can now just monitor the Link List as you do your outreach to remove bad links. The Link List will track all the metrics, including whether the link has been removed.

How to Find Bad Backlinks

Wrapping up

Whether you want to do a cursory backlink audit by just looking at Spam Score and PA, or a deep-dive taking into account anchor text qualification, Link Explorer + Keyword Explorer and Link Lists make it possible. With our greatly improved backlink index, you can now rest assured that the data you need is right at your finger tips and, if you need to get down-and-dirty in Excel, you can readily export it to do deeper analysis.

Find your spammy links!

Good luck hunting bad backlinks!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

* Checkbox GDPR is required

*

I agree

Will you like to book a consultation today?

We promise you’ll be glad to have us as the only premium website developer you’ve ever had!

Will you like to book a consultation today?

We promise you’ll be glad to have us as the only premium website developer you’ve ever had!

Bear Design - WordPress Development

Bear Design provides website development and design, creating content uploaded websites and improving web page placements and web traffic. Bear Design websites are unique, easy to use and responsive. Site owners can easily edit the content, or can trust the Bear Design & Communications to keep them up to date and supply quality content regularly.


GET IN TOUCH
160 City Road, EC1V 2NX London, United Kingdom
Monday – Thursday: 9:00 AM – 5:00 PM
Friday: 9:00 AM – 2:00 PM

WE ARE IN LONDON

Bear Design - WordPress Development

Bear Design provides website development and design, creating content uploaded websites and improving web page placements and web traffic. Bear Design websites are unique, easy to use and responsive. Site owners can easily edit the content, or can trust the Bear Design & Communications to keep them up to date and supply quality content regularly.


WE ARE IN LONDON

GET IN TOUCH
160 City Road, EC1V 2NX London, United Kingdom
Monday – Thursday: 9:00 AM – 5:00 PM
Friday: 9:00 AM – 2:00 PM

Bear Design - WordPress Development

Bear Design provides website development and design, creating content uploaded websites and improving web page placements and web traffic. Bear Design websites are unique, easy to use and responsive. Site owners can easily edit the content, or can trust the Bear Design & Communications to keep them up to date and supply quality content regularly.


GET IN TOUCH
160 City Road, EC1V 2NX London, United Kingdom
Monday – Thursday: 9:00 AM – 5:00 PM
Friday: 9:00 AM – 2:00 PM

WE ARE IN LONDON

© Made with by Bear Design

© Made with by Bear Design

    We are Bear Design

    WE DESIGN

    YOUR WORLD

    Bear Design & Communications Ltd.

    Address : 160 City Road, EC1V 2NX London, United Kingdom
    Phone : +36 702 448 100
    Email : [email protected]

    Opening hours :
    Monday – Thursday: 9:00 AM – 5:00 PM
    Friday: 9:00 AM – 2:00 PM

    Are you sure?
    You must approve our cookie policy to use our site. I you refuse it you will redirect to the Google.
    Refuse
    Approve Cookies
    Cookie Policy
    Cookie Policy
    This Bear Design Cookie Policy (“Policy”) outlines the general policy, practices, and types of cookies that Bear Design And Communications Ltd.. (“Bear Design”, “we”, “us” or “our”) may use to improve our services and your experience when visiting our websites.Cookies are small pieces of text used to store information on web browsers. They’re used by many websites to store and receive identifiers and other information on devices, such as a handheld phone or computer. Our site and services use cookies and other similar technologies (collectively in this Policy, “cookies”), in order to provide a better service to you and to generally improve our sites and services. For example, we may use cookies to help direct you to the appropriate part of our websites, by indicating that you are a repeat visitor. We may also use information to present you with services that are matched to your preferences.Some portions of our websites are functional without cookies, and you may generally choose whether to accept cookies. Most web browsers are set to accept cookies by default, however, you may be able to delete cookies yourself through your browser’s cookie manager. To do so, please follow the instructions provided by your web browser. Please note that disabling cookies will reset your session, disable auto-login, and may adversely the availability and functionality of our websites and the services we can provide to you.As part of our services, we may also place cookies on the computers of visitors to websites protected by Bear Design. We do this in order to identify malicious visitors, reduce the chance of blocking legitimate users, and to provide customized services.Our websites use first party cookies (i.e., cookies set directly by Bear Design) as well as third party cookies, as detailed in the table below.
    Type of CookieWhy we use these cookiesWho serves them and where can you find out more information?
    Analytics and research of usersThese are used to understand, improve, and research users visiting //beardesign.me and their needs for our product offerings. For example, we may use cookies to understand what pages a user browses before submitting a sales request form. We do not share information about this analysis with any third parties.Selected third parties listed and defined as follows:
    • Google Analytics – Web traffic tracking – //www.google.com/policies/privacy/
    • Bing – Conversion tracking from Bing ads – https://advertise.bingads.microsoft.com/en-us/resources/policies/microsoft-bing-adsprivacy-policy
    • Doubleclick – Google advertising platform that analyzes browsing activity across website to establish user profile – //www.google.com/policies/technologies/ads/
    • Twitter – Analyzes browsing activity across website to establish user profile – https://support.twitter.com/articles/20170514
    • Facebook – Analyzes browsing activity across website to establish user profile – https://www.facebook.com/policies/cookies/
    A user can delete these cookies through browser settings.
    Improving Website experienceThese provide functionality to help us deliver a better user experience for our website. For example, cookies help facilitate chats with our sales representatives, allow you to search the website, and deliver the user quickly to their intended website location.1st party and selected third parties as defined below:
    • __cfduid 3rd party cookie – This cookie is strictly necessary for Cloudflare’s security features
    • __hssc Cookie for keeping track of sessions. This is used to determine if we should increment the session number and timestamps in the __hstc cookie. It contains: the domain, viewCount (increments each pageView in a session), session start timestamp. (Expires: 30 min)
    • __hssrc Whenever HubSpot changes the session cookie, this cookie is also set. We set it simply to the value “1”, and use it to determine if the user has restarted their browser. If this cookie does not exist when we manage cookies, we assume it is a new session. (Expires: None. Session cookie)
    • __hstc The main cookie for tracking visitors. It contains: the domain, utk (see below), initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session) (Expires: 2 years)
    • hsfirstvisit This cookie used to keep track of a user’s first visit. (Expires: 10 years)
    • hubspotutk This cookie is used for to keep track of a visitor’s identity. This cookie is passed to HubSpot on form submission and used when deduplicating contacts. (Expires: 10 years)
    • wordpress_ WordPress cookie for a logged in user.
    • wordpress_logged_in_ WordPress cookie for a logged in user.
    • wp-settings- WordPress also sets a few wp-settings-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
    • wp-settings-time- WordPress also sets a few wp-settings-{time}-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
    • __cfduid 3rd party cookie – This cookie is strictly necessary for Cloudflare’s security features
    A user can delete these cookies through browser settings.
    LAST UPDATE: 24.01.2018, LONDON
    Approve
    Refuse
    Cookie Policy
    This Bear Design Cookie Policy (“Policy”) outlines the general policy, practices, and types of cookies that Bear Design And Communications Ltd.. (“Bear Design”, “we”, “us” or “our”) may use to improve our services and your experience when visiting our websites.Cookies are small pieces of text used to store information on web browsers. They’re used by many websites to store and receive identifiers and other information on devices, such as a handheld phone or computer. Our site and services use cookies and other similar technologies (collectively in this Policy, “cookies”), in order to provide a better service to you and to generally improve our sites and services. For example, we may use cookies to help direct you to the appropriate part of our websites, by indicating that you are a repeat visitor. We may also use information to present you with services that are matched to your preferences.Some portions of our websites are functional without cookies, and you may generally choose whether to accept cookies. Most web browsers are set to accept cookies by default, however, you may be able to delete cookies yourself through your browser’s cookie manager. To do so, please follow the instructions provided by your web browser. Please note that disabling cookies will reset your session, disable auto-login, and may adversely the availability and functionality of our websites and the services we can provide to you.As part of our services, we may also place cookies on the computers of visitors to websites protected by Bear Design. We do this in order to identify malicious visitors, reduce the chance of blocking legitimate users, and to provide customized services.Our websites use first party cookies (i.e., cookies set directly by Bear Design) as well as third party cookies, as detailed in the table below.
    Type of CookieWhy we use these cookiesWho serves them and where can you find out more information?
    Analytics and research of usersThese are used to understand, improve, and research users visiting //beardesign.me and their needs for our product offerings. For example, we may use cookies to understand what pages a user browses before submitting a sales request form. We do not share information about this analysis with any third parties.Selected third parties listed and defined as follows:
    • Google Analytics – Web traffic tracking – //www.google.com/policies/privacy/
    • Bing – Conversion tracking from Bing ads – https://advertise.bingads.microsoft.com/en-us/resources/policies/microsoft-bing-adsprivacy-policy
    • Doubleclick – Google advertising platform that analyzes browsing activity across website to establish user profile – //www.google.com/policies/technologies/ads/
    • Twitter – Analyzes browsing activity across website to establish user profile – https://support.twitter.com/articles/20170514
    • Facebook – Analyzes browsing activity across website to establish user profile – https://www.facebook.com/policies/cookies/
    A user can delete these cookies through browser settings.
    Improving Website experienceThese provide functionality to help us deliver a better user experience for our website. For example, cookies help facilitate chats with our sales representatives, allow you to search the website, and deliver the user quickly to their intended website location.1st party and selected third parties as defined below:
    • __cfduid 3rd party cookie – This cookie is strictly necessary for Cloudflare’s security features
    • __hssc Cookie for keeping track of sessions. This is used to determine if we should increment the session number and timestamps in the __hstc cookie. It contains: the domain, viewCount (increments each pageView in a session), session start timestamp. (Expires: 30 min)
    • __hssrc Whenever HubSpot changes the session cookie, this cookie is also set. We set it simply to the value “1”, and use it to determine if the user has restarted their browser. If this cookie does not exist when we manage cookies, we assume it is a new session. (Expires: None. Session cookie)
    • __hstc The main cookie for tracking visitors. It contains: the domain, utk (see below), initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session) (Expires: 2 years)
    • hsfirstvisit This cookie used to keep track of a user’s first visit. (Expires: 10 years)
    • hubspotutk This cookie is used for to keep track of a visitor’s identity. This cookie is passed to HubSpot on form submission and used when deduplicating contacts. (Expires: 10 years)
    • wordpress_ WordPress cookie for a logged in user.
    • wordpress_logged_in_ WordPress cookie for a logged in user.
    • wp-settings- WordPress also sets a few wp-settings-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
    • wp-settings-time- WordPress also sets a few wp-settings-{time}-[UID] cookies. The number on the end is your individual user ID from the users database table. This is used to customize your view of admin interface, and possibly also the main site interface.
    • __cfduid 3rd party cookie – This cookie is strictly necessary for Cloudflare’s security features
    A user can delete these cookies through browser settings.
    LAST UPDATE: 24.01.2018, LONDON
    Approve
    Refuse
    Welcome
    We use cookies to ensure that we give you the best experience on our website. Before you continue browsing you must approve or refuse our cookie policy.
    Approve
    Refuse
    Cookie Policy