I have mixed feelings about the explosion of web content.
On one hand, content can be the glue that holds campaigns together (SEO, social, paid, email…everything).
On the other hand, content has gotten saturated. So many marketers misinformed about what “good” content is.
I don’t know how many more “ultimate guides”, “complete lists” and “expert roundups” the internet can handle…
But, I digress. This post is not about creating good content, it’s about how to clean up bad content.
Why does this matter? Because having dated, irrelevant and flat out bad content can negatively impact your SEO.
When I say “content”, I don’t mean just blog posts – I mean your entire website:
- Legacy product / service pages
- Irrelevant category or auto generated “tag” pages
- Dated blog, resource or informational pages
- Doorway pages that don’t connect to your site’s core architecture
- Subdomains, forums, staging domains, etc
Specifically, in 3 ways…
- Content that doesn’t resonate with your target audience will kill conversion and engagement rates.
- Google’s algorithm looks heavily at content quality, trust and relevancy (aka having crap content can hurt rankings).
- Too much low quality content can decrease search engine crawl rate, indexation rate and ultimately, traffic.
In this post, I’m going to break down everything down, step by step, how to audit your website’s content. Specifically:
- How and where to get the right data inputs for your audit
- The parameters to assess the quality of your data / content
- The options you have to manage low (and high) quality content
I’ll be using our Website Quality Audit as a baseline for you to follow along.
The content audit process
The end goal of our content audit is to have a decision about what to do with every URL on your website – delete, redirect, update or leave as it is.
To make that decision, you’ll need to review pages manually. It’s necessary for human eyes to review pages that:
- Provide no value to your target audience
- Are no longer relevant, up to date or correct
- No longer exemplify your messaging, brand and marketing
We can leverage automation to gather and format the data, but there’s still an element of human analysis needed to make decisions.
Part 1 – Gathering the data inputs
Data is the only way we can make informed decisions about how to handle pages on our site. Our content audit pulls data from 4 sources.
1. Full website crawl from ScreamingFrog
We need to review and analyze every URL on your site. This data can be easily pulled using Screaming Frog or from your Sitemap.xml file (our tool uses Sitemaps).
2. Traffic and engagement data from Google Analytics
This data helps us understand content quality by looking at organic visits per page, bounce rate and conversion rate. We set parameters to help determine outcome of that page.
3. Backlinks data from Ahrefs
Backlink data helps determine if a page should be deleted or 301 redirected into a similar piece of content. Good links are hard to come by and we want to preserve link equity by properly managing content with links.
4. Server log files (optional)
A server log shows us how many times / often search engines are visit each page on your site. If a page is low quality and gets crawl often, it will change how we manage that page (i.e. 301 or update as opposed to delete / 404).
You’ll need to compile this data using VLOOKUPS. If you don’t know how to do that, no worries, we’d love to run this audit for you. Grab a time here to have a risk free consultation.Â
Part 2 – Bulk check URLs
Now, you’ll have every URL on your site with corresponding:
- Organic traffic
- Bounce rate
- Conversion rate
- Backlinks
The analysis starts by cross walking the Sitemap with organic traffic data.
Here’s a process diagram showing how we can eliminate URLs to review manually applying the data + logic…
If the page DOES get organic traffic… | If the the page DOES NOT get organic traffic… |
…and has a low bounce rate, we want to leave it alone (200). | ….and has NO links pointing to it, we want to delete it OR (404). |
…and has high bounce rate, we want to review for content quality (QR). | ….and DOES HAVE links pointing to it, we want to redirect it into a similar page (301). |
.and has a high conversion rate, we want to leave it alone (200). | |
…and has a low conversion rate, we want to review for content relevancy (QR). |
Here’s a process diagram showing how we can eliminate URLs to review manually applying the data + logic…
Our decision tree ends up in 1 of 4 actions.
- Leave as is (aka 200). If a page does receive organic traffic, has a low bounce rate and high conversion rate, we want to leave that page alone.
- Quality review (QR). If a page has does receive organic traffic, has a high bounce rate and low conversion rate, we want to review that page for content relevancy. After manual quality review, you will want to delete (404), redirect (301) or rewrite the content to improve it.
- Delete (404). If a page has no organic traffic and no backlinks, it has little value to your site. You should manually review it or delete it from your site.
- Redirect to similar content (301). If a page has no organic traffic but does have backlinks, you want to preserve link equity by setting a 301 redirect into a similar piece of content.
Part 3 – Quality review
The goal of our audit template is to cut down on the manual work of reviewing every page on your site for quality.
However, some manual review is inevitable as machines can’t read content for quality (YET).
If you followed our template you will have every page on your site with the following recommendations:
- Quality Review (QR)
- Leave as is (200)
- Redirect into similar (301)
- Delete (404)
Let’s walk through the manual process and your final options for dealing with the pages.
1. Quality Review (QR)
Execute this option if a page…
- Gets steady traffic from search engines (100 or more a year)
- That traffic has poor engagement metrics
- If a page is performing well in search and has low quality engagement metrics, we want to review it for relevancy.
Google doesn’t want to send traffic to your site if they’re not going to enjoy it. If they bounce right away, it’s a signal to Google that your website was not a good result for that query.
For that reason, we want to make sure the content on our site is up to date, relevant and delivering value to our visitors.
If you have URLs that triggered a QR, here’s what to do…
- Visit those pages individually
- Read them for quality – is it well written? Does it make sense?
- Read them for relevancy – is this topic still relevant? Is is on point with our brand?
From here, you have 3 options to manage:
- Rewrite or update the content to reflect updates, branding, keywords, etc.
- Delete the content if the topic is no longer relevant.
- Redirect the content into something more up to date and relevant.
I can’t tell you how to make the exact decision, but any of these will suffice.
2. Leave Content As-Is (200)
Execute this option if a page…
- Gets steady traffic from search engines (100 or more a year)
- Gets quality traffic from search engines (good bounce rate, driving conversions)
If a page is performing well in search and has quality engagement metrics, we want to leave it alone. Why mess with a good thing?
3. Redirect into similar (301)
Execute this option if a page…
- Gets NO traffic from search engines (100 or less a year)
- Has inbound links pointing to it
- 301 redirects pass on 100% of link equity – if a page has links pointing to it, this is always the best option.
If you have URLs that triggered a 301 result, here’s what to do…
- Find a piece of content on your site that’s similar.
- Set a 1 to 1, server side 301 redirect from the old content into the new.
If you’ve acquired links over time, a large part of your audit will be redirects. It’s best to build out a URL mapping file in Excel and pass to a developer to ensure this goes off smooth.
4. Delete (404)
Execute this option if a page…
- Gets NO traffic from search engines (100 or less a year)
- Has NO inbound links pointing to it
There’s a ton of information in the community claiming 404 pages are bad for SEO. This is only true when the wrong pages are deleted and not properly redirected.
For example, if you’re migrating your website to new URL structures:
- Old: yoursite.com/blue-shoes/
- New: yoursite.com/products/blue-shoes/
If you don’t redirect the old into the new, you’ll be left with a 404 page that search engines won’t be able to index and rank.
However, if you deem that old page no longer valid, it gets no traffic and has no inbound links, the best option is to delete the page and remove it from Google’s index. This creates a leaner, more relevant site that is crawler friendly.
If you have URLs that triggered a 404 result, here’s what to do…
Delete!
If you find the page has value to other areas of your business, i.e. internal traffic, social media, etc, then you may not want to delete it from your site. Instead, you might want to setup a canonical or “noindex” tag. This will be discussed next.
5. Set Noindex or Canonical tag
Execute this option if a page…
- Gets NO traffic from search engines (100 or less a year)
- Has NO inbound links pointing to it
- Has value to your website OUTSIDE of SEO
You may not be comfortable mass deleting or redirecting pages on your site. If that’s you, there’s 2 more options.
“Noindex tag”. This tells search engines NOT to index this page, deleting if from their index. This will help to clean up your site WITHOUT removing the page.
“Canonical tag”. This tag points to another page on your site and tells search engines to use that as the authority reference on your site. Again, this helps to clean up your index without removing pages.
These are both viable options but should be used secondary to 404 or 301 options.
Next steps
Running a content audit is a pivotal part of increasing your website’s organic traffic by improving your existing assets. It’s a lot of work, we’d love to do it for you. Book a free consultation to find out how we can help grow your rankings, completely done for you.
7 years ago