Having a search engine friendly website, a site with no barriers to the search engines is critical to success within “organic” search engine results.
Here are some relatively simple SEO Site Review/Audit steps that should help you flush out the most important site wide issues within a small to mid size website.
A SEO site review or audit should be conducted periodically; for example, a few weeks to months after a website is redesigned or after a major update, to allow enough time for the site to be fully re-indexed.
Anytime organic search engine traffic takes a significant drop a SEO site review might help you determine the cause.
You should also consider a site review if one hasn’t been conducted within the last couple of years, in order to incorporate new developments with the major search engines.
Finally, a SEO review or audit should be considered during site redesign planning to flush out any issues with the current website that should be addressed in the new design.
Examine Files Indexed In Search Engines
Either examine all or most of the landing pages reported by an analytics system, or scan through the listing of files indexed in one or more of the major search engines by searching on site:domainname.com in Google or Bing, for example.
Before scanning the search listings in Google, go to the end of the search results and click on the “Repeat the search with the omitted results included” link if there is one. Those omitted files often point to issues. Google will list up to a thousand files from the site so it may take a few clicks to the get to the end.
If you have a site with more than a 1,000 files indexed, you can examine more files by adding a phrase to the search such as [site:domainname.com “hp printers”] to look at some specific pages in a category for example.
“Repeat search with omitted results” link adds search results that often point to issues.
Scan the listings and click through to a sampling of the files to look for issues such as files that shouldn’t be indexed, old pages that are still up and getting indexed, less than ideal use of HTML Page Titles or Meta Description tags, content from include pages, pop-ups, or frames that are getting indexed without the surrounding navigation, etc.
HTML Page Titles & Meta Description Tag handling
As you scan through the files, check the HTML page titles to see if they are unique, enticing page titles that will be displayed in search results. Also, check a sampling of the Meta Description tags and ensure they are unique descriptions of the page’s content (See the Google Webmaster Tools section below about duplicate tags too).
If the titles and descriptions are not unique and enticing, you’ll want to optimize any that can be manually updated.
If a CMS system is used such as Drupal, Joomla, some ecommerce tools, etc., see if there are settings or plug-ins available to improve HTML Page title and description tag handling.
If there are programmatic issues with the page titles and description tags, and no settings or plug-ins are available to fix the issues (often in the case in custom dynamic sites) you may need to consider modifying the programming or even a site redesign to address the issues.
Compare The Number Of Pages Indexed
Check the number of files indexed in Google and Bing or both and compare the number to an estimate of the total files on the site and look for huge differences.
You can get the number of files that are reported to be indexed in Google or Bing using one of the many tools available such as URLTrends, SeoQuake for Firefox, or by performing a site:domainname.com search in Google or Bing.
The result will often be one of two scenarios:
A) Too few files indexed
Too few files or too many files indexed can both point to issues. For example, if the web site has at least 10,000 files and the search engines report fewer than 300 files indexed, that would suggest there are issues keeping the search engine crawlers from indexing many of the files. See some of the troubleshooting steps below.
B) Too many files indexed
On the other hand, if the search engines report far more files indexed than there should be on the site (for example 15,000 files are reported indexed for a site that should only have a couple of thousand files) this suggests there may be duplicate path issues or many old pages have been left up on the site.
I often see webmasters keep many old pages, even entire old versions of sites live, getting indexed, sometimes ranking for important keywords. However these old pages usually have broken navigation links and images, out of date information, and countless other issues. Check the bounce rate to those old pages. It’s likely nearly 100%.
If you find old pages getting indexed, try to determine whether they are ranking for relevant keywords before you take them down.
For example, in Google Analytics, search for the page URL in Content: Top Landing Pages. Then click on the page and choose Entrance Keywords in the Landing Page Optimization section.
Alternatively, in Google’s Webmaster Tools (google.com/webmasters/tools/home) go to “Your site on the web” then click on Search Queries: Top pages.
If you find the page in this listing, click on the page URL to see the number of impressions and clicks by keyword.
If a page is being returned in search results quite often and especially if it is bringing significant traffic to the site for relevant keywords, think about including similar content on the new part of the site.
At the very least, consider setting up a 301 redirect from the old page(s) to the best page on the live site to capture that keyword traffic for awhile. Then work to optimize a page(s) on the live site for those keywords.
Ensure Site Navigation Links Can Be Followed
Check that site navigation links can be read and followed by the search engines. We often see problems with cascading drop-down menu systems and other navigation systems. Links that can’t be followed can hurt page rankings and result in pages being crawled less often for changes.
Look at a sample of pages with a page analyzer or crawler such as seo-browser.com and examine Google’s text only version of the cache (screen shot below) to see if Google shows all the links including any fly-out or drop down links (click on some of them to make sure they work correctly).
Check A Sample Of Important Pages On The Site
Pick a sample of important pages at the top level and other levels, such as some product categories and specific items within an e-commerce site for example. See if they are indexed and look for problems such as duplicate content or paths, and whether the content of pages is not getting indexed.
Copy a snippet of what appears to be unique text from the page. Then search for it in Google with parenthesizes around the text.
Many pages not indexed?
If you get no results, the page may not be indexed. See if it has been live long enough that it should be indexed. Try a search on the URL using a site:domain/filename type search.
If you get no results, check a larger sample of pages in this matter. If you find a high percentage of pages not getting indexed. you’ll need to do some more troubleshooting to try to determine the reason they are not getting indexed such as navigation links that are not being followed, pages excluded by robots.txt (more below), issues with URL’s, or other problems.
Check Google’s cache of pages
If a page is indexed, check it in Google’s “text-only” version of the page’s cache and/or one or more page analyzers or crawlers. Compare what is indexed or crawable to the actual page to see if important content is not getting indexed, or if there isn’t an ideal amount of content getting indexed. Do this for a good sample of pages.
Check for duplicate paths
If there are multiple results for the exact content on the site, there may be duplicate path issues. Check to see if there are different URL’s being used to point to same files. There should be only one path to a file.
Check for duplicate content
If you get multiple results for a search on unique text, examine the files to see if they are indeed duplicates or very similar.
If the pages are duplicates or very similar content located on multiple websites, this may be a case of multiple web sites using similar content (very common with ec-ommerce sites that use content provided by manufacturers etc).
It’s best not to use the same content that is used on other websites (unless you include “enough” unique content on the page too) or you’ll be competing with all the other sites for rankings. Occasionally, you may find content being illegally used on other sites (this is how we learned there was a complete duplicate of our own website being hosted in Asia).
Very similar content may get indexed in some dynamic web sites. For example, products in e-commerce sites may get indexed multiple times with different colors or sizes. It’s best to have very similar content indexed only once.
It may not be easy to fix some of these duplicate path and duplicate content issues without redesigning the site. However, you may be able to reduce the impact these issues can have on rankings and crawl rates by using the Canonical Link Element or trying the parameter handling in Google’s Webmaster Tools (Settings /Parameter handling) to ignore some of the parameters being used.
Check Google’s Webmaster Tools
Use Google’s Webmaster Tools (google.com/webmasters/tools/home) to help flush out issues.
Duplicate Page Titles & Description Tags
In the Diagnostics section, click on “HTML Suggestions” and look for duplicates. It’s best to have unique HTML Page Titles and description Meta tags for every webpage. Update any duplicate page titles and description tags that can be manually updated.
However, duplicate titles and description tags may be reported because there are duplicate paths to files, so check the files to see if this is the issue. Again, there should be only one path to a file.
Check the Robots.txt file
Google displays the robots.txt file if there is one (Site Configuration: Crawler Access). Scan the robots.txt file to see if there are sections of the site being blocked that shouldn’t be. You can also test a sample of important URL’s from the site to see if access has been blocked.