Thursday 3 March 2016

Why aren’t all my xml sitemap pages indexed in Webmaster Tools?


In the last few days I’ve encountered a surprising number of clients and even SEOs who don’t fully understand XML sitemaps, so I’m here to clear up some things.

Let’s say you half read a blog post somewhere that said “if your site doesn’t have an XML sitemap, your site will never be indexed and you will be poor, miserable and die lonely”.  So, you got your developer or SEO to make an XML sitemap for your website, or maybe you did it yourself with a free tool (because you’re cheap).  All giddy and excited, you submit your sitemap through Google Webmaster Tools and wait for the magical day for Google to crawl it.  Like Xmas morning, you creep down the stairs, log into GWT and start to cry because you see a report that looks like this:





Google Webmaster Tools Sitemap Report

“Only 262 pages indexed!” you scream.  “Why does Googlez hate me?  Imma fire my SEO and kick a baby!”

In a fevered response, you (or your SEO) goes line by line through your sitemap.xml file to make sure there are no broken links, or malformed URLs (good for you!), but you can’t find anything.  So instead, you resign yourself to being poor, miserable and dying lonely.

Well..  here’s something you may not have considered..

All URLs in a sitemap.xml file must return a 200 OK response

I find myself constantly amused by the number of XML Sitemaps I come across that have URLs that either 404 or redirect with a 301 or 302.  What’s even more amusing, is when I find URLs that have been disallowed via robots.txt.

So, to help you all understand why the URLs in your XML sitemap may not be indexing fully, I’ve made some easy-to-follow pictures!  Why?  Because I know how much you hate reading.

Here is not index reason:

URLs in XML Sitemap returning 404 Not Found responses

URLs in XML Sitemap returning 301 or 302 redirect responses

URLs in XML Sitemap disallowed via Robots.txt

URLs in XML Sitemap returning 200 OK responses

URL With Duplicate content.

MAy different version of your domain name.


Now, before you start looking… no, this site doesn’t have an XML sitemap file.  Why?  Because they’re not necessary!  An XML sitemap is only a tool to help crawlers discover pages they might not normally find, usually because you have a crappy, unspiderable javascript menu that plays a Megadeth song every time you hover over it with your mouse, because your usability expert told you that was the future of the web.


No comments:

Post a Comment