Skip to content


Valid HTML is bad SEO

I’ve been amazed at how many SEO firms I’ve seen recently touting “W3C valided HTML” as one of their core SEO recommendations – a fundamental misconception I thought had disappeared years ago.

It’s an easy mistake to make, but linking it directly with SEO is plain wrong, bad advice for SEOs to be giving to their clients, and gives our industry a bad name. Why is it bad advice? Consider the cost to an enterprise client such as Amazon (1,451 home page errors) of implementing W3C validated HTML throughout their website, and then consider the following:

Search engines index the majority of their content by parsing HTML files, so there is a link between parseable code and efficient indexing (and therefore ranking). However, parseable HTML is not the same as W3C validated HTML. This is an important distinction, illustrated in the examples below:

<br> <– Not valid in xHTML 1.0 Transitional, easily parseable

<a href=”http://www.example.com/” Link to example.com></a> <– Not valid HTML, easily parseable, will not pass any anchor text, you won’t see it in your browser

<META name=”keywords” content=”useless tag” /> <– Fully parseable, invalid HTML 4.01, invalid xHTML 1.0

<p <a href={example.com}> <h1 Cheap flights</p> <– Completely unparseable, completely invalid, you won’t even see the text in your browser.

From the examples above, a general rule of thumb might be, if you can see it in a text browser (such as Lynx), it’s more than likely it can be parsed by search engines, regardless of the HTML’s W3C validity.

Another reason why W3C validated HTML is not an SEO recommendation is shown in the code example below:

<p>Cheap Flights</p>
<h1>We fly to destinations across the globe from London to New York, and offer the best service from check-in to your destination</h1>

Now that’s clearly valid HTML, but what SEO would say that’s a good, optimised snippet of HTML? The W3C validator tool cannot check for semantic validity, which is far more important for SEO.

I knocked up a script to prove this to those who still aren’t convinced. The charts below show the number of HTML errors in the W3C validator on the y-axis, with the Google natural positions along the x-axis. I’d say that’s a pretty random spread of errors?

cheap-mortgages

W3C error spread for cheap mortgages

cheap-flights

W3C error spread for cheap flights

car-insurance

W3C error spread for car insurance

credit-cards

W3C error spread for credit cards

With all this in mind, I think it’s pretty clear that in our Amazon example, the ROI they would receive for changing their HTML templates would be very far into the red. I’d say this permeates right down to mom & pop websites in terms of likely return on investment. Hence as SEOs, giving our clients advice that actually costs them money gives our industry a bad name, and negates the great return that good SEO can bring.

Let’s work together to remove this idea from our pitches and audits and make SEO better :)

  • Twitter
  • Sphinn
  • Digg
  • del.icio.us
  • Google Bookmarks
  • Facebook
  • email
  • Add to favorites
  • StumbleUpon

Posted in seo. Tagged with , , .

One Response

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. I believe that using valid HTML code is just another SEO myth, like many other SEO myths, like keyword density for example..

Some HTML is OK

(required)

(required, but never shared)

or, reply to this post via trackback.