Monthly Archives: June 2009

Valid HTML is bad SEO

I’ve been amazed at how many SEO firms I’ve seen recently touting “W3C valided HTML” as one of their core SEO recommendations – a fundamental misconception I thought had disappeared years ago.

It’s an easy mistake to make, but linking it directly with SEO is plain wrong, bad advice for SEOs to be giving to their clients, and gives our industry a bad name. Why is it bad advice? Consider the cost to an enterprise client such as Amazon (1,451 home page errors) of implementing W3C validated HTML throughout their website, and then consider the following:

Search engines index the majority of their content by parsing HTML files, so there is a link between parseable code and efficient indexing (and therefore ranking). However, parseable HTML is not the same as W3C validated HTML. This is an important distinction, illustrated in the examples below:

<br> <– Not valid in xHTML 1.0 Transitional, easily parseable

<a href=”http://www.example.com/” Link to example.com></a> <– Not valid HTML, easily parseable, will not pass any anchor text, you won’t see it in your browser

<META name=”keywords” content=”useless tag” /> <– Fully parseable, invalid HTML 4.01, invalid xHTML 1.0

<p <a href={example.com}> <h1 Cheap flights</p> <– Completely unparseable, completely invalid, you won’t even see the text in your browser.

From the examples above, a general rule of thumb might be, if you can see it in a text browser (such as Lynx), it’s more than likely it can be parsed by search engines, regardless of the HTML’s W3C validity.

Another reason why W3C validated HTML is not an SEO recommendation is shown in the code example below:

<p>Cheap Flights</p>
<h1>We fly to destinations across the globe from London to New York, and offer the best service from check-in to your destination</h1>

Now that’s clearly valid HTML, but what SEO would say that’s a good, optimised snippet of HTML? The W3C validator tool cannot check for semantic validity, which is far more important for SEO.

I knocked up a script to prove this to those who still aren’t convinced. The charts below show the number of HTML errors in the W3C validator on the y-axis, with the Google natural positions along the x-axis. Read More…

Are PPC ads really affecting organic rankings?

A post on SearchEngineWatch yesterday asked the question “Are PPC Ads Now Counting in Google Organic Backlinks?” A question I was tempted to reject out of hand as tosh (this would be incredibly damaging for Google if this was really true), but after reading through the post I couldn’t discount it entirely without some research of my own.

This issue is more interesting than I initially thought. Basically what is being said in the article is that for one of the author’s client websites, a number of backlinks were appearing in Webmaster Tools that appeared to show links coming from an AdWords/YSM campaign. Not only this, but the website ranks #1 in Google for a phrase only used in the PPC campaign, and which is not in the content of the page that ranks for the phrase:

Google cache

So after a bitĀ  digging in Yahoo (search for ["ga nc tn & tx land sale" bluegreen] in Yahoo, and see screengrab below), I can see what appears to be happening, is that these particular ads have been syndicated across a bunch of the spammy pseudo search engines you see so many of these days.

yahoo cache

In this low-quality syndication network, the JavaScript code they’re using in these sponsoredĀ links is shown below:

<a href="[[adurl]]" ONMOUSEOVER="window.status='http://www.BluegreenCommunities.com';return true;">anchor text</a>

I counted only 2 302 redirects on the ad URL redirection chain before the spider hits a robots.txt block. This means Google may have either given up following these redirects before it reaches the robots.txt file, hit the robots file and decided to try another route, or was treating the JavaScript element as a different link entirely.

Now the clue that Google is following the JavaScript code element in this case is that it is the domain, and not the ad destination URL that gets the anchor text credit, accounting for the homepage ranking #1 for the query in question, and not the advertising URL.

This is rather odd, because “window.status” is not a JavaScript link function, and implies Google’s JavaScript crawling isn’t as advanced as they want us to think. Looks to me like they’re just running a regex for anything that looks like a URL inside a onmouseover/onclick/etc command – this requires testing, but if that’s it, surely that’s pretty primitive?

I think it’s fairly clear this is a screw up in Google’s JavaScript crawl implementation, so will likely be fixed in (very) short order. In the meantime, I’m thinking of setting up an AdWords campaign or two…

How Google handles alt text in H1 tags

It’s well accepted amongst SEOs that alt text in images doesn’t carry as much weight as plain text. However if a website’s CMS automatically uses images as headings to preserve a design, is there any ROI in SEO for the associated costs of implementing sIFR or image replacement techniques?

I ran a quick test in order to test this, and an inference taken from page 16 of Google’s SEO Starter Guide:

“the alt text for that image will be treated similarly to the anchor text of a text link”

The inference here is that alt text in images does not carry as much ranking weight as plain textual content. As mentioned above, this is a generally accepted principle, but I wanted to know if this applied equally to heading tags as everything else.

I linked to 2 almost identical pages, with the heading code differing as follows:

  • Text version (8lks5c3z.htm): <h1>7i1bn504</h1>
  • Image version (04x7oszi.htm): <h1><img src=”04x7oszi.jpg” alt=”7i1bn504″ /></h1>

The page with the image-based H1 tag was linked to above text-based H1 to adjust for any bias in HTML order.

As expected, the results showed a clear bias towards the textual version. Google indexed both pages and putting the image-based version in the supplemental results. Yahoo, Ask and Lycos only indexed the textual version (and Bing did not index either version). The images below outline the results:

First result in Google is text version

First result in Google is text version

Both results appear with duplicate filter off

Both results appear with duplicate filter off

Yahoo only indexes textual version

Yahoo only indexes textual version

Ask only indexes textual version

Ask only indexes textual version

Lycos... ah well, who cares ;)

Lycos... ah well, who cares ;)

 Scroll to top