Skip to content


Simple meta data bot in Perl

I needed to grab the page titles and meta descriptions for a bunch of specific URLs recently and knocked up a quick Perl script to do the hard work for me. Just run the script below from the command line, and paste the URLs you need into a file called ‘urls.txt’ placed in the same folder:
Continued…

Posted in perl, seo.

Google: Sesame Street more important than Berlin Wall

Judging by Google’s doodles today, Sesame Street’s 40th Anniversary is more important than the 20th Anniversary of the fall of the Berlin Wall – at least in the UK (Google Germany seems to have its priorities right). Hmmm.

Google UK:

google-uk

Google Germany (better logo anyway IMO):

google-de



Posted in google.

Cervical Cancer Jab & SEO

daily-hate

Malcolm Coles is leading an excellent campaign to remove the Daily Hate’s misinformative articles from the top of Google for a very important search term; cervical cancer jab:

jab-misinformation2

So please pass it on and add a link to the NHS for terms such as cervical cancer immunisation to your blog or website.

It’s worth noting that the Mail’s misinformation is directly wasting taxpayer’s money by forcing the NHS to buy Google AdWords to put across the facts.

* Apologies to the DailyHateMyself blog for nicking their logo – it’s for a good cause ;)

Posted in seo.

Terrible SEOmoz advice: forget about users

Rand posted some link bait on SEOmoz about how focusing on users as an SEO is a bad idea. Entitled “Terrible SEO Advice: Focus on Users, Not Engines,” the post is a one-sided rant against so-called SEOs that argue:

tactics which are engine-focused … can be ignored

I could be wrong, but surely no such people exist who call themselves SEOs? Jason Calacanis may argue this, but then again he’s not an SEO is he?

The post now includes an update saying “users should absolutely be the focus of your efforts.” Again, the title of the post boils down to “don’t focus on users” which doesn’t really add up as far as I can see. The power that SEOmoz now holds over the beginner-intermediate market of SEO information puts them in a de facto position of responsibility in not spreading misunderstanding and misinformation through irresponsibly titled and poorly argued blog posts such as this, which present opinion as fact.

Forgetting the largely pointless and opinion-based graphs in this post, let’s examine the bullet points in detail (Danny also does a great job of this.) Rand argues that the following wouldn’t exist without SEO directed ’solely’ towards search engines:
Continued…

Posted in seo.

Google top searches export data

One of the best features in Google Webmaster Tools is the Top Search Queries data, which shows what search queries your website appears for, and which result in clickthroughs. In the web interface this is easy to use and provides a great overview. However, it is rather frustrating that the export feature exports this data in a format that is almost impossible to use:

webmaster

As you can see, all of the data in square brackets (columns D & E) is presented in one row, and very difficult to analyse.

I put together a basic perl script that will re-order this information, and split it into two separate spreadsheets – one for impression data, and one for clickthrough data. This generally results in quite large files, but the data is a lot more easily digestible and easier to manipulate using programs like Excel. Enjoy!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#!/usr/bin/perl -w
use strict;

# You'll need to change the filename to correspond to your downloaded CSV file
open (DATA, 'TopSearchQueries_xxx.csv') or die "Error $!";

open (DIMP, '>wmt-impressions.csv')     or die "Error $!";
open (DCT,  '>wmt-clickthrus.csv')      or die "Error $!";

# create a new file that has 6 new columns (kw, %, pos, x 2)
print DIMP "Month,Locality,Type,Keyword,Percentage,Position\n";
print DCT "Month,Locality,Type,Keyword,Percentage,Position\n";

while (<DATA>) {
    my $line = $_;
   
    $line =~ s!""!!g;
    $line =~ s!"\(Virgin Islands, !"(Virgin Islands !g;

    if ($line =~ m{^([^,]*),    # Month / time period
                   ([^,]+),     # Locality
                   ([^,]+),     # Search type
                   "([^"]+)",   # impressions
                   (?:"
([^"]+)")?\s*$}xi) {
        my $month = $1;
        my $locality = $2;
        my $type = $3;
        my $impr = $4;
        my $ct = $5;
       
        while ($impr =~ m{\[([^,]+),([^,]+),([^,]+)\] }gi) {
            my $kw = $1;
            my $pc = $2;
            my $pos = $3;
            print DIMP "$month,$locality,$type,$kw,$pc,$pos\n";
        }

        if (defined($ct)) {
            if ($ct =~ m{\[([^,]+),([^,]+),([^,]+)\] }gi) {
                my $kw = $1;
                my $pc = $2;
                my $pos = $3;
                print DCT "$month,$locality,$type,$kw,$pc,$pos\n";
            }
        }
    }
}

Posted in google, perl, seo.