Archive for The Internet

5ft Shelf

Just discovered an awesome new website. It is a very simple idea, brilliantly done. Basically you are allowed to add 5 feet* of books, movies and music to your shelf. Each item you add has a width associated with it, so you have plenty of space to play around with. Once you have added all your stuff you can receive custom recommendations based on what other users have read/added etc. There are also most popular lists, mini shelves for specific types of books and the best part is, from a usability perspective it is an absolute pleasure to use.

Take a look: www.5ftshelf.com.

* This is because (as the site will tell you) in 1909 Dr Eliot, then President of Harvard University, claimed a liberal education could be achieved by reading a collection of books that would total no more than 5ft in width. A local publisher challenged him to name them and he responded with what became known as the Harvard Classics.

How to overcome the duplicate content issue with canonical tags

Once upon a time the only way to get your site spidered by search engines was to manually submit it to the Yahoo directory (come on, in 1997 what other search engines were worth using). Nowadays, search engines are a lot more advanced and don’t require any manual work. This is great, but sometimes search engines can be a little too good at finding things, sometimes things you don’t want finding.

Take the following example… I run a website www.openmeetings.co.uk which I host on a subdomain of this website. I have changed the name servers of the subdomain so http://subdomain.simonlangley.co.uk/index.php  will be shown as http://www.openmeetings.co.uk/index.php as long you access through the site through the latter URL.

Now, whilst this is a cheap way of hosting multiple websites, the one drawback is that there are two versions of every page, but with different URLs. Anyone who knows anything about SEO will tell you that this bbbbaaaadddddd news. Search engines will penalise pages that appear to be duplicated, the reason for this is probably because the content is not original, will be of less use to users and/or is more likely to be spam related.

Up until now (for the above scenario anyhow), it was almost impossible to prevent search engines spidering the subdomain. One way to prevent spidering was to ensure that under no circumstances were links to the subdomain are used. This is fine for static HTML pages, but the second you start using PHP or other server-side scripting, the chances are subdomain links creep in (even for a short time) and as such are spidered.

At the start of 2009 a new method of overcoming this problem was developed and is (apparently) supported by all the major search engines. The method uses a ‘canonical tag’ which sits in the head of you document. The canonical tag tells search engines what the actual page should be, so if (as per the scenario above) a search engine stumbles across a page that differs from the URL in the canonical tag, it will be ignored.

Of course there are lots of other reasons why duplicate content can arise (campaign/tracking codes etc), which I won’t go into here, but this method will work for these too.

To implement a canonical tag place the following code in the head of the page:

<link rel="canonical" href="http://www.mysite.co.uk/index.htm"/>

So there you are!

I’ve heard a rumour that search engines use the canonical tag 95% of the time (who comes up with these stats is beyond me), so it’s not quite watertight, but certainly better than ye olde days.

The way the web was in 1996

One of my favourite websites is the Waybackmachine - an Internet archive of websites dating back to 1996. It really is amazing how much some sites have changed. Here are some very good examples…

The BBC website in 1997
The BBC website in 1997

It seemed to be the fashion to use black backgrounds in 1997!

 The Coca Cola website in 1996
The Coca Cola website in 1996
I have no idea what this is about…

 The McDonalds website in 1996
The McDonalds website in 1996
Ditto

 The Pepsi website in 1996
The Pepsi website in 1996
Prepare for an epileptic fit looking at this background, it is mental.