Saturday, June 26, 2010

Redirect status codes (again)

My pedantry over over the 303 redirect having been pointed out by my colleagues, I figure ‘in for a penny, in for a pound’. Or 301 to be precise.

The use of a canonical URL for SEO purposes is well known, as search engines are notoriously precise, and will store reputation against the exact form of a URL, including trailing slashes and case sensitivities.

The recommended best practice is to use a redirect to consolidate reputation against the canonical form. 

The important point about the redirect is that you should use a 301 status code to indicate that the redirect is permanent. This is used by the search engine to combine reputation. If you use a 302 status code the user will be redirected, which is good, but the search engine will interpret this as a temporary redirect and will keep the incorrect URL in its index as a valid content URL.

This illustration from the Google SEO Report highlights the problem:

Illustration showing SEO dilution

As ever, the best reference for more information is Google, and I would recommend everyone involved in SEO read Google’s own SEO Report Card – it’s an easy read, and well worth the effort.

Wednesday, June 23, 2010

Exercising freedom of speech, or just squatting?

I’m settling comfortably into middle age so it’s about time for a Colonel Blimp moment. I have the great good fortune to have a bicycle / moped ride to work each day that takes in the Houses of Parliament and a circuit round Parliament Square. Once upon a time (and for quite a long time) there was a chap who camped out there called Brian Haw, and he was a man of principal and much respected.

In recent months however, Parliament Square has begun to look like an urban outpost of Glastonbury, and it’s completely out-of-hand and I’m outraged (albeit not from Tonbridge Wells).

Frankly, I think they’re taking the p*ss, and if I had my way they’d all be moved on each night. I’d also point out that there’s no one in Parliament at night, so they’re not demonstrating to anyone or making any point. They’re just squatting on one of most expensive pieces of real estate in the world.

I’m all for the expression of free speech, and am more than happy for people to march up and down during the day, but I think that they should be cleared out each night – if they feel so strongly about whatever they are demonstrating against, they should be happy to come back each day. If the price of democracy is a bus ticket in each morning, so be it.

My letter to the Telegraph is in the post.

Update (06/07/2010): it has a name, it’s called Democracy Village – website here - http://democracyvillage.org/. Turns out it’s quite a big deal - Boris is attempting to get rid of them, but they have had a stay of execution for some reason. Also worth noting that Brian Haw is not affiliated to them in any way – as stated on his website - http://www.parliament-square.org.uk/.

Facebook and the Open Graph Protocol (OGP)

A couple of days ago someone asked me whether the old “avoid iframes” mantra was still relevant, and if so why. The specifics of my response may make it into another post, but in summary I suggested that if both sides (host page and iframe target) trust each other, and are working towards a common aim – i.e. they both want the solution to work - then iframes work just fine. I gave the Facebook Social Plugins as my reference for this – the Like button can be implemented as either XFBML or an iframe, and works just fine as either**.

His response (and he may have misquoted his source) was to the effect that he’d been told that that was not the case, and that the Open Graph Protocol didn’t use iframes. I could have let it lie at that, but I trust him, so am happy to lay the blame at the door of his source, in which case some serious explaining is in order.

The Open Graph Protocol is precisely that – a protocol. It is a recommended set of HTML <meta> tag attributes that can be used to give semantic meaning to a web page. There is a healthy OGP group running here - http://groups.google.com/group/open-graph-protocol/topics?pli=1 – where members are discussing possible extensions to the current protocol, but in essence it’s a bunch of machine-readable information that can be used to categorise the contents of a web page. These tags can be used by anyone, either to ‘decorate’ their own website, or to give meaning web pages that include the tags (if you run a service that involves managing URLs – e.g. link shortening, search engines etc.) Facebook’s own use of the OGP tags is pretty basic at the moment – if you “Like” a page, Facebook does two things:

  • It registers the fact that you Like it, and posts to your stream
  • It retrieves the page itself, and extracts information that is relevant in order to make the Like information more structured.

In a standard web page all Facebook can do is extract the few standard HTML details (page URL, title tag etc.) By encouraging web developers to include the OGP tags they can provide a much richer experience on the Facebook site – they can “know” that when you Like a movie on IMDB that you are talking about a movie, and not some abstract page on the internet. This is the essence of the Semantic Web.

So, to say that the OGP doesn’t support iframes is a bit like saying a boiled egg can’t drive a car. It’s meaningless.

** It turns out that XFBML uses iframes anyway, so you can’t get away from them. The fb:like element is converted into an iframe by the FB JavaScript. Easiest way to view this is to use Firebug, which will show you the actual DOM post-manipulation.

Thursday, June 10, 2010

HTTP status codes and the PRG pattern

The Post-Redirect-Get is a popular pattern in websites seeking to prevent users from reposting data by accident (e.g. refreshing the checkout page and then being charged again.)

Just to refresh, the process is as follows:

  1. User submits data which is POSTed to the server
  2. Server processes form data, and issues a REDIRECT
  3. Browser receives the REDIRECT and the GETs the new page

Because the final step is a GET (and not a POST), refreshing the page has no effect (assuming you’re not posting data via a get, in which case you should have your internet connection revoked).

What you may not know is the there is a standard HTTP response code for this operation – and it’s not 302, it’s 303. It’s a tiny detail, but an important one. If you’re using PRG as a pattern, make sure that whatever server tech you’re using issues the correct status code – it’s the little things that make all the difference. Just ask Apple.

(The ASP.NET MVC Controller.Redirect method uses 302, natch.)

http://en.wikipedia.org/wiki/HTTP_303

http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

Wednesday, June 02, 2010

South London Geek Night – 14th July 2010

A bit of self-publicity – I’m speaking at the second “South London Geek Night” – no, I’m not telling my family – on 14th July, at the Bedford pub, in Balham.

Topic is NoSQL – it’s only 15 minutes, and the audience is mixed (the lawyer from my previous project spoke at the last one on IPR), so it won’t be overly technical – more of an overview.

If anyone has anything they particularly want to hear about, just drop me a line (or comment here).

Details here – http://southlondon.geeknights.net/

BBC homepage issues – is CouchDB the culprit?

Some of you may have noticed that the BBC Homepage now has a small notice in it:

BBC homepage screenshot I know that the BBC homepage is one of the key case studies in production use of CouchDB, and the message here (in the comments), suggests that load is the problem:

As I described in an earlier blog post, the new BBC homepage has been built on a whole new technical architecture. Since launching we’ve found an issue with the service we use to save users’ customisation settings. Although we ran a public beta for more than 2 months, this problem only became apparent when we moved the whole audience across to the new site, increasing the load on the platform 20 times. Despite thorough load testing before launch we were unable to accurately predict the type and combination of customisations that users would perform, and as a result we now need to re-architect the way we save your homepage customisation settings in a more efficient way.”

Let’s hope it’s solved, and soon.

NoSQL database as a mock data provider

We’ve been talking through some of the possible areas in which a document database (MongoDb, CouchDb, RavenDb) might be of use to us, and a new one came up today. The dev team were discussing database schema issues, and it seemed to me that we could bypass the entire conversation by using a schema-less database, not for production, but for testing / development.

Whilst the guys spend time refining the domain model, adding / removing attributes, changing relationships etc., the easiest thing to do data-wise is simply ignore it. Use a document database as a very simple half-way house between static mock data providers and a final-cut RDBMS (if that’s the solution). It’s also a great way to get the ops team (LOL) comfortable with using a database technology with which they are unfamiliar. No more excuses.

It’s also a neat way to remove developers from the database creation / optimisation process, should you so wish. They can ignore schema issues, and hand the problem to a database specialist. Or am I dreaming?

In addition, the ease of use means that a single dev server could be used to support multiple developers and test scenarios. Each developer could have their own database for their own abuse, and they could at the same time plug into a single, shared database for more strictly controlled test data. It would also allow testers to prime test databases with data using simpler tools than raw SQL. Everyone is a winner.

I think this probably counts as an update to this post from 2005 - http://hugorodgerbrown.blogspot.com/2005/04/unit-testing-and-data-access.html , when I’d just got the hang of the provider model.