Saturday, March 27, 2010

SEO 101 (pt 3) – the search results page

  • Click here for part 1 – how search works
  • Click here for part 2 – anatomy of an HTML page

OK – so you’ve managed your SEO brilliantly, and your website appears on the first page of all the major search engines (that would be Google) for all of the keywords you’re monitoring (you are monitoring keywords, aren’t you?)

Unfortunately the job isn’t yet complete – good SEO may get you onto the results page, but the final decision isn’t up to Google, it’s up to the user, who has to decide which of the results most closely matches their query. Fortunately, there are several ways in which you can manage your appearance in the list, and it’s really not that hard.

Page title – it’s the first thing people see in the list of results – make sure it includes your name, and something about the site. Google only shows the first 60 characters or so, so make it pithy.

Description – it may not be used in the ranking, but it will appear on the search results page, so again, make it count, and try and imagine how it will read to someone who doesn’t know about you already – an overly-clever marketing strap-line may look good on the homepage, but may be misunderstood when taken out of context.

Site links – you can’t control site links, they are auto-generated, but you can remove specific links if you wish, using Google’s Webmaster tools. If you see something you don’t like, get rid of it.

URLs – search engines are very specific with regard to URLs. At a technical level, developers and network experts often do clever things to make sure that people are always directed to the correct page, but this may actually harm your ranking, as the search engine may split your “ranking” across the URLs (e.g. www.abc.com/myproduct and myproduct.abc.com may resolve to the same logical page, but to a search engine they are different pages). In terms of SEO, the recommendation is to consolidate URLs using the standard HTTP response code 301 to redirect all traffic to a single URL. (As a side note on this one, you should make sure you are using the analytics to understand where people are coming from if you are getting a lot of traffic on an unwanted URL. Affiliate sites are notorious for this.)

The best way to achieve all of this is to start at the end and work backwards. What do you want your site to look like when it appears in search results list? Look at your competitors and see what they do – use a bit of cut-and-paste magic to fake a screenshot that has all of your competitors on the same page, and then print it out, pin it up and make up your own mind – would you choose your own site?

SEO 101 (pt 2) – anatomy of an HTML page

  • Click here for part 1 – how search works

SEO isn’t only about the structure of your HTML pages – site structure, URL composition, HTTP response codes and PageRank** all count too – but the heart of any search engine is the indexing of HTML content.

It’s all too easy these days to auto-generate a boilerplate web page and then extend it  with content, but ignoring the structure will come back to bite you, if SEO is important to your business.

Below is a (very) simple guide to the basics – and if you need to go deeper than this, check out Google’s own documentation.

 Sample HTML page, illustrating SEO elements In summary: every part of the page is important, and even if we don’t know exactly how Google treats the contents, we do know that they reward those who pay attention to detail.

  • Click here for part 3 – the search results page

** There is some debate on the web as to how significant this is these days, something which Google themselves seem to endorse - http://www.google.com/support/forum/p/Webmasters/thread?tid=6a1d6250e26e9e48&hl=en

SEO 101 (pt 1) – how search works

(NB: If you want to know about SEO in more detail, go and visit Glyn’s blog.)

So how does a search engine work? It’s very complicated in reality, hence why Google only employs such clever people, but the principals are pretty straightforward.

Crawling – the first thing a search engine needs to do is know about all the pages it needs to search through. This involves lots (and lots, and lots) of small programs (“spiders”) scraping their way through the entire content of the internet – they follow every link, and scuttle back to base with the contents (HTML) of every page. When a new page is found, links within that page are added to the backlog of pages to crawl, and the spiders just keep on doing their thing until the job is done. Which is never. Think you’ve got it bad at work?

Indexing – once a page has been harvested by the spiders, the content within the page is indexed. This is part one of the secret sauce – using upwards of 200 distinct attributes of a page, Google will pull it all apart and strip out what it thinks it all means. The index is what is used to match your query to the library of web pages that Google knows about.

Query semantics – if I type in “bread”, am I looking to buy some online, make my own, or watch old episodes of the 1980’s sitcom of the same name? Who knows, but this really is rocket science. Spooky stuff, but it includes things like common phrases, popular abbreviations, semantic deconstruction of sentences, plus knowledge about you, your country etc. A PhD in Philology probably helps with this bit.

Ranking – given the size of the internet, you can type in pretty much anything and get a zillion matches between your query and Google’s index, so the next step is putting them in some kind of order. No one really knows how this works – Google used to use something they called PageRank, which was the original secret sauce, but apparently even that is less important than it used to be (see here). Whatever it is, this is bit that’s hard to predict, so your best bet is not to bother – neither you, nor anyone selling their services to you, can game Google (more than once!). Just stick to the basics, and make your website as simple to index as possible. (That’s not strictly true – it’s not totally opaque. Being really popular does help your ranking, hence the proliferation of “link sites” which superhighway robbers use to try and force up PageRank for a site. They don’t work, and may in fact get you removed from Google’s index altogether – avoid like the plague.)

If you’re interested in finding out more about Google, the best place to start is Google itself – they even have some instructional videos - http://www.google.com/howgoogleworks/. In fact, I should have just posted this link to begin with.

  • Click here for part 2 – anatomy of an HTML page
  • Click here for part 3 – the search results page

Eric Schmidt deserves a mention

I don’t know how this slipped through the net, but had always assumed that when Eric Schmidt took over the reigns from Sergey Brin and Larry Page that Google had lost some of its engineering roots and that the money men had taken over.

Which made it all the more impressive that it somehow managed to maintain its bewildering rate of technical progress (and prowess). I have a pet theory that tech companies run by techies do better than those run by accountants (although you do need an accountant – I’m not suggesting you don’t), which Schmidt seemed to contravene.

Turns out he’s head of the nerd-herd – co-author of a popular UNIX program called lex, and ex-CTO of Sun, so he’s the real deal. He’s also a pioneer of the 70:20:10 time management model, hence the 10% time.

Eric Schmidt, I did you a disservice, and I apologise.

(Oh, and if you run a company that makes / sells software, and you can’t read code, go buy a book. At least look like you’re making the effort.)

Friday, March 26, 2010

iPad – what’s it for again?

Sitting here in an airport (with free wi-fi – are you listening BAA?) about to embark on six hours of travelling, involving two flights and a bunch of hanging around, I feel like I should be in an iPad advert. Surely this is what the iPad was invented for?

And yet, I’m struggling to think of what I would do with it that doesn’t require both an internet connection and a physical keyboard.

A touch-only wi-fi slate really wouldn’t help me right now. I love the idea of having all my daily news appear Minority Report style, and being able to swipe my way through colourful online magazines, but at the end of the day I also need a keyboard, and the ability to work offline.

You don’t need a big screen for music, an iPod will do, so the only real value is in watching movies, which I can do already on my laptop. My laptop also has the advantage of a hinge, so whilst its base is sitting flat, the screen is at a convenient angle.

There is a solution to this, apparently. As the BBC Click presenter so gushingly put it – it’s not until you see it propped up on its stand and with the physical keyboard attached that you really get it. No, I thought – you really don’t get it – it’s now like a laptop, only worse, as all the bits that were so carefully engineered to fit together have been taken apart and spread out all over the table. As a laptop, it’s rubbish.

[Disclaimer: All of the above ignores the fact that I still want one, and it will undoubtedly be a great success. I  just won’t be getting rid of my laptop any time soon.]

Thursday, March 25, 2010

Social media tracking products

[Update: Katie, from Radian6, responded to this (see comments below) in record time, so they are the winner. Being serious for a minute, it was very impressive. I’m still sceptical and think that this has more to do with Katie herself, and the fact that she’s good at her job, than the merits of any specific product, but then what do I know? Not as much as she does, that’s for sure, so why not talk to her instead - @misskatiemo on Twitter. Oh, and buy her product – it’s brilliant, as has just been demonstrated.]

So, now that I’ve become obsessed with tracking myself online (see previous post), I’ve uncovered a very healthy sub-culture in social media tracking. There are lots of products one can use to track Twitter, Facebook, Digg, etc. with a bewildering array of different feature sets and target markets. It’s clearly a nascent industry (no fixed sales pitch). Let’s see if we can help it along.

Social media tracking seems to boil down to scanning various social networks for mentions of specific keywords, retweets etc. The aim is to track everywhere your name / product is mentioned, and if you’re really on the ball you can use the tools to “respond effectively” to the general chatter. Advanced features include things like “sentiment analysis” to help you understand whether people are saying nice things or not. You could just read them of course – you can get through a hell of a lot of tweets in a short period of time if you’re really trying.

Apparently Eurostar is the case study in getting this wrong – when their trains got stuck in the winter they were very slow to respond to a very active, and understandably upset, community of marooned passengers. If' only they’d bought a copy of Radian6, they’d have been fine.

If anyone reading this decides to tweet about it we may able to drive the social media tracking industry into a recursive search about itself, which can only help to drive up their collective profile.

  • Flowdock – it’s Finnish, it’s RoR, it probably does stuff you don’t understand. Be warned. In public Beta, so it’ free – get it now.
  • Raven – it’s not written by Ayende, but don’t hold that against it. Looks good, $79pcm, free 30 day trial.
  • Sysomos – they’re “redefining social media analytics” apparently, which generally means they aren’t. Includes “Automated Sentiment” tracking. Think they pinched that from Scrumbot? No free trial (why do people do that – they’ve just lost me already – that’s 0% conversion rate from my visit – put that in your sentiment-meter Sysomos.)
  • Scoutlabs – quite pricey, at $199pcm, but with a 14-free trial. Coloured chart, graphs, all that stuff. Looks like an attempt to make social media tracking look like watching the stock market – i.e. grown-up, and neat.
  • Radian6 – the uber-dashboard – with an annoyingly sincere video to accompany their product launch, which goes on about the “game-changing” nature of their product. I think that means it’ll be really, really expensive (pricing is still TBA). They’ve also made up their own catchy phrase for all of the noise on the internet – the Social Phone. I presume they mean a phone at the bottom of a handbag, in a noisy bar, that auto-dialled your number at 3am whilst you shout loudly down the other end trying to get someone to listen to you?
    [Update: I still think the video is over-kill, and I don’t like the Social Phone, but the product does indeed seem to work.]
  • Unilyzer – as previously posted, this one is all about the stats – though you may need a PhD to decipher them.

Of course, given the nature of products I am assuming that someone from all of the above companies will see and respond to this post – given that that’s the point of them?

An honourable mention (and retraction of any unfair criticism) to the first person to do so.

I, Internet

In a matrix-style revolution I have apparently merged myself into the very fabric of the internet. Or so it would appear from my personal dashboard from Unilyzer. I think it’s really for companies who want to monitor their online presence and engage with the inter-youth everywhere and anywhere, but it’s very good for a spot of personal navel-gazing.

I haven’t the foggiest what it’s telling me, and I’m not that happy about the number of zeroes in it; the fact that my name is the top item in the tag cloud also suggests that I talk about myself a lot. Although very rarely in third person – Hugo Rodger-Brown doesn’t do that.

Go get yourself one – it’s free for a single account.

unilyzer

Wednesday, March 24, 2010

Ecommerce Stakeholders – Mindmap available

I’ve published another Mindmeister mindmap, this time on the subject of “Ecommerce Stakeholders” - http://www.mindmeister.com/45646065/ecommerce-stakeholders

When working on a large ecommerce deployment, it’s all too easy to concentrate on the problem(s) right in front of you – i.e. how to hit the deadline – at the expense of the bigger picture. Ecommerce websites often exist within a complex corporate structure that includes legal, financial and marketing functions amongst others, and not engaging with these groups at the earliest opportunity is a very easy shortcut to take. If you do take this shortcut, be prepared to repent at your leisure, as the finance department will gladly can your launch rather than letting it go ahead without sufficient testing.

All of the stakeholders should be included in the entire lifecycle, from requirements gathering to final pre-launch testing and sign-off (sign-off being critical in terms of stakeholder management – people need to understand what they’re getting, and how to determine when it’s ready to go).

Of course this works both ways – it would be nice sometimes if the marketing department could wait until the site is built before launching their $$$ advertising campaign, but such is life.

Anyway – it’s public, so please use and update as you wish.

Ecommerce stakeholders mindmap

Tuesday, March 23, 2010

Site design is not just Photoshop

In order to try and explain to a design team that I’m working with that the site-design-by-photoshop approach is not delivering, I have created the mindmap below on Mindmeister. It’s supposed to illustrate all the things you should think about when designing a web UI, from the look-and-feel, through to how the page structure can affect analytics, SEO etc. I’ve made the map public, as I think it could become a useful tool for people when trying to explain to extended team members why, for instance, putting the entire site on a single page with lots of AJAX isn’t a good idea, however nice it looks.

URL is - http://www.mindmeister.com/45020006/user-interface – and it’s editable by anyone.

Effective user experience design mindmap

SEO is common sense

One of the things that often crops up in conversations with clients is “how can I up my ranking on Google”, and more often than not they’ll genuinely believe that there is some way to ‘game’ the rankings.

Well, there isn’t, clearly, otherwise Google wouldn’t exist, and what is more, whilst Google is naturally secretive about how its algorithms match your query to their indexes, it’s commendably open about how it uses the content of your site to compile those indexes.

If you own / run a site, and are worried about your ranking, what you need to do is let your developers take charge – and point them at the Google reference documentation – starting here -http://googlewebmastercentral.blogspot.com/.

Creating a Google-friendly site (yes, and Yahoo, Bing, etc.) is about attention to detail, and crafting each page, understanding the site structure (URLs inc.), every element on the page, the content etc. If you need some real-world pointers, then here’s a shameless plug for Glyn’s blog - http://darkin.wordpress.com/ – where’s he’s promised to reveal all. Should be good stuff.

(I feel quite strongly about this as I was beaten over the head repeatedly by my last employer (a very big online company, who should know better) with an SEO report they had commissioned from a bunch of charlatans who included pearls of wisdom such as meta keyword stuffing. Just to be clear, Google does not index meta keyword or description tags (although the description may be used to generate the snippet that appears in the results list) – details here - http://googlewebmastercentral.blogspot.com/2009/09/google-does-not-use-keywords-meta-tag.html. If your “SEO consultant” tells you to put in keywords, fire them. If you have a dedicated consultant who is not on the development team, fire them. What we need is better tools with which to understand site structure / best practice.)

[UPDATE: don’t just take my word for it – here it is on Google’s own site: Don't feel obligated to purchase a search engine optimization service. Some companies claim to "guarantee" high ranking for your site in Google's search results. While legitimate consulting firms can improve your site's flow and content, others employ deceptive tactics in an attempt to fool search engines. Be careful; if your domain is affiliated with one of these deceptive services, it could be banned from our index. http://www.google.com/support/webmasters/bin/answer.py?answer=40349 ]

Monday, March 22, 2010

Live Writer and the NullReferenceException

There’s never any excuse for a NullReferenceException, and Microsoft themselves should know better. This was thrown when I tried to launch Live Writer before unlocking a drive on my laptop secured with BitLocker (i.e. it was unavailable). Honestly, how lazy do you have to be not to trap that and pop up an intelligent message?

I’ve seen a lot of things like this lately (well, since turning on BitLocker) - software that depends on the file system but that doesn’t bother to check whether it’s available. Attention to detail, anyone?

live_writer_nullreferenceexception

live_writer_nullreferenceexception_2

Rework review (37signals new book)

Rework (front cover)Rework (back cover)

Rework isn’t a book in the conventional sense – it’s basically a set of one-page polemics on the value of common sense over any formalised method / doctrine for managing projects / people and creating a start-up. I admire the guys at 37signals enormously, have spent a lot money (willingly) on their products, and am a huge fan of what they do, and how they do it. However, even I am not entirely convinced their advice is applicable to everyone. That said, it couldn’t be much easier to read – it’s about an hour from cover-to-cover.

Summary points (stop me if you haven’t heard any of these before):

  • The best time to start a company is now
  • Start small and don’t be impatient
  • Don’t make a five-year plan – it’s just guesswork
  • Don’t depend on raising capital if you can help it
  • Don’t employ people unless you need to
  • Marketing is just spam – do it yourself
  • Leave out features in favour of quality
  • Listen to your customers – feedback is everything
  • Don’t work too hard – if you are you’re doing it wrong
  • Meetings are generally a waste of time
  • Meeting organisers are a waste of space (& money)
  • Make sure you enjoy it.

Difficult to argue with any of the above – providing you’re running a small software company. It’s the same old stuff – just use common sense. I don’t know how many more times we need to hear that as a message.

Skype Access

I just logged in the airport using Skype credit – which was a great user experience (albeit rather costly at £0.11/min). None of the usual airport nonsense with having to register / sign-up with some new service and then enter my credit card details. I just fired up Skype and there it was: screenshots below.

I’m convinced that single-sign-on / integrated services like these are going to be the big thing this year – and I think Skype really missed a trick with this – they could / should have been there before Facebook Connect.

Take the ubiquity of Facebook combined with Facebook Connect and I can start to make sense of the Chrome OS model – the idea that you could use your browser to log-in to the entire internet as a network. Windows has had the concept of binding your Windows login to a Live login for a long time now, but it’s never really taken off. It will do soon enough.

skype_credit_1

skype_credit_2 skype_credit_3 skype_credit_4

Data visualisations

There seems to be a wave of interest coming right now in data visualisations. There's always so much going on in this field that it's difficult to detect anything out of the norm, but following on from the Pivot show, we had Tim Berners-Lee showing off some great data apps (inc. a beautiful globe demo - http://www.youtube.com/user/TEDtalksDirector#p/u/7/3YcZ3Zqk0a8) at TED, and finally we have the US CTO, Vivek Kundra calling for the US Goverment to encourage / enable the "YouTube" of data (http://news.bbc.co.uk/2/hi/technology/8576891.stm).

Talking of data, the Guardian have been doing great things releasing their data - expect some good stuff to come around during the election. Check it out here - http://www.guardian.co.uk/data-store

Friday, March 19, 2010

REST-* Architectural Goals

Looking through the REST-* collateral I’m struck by how much it relies on common sense over technical pedantry. The high-level architectural goals can be found here, and I’ve pasted the content in below just to make it easy. I particularly like the “Edge cases should be extensions” goal.

Architectural Goals

Low barrier to entry
Clients that use the specification should have a very low barrier to entry. They shouldn't need to install a library or large stack of software to use a specification. An HTTP client or web server provided by the language or platform should be enough to implement or use implementations of the specification.

Edge Cases should be Extensions
Edge cases that complicate the main specification should be defined in a separate sub-specification. Extensions should strive to be layered on top of the main specification by using facilities like HATEOAS and HTTP conneg to provided their features.

Pragmatic REST
While a specification should strive to follow RESTful principles, simplicity should never take a back seat to being a pure RESTafarian. If you need to bend the rules of REST to create a simpler design, then that's the path that should be taken.

80/20 Rule
Specifications should remain simple. Many times in specification efforts, edge cases cause a lot of bloat and complexity within the specificaiton making it difficult to use, understand, and implement. Specifications should cover 80% of the most common use cases. Edge cases should not be in the main specifications. REST should be able to provide the facilities (HATEOAS, HTTP conneg) to abstract away edge cases.

Avoid Envelope formats
Whenever possible, avoid envelope formats. Examples of envelope formats are SOAP and Atom. Envelope formats encourage tunneling over HTTP instead of leveraging HTTP. They also require additional complexities on both the client and the server. In most cases HTTP headers should be enough to transfer metadata about the request, response, or resource.

Isolate data formats to extensions
If possible, specifications should try not to define new data formats.

NopCommerce + Azure = ???

I’m currently investigating NopCommerce as a base platform from which to build something else, and as a starter for ten I thought I’d try and get it up-and-running on Azure. I’ve previously worked with Amazon EC2/S3, GoGrid & Google AppEngine, so was interested to see how Azure stacked up.

Getting NopCommerce running on a local machine couldn’t be much easier – it even installs the database for you – so I won’t go into any detail on that. Azure, however is another matter entirely. I will post another time on my general thoughts about Azure, but in this first instance I just wanted to vent my frustrations.

Suffice to say, that after going through all the pain of downloading all the things I needed to download (including a new copy of Visual Studio) I have the following situation – I have a working copy of the site that runs in the local Azure Development Fabric, against a database running in SQL Azure. I have subsequently uploaded it to my application (CloudCommerce), and it doesn’t work. Which is fine, I never expected it to. The problem is this:

CropperCapture[3]

What on earth am I supposed to do with that?

I shall report back in due course…

Wednesday, March 17, 2010

Ask the crowd

Here’s a great example of how even the biggest companies are jumping on the feedback bandwagon – from around 4:00 in on this IE9 interview - http://channel9.msdn.com/posts/Charles/In-your-hands-IE-9-Surfing-on-Metal-GPU-Powered-HTML5/ – notice how they admit that the browser is maybe 50% complete, but they already want the feedback – good and bad, and particularly – the “prioritisation”.

This is how all companies should be looking at feedback – don’t even bother to prioritise your backlog before you’ve asked your customers.

If you’re looking for feedback management / integration you should be looking at UserVoice and GetSatisfaction. If you’re a bit more corporate CRM-focussed then RightNow might work for you.

Saturday, March 13, 2010

QCon 2010 (London): Review

And so QCon is over for this year, and it’s been a great conference (IMO). I’m not sure I learned anything totally new – it was more an affirmation of things I’d thought / heard / read about – and a chance to see some of these things out in the wild. The speakers ran from big conference names to academics through some front-line experts, so a real range. I don’t think I attended a single sales pitch, and although a few named products slipped through the net, they were all OSS projects, not commercial products. All-in-all it seemed to stay true to the “for programmers by programmers” promise.

The overall theme seemed to be that you can achieve pretty much whatever you want using bits and pieces that are already out there and tailoring them for your particular problem domain. There is no one-size-fits-all solution, and very little reason to pitch up to a big COTS vendor and buy their product suite (beyond internal accountability and CYA.) The real key is getting the right people and empowering them to solve the problem for you. Small teams, with the tools they need, no more, no less, will get you there. The same solution applies whether you’re tackling problems of scale of planetary proportions (Facebook), or focussing on extreme performance at the chip level (LMAX).

My only real regret is that I missed the Erlang / Functional Programming tracks – it’s something I’d love to know more about, but I just felt I had to either commit to the entire track or none at all. (It’s a bit like snowboarding – I’d like to learn, but if I only manage five days skiing a year I don’t want to spend it on the nursery slopes.)

Highlights were LMAX and Facebook – amazing teams breaking new ground – inspiring stuff from both, and a big thank you to the speakers, Aditya Agarwal from Facebook, and Dave Farley & Martin Thompson from LMAX.

On a technology front, the web is ubiquitous, as is mobile, though what that means is still a problem (what makes something “mobile” if my netbook runs the same software as my desktop – is it GPS, AR?) If you’re working client-side then it’s HTML/CSS/JS and HTTP; if you’re working server-side it’s offline, async, and message-based. Nothing new, although you might want to think about storing your data in somewhere other than an RDBMS. Just make sure you know why you’re doing what you’re doing, and can defend your choice if necessary.

My quick list of best practices is as follows:

  • Do hire good people (great people if you can afford it)
  • Do give those people everything they need to do the job
  • Do pay attention to detail – it counts. Just ask Apple.
  • Do learn the basics – everyone should know HTTP
  • Do use the best tool for the job / problem at hand
  • Do keep focussed on the problem that needs solving
  • Do keep learning – you can never know enough
  • Do keep it simple – if it’s not obvious, it’s a problem
  • Do embrace change
  • Do take risks!

On the other hand:

  • Don’t be a slave to habit
  • Don’t cut corners – it doesn’t help
  • Don’t over-complicate
  • Don’t build for a future that may not exist

And lastly,

  • Don’t moan – this is a great time to be in software – it’s never been so rich and varied; this is the Cambrian explosion of software development, and we should all be thankful that we were here when it happened. We’ll never have it so good.

Friday, March 12, 2010

QCon 2010 (London): Day Three

Roundup from the final day – summary thoughts and conclusions to follow in a separate post when I’ve had time to formulate them!

  • Introduction to Bespin - Mozilla's Web Based Code Editor 8/10
    Very low-key presentation that went through some of the deficiencies of the current command line environment, and a demonstration of how much better it could have been (and will be with Bespin). Another really elegant example of how much difference very small changes make, and the value of attention to detail. +1 for the demo.
  • The Present and Future of Web App Design 8/10
    Really enjoyable romp through the past, present and future of user interfaces. No breakthrough items, but a great presentation with no angle. Mobile (in all its forms) is the run-away winner, along with all the goodies that things like 3-D, GPS and AR will bring. Minority Report, here we come.
  • Mobile JavaScript development  6/10
    I didn’t really learn a lot in this presentation, other than that mobile apps don’t work very well. Apparently it’s possible to write web-apps (HTML/CSS/JS) that can access native APIs (accelerometer, camera, GPS) for most smartphones using something from phonegap.com, and HTML5 provides offline data storage. Oh, and dojo is better than jquery. Mobile is big here, and it’s clearly this year’s thing, but it seems to be still in the embryonic stage. Things like the ipad (mobile features, desktop screen-size) won’t help the confusion, but some pretty cool things are going happen in this space this year.
  • RESTful Business Process Management 6/10
    Dry, academic, talk that demonstrated quite nicely that terms like ‘REST’ and ‘Mashup’ do not mix with terms like ‘BPM’, ‘SOA’, etc. from a cultural point of view, if nothing else. Yes, you could build an SOA using REST services, you just wouldn’t call it SOA – and neither would you use a visual tool to do it for you. Enterprise and community are not happy bed-fellows.
    One thing I did quite like was the use of the HTML ‘embedded resource pattern’ for building composite services. Embed the URI of another service in a service response and shift the burden back on to the client – exactly as HTML does with web pages, where the browser is responsible for building the composite view from all of the embedded URIs.
  • Does REST need middleware? 8/10
    Much better – this was a great wrap-up for the conference as a whole – how to use an existing, simple, well-understood protocol (HTTP) to satisfy some pretty complex requirements – SOA, reliable messaging, transactions etc. Really elegant solution (REST-*), and definitely something to watch. A great way to end.

Thursday, March 11, 2010

QCon 2010 (London): Day Two

Day two from QCon London 2010 below. General impression is that there is too much content from people whose job is presenting content to people, and not enough from those really on the front-line. The exception to this was the LMAX presentation, which stood out from everything else today.

If yesterday’s lesson was that you need to hire good people and look after them, then today’s was that you need to break big problems into smaller problems in order to solve them. Fairly obvious really, but then so was gravity once Newton had given it a name.

  • Living and working with aging software 6/10
    Slightly wayward keynote from GOF original Ralph Johnson – a verbal brain dump from someone with more years in the software industry than most. He likes refactoring, but then apparently he invented the word.
  • Introduction: Irresponsible Architectures and Unusual Architects 7/10
    Great presentation about the use of REST, and the role of the internet as an application platform. Over-intellectualised for my liking – some very simple concepts made to seem very complicated - but I liked the message. I totally agree that everyone should understand the core HTTP fundamentals – request/response pattern, HTTP verbs, status codes, header values – the specification is there, use it.
  • Scaling Applications: Complexity, Performance, Maintainability 7/10
    Good presentation from Ayende – including some code demos, which always goes down well with the crowd. Main point is the same as everyone else’s – scale out is only possible if you split systems into functional components and attack each individually – e.g. user login has different requirements from user registration, so don’t stick it all in a single table in an RDBMS. Use a service bus for async communication, data is always dirty, etc., etc. Ended with a slightly unnecessary demo of his new (json) doc database - .net replica of CouchDB, with some Mongo features (around querying). Looks great, well worth a look, although it’s clearly in the alpha stages.
  • Simplicity - the way of the unusual architect 7/10
    Dan North gave an entertaining but ultimately unsatisfying presentation on how to simplify complex problems (“simplicate”); building a shed is apparently simpler than building a nuclear power station because we, as humans, can fit the entire design into our heads prior to building it. Maybe it’s just because it’s smaller, and not full of radioactive material? Some slightly contrived points, but the message is the same – split things up to make them simple and then apply the appropriate solution. I think we get it now.
  • LMAX - How to do over 100K contended complex business transactions per second at less than 1ms latency 9/10
    Great presentation from the team at LMAX on how to achieve 100k tx/sec at a guaranteed 1ms latency. Turns out it’s about getting some very clever people together and letting them solve some simple problems. Really impressive to hear from people at the top of their game about how they did it; so much more impressive than the waffle the blogocracy come out with (he says, whilst blogging). Inspiring.
  • Command-Query Responsibility Segregation 7/10
    Udi Dahan giving us his guide to offline / async processing. Good presentation skills, and good ideas, just not sure he’s saying anything new. Essentially still the same message – break things up into smaller, more-focussed units, and solve problems using the most appropriate technology. Quite enjoying the Ayende-Dahan micro-ISV face-off though. Which service bus should I go for…?

And today’s winner is … LMAX.

Things I missed but would like to catch up on:
Introduction: Cloud Solutions
Beyond the Data Grid - Designing for Actor Model, Social Networks, Scalable Search

Wednesday, March 10, 2010

QCon 2010 (London): Day One

Roundup from day one of QCon London 2010.

  • Bad Code, Craftsmanship, … and Certification: 7/10
    Keynote speech, very entertaining, but not much new - basically, write better code, it's worth it in the long run. (And don’t bother getting certified.)
  • Project Voldemort at Gilt Groupe: 7/10
    Enjoyed this one - some great insights into using NoSQL from an ecommerce perspective (used for inventory / shopping cart if you're interested.)
  • Auntie on the Couch: 5/10
    Some interesting info on CouchDB use at the BBC, primarily around the ops side (restarts so quickly (<1s) that they can recycle processes on live production servers without triggering alerts). In summary, it works, at scale, and is easy to manage - go try it. (It powers users’ homepage preferences for one thing).
  • Facebook: Architecture and Design: 9.5/10
    Great presentation, mixture of jaw-dropping stats (8bn minutes spent every day on Facebook worldwide) and insight into how Facebook works. Key point seems to be hiring good engineers and empowering them. Key technologies (HipHop, Haystack etc.) were developed by very small teams (<5), and there are no dedicated product owners - mixture of top-down strategic goals and bottom-up innovation. Want to work there.
  • From Dev to Production: 6/10
    Good presentation, well presented, but no great insights; basically be nice to the ops team. One nice point though – try to build once and deploy the same binaries to each environment – do not run separate environment builds, but use external configuration only to differentiate between deployments.
  • Demystifying monads: 5/10
    This probably reflects worse on me than the presenter (who was, to be fair, standing in for his wife at the last minute), but I really don't get monads. Probably shouldn’t have tweeted about this during the presentation however. Turns out half the room, including the presenter, were watching the live tweet-stream. Crowd in this talk definitely at the Computer Science end of the scale.

Things I missed, but want to catch up on at some point:

sky.com: Behind Britain's Entertainment Infrastructure
Building Skype. Learnings from almost five years as a Skype Architect
Functional Languages 101: What's the Fuss?

Monday, March 08, 2010

Sample art collection viewed in Pivot

[UPDATE: I’ve turned this off for now – email me if you’d like to see it and I’ll turn it back on.]

I’ve created my first Pivot collection, which was surprisingly easy, and the results are quite good. I’ve uploaded it to an EC2 instance should anyone be interested. You’ll have to download / install the Pivot client from here - http://getpivot.com/download/ – after which you can browse my sample art collection here - http://79.125.7.206/sample_art_collection.cxml

(I won’t leave it up there forever, so it may not be available all the time.)

The collection is based on an art collection – I’ve downloaded all the photos from the 2009 Royal Collect of Art “Secret” exhibition (read about it here - http://dams.rca.ac.uk/res/sites/RCA_Secret – and then used a random set of artists (from the Wikipedia article on Young British Artists here - http://en.wikipedia.org/wiki/YBA). The acquisition dates are randomly distributed from 01/01/2000 to 28/12/2010, the valuations randomly distributed between $10,000 and $10,000,000 and the locations randomly distributed between London, NYC, Paris, Geneva and Sussex (as that’s where I was when I did it!)

See what you think…

Friday, March 05, 2010

Post-SQL Era – what happens next?

This isn’t much of an article in and of itself, but is a good jumping off point, with some good links at the bottom - http://bit.ly/ahUMF2

To quote from the article (entitled “MySQL + Memcached: End of an Era?”):

“LinkedIn has moved on with their Project Voldemort. Amazon went there a while ago. Digg declared their entrance into a new era in a post on their blog titled Looking to the future with Cassandra. Twitter has also declared their move in the article Cassandra @ Twitter: An Interview with Ryan King.”

Very, very few sites reach the heady heights of these sites (watch this video for some pocket stats, from which the slide below is reproduced), so I guess most people will be happy with MySQL for a while yet, but interesting nonetheless.

2010-03-05_2327 
JESS3)

Thursday, March 04, 2010

Basecamp Visualisations

Looking at the Microsoft Labs Pivot project, it occurred to me that there are some nice visualisations you could make out of Basecamp data – project activity over time, cut by project type, company, person etc. Not sure what the point would be, or what the pictures would represent (screenshots?), but that can come later.

Having downloaded my Basecamp data (see previous post), I’m going to try and put something together. I’ll post here when I’m done (don’t wait up, I may be some time.)

[Update: in case anyone was wondering, it looks like the heavy lifting can be done by a simple XSL transform from the Basecamp export XML format to the Pivot Collection XML (CXML) format – described here. Getting the images to show is another matter.]

Basecamp backups – Centripetal Software

I thought it might be time for a good news story. I am a huge fan of 37signals, and particularly their project management app – Basecamp. I’ve now introduced it into three separate companies, and it just works. (Anecdotally, I also trialled Huddle on a project, which didn’t work, despite offering more functionality, as it just wasn’t simple enough. Don’t underestimate how much work goes into making things look easy – just ask Apple. Or read this – when did you last work for a company who took this much care when revisiting an existing feature? It worked – I’ve upgraded. Twice.)

Anyway, one of the major bugbears with Basecamp, and the forums are littered with complaints about this, is the data backup feature. You can download all of your content from Basecamp, in HTML or XML format, however, this does not include files uploaded to the site. Which, if our usage is at all indicative, makes the service almost useless, as almost every message thread has a file attachment, as do many of the comments. This is clearly deliberate, but it’s not clear whether this is encourage client lock-in (bad idea), or to reduce their Amazon S3 charges (which makes sense). If everyone decided to backup everything daily that could cause some bandwidth charging issues, although you wouldn’t have thought it would be that hard to design some kind of automated backup policy that restricted their exposure?

In idle moments I’ve thought it must be possible to parse the Files section and download every file, but it always seemed very fiddly, and besides, how would it work – a desktop app, downloading everything locally?

So, this morning, whilst wandering through the 37signals site looking for Basecamp extensions (for something else that’s been on my mind recently), I came across these guys - http://www.centripetalsoftware.com/ – an online service offering Basecamp backups, including all files, and Writeboards.

I signed up for the free trial, and went through the registration process during which you’re prompted for a Dropbox account – which is the genius behind this*. In essence it operates like any corporate backup – you set up jobs, and they run on a schedule in the background. Before you know it your Dropbox icon is whirring away in the task bar and the files are downloaded to your local computer (and of course to your Dropbox account).

I then got an email (automated I presume, but you never know) from the founder detailing the job status, and asking for feedback. So, Mike, here it is – it’s a fantastic service, and a must-buy for anyone who depends on Basecamp.

Taking a closer look at the website (here it is again - http://www.centripetalsoftware.com/) they seem to be aiming to broaden their service to offer a similar service for other “cloud” based applications – and good luck to them I say. It’s a simple idea, well executed, and you can’t ask for much more than that. (Except perhaps a bi-directional sync between Basecamp and Dropbox – save a file to Dropbox, see it appear in Basecamp, with versioning. Thanks Mike.)

* You can backup to an FTP server, if that’s your thing.

(BTW – 37signals new book – Rework – is out March 9th.)