Tuesday, December 22, 2009


Did I spot this in: a.) a 17th Century document drawn up between Sir Hubert de Poncey and his tallow chandler, or b.) a 21st Century digital media contract?

“Now therefore, for good and valuable consideration, the receipt and sufficiency of which is hereby acknowledged, the parties hereby agree as follows”…

Friday, December 18, 2009

Who invented the Service Bus?

I’m listening to the Scott Hanselman / Udi Dahan podcast (here) on NServiceBus, and they are just discussing how the service bus model could be applied at the application level, and should not be dismissed as just an Enterprise design pattern (around 14 mins in).

This took me back to the time that James and I were working on an eVB (!?!) application for field engineers using old Windows CE mobile devices. We came unstuck with MSMQ eventually (too long to go into here), but not before we found a use for it in passing messages between multiple eVB applications running on the same device.

Can we lay claim to having invented the Service Bus (this was around ten years ago – when the internet was still 0.9, and the only clouds were in the sky, blocking out the sun)?

Wednesday, December 16, 2009

Spot the difference

On the same day I visited Google at their London HQ, I also visited Microsoft at theirs. One of these two tech behemoths has an open ‘Guest’ Wifi connection throughout the building, that anyone can use.

The other also has a ‘Guest’ network, but theirs is password-protected; you have to apply for a password 24 hours in advance. I’ll leave it up to you to guess which is which.

I cannot understand why any company cannot maintain an open guest network with internet access. Even if it’s password protected to prevent drive-by byte-theft* they can keep one password and post it at reception. It’s totally bizarre. If one company can do it, so can everyone else. These concepts aren’t protected by law.

It’s the same with security online – why is that First Direct can manage an online banking system with a very simple login process that works from any browser, whereas others require a combination of key fobs, dongles and even (yes, that’s you Barclays) a mini-pocket calculator to generate a random password. Couldn’t they just have taken a look around at their peers and adopted best practice?

* This does happen - my builder admitted last week that he doesn’t have a working internet connection, he steals his neighbour’s; when he’s out and about he simply parks up in a residential neighbourhood and sees who’s got an open connection.

Sunday, December 13, 2009

Google-ising (pt 2)

So I’ve now visited the London branch of the Chocolate Factory and can confirm it’s everything you’d hope for; good spots include someone sitting in a deckchair working, another in a massage chair, and a Segway parked up on the fourth floor.

All joking aside, the most interesting part was the relationship between sales and engineering. As explained (by a sales person, and without irony) in the lift on the way up, sales and marketing are on the bottom floor, below engineering, both physically and metaphorically. This was backed up by the reverential awe with which we were introduced to a real live Google coder, who could only spend 30 minutes with us before being stolen away by another sales team.

A fascinating couple of hours inside the machine, unfortunately just hours before they were all presented with their new phones, so I can’t confirm anything about their existence or otherwise.

Thursday, October 29, 2009

Is Google Evil?

Hot on the heels of one giant crushing the ambitions of smaller companies (Amazon’s MySQL solution) comes another – Google’s destruction of the sat-nav market with their latest announcement (“Free Google sat-nav shakes market “).

Enthusiastic free-marketeers would probably say that the addition of Google into the market will spur Garmin and TomTom to greater innovation and ultimately a better deal for the consumer, but it can’t be pleasant having the rug pulled from under you like that.

At least in the good ol’ days they had the decency to buy your company first and then destroy integrate it.

Tuesday, October 27, 2009

Amazon marches on

Why would anyone host or manage their own infrastructure these days - http://aws.amazon.com/about-aws/whats-new/2009/10/27/introducing-amazon-relational-database-service/

So now Amazon offer storage (S3), compute power (EC2), relational databases (RDS), non-relational databases (SimpleDB), queueing (SQS), Hadoop (Elastic MapReduce), people (Mechanical Turk - for when computers just don’t cut it), connectivity into your own infrastructure via VPN – really, if I owned a hosting company I’d be worried, and if I ran a start-up I’d look no further.

Apparently they sell a bunch of stuff as well - http://phx.corporate-ir.net/phoenix.zhtml?c=176060&p=irol-newsArticle&ID=1345413&highlight=

Thursday, October 22, 2009


Mozilla’s labs group have released some information on a messaging solution codenamed Raindrop. It runs on CouchDb, and seems vaguely related to the whole APML (attention profiling markup language) movement – allowing users to sift through the daily dump of information by applying qualitative filters to info. (e.g. emails from my family are more important to me than Tweets from some celebrity I happen to follow – that sort of thing).

I don’t really have an opinion on it as yet, but given how much I dislike email, and how disappointed I am with Wave, it’s good to see life in this area.

Monday, October 19, 2009

“Polyglot Persistence”

As I cycled home this evening I was thinking to myself that perhaps a mixed-database environment might be the best approach. I was trying to work out how I would re-engineer our platform given the opportunity, and it’s clear that whilst some areas are ripe for the NoSQL upgrade, others, notably transaction tables, audit logs etc., are much better suited to strongly-typed relational databases.

As with everything I seem to do at the moment, I’m definitely a follower, as no sooner had I fired up my browser than I came across the following: http://johnpwood.net/2009/09/29/using-multiple-database-models-in-a-single-application/

It’s a great article, articulate and intelligent, so go read it…

Thursday, October 15, 2009

Map/Reduce & the Mechanical Turk

So, I have a project that I have wanted to get off the ground for a long, long time, which involves people solving a problem that computers seems incapable of – making sense of the entertainment industry’s metadata mess. It’s a disaster, no one can match anything to anything with any degree of confidence, and even companies whose raison d'ĂȘtre is value-added metadata don’t seem capable of getting it right. I don’t entirely blame them, as having worked at the sharp end for a number of years I know how difficult it is.

Except that it isn’t really – at least not for humans. Computers can’t do it because IDs don’t match and there’s very little fixed structure. It’s a schema-less nightmare. The only significant effort I’ve seen at creating a universal schema was hopeless. (Unfortunately I was supposed to be managing it at the time!) And yet it’s quite easy to match assets to metadata across formats (digital, physical etc.) as a human. We can match images and sounds, we can do loose / fuzzy text matches, and above all we have common sense.

The problem for us people is the scale of the problem – tens of millions of assets need matching – which superficially appears best-suited for tackling programmatically. So how can we reconcile the requirement for human intervention with a problem of vast scale?

This is a map/reduce problem at its heart – we need to spread the work across as many people as we can, and then aggregate the results. Is the Amazon Mechanical Turk the solution?

Adding in the spice of having no fixed schema (what happens to your precious database when the music industry decide to create a new product type that looks a bit like an album, but different) and it’s a problem for the NoSQL generation.

So here’s my solution – stick all the available data into a non-relational document store, index with a search engine, and then present a simple user interface to allow people to validate the metadata and to perform the all-important matching process. Finally, motivate people to do the work by paying them, and use the Mechanical Turk to manage the human map/reduce function.

Some kind of validation is required to maintain the data quality (only accept matches provided by multiple people?) – who knows, perhaps if enough people join the labels / studios themselves might get involved to officially endorse the work (think Twitter verified accounts.)

All I need now is someone pay to have it done…


I’ve just done my first couple of HITs (Human Intelligence Task) – looking up iTunes AudioBook prices for someone – hopefully I now have $0.04 winging it’s across to me. Here’s a screencast of me in action! http://screencast.com/t/GazhIehoEW

Saturday, October 10, 2009

Riak – another No-SQL option

Just reading about Riak – another non-RDBMS solution, which pushes all of the right buttons:

  • Key-value store
  • Document-based
  • Extensible (in runtime) schema
  • Flexible inter-object links (i.e. relationships)
  • Includes Map/Reduce functions for data queries
  • Natively accessible over HTTP
  • Syntactically Get-Put-Delete, not CRUD
  • Deterministic & repeatable ID generation
  • Shares some concepts with Amazon’s Dynamo


All-in-all it seems on the face of it to be a data persistence solution built for the internet. Details can be found here - http://riak.basho.com/nyc-nosql/ – though be warned, this presentation includes lots of diagrams like this:


Wednesday, October 07, 2009

Measurable quality

Whilst boring a (City-based) friend with my thoughts he pointed out that in the City the highest paid people are often not the bosses, but the star traders. So now we have another models to investigate – the bonus culture, where workers are openly rewarded according their contribution. Either way, it’s possible to earn a (very) decent living by continuing to practice the very thing that made you successful in the first place, with the business “management” being taken care of by people trained in their own way to do just that. (And having no greater status than those doing the work.)

The most successful lawyers continue to practice the law, and the most successful traders continue to trade. One could argue that this is because what they do generates enough money to make this an attractive proposition, whereas software development does not. And yet… software is surely the biggest growth industry in the last 25 years – it has literally appeared out of thin air, and yet it’s created some of the largest personal fortunes every seen. An article I read a couple of years ago (wish I could find it) had some statistics about billionaires that included a summary of those who could “program a computer”; let’s just say that there were more computer programmers in the list than lawyers or accountants. So what gives?

Back in the real world, most software programmers are neither billionaires, nor working for billionaires, but the question of how to make a respectable living still exists. Of course, the great advantage that City traders have is that their contribution is measured in numbers* – which can be ranked and rated. Which begs the question, how do you evaluate a developer’s contribution to a company’s success?

If you started a company that became SAP (as did one of the programmers in the list), and therefore had both money to spend, and the ability to recognise technical talent, how would you measure it? Do you even bother – is one person’s line of code the same as another’s?

Everyone is not equal, we all know that, but how do you prove it; if it’s not possible to prove it objectively, does that make software development art and not science?

* They also have the huge advantage of having “make more money” as the sole focus of their job.

Monday, September 14, 2009

Rise up and take control

I overheard someone on the radio this morning (I think he was Chief Economist @ Google) talking about the skills required for success in the future. He made a case for geeks taking over the world, that an ability to understand information, to analyse and manipulate it, would be more powerful in the new economy than the traditional skills of being able to control language and emotion (hence the reason so many politicians trained as lawyers.)

It struck a chord as I have been thinking along similar lines for a while - why is that most other professional services organisations work as partnerships (lawyers, accountants, doctors, bankers even) where those who do the work own and run the company, with IT services being the exception (with, of course, a few exceptions). Why does the IT and software development community have such a low self-esteem, that they are happy to work for sales and marketing teams who have little real understanding of what they are capable?

Of course there are companies where the techies do run the company, bringing in the business expertise from outside as and when necessary - Google being the most obvious example - but I'm thinking more of the new wave of small companies like those behind Fogbugz, Basecamp etc., where tech-savvy (and I mean really savvy, not just that they read Wired) entrepreneurs have shown that a working knowledge of the HTTP protocol doesn't preclude you from being successful in business.

So why is it so rare - a company where the accountants are brought in for the boilerplate business management issues, with product management and company strategy managed by those who really understand the product from the bottom up.

Thursday, September 10, 2009


Just to demonstrate how far we have evolved from our tree-dwelling ancestors, who only had to think about how to find food and shelter, there's a lot of online noise this morning about the fact that Google has increased the size (width) of the search box on their homepage.

Since I'm no better than any other fatuous twit I may as well join in. I think that this is related to the type-ahead dropdown, which is the width of the search box. Wider dropdown means more real estate for displaying richer 'live' content. Possibly.

Honestly, who gives a ...?

Monday, August 03, 2009

It’s a car, but what kind?

In my current role all analogies must be related to cars, so I thought I’d pile in with one of my own (and offload some my current frustrations whilst I’m about it.)

Imagine for a moment that I work for a large car company. I have been contracted to build a small team to develop a racing car, outside of the factory production process. We can do it quicker than they could normally, use different materials, and have been allowed to operate outside of the normal operational processes.

For our next project the same team have been asked to develop a rally car. It’s a bit different, and a bit more complicated, but that’s OK, we can handle it, and we’re confident of hitting the date.

And then, half way through the project, they announce that the rally car we’re building is going to be used as the model on which they intend to base their new family hatchback, and that they have already decided on a date by which they intend to retire their existing top-selling model. Only it’s the same date as we are expected to deliver the rally version. And we’re not getting any more money.

Several weeks later, having roped in the corporation’s finance, legal, marketing, sales & advertising teams, and gotten buy-in to the date from the executive board they are all asking why we’re struggling to meet the deadline.


Tuesday, July 14, 2009

Spot the bottleneck

This article (http://www.internetretailing.net/news/waitrose-picks-precise-to-manage-website-performance-and-availability) is quite interesting for a couple of reasons:

  1. Did they really need a contract with a third party to work out that some SQL statement were causing performance issues?
  2. The fact that improving SQL performance is their primary mechanism for performance gains.
  3. IBM provided the platform.

It would be very interesting to get a view into their existing architecture – to see how much tuning they have done so far.

(PS – I know performance is difficult – you only need to look at the website I currently work with/for/on to see that I don’t have all the answers; my argument is that if they have the money to employ someone specifically for the task that suggests they have exhausted all possible internal solutions. There is surely nothing more dispiriting for a development team than to be held to account for a performance issue and then denied the resources to solve it, only to see a third party parachuted in and given all the assistance they require.)


The other thing that I wanted to mention was the issue of scalability – we’re being asked to scale 1,000% in three months – should I be scared? An increase of 35% seems trivial – why can’t they scale out to cover that increase?

Friday, July 03, 2009

Tipping Point?

The mercury has now popped out of the top - when Computerworld starts picking up on these things we can now assume they have gone mainstream: http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9135086

To paraphrase ((c) Computerworld.com)
The movement's chief champions [...] learned to get by at their cash-strapped startups without Oracle by building their own data storage solutions, emulating those being built by Google and Amazon.

Now that their open source data stores manage hundreds of terabytes or even petabytes of data for thriving Web 2.0 and cloud computing vendors, switching back is neither technically, economically or even ideologically feasible.

Tuesday, June 30, 2009

Yet more bad news for RDBMS enthusiasts

The temperature's rising, and it's surely only a matter of time before normalisation is wiped from the developer's best-practice lexicon. Another well written article killing the myth here - http://www.roadtofailure.com/2009/06/19/social-media-kills-the-rdbms/

As previously stated here - the sheer scale of internet applications has exposed the short-comings of traditional databases in all but the most severe environments (banks?)

My favourite section of the article is the list of things that the author will not be missing with his new solution:

[Quote (c) Bradford Stephens, Road To Failure blog]

* Transactions. Our data is written in from a Hadoop cluster in large batches. If something fails, we’ll just grab the HDFS block and try again.
* Joins. Nothing is more evil than normalization when you need to shard data across multiple servers. If we need to search on 15 primary fields, we’re fine with copying our data set 15 times, with each field a primary key for its table.
* Backup and Complex Replication. All of our data is imported from HDFS. If high-availability is a must, we can simply use Zookeeper to keep track of what nodes die, and then bring up a new one and feed it the data needed in ~ 60 seconds. With scales of hundreds of millions of documents, no one will miss a few hundred thousand for that brief period of time.
* Consistency. If our users are analyzing millions of documents, they’re not going to care if there’s 15,000 unique Authors, or 15,001.

Agreed - if you're a financial institution, the difference between 15,000,000,000 and 15,000,000,001 is important, but for the rest of us, it just isn't.

Tuesday, June 23, 2009

Graph Databases

Something that has really resonated with me over recent weeks is the concept of the graph database. I’ve spent most of my professional career railing against RDBMS software and the frustration of database cost/scale/performance, and although graph databases (or key-value databases) won’t solve the database dilemma, it’s very encouraging to find such a vibrant community of experts trying to tackle these issues.

Here is a great presentation which introduces the concepts - http://markorodriguez.com/Lectures_files/risk-symposium2009.pdf

Monday, June 08, 2009


Many years ago I wrote about the death of the ACID transaction and the rise of the compensating transaction in loosely-coupled systems (here). I've never liked databases, and their sensitive, delicate, demeanour, so I'm particularly pleased to read more and more about the rise of massively scalable (and robust) "databases" based on denormalised key-value pairs - Cassandra (Facebook), BigTable (Google) & Dynamo (Amazon) to name but three.

I know none of these is exactly new, but I think the ideas behind them are being shared within the broader community these days, and that can only be a good thing.

Tuesday, June 02, 2009

Google Wave

We’ve been using Basecamp to collaborate on our team for the past year, and one of the things that it highlights is how incredibly poor an experience email provides. A typical email thread (say 10 replies) can encompass a number of people who are cc’d in (or dropped) from any single message, making retrospective auditing of a decision very, very complex. (In fact it’s impossible if you weren’t on the critical email in the chain.)

Having got used to Basecamp, we now use its Messages function in preference to email precisely because it provides a single conversational thread where anyone can see the decisions being made in chronological order, irrespective of when they joined.

One of the features of Basecamp we haven’t really got comfortable with is the Chat feature, which is functionally equivalent to the Messages, but in ‘real’ time – i.e. it’s a better IM, where Messages are a better email.

Google Wave seems like a better Chat and a better Messages function, combined in one. I have no idea if it’s a ‘killer app’, and some of the initial press has suggested it’s just too ambitious, too complicated for non-technical users, but I for one applaud Google’s ambition in at least addressing the problem. Email is well past it’s sell-by-date, I think that for tech-savvy power users Wave (or an equivalent) could become a de facto communication medium.

Wednesday, May 27, 2009

Where does innovation come from (clue – it’s not IT)?

Somewhere in my previous post I stated that in the online economy innovation comes from the bottom not the top, something that I thought at the time was fairly uncontroversial.

Last week I attended an IT seminar, and two things struck me. First, I really don't work in IT - although I don't know what the alternative is; does the VP product development at Google put "IT Consultant" on their passport? Probably not, but what else is there? Anything else seems a little pretentious.

Second, my comment about innovation was quite controversial. The assembled crowd (mostly CIOs / IT Directors) nodded in agreement when the panel suggested that innovation was a luxury in the current climate, and that all that really mattered today was business value as measured by cost reductions and efficiency gains.

But if your business is technology, how can you not innovate? I was astounded at the assumption that innovation was something could be turned off. I may be very fortunate in my current job but my role is essentially controlling the unstoppable flood of innovation from our development team, and directing it towards some appropriate business objective. Turning it off would be unthinkable, if even possible without losing the team itself.

Someone at the seminar gleefully announced that the fabled Google 20% time was all but gone now, and even the mighty search giant had succumbed to market forces. Well, possibly, according to the HR team, but I'll bet a lot of money that the innovation continues unabated, 20% time or not. Google's scheme was more about encouraging what goes on anyway, with or without formal recognition.

The Internet (capital 'I') has matured to the point where it now represents an industry sector in its own right. On the Internet innovation is endemic, and comes from the youngest, keenest, coolest people in the room and not the oldies in the comfortable chairs. And it's a lot more fun than IT.

Tuesday, May 05, 2009

Old v. New

Some quick thoughts about the comparison between old Enterprise projects and the new style of web projects:

Category Enterprise Web
Project style Waterfall - Design, Develop, Test, Deploy, RUP Agile - rapid iterations
Innovation Top-down (Business Requirements) Bottom-up (Google 20% time, Communities)
Team structure Vertical - developers code complete spikes from web to database Horizontal - front-end team is split from (and a client of) back-end team
Software COTS Products - expensive, well-supported commercial products (ATG, IBM, MSFT) OSS Frameworks & Patterns, Community-led initiatives, Bespoke 'glue', products for specific functionality
Development Environment Restricted software tools, homogenous environment, complex approval process Heterogeneous, open, embracing new technologies (best tool for the job)
Development Languages Java, .NET, C++ PHP, Perl, RoR, Python, JavaScript, …
Buzzwords SOA, SaaS, Compliance jQuery, Memcache, LAMP, JSON, API, OAuth, Mashup
Web Services SOAP,WS-* REST, JSON, POX
Documentation Offline - licensed developer accounts; formal training courses Online - updated on the fly, available to everyone
Architecture N-Tier Distributed
Infrastructure Best-of-breed hardware, design against failure Low-end hardware, expect failure and design accordingly
Database Normalised, strict schema, referential integrity Denormalised, dirty data, lazy-loaded
Data access Direct – ODBC, JDBC, ADO.NET ORM Framework, Cache, Services
Requirements Secure, Scalable Flexible, Fast
Hero clients / employers Government,
Google, Yahoo!, Amazon, eBay, Facebook, BBC, Twitter, Next Big Thing
Success Criteria How big can we grow, how secure is our data? How fast can we react to change, how far can we scale?

Monday, April 27, 2009

Internet-scale application development

I have a posting in me somewhere about the significance of internet-scale applications and their impact on software development but for some reason I can't get it all out. So I thought I'd post a placeholder anyway, just to prove that I was thinking about it.

Summary is that Internet has now exceeded Enterprise as the gold standard in software - platforms like Amazon, Facebook, Google, Yahoo etc. have pioneered ways of working and architectural patterns that make traditional "n-tier" enterprise applications look like toys.

Expect more soon, once I've worked my thoughts into some logical order.