Sunday, March 13, 2011

QCon 2011 (London): Day Two

Day two was (for me) primarily about the big guys – Google, Twitter, Netflix & Best Buy all made appearances, so this was less about the Special Forces of day one, and more about how that theory scales.

Keynote: Innovation at Google (Patrick Copeland) 8/10:

Great talk from Patrick Copeland on Google culture. Hard to summarise in short-form, but essentially the emphasis is on innovators (people) and not on ideas. Ideas are worthless without execution, everyone has an idea (Google themselves have 100-200k employee ideas in their idea-database), but very few people can make something out of an idea. Look for the spikey, difficult people – constantly challenging, never taking the easy route. Be prepared to fail, and fail fast – use data mercilessly to measure an idea, and if it’s not going anywhere kill it – Wave is the poster child for this philosophy. Google ‘dog-food’ anything and everything – not just new ideas per se, but even within the core product. Every search results page served is a complex multi-variate test (I’ve noticed this myself), measured and analysed. Like the previous day’s talks – no one should get too attached to anything – as everything is expendable. Fun section on the concept of the “pretendo-type” or “pretotype” – an example being the Palm guy who made a fake Palm Pilot out of wood and carried it around with him to see if it felt right. (See pretotype.org.)

One possible newsworthy item here was the impending launch of “Androgen” – a “v. fast prototyping tool for Android”. Watch this space.

Building Best Buy’s RESTful Commerce Engine (Brian Sletten) 8/10:

Starting to get into deep tech here, which was great. I worked with BBY a few years back, so have some appreciation of the complexity of creating software for such a behemoth, and they seem to have done a great job on their commerce API. The secret – well, there is no silver bullet (of course), but they seem to have taken a pragmatic approach – start simple, listen to your users and build out. I loved the purity of their implementation – particularly the use of hypermedia to provide fully-formed URLs to client software. You call the initial URL, and it will return all of the available service URLs. If they release a new service, they add it to the manifest, and you use it when you need to. Additional features included the use / extension of the link/rel attribute.

Our old friend “small teams, co-located” cropped up again – is there anyone who still believes that anonymous, distributed, teams of developer-drones can succeed?

One thing that they apparently found very useful is BDD – they have a very complete set (1,000+) of Cucumber tests that they can run – which apparently help enormously with smoke / regression testing new releases.

Data Architecture at Twitter Scale (Nick Kallen) 7/10:

Now we’re getting very deep – this was the first of a number of very technical presentations, which I can’t really do justice to here – though I would recommend the presentation when it finally appears on InfoQ. It was a really interesting data history lesson on Twitter – how they migrated from the original Rails / MySQL implementation to the (incredibly exotic) system they run today. Suffice to say, I now understand why the API restricts you to the last 3,200 tweets. They are fortunate in being to optimise for some incredibly precise scenarios, and so have built an architecture that supports those specific use cases. As one example, all of a user’s read-only timeline is cached in memory – so that when someone tweets all of their followers receive a message (pub-sub) that is then prepended to their own in-memory timeline. They have internal SLAs around every scenario – which includes propagating updates within a second – just imagine what that means when Ashton Kutcher tweets… (they can support 4.8m messages/second).

Couple of interesting bits: all engineering solutions are considered transient (following the theme), good enough is considered good enough, and until recently social graph events (following/unfollowing) exceeded tweet events. Counter-intuitive, but true.

Behind the Scenes at VISA (John Davies) 8/10:

This was a first – the only unrecorded session (I believe), this was a walk into the murky world of the world’s finance backbone. I hesitate to repeat anything here – I fear for my safety! John’s own presentation was neutered by the policy wonks at VISA, so this was largely a Q&A session. Interesting factoids: VISA runs on a mainframe (is not distributed), it runs on 99.99999% availability, it provides the US government with their GDP figures, it can afford to lose two data centres without interrupting service. No one knows where the DCs are, and their core systems have probably only changed 10% in twenty years (contrast with Dan North’s software half-life measured in months). This is the opposite of Agile, and probably good for it – this is, next to ICBM programming, about as serious as it gets. But that also presents VISA with a problem – when Hyves (Dutch Facebook) introduced a payment system, VISA card transactions dropped 25% in nine months – if Facecbook (/Google / ?) were to do the same, VISA’s entire business could collapse, and with their bunker-mentality they would have no way to react. So, they’re getting out into the community, hence John’s presence at QCon – VISA needs the development community more than we need them – but this could present some life-changing opportunites to some.

Be warned though – if you work at VISA you may have to sign quite a long, restrictive, covenant.

Netflix’s Cloud Data Architecture (Siddarth Anand) 8/10:

Some great facts re. Netflix – they were spending $600m/year on USPS postage before embracing the web, streaming costs 1% of posting a DVD, at peak they account for 20% of downstream bandwidth in the US, they have $2bn in revenue, and they do all this with 400 employees, 15 of which are in IT operations. That’s 15. Extraordinary stuff.

Anyway, they decided to move everything to AWS, which they did, in commendable JFDI style. One interesting point – they never attempted a wholesale data import from DC to cloud – but instead moved a user’s records on demand – when the user first accessed the new site. This meant that they were able to move everyone across over a two-three week period.

Yet again, as if it needed re-emphasising – small teams of capable people, empowered to do the job.

Using Hypermedia Services for System Integration (Tim Ewald) 6/10:

This was the end session of the REST track, and I felt a bit like I’d walked into a private conference. This talk was quite opinionated, and there was a lot of speaker-audience chat about how important it was to use link/rel for hypermedia, or some other document annotation. I don’t really care – the theory of hypermedia is interesting in general, and specific implementations are interesting (qv BBY session), but frankly the academic details are not. This was a classic case of over-thinking. A bit more JFDI, a little less ivory-tower please.

No comments: