Big Data

Image by Kevin Krejci via Flickr

The CloudCamp unconference returned to London for the 14th time this evening, regaling a capacity crowd in the Crypt below Clerkenwell’s St James Church with several hours of discussion and debate on the somewhat elusive topic of ‘Big Data’.

Rather rough notes of the proceedings follow, after the break.

LEF‘s Simon Wardley kicked proceedings off as usual, once again managing to pepper an on-topic canter through the topic with a seemingly never-ending stream of Flickr images of cats… and analogies to electricity. You possibly had to be there? His core message, though? There’s nothing new under the sun… and the cycles of change just keep on coming.

Next, Peter Matthews from CA Labs, on “is big data mutually compatible with the cloud?” Erm, yes. Data volumes with big data are so large that it’s difficult to move it around… which creates opportunities for lock-in that vendors may wish to seize. And then he was out of time.

Next, Fujitsu’s Mark Wilson on ‘Structuring Big Data.’ He’s actually talking about Linked Data, a topic I’ve dug into before here and over on – Linked Data could be/ might be the effective realisation of the decade-old Semantic Web dream. Big Data means masses of unstructured or semi-structured content, presenting a management headache of previously unanticipated proportions. Linked Data, he argues, creates the mechanism to link all of this data together from across disparate sources. Yes, but it’s easier to say than to do… And in 5 minutes he really couldn’t explain enough to persuade the audience. Linked Data should be “the optimal reference source,” he said. It should be “a broker for all data sources,” and we should “think about integration, not duplication.” Yeeeeees… But.

Next, Canonical’s Nick Barcet, talking around scalability, Ubuntu, package management, configuration management, etc. Not wholly sure what the point was, I’m afraid.

Next, Chris Swan from UBS – big data and security. “If you’ve got security controls that aren’t properly monitored, then they don’t matter.”

Next, Tom Leyden of Amplidata – Big “Unstructured” Data in the Cloud. Data storage to increase 30x over the next decade, but staff will only increase 50% over the same period. Challenge in the 90s, as existing storage and analysis technologies struggled to cope with new data volumes. Seeing similar problems today with data streaming from sensor web, etc. Traditional file systems cannot cope. Object Storage the way forward ?

Next, Alex Farquhar – “Cloud v Big Data.” Not really versus… but intersection of the two. Too much discussion of his company, Forward. Just talking about how his company uses cloud to provision IT resources. Might work as a conference presentation or case study – not sure it fits as a 5 minute lightning chat. Around 60TB of data at Forward. Diverse and vital. Using Hadoop cluster – 24 nodes on-premise. Rationale (proximity to the cluster) seemed odd. That can be true, but not clear that it really needs to be the case here?

Next, Alaric Snell-Pym, on Scaling Hadoop. Trying to overcome Hadoop’s I/O bottleneck. Explaining basics of Hadoop and Map/Reduce – no one else has. Explains use of HDFS and ‘selective reading’ to manage lots of small tables and overcome the problems of I/O.

Next, Matt Wood from Amazon. Talking about genetics and the human genome. It’s an analogy. Human Genome Project took years and millions of dollars. Development of gene sequencing machines led to a step change – dramatic drop in cost of sequencing DNA. Like the cloud, anyone? But… the machines create an analysis challenge, because they generate so much data. Cloud offers “collection of productivity tools” to help scientists work with this data collaboratively and (relatively) affordably. A perfect example of a lightning presentation, unlike most of those who preceded him.

And finally, an impromptu slot from HP’s Joe Weinman. A quick overview of current thinking behind his latest book. This one could have gone for much longer… Good stuff.

And that’s the lightning talks finished. Now, the panel, and Simon Wardley’s search for “experts” and “volunteers.”

…and unfortunately, your scribe was ‘volunteered’ as an ‘expert’ by Mr Wardley… and here end the notes. It was great to have Amazon’s Werner Vogels sneak in, and lob comments into the panel, though…

Great event, though with the usual mix of people you wish could have talked for longer… and people you wish wouldn’t have spoken.

Enhanced by Zemanta