Edinburgh journey times, from Mapumental

As the Big Data hype machine continues its relentless attempt to gobble everything in its path, new business units and entire new domains buying into the promise find themselves faced with unanticipated data volume and complexity. They see the potential for data-based decision making, but still face (short-term?) challenges in actually managing, analysing or interpreting the data they now collect.

Early iterations of core tools such as Hadoop were raw and unpolished, driving the emergence of a niche group of developers and data analysts with the specialist skills to cope. Those tools become easier to use with each new release, which goes some way toward countering panicked claims of a massively yawning skills gap.

There is, of course, still a need for people with specialist knowledge, hard-won skills, and painfully gained experience. But you no longer (if you ever really did) need to install a Hadoop cluster with your bare hands and juggle complex statistical formulae in your head to benefit from the growing prevalence of data in all aspects of business.

At the ‘softer’ end of the market, specifically, there has been an explosion of new startups rushing to offer tools that make it easier to create visualisations and dashboards to deliver some value from the data whilst hiding its complexity. Some established players in the visualisation and business analytics market, such as IPO-prepping Tableau and IPO-considering Rosslyn Analytics, are hoping to press home their lead. Elsewhere, visualisation and business intelligence pros eye the influx of new users with horror, reprising the wailing of typesetters and designers when desktop publishing tools like Ventura and PageMaker first appeared; these amateurs, they protest, cannot do it properly. They do not understand. They — shock, horror — might mislead people.

We’ve been here before. A $50bn market opportunity will tend to see smart young things — and charlatans — emerge from the woodwork in droves. The market will settle. Some of the incumbents will survive, some of the new entrants will succeed, and most of the customers will (eventually) work out what they really need. Visualisation’s equivalent of using a different font for every paragraph (just because you can) and the <blink> tag will fade, just as it did in desktop publishing and web ‘design’.

The new data visualisation tools from companies like chart.io demonstrate an interesting trend toward atomisation of business tasks. We’ve seen this in the consumer space for a long time, particularly since the advent of channels like Apple’s App Store. But in business, the trend has been towards consolidation, with fewer and fewer software tools trying to do more and more. These bloated behemoths become ever-harder to use, ever-harder to support, and developers find themselves increasingly constrained by dependencies and legacy as they try to innovate or encourage the next upgrade cycle.

Now, though, we’re seeing renewed enthusiasm for the small, focused, application. It needn’t do much, but it must do what it does compellingly and well. Increased provision of APIs helps, of course, with users more easily able to extract data from a source, work with it, and then pass the result on to the next application in the chain. In principle, a recipe for freedom of choice, best of breed, and healthy innovation. Whether it continues to work in practice remains to be seen.

Many of these topics surfaced in a conversation earlier this week, as I spoke with Vik Singh, CEO and co-founder of Infer. The company has been in the news this week because it secured $10m in new funding. But Singh talks passionately of the need to “do one thing well.” Infer, he suggests, does that. They’ve identified a problem — lead scoring — in which the focused application of some effort can quickly lead to a “measurable lift on the top line” of their customers. Infer can be deployed (as SaaS) in a matter of days, integrated with existing tools such as Salesforce and Marketo, and quickly start to deliver demonstrable returns.

We focus on a particular problem… Our team understands what matters. We don’t focus on counting features. We focus on finding the least amount of features to deliver the most value.

It’s a refreshing attitude, but does it create a product — or a company — that can grow? Singh certainly seems convinced that it does, and talks of ways that the basic notion of lead scoring can be extended as customers move away from intuition and best-guesses towards a data-based decision making process.

Tools are increasingly emerging to help business decision makers be better informed by the data that matters to their organisation. Some of those tools are best used by people with the technical and business skills to understand data and its implications. Some are, perhaps, too easy to use, and will encourage half-baked analyses based upon inappropriate data. In purely Darwinian terms, those tools — and the companies that invested too heavily in them — will most likely fall by the wayside. We’re at an early stage in a journey that needs to see all of us become more data literate. Good tools, smart companies, and some great lighthouse examples can all help that. Just as use of desktop publishing and web design tools settled down, we’ll see the same with data manipulation tools. I’m not particularly worried about this industry’s ability to deliver good infrastructure tools, good data ingest, storage, and manipulation tools, or good data visualisation tools.

I’m more concerned to ensure that we nurture the softer skills. Almost anyone can make a chart (they teach it to 7 year olds!). Many people can even make the right chart. Far fewer people can craft the narrative that makes the data in that chart sing, that uses it as a tool to persuade, to shape, and to take decisions. We’re cracking the technical side of the problem. Now we need to find — and to celebrate — the story tellers.

And just to finish off, it was interesting to see Stephen Wolfram‘s blog post this week, Data Science of the Facebook World. Lots of data, lots of graphs, and the beginnings of a story to make the data sing…

Image, produced by Mapumental, shows average public transport journey times into central Edinburgh.