Albert Einstein. Image © 1947.

Albert Einstein, you may have heard, was a clever man. He scribbled equations on blackboards, thought big thoughts, and all of that. But, allegedly, he also said

Everything should be made as simple as possible, but not simpler.

These words have resonated with me recently, as I’ve heard pitches from one company after another, all of which are trying to cut through the complexity of data to make it accessible. Their goals appear laudable, but all too often I find myself wondering how simple this stuff can be? If we make it too simple, do we run the risk of unleashing a flood of half-baked ‘analysis,’ undertaken by people who really shouldn’t be allowed near a calculator, let alone a Hadoop cluster? On the other hand there’s a persuasive argument to be made for democratising access to data and tools, freeing organisations from over-reliance upon their new High Priests of Data.

Every data question should not require a data scientist, but maybe we really shouldn’t be making it too easy for people to tackle the hard questions without support from someone who knows what they’re doing.

How simple, then, is too simple? And can we use data in a different way; in order to offer enough simplicity, smartly?

One company which has tried to do that in an interesting way is Datahero, and I spoke with co-founder Chris Neumann this week to learn some more.

There’s been some good coverage of the company over the past few weeks, and if you’ve not come across them before then it’s worth taking a skim through the Related articles at the end of this piece. I won’t bother repeating those pieces here.

The thing that interested me — other than Chris’ obvious passion and enthusiasm for his subject — was the way in which Datahero plans to use a mix of data analysis, user experience design and machine learning in order to guide the user toward analyses and visualisations that are likely to be of use to them. Chris is quick to stress that the company isn’t looking too closely at the data values its customers upload. Instead, the system studies the structure of the data (this column contains dates, this column contains place names, etc) and any associated metadata in order to make recommendations for logical ways to visualise the dataset.

As the number of users grows, the roadmap also includes Amazon-style recommendations. If a lot of other people uploading their quarterly sales forecast graph it in a certain way, then it makes sense to recommend that type of graph to a new user who uploads data with a similar structure. The recommendation won’t always be right, but it should go a long way toward minimising the fear of staring at columns and columns of data without a good idea about where to turn first.

Smart recommendations, clever algorithms, and an engaging UI will not — on their own — turn everyone into a data scientist. But nor, really, should they. What they can do, though, is what Neumann described as “enabling the 99%;” those people who have a dump of data from Mailchimp or Google Analytics or Salesforce or Excel, and who don’t really know how to begin sensibly visualising the multitude of columns and rows of numbers.

It’s an intriguing idea, and it will be interesting to see how successfully the system can deliver value to a potentially huge pool of individuals who find themselves data-rich, but skills-poor.

Even if technically successful, of course, the challenge will be persuading those same users to continue paying for the service. If they think they’ll use it often enough to pay for, won’t they end up acquiring enough of the skills to work with their own data anyway? Unless, of course, Datahero manages to grow and add complexity along with its users…

Image of Albert Einstein, ©1947. Image sourced from the Library of Congress’ Prints and Photographs division with identifier cph.3b46036. Shared on Wikipedia, and deemed to be in the Public Domain.