3662651926_d7da13a9ed_zAs part of GigaOM’s Structure:Data Conference (taking place in New York City on 20-21 March), Jo Maitland and I are going to host a Mapping Session on Data Marketplaces. What are they, what are they doing, why do they matter, and how does their future look? The session is intended to be highly interactive, so attendees should come armed with questions, opinions, data, and perspectives to share. It is free to registered Structure:Data attendees, but space is limited so sign up here if interested.

About a year ago, I began a series of podcasts to explore the scale and nature of the opportunity facing a growing collection of so-called Data Markets. I spoke with the most visible champions of the idea such as Iceland’s DataMarket, Infochimps, Factual and Microsoft, as well as smaller companies like AggData with different perspectives to share.

Some of the conclusions from the series appeared in a report (subscription required) for GigaOM Pro, and I’ve continued to watch the space, and to provide more detailed assessments of it to several clients.

Companies like Factual and DataMarket continue to grow. Factual adds new data and products with regularity, whilst DataMarket has opened a US office and – with their new energy portal – begun to demonstrate that vertical sites may ultimately prove more appealing than one-size-fits-all repositories of mixed data.

infochimps-topinfochimps-bottomAnother early entrant, Infochimps, is still going strong, but shows every sign of pivoting away from provision of a data market toward offering Big Data management and processing tools. Every time I visit the site, the data seems harder to discover (keep scrolling; it’s at the very bottom of that looooonnnggg home page). Infochimps, clearly, is finding more customers willing to pay for software and services than for boring old data. It is possible that they’re right, and mainstream adoption of the data market idea will require far more emphasis upon the delivery of services and outcomes, with the data becoming a necessary foundation rather than an end in itself. At the end of the day, that might well be a healthy change of emphasis.

Kasabi, of course, has gone altogether, after parent company Talis (see disclosure) prematurely decided to focus its attentions elsewhere. And then there’s Microsoft. The Azure Data Market is still there, and it’s still steadily gaining data, but it doesn’t look to me as if Microsoft is actually trying very hard. The company’s offering just seems to be bubbling along gently. If Microsoft really focused effort, attention and money here they’d be far more visible by now.

So, as you’d expect, we’ve seen some successes, some failures, and some really odd decisions in the past year. Outside of tightly defined verticals such as Finance, prospective providers and consumers of data remain confused as to the worth of data. Everyone has a vague idea that money can and should exchange hands for these ephemeral bits and bytes, but providers are – frankly – guessing how much to charge. One of the biggest opportunities for a data market is as an aggregation of data created by others, but to be truly effective that aggregation requires effort. Someone has to harmonise the data, someone has to document metadata, data acquisition methodologies, etc. Someone has to create and maintain a mechanism to describe the relative merits of one (expensive) dataset of New York delis over another (cheaper) one. Someone has to reconcile the ‘123 Market’ in one data set with ‘Flat 6, 123 Market St.’ in another, and ‘Apartment 6, 121-123 Market Street’ in a third. That is where the real value lies, but it’s also where many of the most intangible costs lurk. How much is cleansing and harmonisation worth? Without it, though, they’re not data markets; they’re data jumble sales.

More explicitly focusing on vertical markets, as DataMarket has begun to do, clearly creates opportunities to build and sustain communities of interest. The trick is getting a community that is focused enough to cohere whilst remaining large enough to deliver revenue. Subsuming the data beneath services also has potential; there may be more interest in a service to help you select the best energy provider for your needs than in a site that lets you download a spreadsheet of energy price fluctuations.

And finally, one area that none of the data markets have really touched upon is the whole opportunity around personal data and data lockers. We all know that personal data has value (both monetary and other). We also recognise a growing sense of discomfort with the ways in which commercial organisations take and reuse data about ourselves and our interactions with their services. But there are a growing number of startups (and Government projects!) looking at ways to return control of the data to the individual, and to offer money, services, and features in return for sharing some of that data with others. As these ideas develop, how might they play in the data market landscape? Watch this space, I think.

So, if any of that is of interest, or if you’d like to discuss some other aspect of the data market world, then do join us in New York next month.

Structure:Data fills Wednesday 20 March and Thursday 21 March for me. I get in from the UK on the evening of Monday 18 March, and still have some gaps in my schedule for Tuesday 19 March. So if you’re in or near Manhattan and want to meet up, please do get in touch.

And, given the conference venue, I should get a welcome opportunity to revisit the High Line

Disclosure: I am a former employee of, and current shareholder in, Talis.

Photograph of New York’s High Line Park by Flickr user A. Strakey