In a pair of blog posts yesterday, Andreas Blumauer of Austria’s Semantic Web Company touched on an area that has been absorbing my attention recently, and raised some questions worth exploring here.
I am travelling to San Diego next week to speak about the importance of evolving Enterprise attitudes to data. Borrowing some nice turns of phrase from Sir Tim Berners-Lee‘s recent TED talk and JP Rangaswami‘s keynote to Powered by Cloud, amongst other things I’ll be suggesting that they ‘stop hugging their data’ and move ‘from data centre to data centric.’
The Linked Data initiative, which began in March of 2007 as a community project supported by W3C‘s Semantic Web Education & Outreach (SWEO) Interest Group (of which I was a member), has been a huge success. Described by Berners-Lee as ‘the Web done right,’ the notion of Linked Data rests upon the acceptance of four simple principles, yet opens the door to previously unanticipated re-use of data scattered across the Web.
The most rapid adoption has, unsurprisingly, been seen in terms of liberally licensed data already visible on the Web in some form. DBpedia, for example, is a community effort to extract structured information from Wikipedia and expose the individual facts for use across the Web. There have also been examples — as always justified by hacker mentality, ‘academic freedom,’ the imprimatur of ‘research,’ or the expectation that the perpetrators are ‘too small’ to be noticed — in which data have been appropriated to the cause without due care and attention to the rights of the data owner, but these isolated cases should certainly not detract from the value of the broader effort.

- Image via Wikipedia
Public Interest data from organisations such as the BBC has also begun to appear in the ‘Linked Data Cloud‘ (click on individual data sets for more), and the frequency and strength of reciprocal links between participating resources grows rapidly.
Enterprise data is effectively invisible to this Cloud, which brings me back to Andreas’ first post. In it, he asks;
“Since the [Linked Data] cloud is kind of the basic infrastructure which drives the whole process – this layer should remain a freely accessible one. But how could new business models be built on top of it (and constantly spend money on maintaining and extending the underlying infrastructure)?
Where could enterprises start using Linked Data? Only by retrieving data from the ‘outside’ and mash it up with the ‘inside’ – only one way?”
I can certainly see cases in which cautious corporates will be willing to consume without contributing in return, and there’s clearly work to do in demonstrating the value that they could gain from more balanced participation; participation that should never mean unwillingly ‘giving away’ competitive advantage or sensitive data.
We have an annoying tendency to view data in our databases as an indivisible mass, vigorously and unthinkingly applying the same (expensive) protections to an uninteresting and low-value factoid of underlying context as we do to the core attributes of our next big lead.
Andreas concludes this post by suggesting something very similar to JP Rangaswami’s notion of ‘data centric’;
“Information has no ‘place’ anymore, energy can’t be shipped around the world. We should rethink the meaning of a ‘data store’ and information will flow without flooding us. Linked Data might become the essence.”
Andreas’ second post followed after he’d listened to the most recent episode of the Semantic Web Gang, which I Chair. During the show, recorded last month, we discussed the latest release from Thomson Reuters‘ Open Calais activity, which sees it embrace Linked Data’s principles whilst continuing to run and grow a viable global business.
Andreas extrapolates from the conversation to suggest that a viable business model for the data-curating Enterprise might be to expose timely and accurate enrichments to the Linked Data ecosystem; enrichments that customers might pay a premium to access more quickly or in more convenient forms than are available for free. He also sees a market for application builders that optimise the flow of information, and both of these are certainly possible.
The Linked Data — the Data Web — opportunity is far greater, though, and too little attention is being devoted to it by Linked Data’s advocates as they concentrate their efforts on big public datasets of the sort Berners-Lee discussed in Long Beach last week. Big public data sets are important, and Berners-Lee is right to suggest that more Open and Linked access to the outputs of scholarship will help in our efforts to tackle many of the world’s ills. There’s as much value locked up inside our commercial enterprises too, though, and yet the rationale that will ultimately lead to us unlocking this is quite different.
It is that rationale which we need to get right, almost certainly without mentioning ‘RDF’, ‘Semantic Web,’ or even ‘Open.’
And if you’re in Southern California next week too, why not come and say ‘Hi’…?
‘Sunrise over San Diego‘ image © Alon Banks, 2007
Related articles by Zemanta
- TED2009: Tim Berners-Lee (boingboing.net)
- Linked data will change health care (ourownsystem.com)
- The Semantic Web in Action (sciam.com)
- Tim Berners-Lee: Linked data (thenumerati.net)
- Building coherence at bbc.co.uk (blogs.talis.com)
- Calais 4.0 Released: Linked Data Meets the Commercial Web (readwriteweb.com)
- Calais Release 4 and the Linking Data cloud… (ivanherman.wordpress.com)
« « IBM seeks to paint the Cloud blue
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=b5c4308a-faf5-425c-aa2f-bcd0a17471d2)
Paul Miller works at the interface between the worlds of Cloud Computing and the Semantic Web, providing the insights that enable you to exploit the next wave as we approach the World Wide Database.
View Comments Comments until now.
Paul,
The case for enterprise Linked Data is pretty old. Enterprises have always sought and invested in technologies that provide “Open Database Connectivity”.
A point of Linked Data intertia for enterprise, which isn’t technical or conceptual, lies in the perception that “put your data on the Web” literally means: put all your data on the Web. The message for enterprises should be more along the lines of: mesh relevant internal data with external data, and where you see economic value, publish. This is what you see with Calais, BBC, and more to come.
As I’ve articulated repeatedly, in relation to the enterprise, Linked Data is about “shrinking Data Source Names” (DSNs). DSNs are well known and understood in the enterprise realm, they just need to know that via the Linked Data meme the following happens:
1. The DSN shrinks i.e., its reduced to an ID or URI
2. The ID or URI is now scoped at the record level rather than container level (table or database)
3. Courtesy of HTTP intrinsics the representation of the results of a query targeted at the ID or URI (DSN) is negotiable (you don’t need a new Report Writing tool to view data analysis,just put a URI in your browser and “self description” via the entity-attribute-value graph that is RDF does its thing).
Most importantly, how does Linked Data handle security (enterprise or Web or combination of)? Until the recent foaf+ssl initiative, this was completely problematic for Linked Data purity (i.e., the follow-your-nose process of de-referencing hyperdata links). Simple example, how do I not inadvertently expose customer or private data to the wrong people via Linked Data?
To conclude, enterprises will get on board with Linked Data becuase they’ve always thought they had linked data until they hit operating system, dbms, application, and network lock-in. The ingenious use of HTTP to construct Entity IDs (URIs) solves that hurdle. The emergence of foaf+ssl addresses the security concerns (note, in the enterprise realm of ODBC, JDBC etc.. client side certificates are rife and normal occurences).
Links:
1. http://bit.ly/1DgFul – My Linked Data Planet Presentation (target audience: Enterprise)
2. http://www.slideshare.net/bblfish/building-secure-open-distributed-social-networks-presentation – foaf+ssl in a nutshell courtesy of Linked Data exploitation
3. http://esw.w3.org/topic/foaf+ssl – main information space re. this effort.
Kingsley
Thanks Paul for the dialogue, in the meantime another idea came into my mind, how linked data infrastructure could be used: for more accurate predictions, for instance in the financial world.
Might be interesting to hear Tom Tague´s opionion if OpenCalais + Reuters Content + LOD cloud in combination will be able to serve for such things.
Andreas
I think I can hear Tom’s vehement response… but maybe I’ll let him deliver it…
Paul
Kingsley – good points, as usual…
Exactly – which is where I was going with
Paul
Andreas:
You win the prize. Your point is fundamental to some of the underlying strategy behind Calais. We have an exploding universe of content – from content publishers like Thomson Reuters to user generated content to social media to …
People want to make decisions, gain insights and understand context using all of it – but the trust you can put into different content sources varies wildly. So – if we can gain interoperability using (Calais + Thomson Reuters Data + Linked Data Cloud + Other content assets) – we set the stage to empower users to bring *everything* together for the task at hand.
Professionals trust Thomson Reuters content – so we’re confident we’ll have a role to play in all that. In fact – as we’re part of the trusted content universe then the more content the better.
Make sense?
Regards,
Tom
So when an enterprise determine there is value in the internal external data mesh, the will publish using their URIs.
URIs are the digital brand insignia.
Basically, the size of the brand emblem is shrinking, but the opportunity space is increasing exponentially courtesy of Linked Data density.
For example traditional media (TV, Radio, Newspapers) will feel lots of pain on the traditional media side (GIANT emblems such as TV screens, Radios, and Newspaper) but once they realize there is a natural (and relative) \master-details\ relationship exploitable within the Web (courtesy of Linked Data), they will get with the program, as demonstrated by Calais already
Units of \context\ separated by quality, exposed by URIs is how this will eventually play out (enterprise and personal fronts).
Kingsley
Edited version:
So when an enterprise determines there is value to be exploited via the selected internal-external data mesh, they will expose units of value via published URIs.
URIs are the digital brand insignia.
Basically, the size of the brand emblem is shrinking, but the opportunity space is increasing exponentially, courtesy of Linked Data density.
For example, traditional media (TV, Radio, Newspapers) will feel lots of pain on the traditional media side (GIANT emblems such as TV screens, Radios, and Newspaper) but once they realize there is a natural (and relative) “master-details” relationship in play, exploitable within the Web (courtesy of Linked Data), they will get with the program — as demonstrated by Calais effort.
“Units of context” separated by quality associated with URIs is how this will eventually play out (enterprise and personal fronts) market and economy wise.
Kingsley
[...] Linked Data and the Enterprise: a viable two-way street (cloudofdata.com) [...]
[...] Linked Data and the Enterprise: a viable two-way street (cloudofdata.com) [...]
[...] spoke at TED last month, and I wrote about it at the time. ReadWriteWeb has an article today that includes the video of that presentation, which [...]