PhotonQ-Tim Berners Lee on Linked Data at TED
Image by PhOtOnQuAnTiQuE via Flickr

Before going any further, let’s get a few things crystal clear;

  1. The recent success of the Linked Data meme is long overdue, very welcome, and entirely capable of carrying the Web of Data far beyond its current niche adherents. A lot of my current work involves arguing that more organisations should adopt this approach;
  2. The Resource Description Framework, RDF, is a key — and powerful — piece in W3C‘s Semantic Web Architecture. Since its earliest days, I have played various parts in advocating the potential of RDF and will continue to do so;
  3. RDF is an obvious means of publishing — and consuming — Linked Data powerfully, flexibly, and interoperably. I will continue to argue this, and to advocate its wider adoption.

So far, so good.

The problem, I contend, comes when well-meaning and knowledgeable advocates of both Linked Data and RDF conflate the two and infer, imply or assert that ‘Linked Data’ can only be Linked Data if expressed in RDF.

This dogmatism makes me deeply uncomfortable, and I find myself unable to agree with the underlying premise.

The rest of this post attempts to explain why, hopefully more lucidly than I or those with whom I was debating managed on Friday evening via the largely unsuitable medium of the 140 character tweet.

Andy Powell started things off lucidly enough on Friday, asking;

“is there an agreed name for an approach that adopts the 4 principles of #linkeddata minus the phrase, ‘using the standards (RDF, SPARQL)’ ??”

I was amongst those to respond, suggesting as I usually do that;

“well, personally, I’d argue that Linked Data does NOT require that phrase. But I know others disagree…  ;-)”

Other pieces of that conversation can be extracted from the stream; start by scrolling to the bottom, find Andy’s tweet, and work back toward the top.

It’s worth noting that two of those arguing most vehemently against me were former colleagues Ian Davis and Leigh Dodds. I have massive respect for the technical prowess of both (which is certainly greater than my own), and have learned a great deal from Ian in particular over the years that we have known one another. This issue, though, is one on which we have long disagreed, and it was interesting to see the subject of many a difference of opinion in the bars of various conference hotels spill into this public arena.

Anyway, now let me try to explain what I meant.

Perhaps the most commonly cited definition for Linked Data is the one to which Andy was referring; Sir Tim Berners-Lee‘s Linked Data – Design Issues document. It’s worth noting that this document is clearly flagged (in the current version amended on 18 June 2009, at least) as being both a ‘personal view only’ and ‘imperfect but published.’ So a very long way from being a ‘standard,’ ‘specification,’ or ‘definition,’ but certainly still a pretty good starting point, and one to which I often direct clients and others.

Berners-Lee begins,

“The Semantic Web isn’t just about putting data on the web. It is about making links, so that a person or machine can explore the web of data.  With linked data, when you have some of it, you can find other, related, data.

(my emphasis)

That sounds good, doesn’t it? Indeed, we talked about that on the Linked Data panel I moderated at the recent Semantic Technology Conference, and I’ve embedded the video here.

It is the next section of Berners-Lee’s document that is used to validate the view that Linked Data needs RDF;

“1. Use URIs as names for things

2. Use HTTP URIs so that people can look up those names

3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

4. Include links to other URIs. so that they can discover more things.

(my emphasis)”

On one reading, an unambiguous validation of the view with which I disagree. On another, a suggestion of best practice, expressed as part of a ‘personal view’ with which we are perfectly entitled to take issue.

Would the zealots be calmed by the simple insertion of ‘preferably’ or ‘ideally,’ immediately after point three’s second comma? Maybe. Or perhaps the fires of Linked Data’s self-appointed Inquisition would be stoked for Berners-Lee himself.

Talk of Linked Data, Open Data, the Web of Data and related concepts in recent years have led to a quite remarkable shift in attitude amongst individuals, public bodies and private corporations. Almost everywhere my work takes me, clever people are seriously grappling with the implications of consuming from or contributing to these emerging ecosystems. Not all of their questions have good answers, and not all of the technological, strategic and business implications have necessarily been fully worked through. But these people are asking the questions, and they are asking them in all seriousness.That is a dramatic and welcome shift.

Some, such as the BBC, Thomson Reuters and the UK Government’s Central Office of Information are sufficiently persuaded of the benefits to take risks and to open the previously closed in taking a lead. Others will follow, as fears are assuaged, doubts eased, and benefits realised.

Despite this undoubted progress, the green shoots of a Linked Data ecology remain delicate. By moving from a message that stresses the value of unambiguous and web-addressable naming (HTTP URIs), providing ‘useful information,’ and enabling people to ‘discover more things’ by linking toward a message that elevates one of the best mechanisms (RDF) for achieving this to become the only permissible approach, we do the broader aims great harm.

Yes, those already in the club will probably be very pleased with the purity and functionality of the toys in their playground. But they will have barred a far larger group with data to share, a willingness to learn, and an enthusiasm to engage. At best, they will have slowed the growth of the pool of Linked Data quite dramatically. At worst, they will have created an increasingly irrelevant backwater that more pragmatic people will simply route around. Perhaps, in their pragmatism, those people will now never look seriously at RDF and its power, scared away by the fervour of those who sought to elevate it too high, and too fast.

What are we after? More Linked Data, or more RDF? I sincerely hope it’s the former.

So let’s see loads more Linked Data, and plenty of evangelism as to why RDF could be the best way to do it. But let’s not ostracise the vast majority of potential participants, contributors and beneficiaries in the world of Linked Data, just because they haven’t wholeheartedly embraced RDF yet.

Reblog this post [with Zemanta]