Diagram for the LOD datasets
Image via Wikipedia

Thank you to everyone who took the time to share a wide range of views in response to yesterday’s post in its comments, on Twitter, and out on your own blogs. Although reduced to silence throughout the day because of other commitments, I have been reading and learning from all of you. And, despite the sometimes intemperate language of my original post, your contributions have all been thoughtful, measured, and informative.

Several comments raised the duality of RDF; RDF the model and RDF the format (which can itself be expressed in more forms than the RDF/XML of which most might think). Kingsley’s right, of course, when he asks;

“Is RDF a Data Model or a Format re this discussion. The answer to this question is of utmost importance re coherence.”

Honestly, I’m not sure that I know which it was meant to be… but I can fairly safely suggest that the concerns I expressed become increasingly pronounced as we move from ‘model’ toward ‘format.’ I’m still worried about insisting upon the RDF model in anything other than its loosest sense, but can at least see a glimmer of justification for doing so… whereas insisting upon the format seems several steps too far.

I also liked the simplicity with which Alan Dix and Elliot Smith responded to Rob Styles’ ‘Paul Miller is right… and so is Ian Davis,’ writing;

“Surely the critical issue is whether the semantics are available, not whether they are in RDF. If a csv file is published AND suitable semantics are available, then you know which columns are URIs or whatever else.”

and

“if you publish data on the web and a suitable semantics for interpreting that data and linking it to other data, then why isn’t it Linked Data? It just so happens that RDF has a clear(er) semantics describing the interpretation of its data elements (URIs in particular) than a spreadsheet does; it doesn’t mean you couldn’t apply similar semantics to a spreadsheet if you were so inclined.”

Indeed.

Although I actually agree with every single word, Justin Leavesley‘s comment possibly gets close to the nub of things;

“Yes the same mistake was made with the rise of the web.

Once you had URIs and HTTP you already had plain text which is a perfectly good way to encode content. By adopting the STANDARD convention of HTML, all sort of existing text based formats with their various mark ups were locked out. That locked out a lot of content that already existed and required anyone who wanted to play to convert existing content into a html format.

Of course it did have the small side effect that to consume web content you only needed a browser that understood one convention i.e. html.

The same is true of RDF. XML is the equivalent of ascii in this regard. Sure it is a good way to write down data, but it isn’t sufficient to actually use that data unless you understand the various special conventions.

RDF gives you a standard way to understand TYPES of data that you have never seen before. You simply cannot do that with XML alone. You must build a convention at a different level from syntax, which can be expressed as XML. We have, its called RDF!

Ask yourself the question. Why hasn’t the linking of data taken off before? If there is all this data out there, why didn’t it just get linked together?

Because linking between different conventions isn’t very useful.

The problem has never been the linking of data, that is easy as soon as you have URIs. It is meaningfully linking different data so that you have something useful not just a mess. This itself then pulls in more data. Just as we have seen with the growth of the web and just as we are now seeing with the growth of ‘linked data’”

There are surely far more failed attempts to prematurely constrain in the name of ‘standardisation’ than successful ones. If we’re trying to grow and nurture a market (in more senses than just the commercial,) shouldn’t we be more permissive? I’d far rather be engaged in ‘selling’ (to maintain the market metaphor) the benefits of RDF than apologising for its imposition, wouldn’t you?

RDF (definitely the syntax, possibly the model) is a point in time solution to a set of problems that we collectively consider worthy of resolution. The problems will still be there — and hopefully still worthy of resolution — long after the next technical solution has come along.

A lot of the comments, too, talk about ‘converting’ to RDF. Toby, for example, writes;

“Linked data certainly needs to be *linked*, and after that, it’s pretty important to describe the relationship that each link between resources represents (i.e. “this is a link to a parent resource”, “this is link to a resource that represents a place nearby to this place”).

Once you have that, the idea of a triple emerges almost by itself, and what you have is suddenly starting to look very much like RDF. If your format is not RDF, then it’s likely to be convertable to RDF fairly trivially.”

Yes… but if, for the sake of argument, I happen to have ‘the idea of a triple’ in some other form, I may not want or need to convert. RDF is a solution, not the end-goal.

Alan Morrison says something similar, again assuming (?) RDF to be something more than it necessarily is;

“The RDF family provides a metadata umbrella that non-RDF can fit under. It’s possible to avoid religious arguments by allowing alternatives as long as they can be converted to fit under the umbrella.”

Finally, for now, Bruce D’Arcus writes,

“The microdata in HTML5 discussions suggests to me that the first thing that goes out the window when you accept RDF as optional (or more typically, a more pejorative unneeded overkill) is ironically the feature most important to both RDF and linked data: the URI (microdata allows one to use string or reverse DNS identifiers instead for property names and types).”

I’d like to learn more about that, and understand the forces at play there…

And after all the comment and discussion… I’m still convinced that RDF’s model and format are important and useful, and still convinced that they should not be mandatory for Linked Data. Mandatory for ‘Linked Data in RDF,’ yes. Mandatory for ‘Linked Data,’ no.

Reblog this post [with Zemanta]